jubins opened a new pull request, #28473:
URL: https://github.com/apache/flink/pull/28473

   ## What is the purpose of the change
   
   Fixes [FLINK-39951](https://issues.apache.org/jira/browse/FLINK-39951) — 
`ArrayConstructor.construct()` uses reference equality (`==`) to compare the 
pickle typecode string `"l"` instead of value equality (`.equals()`). Since 
`args[0]` is a deserialized `String` object at runtime and not an interned 
literal, the `==` comparison always returns `false`, making the `long[]` code 
path effectively dead code.
   
   As a result, any Python array with typecode `'l'` (64-bit longs) silently 
falls through to `super.construct()`, which treats the values as `int[]` and 
truncates them to 32 bits. Users passing long values larger than 
`Integer.MAX_VALUE` receive silently corrupted data with no error or warning.
   
   The fix replaces `args[0] == "l"` with `"l".equals(args[0])` on line 30 of 
`ArrayConstructor.java`.
   
   ## Brief change log
   
   - Fixed `ArrayConstructor.construct()` to use `"l".equals(args[0])` instead 
of `args[0] == "l"` for typecode comparison
   - Added `ArrayConstructorTest` with four test cases covering: large long 
values above `Integer.MAX_VALUE`, single value arrays, empty arrays, and 
delegation to the parent class for non-`'l'` typecodes
   
   ## Verifying this change
   
   This change is covered by new unit tests in `ArrayConstructorTest`:
   - `testLongArrayPreservesValuesAboveIntMax()` — the core regression test; 
uses `new String("l")` to guarantee a non-interned reference (proving the old 
`==` would have failed), then asserts that values like `3_000_000_000L` and 
`Long.MAX_VALUE` are preserved correctly as `long[]`
   - `testLongArrayWithSingleValue()` — verifies a single large long value is 
handled correctly
   - `testLongArrayWithEmptyList()` — verifies empty input produces an empty 
`long[]`
   - `testNonLongTypecodesDelegateToSuper()` — confirms other typecodes (e.g. 
`'i'`) still fall through to the parent `super.construct()` correctly
   
   ## Does this pull request potentially affect one of the following parts
   
   - **Dependencies** (does it add or upgrade a dependency): no
   - **The public API**, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   - **The serializers**: no
   - **The runtime per-record code paths** (performance sensitive): no
   - **Anything that affects deployment or recovery** (JobManager, 
Checkpointing, Kubernetes/Yarn, ZooKeeper): no
   - **The S3 file system connector**: no
   
   ## Documentation
   
   - Does this pull request introduce a new feature? no — it fixes a bug in 
existing functionality
   - If yes, how is the feature documented? not applicable
   
   ## Was generative AI tooling used to co-author this PR?
   
   - [x] Yes — Claude Code was used as a pair-programming assistant. All code 
was written, understood, and verified by the author.
   Generated-by: Claude Opus 4.8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to