LuciferYang opened a new pull request, #56564:
URL: https://github.com/apache/spark/pull/56564
### What changes were proposed in this pull request?
`IndexShuffleBlockResolver.getChecksums` opens the checksum file with
`var in: DataInputStream = null` inside a `try` and closes it in
`finally { in.close() }`. The method already catches
`IOException`/`EOFException`
to return `null`, so it is meant to degrade gracefully when the file cannot
be
read. But the `finally` is not null-safe: if the failure originates in the
stream
constructor (before `in` is assigned), it dereferences `null` and throws a
`NullPointerException` that masks the original error.
The two sibling methods in the same file already handle constructor failure
correctly: `checkIndexAndDataFile` constructs the stream in its own
`try/catch`,
and `getMergedBlockData` uses `Utils.tryWithResource`. This change makes
`getChecksums` consistent by using `Utils.tryWithResource`, which evaluates
the
constructor before its own `try`, so a constructor `IOException` reaches the
existing `catch` and the method returns `null` as intended.
### Why are the changes needed?
`getChecksums` has a latent `NullPointerException`: the non-null-safe
`finally { in.close() }` would mask the real error if opening the checksum
file
fails at construction time, which is inconsistent with the rest of the file.
No
common code path is known to trigger it today; this is a robustness and
consistency improvement that removes the latent NPE and aligns the method
with
its siblings.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added a unit test in `IndexShuffleBlockResolverSuite` that forces the stream
constructor to fail (a checksum file whose `exists()` returns true but which
cannot be opened) and asserts `getChecksums` returns `null` instead of
throwing.
The test fails on the old code (NPE) and passes with the fix.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]