cloud-fan commented on PR #56104: URL: https://github.com/apache/spark/pull/56104#issuecomment-4594790509
@MaxGekk Good catch, this was a real sharp edge introduced by the migration. Fixed in a2d763f. I made it fail loudly rather than write through. A uniform write-through is not safe: for an off-heap view, or a view pointing into a shared `UnsafeRow` / `ColumnVector` buffer, mutating in place would corrupt memory the value does not own — which is exactly what the `copy()`-before-mutate contract (already documented on `Geometry.copy()` / `Geography.copy()`) exists to prevent. So instead of silently mutating a throwaway buffer, `setSrid` now throws and directs the caller to `copy()` first. Concretely: - Added `BinaryView.hasTightOnHeapArray()`, which exposes the exact condition under which `getBytes()` returns the live backing array (tight on-heap `byte[]`), and refactored `getBytes()` plus the serializers onto it. - `Geometry.setSrid` / `Geography.setSrid` now check it and throw `IllegalStateException` (pointing to `copy()`) when the value is a slice / sub-range / off-heap view. The sole caller (`STUtils.stGeom/GeogSetSrid`) already does `.copy()` first, so it is unaffected. Added tests in `BinaryViewSuite` (the predicate across tight / sub-range / slice / off-heap / post-`copy()`) and in `Geometry`/`GeographyExecutionSuite` asserting `setSrid` throws on a non-owning view and succeeds after `copy()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
