atwam commented on issue #9716:
URL: https://github.com/apache/arrow-rs/issues/9716#issuecomment-4245646012

   I had a closer look at the arrow-cpp implementation for this. In 
`GetZeroBasedValueOffsets`, we don't anonicalize empty variable-size offsets on 
write, we just reuse whatever the in-memory array already has. So if we already 
have a canonical one-element buffer, we keep it around. If we have a null or 
zero-length offsets buffer, c++ will preserve that on IPC write.
   
   Strikingly, C++ explicitely exercises this permissive case , including on 
[validation](https://github.com/apache/arrow/blob/4eca50770f7f2c5938a676f0719fbfc8aae4803c/cpp/src/arrow/array/validate.cc#L916).
 Now the question is whether the spec should be updated (and then other arrow 
libraries such as polars/arrow2 will have to change), or whether we stick with 
the spec and output strictly compliant files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to