etseidl commented on issue #7040:
URL: https://github.com/apache/arrow-rs/issues/7040#issuecomment-2626112376

   The parquet spec does say:
   > If a stored value is larger than the maximum allowed by the annotation, 
the behavior is not defined and can be determined by the implementation. 
Implementations must not write values that are larger than the annotation 
allows.
   
   So there is no bug here, as @tustvold pointed out, but the writer that 
produced the file violated the spec.
   
   > Should we consider a reader option that allows 'java compatible' behavior?
   
   I think we should figure out what we want to do with this malformed data, 
and then do it consistently. I'd be inclined to do as arrow-cpp and mask 8 and 
16 bit integers after reading, but that might have a small performance penalty. 
 Gating this behavior may be the way to go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to