Re: [I] Allow Parquet reader to read incorrectly written (negative) uint8, uint16 values for compatibility [arrow-rs]

via GitHub Fri, 07 Feb 2025 10:36:43 -0800


tustvold commented on issue #7040:
URL: https://github.com/apache/arrow-rs/issues/7040#issuecomment-2643705411


   Aah I see the confusion, I interpreted 
   
   > ExampleParquetWriter (from parquet-java) used here in Spark unit tests : 
https://github.com/apache/spark/blob/ece14704cc083f17689d2e0b9ab8e31cf71a7a2d/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala#L871
   (Spark itself writes only signed integers but will read the file written by 
the ExampleParquetWriter successfully).
   
   as there being some buggy spark test harness producing invalid files. I 
agree if parquet-java can produce such invalid files, users may want to control 
what behaviour they get.
   
   Given this is not inherently not standardised, providing a way to configure 
this seems reasonable to me


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Allow Parquet reader to read incorrectly written (negative) uint8, uint16 values for compatibility [arrow-rs]

Reply via email to