Edward Seidl created PARQUET-2435:
-------------------------------------

             Summary: Clarify behavior of DELTA_BINARY_PACKED encoders/decoders
                 Key: PARQUET-2435
                 URL: https://issues.apache.org/jira/browse/PARQUET-2435
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-format
            Reporter: Edward Seidl


I brought this issue up on some time ago on the mailing list [1]; in short I 
would like to add some clarification to the DELTA_BINARY_PACKED section of 
Encodings.md.  The issue is that while the specification does not limit the 
number of bits that can be used to encode deltas, some readers expect a maximum 
of 32 bits for INT32 data, and 64 bits for INT64 data [2]. I propose adding 
verbiage to the specification to the effect that while using 33 bits to encode 
INT32 data (or 65 for INT64), it is not recommended, and that readers _should_ 
be able to read such data, but are not required to.

 

 

[1] [https://lists.apache.org/thread/2wj88oghc0t6qqj8ojp5p5tf8wg11840]


[2] https://github.com/apache/arrow/issues/20374



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to