This is an automated email from the ASF dual-hosted git repository.

apitrou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git


The following commit(s) were added to refs/heads/master by this push:
     new f65d4e1  PARQUET-2435: Clarify behavior of DELTA_BINARY_PACKED 
encoding (#231)
f65d4e1 is described below

commit f65d4e19a00955cc7b964c418708750055dde9d1
Author: Ed Seidl <[email protected]>
AuthorDate: Wed Feb 28 03:35:08 2024 -0800

    PARQUET-2435: Clarify behavior of DELTA_BINARY_PACKED encoding (#231)
    
    Address the issue of using more bits in the encoding than are used in
    the underlying type being encoded.
---
 Encodings.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Encodings.md b/Encodings.md
index aaf7a36..5040094 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -245,7 +245,9 @@ Subtractions in steps 1) and 2) may incur signed arithmetic 
overflow, and so
 will the corresponding additions when decoding. Overflow should be allowed
 and handled as wrapping around in 2's complement notation so that the original
 values are correctly restituted. This may require explicit care in some 
programming
-languages (for example by doing all arithmetic in the unsigned domain).
+languages (for example by doing all arithmetic in the unsigned domain). Writers
+must not use more bits when bit packing the miniblock data than would be 
required
+to PLAIN encode the physical type (e.g. INT32 data must not use more than 32 
bits).
 
 The following examples use 8 as the block size to keep the examples short,
 but in real cases it would be invalid.

Reply via email to