prtkgaur commented on code in PR #557:
URL: https://github.com/apache/parquet-format/pull/557#discussion_r3196529528


##########
Encodings.md:
##########
@@ -391,3 +375,518 @@ After applying the transformation, the data has the 
following representation:
 ```
 Bytes  AA 00 A3 BB 11 B4 CC 22 C5 DD 33 D6
 ```
+
+<a name="ALP"></a>
+### Adaptive Lossless floating-Point: (ALP = 10)
+
+Supported Types: FLOAT, DOUBLE
+
+This encoding is adapted from the paper
+["ALP: Adaptive Lossless floating-Point 
Compression"](https://dl.acm.org/doi/10.1145/3626717)
+by Afroozeh and Boncz (SIGMOD 2024).
+
+ALP works by converting floating-point values to integers using decimal scaling
+(controlled by an *exponent* `e` and *factor* `f`), then applying Frame of
+Reference (FOR) encoding and bit-packing. Values that cannot be losslessly
+converted are stored separately as *exceptions*. The encoding achieves high
+compression for decimal-like floating-point data (e.g., monetary values, sensor
+readings) while remaining fully lossless. Each value is encoded independently,
+enabling random access to individual vectors and parallel encode/decode.
+
+#### Overview

Review Comment:
   Yes this makes sense. I too thought it became long and having a separate 
file would be good.
   Let me take a stab at it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to