emkornfield commented on code in PR #3390:
URL: https://github.com/apache/parquet-java/pull/3390#discussion_r2719334713


##########
parquet-column/src/main/java/org/apache/parquet/column/values/alp/AlpConstants.java:
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.values.alp;
+
+/**
+ * Constants for the ALP (Adaptive Lossless floating-Point) encoding.
+ *
+ * <p>ALP encoding converts floating-point values to integers using decimal 
scaling,
+ * then applies Frame of Reference (FOR) encoding and bit-packing.
+ * Values that cannot be losslessly converted are stored as exceptions.
+ *
+ * <p>Based on the paper: "ALP: Adaptive Lossless floating-Point Compression" 
(SIGMOD 2024)
+ *
+ * @see <a href="https://dl.acm.org/doi/10.1145/3626717";>ALP Paper</a>
+ */
+public final class AlpConstants {
+
+  private AlpConstants() {
+    // Utility class
+  }
+
+  // ========== Page Header Constants ==========
+
+  /** Current ALP format version */
+  public static final int ALP_VERSION = 1;
+
+  /** ALP compression mode identifier (0 = ALP) */
+  public static final int ALP_COMPRESSION_MODE = 0;
+
+  /** FOR encoding for integers (0 = FOR) */
+  public static final int ALP_INTEGER_ENCODING_FOR = 0;
+
+  /** Size of the ALP page header in bytes */
+  public static final int ALP_HEADER_SIZE = 8;
+
+  // ========== Vector Constants ==========
+
+  /** Default number of elements per compressed vector (2^10 = 1024) */
+  public static final int ALP_VECTOR_SIZE = 1024;
+
+  /** Log2 of the default vector size */
+  public static final int ALP_VECTOR_SIZE_LOG = 10;
+
+  // ========== Exponent/Factor Limits ==========
+
+  /** Maximum exponent for float encoding (10^10 ~ 10 billion) */
+  public static final int FLOAT_MAX_EXPONENT = 10;
+
+  /** Maximum exponent for double encoding (10^18 ~ 1 quintillion) */
+  public static final int DOUBLE_MAX_EXPONENT = 18;
+
+  /** Number of (exponent, factor) combinations for float: sum(1..11) = 66 */
+  public static final int FLOAT_COMBINATIONS = 66;
+
+  /** Number of (exponent, factor) combinations for double: sum(1..19) = 190 */
+  public static final int DOUBLE_COMBINATIONS = 190;
+
+  // ========== Sampling Constants ==========
+
+  /** Number of values sampled per vector */
+  public static final int SAMPLER_SAMPLES_PER_VECTOR = 256;

Review Comment:
   wonder if these should be configurable somehow?  Probably OK if not.
   
   can these be package private?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to