xiangfu0 commented on code in PR #18092:
URL: https://github.com/apache/pinot/pull/18092#discussion_r3036489141


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/BaseChunkForwardIndexWriter.java:
##########
@@ -88,13 +90,40 @@ public abstract class BaseChunkForwardIndexWriter 
implements Closeable {
   protected BaseChunkForwardIndexWriter(File file, ChunkCompressionType 
compressionType, int totalDocs,
       int numDocsPerChunk, long chunkSize, int sizeOfEntry, int version, 
boolean fixed)
       throws IOException {
-    Preconditions.checkArgument(version == 2 || version == 3 || (fixed && 
(version == 4 || version == 5)),
+    this(file, compressionType, null, sizeOfEntry, totalDocs, numDocsPerChunk, 
chunkSize, sizeOfEntry, version, fixed);
+  }
+
+  /**
+   * Constructor with optional transform codec support.
+   *
+   * @param file Data file to write into
+   * @param compressionType Type of compression
+   * @param transformCodec Optional transform codec to apply before compression
+   * @param valueSizeInBytes Size of each typed value (4 for INT, 8 for LONG); 
used by transform
+   * @param totalDocs Total docs to write
+   * @param numDocsPerChunk Number of docs per data chunk
+   * @param chunkSize Size of chunk
+   * @param sizeOfEntry Size of entry (in bytes), max size for variable byte 
implementation.
+   * @param version version of File
+   * @param fixed if the data type is fixed width (required for version 
validation)
+   * @throws IOException if the file isn't found or can't be mapped
+   */
+  protected BaseChunkForwardIndexWriter(File file, ChunkCompressionType 
compressionType,
+      @Nullable TransformCodec transformCodec, int valueSizeInBytes, int 
totalDocs,
+      int numDocsPerChunk, long chunkSize, int sizeOfEntry, int version, 
boolean fixed)
+      throws IOException {
+    boolean hasTransform = transformCodec != null && transformCodec != 
TransformCodec.NONE;
+    Preconditions.checkArgument(
+        version == 2 || version == 3 || (fixed && (version == 4 || version == 
5 || version == 6)),
         "Illegal version: %s for %s bytes values", version, fixed ? "fixed" : 
"variable");
+    if (hasTransform) {
+      Preconditions.checkArgument(version == 6, "TransformCodec requires 
writer version 6, got: %s", version);
+    }
     Preconditions.checkArgument(chunkSize <= Integer.MAX_VALUE, "Chunk size 
limited to 2GB");
     _chunkSize = (int) chunkSize;
-    _chunkCompressor = ChunkCompressorFactory.getCompressor(compressionType);
+    _chunkCompressor = ChunkCompressorFactory.getCompressor(compressionType, 
transformCodec, valueSizeInBytes);
     _headerEntryChunkOffsetSize = version == 2 ? Integer.BYTES : Long.BYTES;
-    _dataOffset = writeHeader(compressionType, totalDocs, numDocsPerChunk, 
sizeOfEntry, version);
+    _dataOffset = writeHeader(compressionType, transformCodec, totalDocs, 
numDocsPerChunk, sizeOfEntry, version);
     _chunkBuffer = ByteBuffer.allocateDirect(_chunkSize);

Review Comment:
   The writer version was changed from 6 to 7. Full end-to-end roundtrip tests 
exist in `ChunkCodecPipelineTest` — tests like `testDeltaPipelineEndToEnd`, 
`testDoubleDeltaPipelineEndToEnd`, and `testXorPipelineEndToEnd*` write v7 
indexes with transforms and read them back, validating header parsing and data 
correctness.



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/forward/BaseChunkForwardIndexReader.java:
##########
@@ -84,14 +87,26 @@ protected BaseChunkForwardIndexReader(PinotDataBuffer 
dataBuffer, DataType store
       headerOffset += Integer.BYTES;
 
       _compressionType = 
ChunkCompressionType.valueOf(_dataBuffer.getInt(headerOffset));
-      _chunkDecompressor = 
ChunkCompressorFactory.getDecompressor(_compressionType);
-      _isCompressed = 
!_compressionType.equals(ChunkCompressionType.PASS_THROUGH);
-
       headerOffset += Integer.BYTES;
+
+      // Version 6+ includes a transform codec field
+      TransformCodec transformCodec = null;
+      if (version >= 6) {
+        transformCodec = 
TransformCodec.valueOf(_dataBuffer.getInt(headerOffset));
+        headerOffset += Integer.BYTES;
+      }
+      _transformCodec = transformCodec;
+
+      int valueSizeInBytes = storedType.isFixedWidth() ? storedType.size() : 0;

Review Comment:
   Intentionally allowing all fixed-width types (INT, LONG, FLOAT, DOUBLE) — 
the XOR transform is designed for FLOAT/DOUBLE columns. The validation now 
checks `storedType.isFixedWidth()` which correctly covers INT/LONG/FLOAT/DOUBLE.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to