steveloughran commented on code in PR #14853:
URL: https://github.com/apache/iceberg/pull/14853#discussion_r2699781090


##########
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/parquet/VectorizedPageIterator.java:
##########
@@ -100,6 +101,14 @@ protected void initDataReader(Encoding dataEncoding, 
ByteBufferInputStream in, i
         case DELTA_BINARY_PACKED:
           valuesReader = new VectorizedDeltaEncodedValuesReader();
           break;
+        case RLE:
+          if (desc.getPrimitiveType().getPrimitiveTypeName()
+              == PrimitiveType.PrimitiveTypeName.BOOLEAN) {
+            valuesReader =
+                new 
VectorizedRunLengthEncodedParquetValuesReader(setArrowValidityVector);
+            break;
+          }
+          // fall through

Review Comment:
   this is quite a serious fall through here, given the parquet spec limits 
what RLEs can be used for to bools, Repetition and definition levels & 
Dictionary indices. Is it likely to occur in the wild?
   
   If so, it probably merits a test case to see that if you create one with a 
column whose type != BOOLEAN then you can't init() it with RLE data encoding.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to