Re: [PR] Spark, Arrow, Parquet: Add vectorized read support for parquet RLE encoded data pages [iceberg]

via GitHub Mon, 16 Mar 2026 05:11:02 -0700


pvary commented on code in PR #15410:
URL: https://github.com/apache/iceberg/pull/15410#discussion_r2939969688



##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetVectorizedReads.java:
##########
@@ -316,23 +324,30 @@ public void testSupportedReadsForParquetV2() throws 
Exception {
   }
 
   @Test
-  public void testUnsupportedReadsForParquetV2() throws Exception {
-    // Longs, ints, string types etc use delta encoding and which are not 
supported for vectorized
-    // reads
-    Schema schema = new Schema(SUPPORTED_PRIMITIVES.fields());
-    OutputFile outputFile = new InMemoryOutputFile();
-    Iterable<GenericData.Record> data =
-        generateData(schema, 30000, 0L, RandomData.DEFAULT_NULL_PERCENTAGE, 
IDENTITY);
-    try (FileAppender<GenericData.Record> writer = getParquetV2Writer(schema, 
outputFile)) {
-      writer.addAll(data);
-    }
+  public void testRLEEncodingOnlySupportsBooleanDataPage() {
+    MessageType schema =
+        new MessageType(
+            "test",
+            primitive(PrimitiveTypeName.INT32, 
Type.Repetition.OPTIONAL).id(1).named("int_col"));

Review Comment:
   Do I understand correctly that previous tests are done for every 
`SUPPORTED_PRIMITIVES` field, and now we only test for `int`? Is this 
intentional?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spark, Arrow, Parquet: Add vectorized read support for parquet RLE encoded data pages [iceberg]

Reply via email to