Re: [PR] Expose variantShreddingFunc() in Parquet.DataWriteBuilder [iceberg]

via GitHub Sat, 18 Oct 2025 03:34:44 -0700


amogh-jahagirdar commented on code in PR #14153:
URL: https://github.com/apache/iceberg/pull/14153#discussion_r2377284529



##########
parquet/src/test/java/org/apache/iceberg/parquet/TestParquetDataWriter.java:
##########
@@ -113,13 +121,17 @@ public void testDataWriter() throws IOException {
     List<Record> writtenRecords;
     try (CloseableIterable<Record> reader =
         Parquet.read(file.toInputFile())
-            .project(SCHEMA)
-            .createReaderFunc(fileSchema -> 
GenericParquetReaders.buildReader(SCHEMA, fileSchema))
+            .project(schema)
+            .createReaderFunc(fileSchema -> 
GenericParquetReaders.buildReader(schema, fileSchema))
             .build()) {
       writtenRecords = Lists.newArrayList(reader);
     }
 
-    assertThat(writtenRecords).as("Written records should 
match").isEqualTo(records);
+    assertThat(writtenRecords).hasSameSizeAs(records);
+
+    for (int i = 0; i < records.size(); i++) {
+      InternalTestHelpers.assertEquals(schema.asStruct(), records.get(i), 
writtenRecords.get(i));

Review Comment:
   @huaxingao i'm OK if we want to do that, I think we'd have to read the 
footer and get the schema from that and extract that the variant has a 
typed_value etc....it's not too much additional work and does give us 
confidence that the parquet file is shredded as expected. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Expose variantShreddingFunc() in Parquet.DataWriteBuilder [iceberg]

Reply via email to