[GitHub] [iceberg] singhpk234 commented on a diff in pull request #7391: Build: Run Iceberg with JDK 17

via GitHub Fri, 21 Apr 2023 02:03:00 -0700


singhpk234 commented on code in PR #7391:
URL: https://github.com/apache/iceberg/pull/7391#discussion_r1173527238



##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestMetadataTableReadableMetrics.java:
##########
@@ -190,6 +191,7 @@ private GenericRecord createNestedRecord(Long longCol, 
Double doubleCol) {
   }
 
   @Test
+  @Ignore

Review Comment:
   This is actually interesting and happening only in java 17 env : 
   basically the parquet file written has diff `column_size` value in `files` 
metadata table, in java 8 it's 44 and in java 17 it's 43.
   effectively this line here produces diff results: 
   
https://github.com/apache/iceberg/blob/d04efee702fcdcdbe3659c12f7442f5000aa246a/parquet/src/main/java/org/apache/iceberg/parquet/ParquetUtil.java#L127
   
   
   Is it because java 17 provides better compression than java 8 ? as per this 
blog : https://dkomanov.medium.com/java-compression-performance-fb373078cfde
   
   since this is a value we are getting directly from parquet footer, so 
effectively we are reading what were writing stats are not getting messed up in 
between, which seems correct to me. But I might be wrong here, will wait for 
other folks feedback here if there is deeper investigation required.  
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #7391: Build: Run Iceberg with JDK 17

Reply via email to