[GitHub] [iceberg] openinx commented on a change in pull request #3987: :bug: fix Flink Read support for parquet int96 timestamps

GitBox Sun, 20 Feb 2022 23:33:25 -0800


openinx commented on a change in pull request #3987:
URL: https://github.com/apache/iceberg/pull/3987#discussion_r810839863




##########
File path: flink/v1.14/build.gradle
##########
@@ -67,6 +67,7 @@ project(':iceberg-flink:iceberg-flink-1.14') {
       exclude group: 'org.apache.hive', module: 'hive-storage-api'
     }
 
+    testImplementation "org.apache.spark:spark-sql_2.12:3.2.0"

Review comment:
       Seems like this dependency was added for writing the timestamp as int96 
in the unit test,  but in fact we apache flink's ParquetRowDataWriter support 
writing a timestamp_with_local_time_zone into an INT96.  So I will suggest to 
use the flink parquet writer rather than the spark parquet writer. (It's 
strange for me to introduce a spark module in in the flink module).

##########
File path: 
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java
##########
@@ -321,6 +327,26 @@ public DecimalData read(DecimalData ignored) {
     }
   }
 
+  private static class TimestampInt96Reader extends 
ParquetValueReaders.UnboxedReader<Long> {
+
+    TimestampInt96Reader(ColumnDescriptor desc) {
+      super(desc);
+    }
+
+    @Override
+    public Long read(Long ignored) {
+      return readLong();
+    }
+
+    @Override
+    public long readLong() {

Review comment:
       Since we are writing the `TimestampData` into an int96,  then I think we 
will need to restore the int96 back to TimestampData.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx commented on a change in pull request #3987: :bug: fix Flink Read support for parquet int96 timestamps

Reply via email to