[GitHub] [iceberg] kbendick commented on a change in pull request #1572: Refactor reusable map data in Spark and Flink parquet readers #1331

GitBox Sat, 10 Oct 2020 21:27:38 -0700


kbendick commented on a change in pull request #1572:
URL: https://github.com/apache/iceberg/pull/1572#discussion_r502843848




##########
File path: 
spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetReaders.java
##########
@@ -655,6 +653,11 @@ public int numElements() {
       return numElements;
     }
 
+    @Override
+    public int getNumElements() {

Review comment:
       It looks like the function `numElements` is still there (it's part of 
the catalyst abstract class), in the lines just above these ones.
   
   But unless there's something going on with Spark, I agree with @rdblue.
   
   I looked to see if `getNumElements` were needed on the Flink side, as Flink 
highly favors POJOs for performance reasons within their own serialization that 
they can infer and then for style reasons they also tend to use POJO-y like 
names at times even when its not always necessarily needed in the current 
moment, but the Flink `ArrayData` interface uses `size` instead, so keeping the 
iceberg ReusableArrayData interface to use `numElements` seems like it would 
kill two birds with one stone by also fulfilling the catalyst contract.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kbendick commented on a change in pull request #1572: Refactor reusable map data in Spark and Flink parquet readers #1331

Reply via email to