pvary commented on pull request #2171:
URL: https://github.com/apache/iceberg/pull/2171#issuecomment-769674534


   @qphien: We have some internal deadlines end of this week, so we will come 
back to you in more details on the next week.
   
   In the meantime we had discussed this briefly with @marton-bod.
   
   For the current solution I have the following concerns:
   - This issues is better solved by 
[TEZ-4248](https://issues.apache.org/jira/browse/TEZ-4248). Since it is not yet 
in any release I understand why you are looking for solution in Iceberg.
   - Creating an extra `Record` object for every row in a table can be costly. 
We want to avoid running any extra lines of code on a codepath which runs on 
every row. And minimally we want to avoid creating objects on that codepath. I 
haven't checked but we might want to push down the stuff to Record generation.
   - When we solve the vectorization problem we can enable it in the tests 
here: 
https://github.com/apache/iceberg/blob/19622dcfcb426485748fa017a6181e23df5732dc/mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandlerTestUtils.java#L91-L93
   
   Thanks,
   Peter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to