pvary commented on pull request #2171: URL: https://github.com/apache/iceberg/pull/2171#issuecomment-769674534
@qphien: We have some internal deadlines end of this week, so we will come back to you in more details on the next week. In the meantime we had discussed this briefly with @marton-bod. For the current solution I have the following concerns: - This issues is better solved by [TEZ-4248](https://issues.apache.org/jira/browse/TEZ-4248). Since it is not yet in any release I understand why you are looking for solution in Iceberg. - Creating an extra `Record` object for every row in a table can be costly. We want to avoid running any extra lines of code on a codepath which runs on every row. And minimally we want to avoid creating objects on that codepath. I haven't checked but we might want to push down the stuff to Record generation. - When we solve the vectorization problem we can enable it in the tests here: https://github.com/apache/iceberg/blob/19622dcfcb426485748fa017a6181e23df5732dc/mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandlerTestUtils.java#L91-L93 Thanks, Peter ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
