Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18514 )
Change subject: IMPALA-801, IMPALA-8011: Add INPUT__FILE__NAME virtual column for file name ...................................................................... Patch Set 7: Code-Review+1 Thanks for the update and opening the Jira, Zoltan! The change LGTM! Just wanted to discuss one more thing regarding estimations, I think the nodes above the scan nodes could underestimate the memory requirements for virtual columns. Ie.: joining two tables, the tables have 10000 files each and the scanner returns 1 row from each file. Currently the INPUT__FILE__NAME column is estimated to be 12B, but a path can be 100+B, this means that the join will have to deal with 2MB instead of the estimated 0.24MB. -- To view, visit http://gerrit.cloudera.org:8080/18514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I498591f1db08a91a5c846df59086d2291df4ff61 Gerrit-Change-Number: 18514 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tamas Mate <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 07 Jun 2022 14:08:26 +0000 Gerrit-HasComments: No
