Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18514 )

Change subject: IMPALA-801, IMPALA-8011: Add INPUT__FILE__NAME virtual column 
for file name
......................................................................


Patch Set 7: Code-Review+1

Thanks for the update and opening the Jira, Zoltan! The change LGTM!

Just wanted to discuss one more thing regarding estimations, I think the nodes 
above the scan nodes could underestimate the memory requirements for virtual 
columns.
Ie.: joining two tables, the tables have 10000 files each and the scanner 
returns 1 row from each file. Currently the INPUT__FILE__NAME column is 
estimated to be 12B, but a path can be 100+B, this means that the join will 
have to deal with 2MB instead of the estimated 0.24MB.


--
To view, visit http://gerrit.cloudera.org:8080/18514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I498591f1db08a91a5c846df59086d2291df4ff61
Gerrit-Change-Number: 18514
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 07 Jun 2022 14:08:26 +0000
Gerrit-HasComments: No

Reply via email to