Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15370 )

Change subject: IMPALA-6636: Use async IO in ORC scanner
......................................................................


Patch Set 27: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15370/25/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15370/25/be/src/exec/hdfs-orc-scanner.cc@1375
PS25, Line 1375: stitute("HdfsOrc
> You are correct. Until now, we were reading the last 100KB, but didn't actu
Thanks for the explanation!

I would prefer to reduce the initial range size to 16KB (it is ok to move this 
to another patch).

It should be easy to do this by passing a size to  
HdfsScanner::IssueFooterRanges instead of using constant: 
https://github.com/apache/impala/blob/57982efc21746f6994c11b623fc3cdd1dbbac8a2/be/src/exec/hdfs-scanner.cc#L832

We don't just read something and never use it, but also waste the data cache:
https://github.com/apache/impala/blob/master/be/src/runtime/io/data-cache.h#L73



--
To view, visit http://gerrit.cloudera.org:8080/15370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074
Gerrit-Change-Number: 15370
Gerrit-PatchSet: 27
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Thu, 03 Feb 2022 08:22:19 +0000
Gerrit-HasComments: Yes

Reply via email to