Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18327 )

Change subject: IMPALA-11123: Optimize count(star) for ORC scans
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG@18
PS2, Line 18: 'currentTransaction' column. This patch also drops 'parquet' from 
names
nit. in table's special schema.


http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@797
PS2, Line 797: int
uint64_t?


http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@804
PS2, Line 804:  TupleRow* dst_row = row_batch->GetRow(row_batch->AddRow());
I wonder if we can compute one tuple only for all the stripes available from 
reader_ by adding all row count stats together. In this way, we can avoid 
allocating multiple tuples.

If I read the new code correctly, the current logic produces one tuple for each 
stripe.


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@355
PS2, Line 355: isAcidTable
nit. missing suffix _.


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@400
PS2, Line 400: Parquet
nit. remove?


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@406
PS2, Line 406: (!hasOrc(fileFormats) || isAcidTable)
nit. This implies the acid table property is not checked against parquet table.



--
To view, visit http://gerrit.cloudera.org:8080/18327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
Gerrit-Change-Number: 18327
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Tue, 22 Mar 2022 13:51:38 +0000
Gerrit-HasComments: Yes

Reply via email to