Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19927 )

Change subject: IMPALA-11123: Reimplement ORC optimized count star
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19927/3/be/src/exec/orc/hdfs-orc-scanner.cc
File be/src/exec/orc/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/19927/3/be/src/exec/orc/hdfs-orc-scanner.cc@409
PS3, Line 409:   if (scan_node_->optimize_count_star()) {
> it is possible for row_batches_need_validation_ to be true while scan_node_
In HdfsScanNode.java, I choose to not enable optimized count star for ACID 
table.
So row_batches_need_validation_ should be false here.
This is isFullAcidTable function in AcidUtils.java:

  public static boolean isFullAcidTable(Map<String, String> props) {            
                                                                                
                                                                                
    return isTransactionalTable(props) && !isInsertOnlyTable(props);            
                                                                                
                                                                                
                 
  }

Should I disable optimized count star for all transactional table, just to be 
safe? (isTransactionalTable(props) == True)


http://gerrit.cloudera.org:8080/#/c/19927/3/be/src/exec/orc/hdfs-orc-scanner.cc@811
PS3, Line 811:     // batches and check their validity. In that case 
'currentTransaction' is the only
> this formatting is very unusual in Impala - can you move the else if to lin
Done


http://gerrit.cloudera.org:8080/#/c/19927/3/testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test
File 
testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test:

http://gerrit.cloudera.org:8080/#/c/19927/3/testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test@18
PS3, Line 18: ---- QUERY
> Is there a reason behind returning number of files in Parquet and 0 in ORC?
Fixed this. Both Parquet and ORC should hold the same assertion.


http://gerrit.cloudera.org:8080/#/c/19927/3/tests/util/test_file_parser.py
File tests/util/test_file_parser.py:

http://gerrit.cloudera.org:8080/#/c/19927/3/tests/util/test_file_parser.py@269
PS3, Line 269:         elif subsection_comment == 'ANY_OF':
> this block looks more complex than necessary
Done. Fixed couple flake8 errors as well.



--
To view, visit http://gerrit.cloudera.org:8080/19927
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17
Gerrit-Change-Number: 19927
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Thu, 22 Feb 2024 20:19:35 +0000
Gerrit-HasComments: Yes

Reply via email to