Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18327 )

Change subject: IMPALA-11123: Optimize count(star) for ORC scans
......................................................................


Patch Set 3:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/18327/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18327/1//COMMIT_MSG@16
PS1, Line 16: CID table
> for full ACID tables
Done


http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG@18
PS2, Line 18: 'currentTransaction' column in table's special schema.
> nit. in table's special schema.
Done


http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@772
PS1, Line 772: Status 
HdfsOrcScanner::GetNextWithCountStarOptimization(RowBatch* row_batch) {
> This method is more than 100 lines now. Could you extract the count-star co
Done


http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@786
PS1, Line 786:   TupleRow* dst_row = row_batch->GetRow(row_batch->AddRow());
             :   dst_row->SetTuple(0, dst_tuple);
> Each scanner thread processes a split which could be part of the file. Only
Patch set 3 serve count(star) just from footer scan range.


http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@798
PS1, Line 798:   // selected field from the file (in
> Should I just drop the timer, or move it up to cover ResizeAndAllocateTuple
I decide to drop it for optimized count(star) code path.


http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@818
PS1, Line 818:       DCHECK_LT(stripe_rows_read_, file_rows);
             :       int64_t rows_remaining = file_rows -
> This is a special case that only the scanner of the footer split will proce
Yes, this stay the same in patch set 3. We still serve it just from footer scan 
range.


http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@797
PS2, Line 797: che
> uint64_t?
Done


http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@804
PS2, Line 804: / This is an optimized count(*) case.
> I wonder if we can compute one tuple only for all the stripes available fro
Patch set 3 serve count(star) just from footer scan range (1 row per file 
instead of per stripe).


http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@334
PS1, Line 334: isFullAcidT
> I think that it should contain "full" to make the semantics clearer.
Done


http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@405
PS1, Line 405:     if (fileFormats.size() != 1) return false;
> Not sure if this worth the effort, but mixed ORC / Parquet tables could use
Technically, we should be able to. But in practice, we will not have such 
situation, right?
It will require different SerDe to handle both file format simlutaneously, and 
I have not see Impala deal with mixed file format before.
Please correct me if I'm wrong. Otherwise, I prefer to keep the old behavior.


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@355
PS2, Line 355: isFullAcidT
> nit. missing suffix _.
Done


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@400
PS2, Line 400: count(*
> nit. remove?
Done


http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@406
PS2, Line 406: se;
> nit. This implies the acid table property is not checked against parquet ta
Currently full acid table only available with ORC. But it looks like Parquet 
can support full acid schema too in the future:
https://issues.apache.org/jira/browse/HIVE-8123
I updated this logic in patch set 3.


http://gerrit.cloudera.org:8080/#/c/18327/2/testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test:

http://gerrit.cloudera.org:8080/#/c/18327/2/testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test@4
PS2, Line 4: functional_orc_def.uncomp_src_alltypes
> This table is a managed table. I'm not sure why its table properties don't
This table follow schema from functional.alltypes, but without 
"transactional=true" property.
https://github.com/apache/impala/blob/c10e951/testdata/datasets/functional/functional_schema_template.sql#L2559
Thus, it is a managed ORC, but not full acid.

I think it is okay to use this table. This will align with 
parquet-stats-agg.test that use table derived from functional.alltypes as well 
and exercise similar set of test cases.


http://gerrit.cloudera.org:8080/#/c/18327/1/testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test
File testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test:

http://gerrit.cloudera.org:8080/#/c/18327/1/testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test@100
PS1, Line 100: =====
> For Parquet we also have a multi block test:
Yes, this is exercised in TestOrc::test_misaligned_orc_stripes for ORC.
Added count(star) test there as well.


http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py@279
PS2, Line 279:     if (vector.get_value('table_format').file_format != 'text' or
             :           vector.get_value('table_format').compression_codec != 
'none'):
> Shouldn't we only run this on ORC? It seems the other tests for parquet and
This test constraint, along with parquet and kudu ones, is a bit misleading. It 
constraint to 'text/none' table format only, but actually exercise the 
orc/parquet/kudu testcase.
We should probably fix the constraint for these tests to better reflect what 
they're testing against.


http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py@283
PS2, Line 283:
> It seems we don't need 'unique_database' since all tables we used are in fu
Dropped the 'unique_database' for test_orc_count_star_optimization and 
test_kudu_count_star_optimization.
test_parquet_count_star_optimization still need 'unique_database' since it has 
custom test case that creates and insert new table:
https://github.com/apache/impala/blob/408b0aa/testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test#L119-L125



--
To view, visit http://gerrit.cloudera.org:8080/18327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
Gerrit-Change-Number: 18327
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Tue, 22 Mar 2022 17:56:19 +0000
Gerrit-HasComments: Yes

Reply via email to