Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18327 )
Change subject: IMPALA-11123: Optimize count(star) for ORC scans ...................................................................... Patch Set 3: (17 comments) http://gerrit.cloudera.org:8080/#/c/18327/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18327/1//COMMIT_MSG@16 PS1, Line 16: CID table > for full ACID tables Done http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18327/2//COMMIT_MSG@18 PS2, Line 18: 'currentTransaction' column in table's special schema. > nit. in table's special schema. Done http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@772 PS1, Line 772: Status HdfsOrcScanner::GetNextWithCountStarOptimization(RowBatch* row_batch) { > This method is more than 100 lines now. Could you extract the count-star co Done http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@786 PS1, Line 786: TupleRow* dst_row = row_batch->GetRow(row_batch->AddRow()); : dst_row->SetTuple(0, dst_tuple); > Each scanner thread processes a split which could be part of the file. Only Patch set 3 serve count(star) just from footer scan range. http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@798 PS1, Line 798: // selected field from the file (in > Should I just drop the timer, or move it up to cover ResizeAndAllocateTuple I decide to drop it for optimized count(star) code path. http://gerrit.cloudera.org:8080/#/c/18327/1/be/src/exec/hdfs-orc-scanner.cc@818 PS1, Line 818: DCHECK_LT(stripe_rows_read_, file_rows); : int64_t rows_remaining = file_rows - > This is a special case that only the scanner of the footer split will proce Yes, this stay the same in patch set 3. We still serve it just from footer scan range. http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@797 PS2, Line 797: che > uint64_t? Done http://gerrit.cloudera.org:8080/#/c/18327/2/be/src/exec/hdfs-orc-scanner.cc@804 PS2, Line 804: / This is an optimized count(*) case. > I wonder if we can compute one tuple only for all the stripes available fro Patch set 3 serve count(star) just from footer scan range (1 row per file instead of per stripe). http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@334 PS1, Line 334: isFullAcidT > I think that it should contain "full" to make the semantics clearer. Done http://gerrit.cloudera.org:8080/#/c/18327/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@405 PS1, Line 405: if (fileFormats.size() != 1) return false; > Not sure if this worth the effort, but mixed ORC / Parquet tables could use Technically, we should be able to. But in practice, we will not have such situation, right? It will require different SerDe to handle both file format simlutaneously, and I have not see Impala deal with mixed file format before. Please correct me if I'm wrong. Otherwise, I prefer to keep the old behavior. http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@355 PS2, Line 355: isFullAcidT > nit. missing suffix _. Done http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@400 PS2, Line 400: count(* > nit. remove? Done http://gerrit.cloudera.org:8080/#/c/18327/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@406 PS2, Line 406: se; > nit. This implies the acid table property is not checked against parquet ta Currently full acid table only available with ORC. But it looks like Parquet can support full acid schema too in the future: https://issues.apache.org/jira/browse/HIVE-8123 I updated this logic in patch set 3. http://gerrit.cloudera.org:8080/#/c/18327/2/testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test File testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test: http://gerrit.cloudera.org:8080/#/c/18327/2/testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test@4 PS2, Line 4: functional_orc_def.uncomp_src_alltypes > This table is a managed table. I'm not sure why its table properties don't This table follow schema from functional.alltypes, but without "transactional=true" property. https://github.com/apache/impala/blob/c10e951/testdata/datasets/functional/functional_schema_template.sql#L2559 Thus, it is a managed ORC, but not full acid. I think it is okay to use this table. This will align with parquet-stats-agg.test that use table derived from functional.alltypes as well and exercise similar set of test cases. http://gerrit.cloudera.org:8080/#/c/18327/1/testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test File testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test: http://gerrit.cloudera.org:8080/#/c/18327/1/testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test@100 PS1, Line 100: ===== > For Parquet we also have a multi block test: Yes, this is exercised in TestOrc::test_misaligned_orc_stripes for ORC. Added count(star) test there as well. http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py File tests/query_test/test_aggregation.py: http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py@279 PS2, Line 279: if (vector.get_value('table_format').file_format != 'text' or : vector.get_value('table_format').compression_codec != 'none'): > Shouldn't we only run this on ORC? It seems the other tests for parquet and This test constraint, along with parquet and kudu ones, is a bit misleading. It constraint to 'text/none' table format only, but actually exercise the orc/parquet/kudu testcase. We should probably fix the constraint for these tests to better reflect what they're testing against. http://gerrit.cloudera.org:8080/#/c/18327/2/tests/query_test/test_aggregation.py@283 PS2, Line 283: > It seems we don't need 'unique_database' since all tables we used are in fu Dropped the 'unique_database' for test_orc_count_star_optimization and test_kudu_count_star_optimization. test_parquet_count_star_optimization still need 'unique_database' since it has custom test case that creates and insert new table: https://github.com/apache/impala/blob/408b0aa/testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test#L119-L125 -- To view, visit http://gerrit.cloudera.org:8080/18327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091 Gerrit-Change-Number: 18327 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Tue, 22 Mar 2022 17:56:19 +0000 Gerrit-HasComments: Yes
