Hello Abhishek Rawat, Csaba Ringhofer, Wenzhe Zhou, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/21966

to look at the new patch set (#3).

Change subject: IMPALA-13469: Deflake test_query_cpu_count_on_insert
......................................................................

IMPALA-13469: Deflake test_query_cpu_count_on_insert

A new test case from IMPALA-13445 reveals a pre-existing bug where
cost-based planning may increase expectedNumInputInstance greater than
inputFragment.getNumInstances(), which leads to precondition violation.
The following scenario all happened when the Precondition was hit:

1. The environment is either Erasure Coded HDFS or Ozone.
2. The source table does not have stats nor numRows table property.
3. There is only one fragment consisting of a ScanNode in the plan tree
   before the addition of DML fragment.
4. Byte-based cardinality estimation logic kicks in.
5. Byte-based cardinality causes high scan cost, which leads to
   maxScanThread exceeding inputFragment.getPlanRoot().
6. expectedNumInputInstance is assigned equal to maxScanThread.
7. Precondition expectedNumInputInstance < inputFragment.getPlanRoot()
   is violated.

This scenario triggers a special condition that attempts to lower
expectedNumInputInstance. But instead of lowering
expectedNumInputInstance, the special logic increases it due to higher
byte-based cardinality estimation.

There is also a new bug where DistributedPlanner.java mistakenly passes
root.getInputCardinality() instead of root.getCardinality().

This patch fixes both issues and does minor refactoring to change
variable names into camel cases. Relaxed validation of the last test
case of test_query_cpu_count_on_insert to let it pass in Erasure Coded
HDFS and Ozone setup.

Testing:
- Make several assertions in test_executor_groups.py more verbose.
- Pass test_executor_groups.py in Erasure Coded HDFS and Ozone setup.
- Added new Planner tests with unknown cardinality estimation.
- Pass core tests in regular setup.

Change-Id: I834eb6bf896752521e733cd6b77a03f746e6a447
---
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test
M tests/custom_cluster/test_executor_groups.py
5 files changed, 161 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/21966/3
--
To view, visit http://gerrit.cloudera.org:8080/21966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I834eb6bf896752521e733cd6b77a03f746e6a447
Gerrit-Change-Number: 21966
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>

Reply via email to