Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20922


Change subject: IMPALA-12726: Simulate large scale query in 
TpcdsCpuCostPlannerTest
......................................................................

IMPALA-12726: Simulate large scale query in TpcdsCpuCostPlannerTest

Querying against large scale database is a good way to test Impala.
However, it is impractical to do in single node development machine.

Frontend testing does not actually run the test query in backend
executor and can benefit from simulated large scale test cases. This
patch attempt to do it by instrumenting the CatalogD metadata loading
code to multiply partitions numRows, tables numRows, numNull, numTrues,
and numFalses to 1000x in TpcdsCpuCostPlannerTest. The scaling factor is
supplied through RuntimeEnv. Code that populates stats metadata (caller
of FeCatalogUtils.getRowCount() and FeCatalogUtils.injectColumnStats())
is instrumented to check against this scaling factor on whether to
multiply the stats for a particular table or not. Tables that is scaled
up must also be invalidated so that they will be reloaded with new
scaled stats.

Total byte sizes are not scaled up in this patch because it does not
impact query plan unless stats extrapolation is being used.

Testing:
- Scale the fact tables of tpcds_partitioned_parquet_snap in
  TpcdsCpuCostPlannerTest to 1000x to simulate 1TB TPC-DS. Number of
  executor is raised from 3 to 10, and REPLICA_PREFERENCE is set to
  REMOTE to ignore data locality.
- Compare with the afternative methods where instrumentation is done
  during stats collection (COMPUTE STATS) and confirm that the resulting
  query plans are the same with this patch.
- Pass FE tests.

Change-Id: Iaffddd70c2da8376ca6c40f65606bbac46c34de7
---
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/planner/TpcdsCpuCostPlannerTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q28.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q36.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q38.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q39a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q42.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q43.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q44.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q47.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q49.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q50.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q51.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q52.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q53.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q54.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q55.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q56.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q57.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q58.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q59.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q60.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q61.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q62.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q63.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q64.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q65.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q66.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q67.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q68.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q69.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q70.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q71.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q72.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q73.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q74.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q75.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q76.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q78.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q79.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q81.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q83.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q84.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q85.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q86.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q87.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q88.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q90.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q91.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q93.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q96.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q97.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q99.test
116 files changed, 38,206 insertions(+), 39,006 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/20922/1
--
To view, visit http://gerrit.cloudera.org:8080/20922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaffddd70c2da8376ca6c40f65606bbac46c34de7
Gerrit-Change-Number: 20922
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <[email protected]>

Reply via email to