Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16723


Change subject: IMPALA-10314: Optimize planning time for simple limits
......................................................................

IMPALA-10314: Optimize planning time for simple limits

This patch optimizes the planning time for simple limit
queries by only considering a minimal set of partitions
whose file descriptors add up to N (the specified limit).
Each file is conservatively estimated to contain 1 row.

This reduces the number of partitions processed by
HdfsScanNode.computeScanRangeLocations(). Further, within
each partition, we only consider the number of files that
brings the total to N.

This is an opt-in optimization. A new planner option
OPTIMIZE_SIMPLE_LIMIT enables this optimization.

Testing:
 - Added new planner tests that enable this optimization
 - Ran PlannerTest
 - Ran query_test.py tests by enabling the optimize_simple_limit
 - TODO: add e2e tests (I haven't added them yet since the
   results are non-determinstic. Only simple count(*) queries
   on top of subqueries with limit could probably be added)

Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test
10 files changed, 302 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/16723/1
--
To view, visit http://gerrit.cloudera.org:8080/16723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Gerrit-Change-Number: 16723
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <amsi...@cloudera.com>

Reply via email to