Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16204 )

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
......................................................................

IMPALA-8125: Add query option to limit number of hdfs writer instances

This patch adds a new query option MAX_FS_WRITERS that limits the
number of HDFS writer instances.

Highlights:
- Depending on the plan, it either restricts the num of instances of
  the root fragment or adds an exchange and then limits the num of
  instances of that.
- Assigns instances evenly across available backends.
- "no-shuffle" query hint is ignored when using query option.
- Change in behavior of plans is only when this query option is used.
- The only exception to the previous point is that the optimization
  logic that decides to add an exchange now looks at the num of
  instances instead of the number of nodes.

Limitation:
A mismatch of cluster state during query planning and scheduling can
result in more or less fragment instances to be scheduled than
expected. Eg. If max_fs_writers in 2 and the planner sees only 2
executors then it might not add an exchange between a scan node and
the table sink, but during scheduling if there are 3 nodes then that
scan+tablesink instance will be scheduled on 3 backends.

Testing:
- Added planner tests to cover all cases where this enforcement kicks
  in and to highlight the behavior.
- Added e2e tests to confirm that the scheduler is enforcing the limit
  and distributing the instance evenly across backends for different
  plan shapes.

Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Reviewed-on: http://gerrit.cloudera.org:8080/16204
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test
M tests/query_test/test_insert.py
16 files changed, 903 insertions(+), 34 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 10
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>

Reply via email to