Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16204 )
Change subject: IMPALA-8125: Add query option to limit number of hdfs writer instances ...................................................................... IMPALA-8125: Add query option to limit number of hdfs writer instances This patch adds a new query option MAX_FS_WRITERS that limits the number of HDFS writer instances. Highlights: - Depending on the plan, it either restricts the num of instances of the root fragment or adds an exchange and then limits the num of instances of that. - Assigns instances evenly across available backends. - "no-shuffle" query hint is ignored when using query option. - Change in behavior of plans is only when this query option is used. - The only exception to the previous point is that the optimization logic that decides to add an exchange now looks at the num of instances instead of the number of nodes. Limitation: A mismatch of cluster state during query planning and scheduling can result in more or less fragment instances to be scheduled than expected. Eg. If max_fs_writers in 2 and the planner sees only 2 executors then it might not add an exchange between a scan node and the table sink, but during scheduling if there are 3 nodes then that scan+tablesink instance will be scheduled on 3 backends. Testing: - Added planner tests to cover all cases where this enforcement kicks in and to highlight the behavior. - Added e2e tests to confirm that the scheduler is enforcing the limit and distributing the instance evenly across backends for different plan shapes. Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Reviewed-on: http://gerrit.cloudera.org:8080/16204 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test M tests/query_test/test_insert.py 16 files changed, 903 insertions(+), 34 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Gerrit-Change-Number: 16204 Gerrit-PatchSet: 10 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
