Hello Michael Ho, Lars Volker, Philip Zeyliger, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/12037 to look at the new patch set (#2). Change subject: IMPALA-7928: Consistent remote read scheduling ...................................................................... IMPALA-7928: Consistent remote read scheduling Currently, remote reads for a particular file are not scheduled to a consistent set of nodes. This reduces the efficiency of the HDFS file handle cache (and any other cache that is at the file level). This schedules remote reads consistently by generating a set of simluated remote replicas for each file. The simulated remote replicas are generated by hashing the filename multiple times and finding the closest nodes in a hash ring. This is a consistent hash that is designed to limit the number of files remapped when cluster nodes come and go. The number of simulated remote replicas is controlled by a query option 'num_simulated_remote_replicas', which defaults to 3. Once the simulated remote replicas are chosen, the algorithm for picking a specific replica uses the same algorithm as picking a local replica. It picks the node with the minimum number of assigned bytes and uses 'schedule_random_replica' to determine how to break ties. It leaves the normal algorithms in place for local files, Kudu, and HBase. If 'num_simulated_remote_replicas' is set to 0, simulated remote replicas are disabled and the previous remote scheduling algorithm is used. Change-Id: Icbf74088a8bd8c285ab7285ea3a01acd1bb53a45 --- M be/src/experiments/CMakeLists.txt A be/src/experiments/hash-ring-test.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 10 files changed, 332 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12037/2 -- To view, visit http://gerrit.cloudera.org:8080/12037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icbf74088a8bd8c285ab7285ea3a01acd1bb53a45 Gerrit-Change-Number: 12037 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>