[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 25: Verified+1 Verified with https://jenkins.impala.io/job/gerrit-verify-dryrun/1991/ -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 25 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 23 Feb 2018 04:17:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify the interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Reviewed-on: http://gerrit.cloudera.org:8080/8414 Reviewed-by: Tim ArmstrongTested-by: Tim Armstrong --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 577 insertions(+), 909 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 26 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 25: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 25 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 22 Feb 2018 23:06:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 24: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 24 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 22 Feb 2018 21:40:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 23: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 23 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 19 Feb 2018 18:17:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#22). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify the interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 577 insertions(+), 909 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/22 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 22 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 20: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 20 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 23 Jan 2018 00:09:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#20). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 577 insertions(+), 909 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/20 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 20 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 18: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 18 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 16 Jan 2018 23:05:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 17: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 17 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 11 Jan 2018 00:34:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 575 insertions(+), 908 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/17 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 17 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#16). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 575 insertions(+), 904 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/16 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 15 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 14 Dec 2017 22:53:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 575 insertions(+), 904 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/14 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 13: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 06 Dec 2017 16:33:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 575 insertions(+), 904 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/13 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 12: Code-Review+2 (1 comment) Please see if Tianyi has any more comments too. http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@93 PS9, Line 93: void Cancel(); > Let me take a look through the current patchset first. Even the other methods seem to operate entirely (or almost entirely?) on the 'reader' state, rather than the diskiomgr 'this' itself. Even the lock is the reader lock. So it seems they are more operations on the context. I think the current change is okay as is, but i think we could consider moving even more things into here in a purely mechanical change in the future. -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 12 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 21 Nov 2017 04:27:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#12). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 575 insertions(+), 904 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/12 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 12 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 9: (19 comments) http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@a222 PS11, Line 222: : : > did you mean to just delete that? Done http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@457 PS11, Line 457: XCEED > queued? Done http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@333 PS11, Line 333: CANCELLED > seems like the right thing to do but are you sure we weren't depending on p Before this patch the only place when RequestContext::Cancel() was called with a non-CANCELLED status was with MEM_LIMIT_EXCEEDED. This patch removes that call and the need for any asynchronous error propagation via RequestContext, so can't introduce any new bugs in this area. Error propagation bugs are still possible in client code if they cancel the context before propagating the original error, but I believe the HDFS scan node code is written to avoid this. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@581 PS11, Line 581: DCHECK(!buffer->is_cached()) << "HDFS cache reads don't go through this code path."; > maybe put that dcheck upstream too (like ReadRange() or even earlier) to he There is already a DCHECK in ReadRange() that catches this, but its non-obvious. I added a string to it to make it clearer what it's checking. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@665 PS11, Line 665: DCHECK > nit: _EQ The strong typed enums prevent that unfortunately. I did change this to print out the integer value on failure. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h@448 PS11, Line 448: again. > disk threads? Done http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@27 PS9, Line 27: For most I/O manager clients it is an : /// opaque pointer, but some clients may need to include this heade > is that still true? Done http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@93 PS9, Line 93: void Cancel(); > how did you decide to make this a first class thing in the RequestContext, Yeah, I'd probably call it DiskIoMgrClient or something like that if I was writing this from scratch. The ones I moved were the trivial methods that were only called during RequestContext set-up and tear-down. The other DiskIoMgr methods are called from more places and are more about the I/O operations than setting up the context. It isn't really principled now that I think about it. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc File be/src/runtime/io/request-context.cc: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@42 PS11, Line 42: << static_cast(range->external_buffer_tag_); > oh, I guess DCHECK_EQ doesn't work with enum or something? Yeah, not with the strongly-typed "enum class" enums. It works with weakly typed enums because they get implicitly converted to integers. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@75 PS11, Line 75: reader > what about the writing case? is that relevant? Yep, this needed updating. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@76 PS11, Line 76: GetNext > GetNext() Done http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@79 PS11, Line 79: it > does this referring to the context or something else? Done http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@90 PS11, Line 90: cancelled Status > does that mean the CANCELLED status or the status that lead to cancellation The RequestContext can't be cancelled with an arbitrary status. If there was an earlier error for a ScanRange, the cancelled status doesn't overwrite the prior status. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@91 PS11, Line 91: the cancelled state. > what does that mean? reworded. http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@95 PS11, Line 95: decrements the number of disk queues the context : // is on. > which code is this talking about? Oh, I guess that's
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#10). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 552 insertions(+), 861 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/10 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 10 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 8: > (3 comments) > > I kept on pulling at the error propagation string and I've realised > that we need a few more changes to get this into a good state where > it's more clearly correct. I'm going to continue working on that > and post a patchset once I'm done. Is that complete in the latest patchset or should we wait for another before reviewing? -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Nov 2017 19:28:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 8: Ended up doing some further changes to make the DecrementRequestThread*(), locking and error propagation a bit more comprehensible. -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Nov 2017 19:08:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#8). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 537 insertions(+), 847 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/8 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong