[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous writes to the data cache to improve scan performance when a cache miss happens. Previously, writes to the data cache are synchronous with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes - Add a timer to the MultiThreadedReadWrite function to get the average time of multithreaded writes. Here are some test cases and their time that differ significantly between synchronous and asynchronous: Test case| Policy | Sync/Async | write time in ms MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Reviewed-on: http://gerrit.cloudera.org:8080/19475 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/metrics.json 10 files changed, 427 insertions(+), 52 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 23 Mar 2023 08:32:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 12: Code-Review+2 My internal test run looks good, this looks ready -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 23 Mar 2023 03:22:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9171/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 23 Mar 2023 03:22:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12674/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 11 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 23 Mar 2023 03:12:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 10: (2 comments) Thanks again for your code review and suggestions. http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc@107 PS10, Line 107: usleep(10 * 1000); > One small thing: Since we call this so frequently, it matters a lot to exec Done http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc@960 PS10, Line 960: if (buffer_limit < (1 << 23 /* 8MB */)) { : return Status(Substitute("Configured data cache write buffer limit $0 is too small " : "(less than 8MB)", FLAGS_data_cache_async_write_buffer_limit)); : } > Unfortunately, this causes problems for the OutOfWriteBufferLimit test case Done -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 23 Mar 2023 02:57:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#11). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous writes to the data cache to improve scan performance when a cache miss happens. Previously, writes to the data cache are synchronous with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes - Add a timer to the MultiThreadedReadWrite function to get the average time of multithreaded writes. Here are some test cases and their time that differ significantly between synchronous and asynchronous: Test case| Policy | Sync/Async | write time in ms MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/metrics.json 10 files changed, 427 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/11 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 11 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 10: Code-Review+1 (2 comments) This is looking good. I only have a couple small things, then I think we are good to go. I will kick off another round of tests. Thank you for your patience on this review. http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc@107 PS10, Line 107: usleep(10 * 1000); One small thing: Since we call this so frequently, it matters a lot to execution time of these tests. In my experiments, these tests run much faster if we bring this down to 500 vs 1. http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc@960 PS10, Line 960: if (buffer_limit < (1 << 23 /* 8MB */)) { : return Status(Substitute("Configured data cache write buffer limit $0 is too small " : "(less than 8MB)", FLAGS_data_cache_async_write_buffer_limit)); : } Unfortunately, this causes problems for the OutOfWriteBufferLimit test case, because it uses a smaller buffer. As a workaround, let's disable this check for backend tests: if (!TestInfo::is_be_test() && buffer_limit < (1 << 23 /* 8MB */)) { -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 21 Mar 2023 21:13:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12621/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 14 Mar 2023 08:21:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12620/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 14 Mar 2023 08:15:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 7: (14 comments) Thank you for your suggestion! I have updated the code and commit message, and added time-consuming log printing in the test code for easier replication of test results. http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > Thank you for the performance numbers. This is interesting enough that we s Done http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9 PS7, Line 9: write > Nit: writes Done http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10 PS7, Line 10: when cache miss happens > Nit: "when a cache miss happens" Done http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11 PS7, Line 11: synchronized > Nit: synchronous Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc@822 PS7, Line 822: EXPECT_LT(count, NUM_CACHE_ENTRIES_NO_EVICT); > On our test jobs, this test fails on this assert. I think we might > want to use a lower number for data_cache_async_write_threads for > this test (e.g. 1 or 2). Sorry, this was caused by me missing some code. In order not to interfere with the DataCache in TraceReplayer, I changed the number of asynchronous threads to the constructor parameter of DataCache, but I forgot to make the corresponding modifications in the test code. Now this issue has been fixed and all test cases should pass. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128 PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20; > Let's add a comment here saying that this large value for the queue size is Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385 PS7, Line 385: abstruct > Nit: abstract Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393 PS7, Line 393: explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size) : : key_(filename, mtime, offset), > I would like to make StoreTask a simple struct with minimal logic, and move Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975 PS7, Line 975: if (buffer_limit < PAGE_SIZE) { : return Status(Substitute("Configured data cache write buffer limit $0 is too small", : FLAGS_data_cache_async_write_buffer_limit)); : } > Let's require the limit to be higher. I think the minimum should be 8MB. If Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059 PS7, Line 1059: if (UNLIKELY(current_buffer_size_.Load() + buffer_len > store_buffer_capacity_)) { : VLOG(2) << Substitute("Failed to create store task due to buffer size limitation, " : "current buffer size: $0 size limitation: $1 require: $2", : current_buffer_size_.Load(), store_buffer_capacity_, buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES-> : Increment(buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1); : return false; : } > This would become a CompareAndSwap loop where either we are at the limit an Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1069 PS7, Line 1069: const StoreTask* task = new StoreTask(filename, mtime, offset, buffer, buffer_len, : current_buffer_size_); > Move the logic from the StoreTask constructor (incrementing counters, alloc Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1074 PS7, Line 1074: > We would also add a CompleteStoreTask() function here that would be called Done http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc@80 PS7, Line 80: Write threads need to bound the extra memory consumption for holding the " : "temporary buffer though. > Nit: Let's change this sentence to say that the extra memory for temporary Done http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12619/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 14 Mar 2023 08:11:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#10). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous writes to the data cache to improve scan performance when a cache miss happens. Previously, writes to the data cache are synchronous with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes - Add a timer to the MultiThreadedReadWrite function to get the average time of multithreaded writes. Here are some test cases and their time that differ significantly between synchronous and asynchronous: Test case| Policy | Sync/Async | write time in ms MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/metrics.json 10 files changed, 427 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/10 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#9). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous writes to the data cache to improve scan performance when a cache miss happens. Previously, writes to the data cache are synchronous with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes - Add a timer to the MultiThreadedReadWrite function to get the average time of multithreaded writes. Here are some test cases and their time that differ significantly between synchronous and asynchronous: Test case| Policy | Sync/Async | write time in ms MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/metrics.json 10 files changed, 427 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/9 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py@409 PS8, Line 409: flake8: W293 blank line contains whitespace http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py@409 PS8, Line 409: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 14 Mar 2023 07:52:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#8). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous writes to the data cache to improve scan performance when a cache miss happens. Previously, writes to the data cache are synchronous with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes - Add a timer to the MultiThreadedReadWrite function to get the average time of multithreaded writes. Here are some test cases and their time that differ significantly between synchronous and asynchronous: Test case| Policy | Sync/Async | write time in ms MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/metrics.json 10 files changed, 427 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/8 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc@822 PS7, Line 822: EXPECT_LT(count, NUM_CACHE_ENTRIES_NO_EVICT); On our test jobs, this test fails on this assert. I think we might want to use a lower number for data_cache_async_write_threads for this test (e.g. 1 or 2). -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Sun, 12 Mar 2023 06:34:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 7: (13 comments) This is interesting, and I think we should go forward with merging this. I have a few code organization comments, but I think this is getting very close. I am running our test jobs with the data cache enabled and async writes turned on. To make that easier in future, can you add code in bin/run-all-tests.sh that defines a new environment variable similar to the one we have for DATA_CACHE_EVICTION_POLICY (e.g. DATA_CACHE_ASYNC_WRITE_THREADS)? See the current code we have here: https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L67 https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L82-L85 That let's us run our full test job with this enabled. http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > Thank you for your question, I have done some simple performance tests for Thank you for the performance numbers. This is interesting enough that we should go ahead and plan on checking this in. Can you incorporate a few of the interesting performance results into the commit message? http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9 PS7, Line 9: write Nit: writes http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10 PS7, Line 10: when cache miss happens Nit: "when a cache miss happens" http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11 PS7, Line 11: synchronized Nit: synchronous http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128 PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20; Let's add a comment here saying that this large value for the queue size is harmless because the total size of the entries on the queue are bound by the data_cache_async_write_bufer_limit. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385 PS7, Line 385: abstruct Nit: abstract http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393 PS7, Line 393: explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size) : : key_(filename, mtime, offset), I would like to make StoreTask a simple struct with minimal logic, and move the counter update / buffer allocation outside of this class. Basically, DataCache::SubmitStoreTask() would: - Increment the counter atomically with the CompareAndSwap() subject to the limit - Allocate the memory buffer and copy into it - Increment the async writes outstanding bytes - Construct the StoreTask passing in the new buffer and all the other data needed (but not the total_size AtomicInt64 that we have today) - Unlike today, StoreTask would take in a pointer to the DataCache object. Then, we would add a DataCache::CompleteStoreTask(const StoreTask& store_task), which does the work that the StoreTask destructor does today (decrementing the counters). StoreTask becomes a simple holder of data. StoreTask's destructor calls CompleteStoreTask(this) on the DataCache pointer that was passed into the constructor. Having the allocation logic in DataCache will make it easier for us to hook up the memory tracking. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975 PS7, Line 975: if (buffer_limit < PAGE_SIZE) { : return Status(Substitute("Configured data cache write buffer limit $0 is too small", : FLAGS_data_cache_async_write_buffer_limit)); : } Let's require the limit to be higher. I think the minimum should be 8MB. If someone configures this to a value that is smaller than most IOs, then nothing will ever get cached. Most IOs we do are smaller than 8MB. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059 PS7, Line 1059: if (UNLIKELY(current_buffer_size_.Load() + buffer_len > store_buffer_capacity_)) { : VLOG(2) << Substitute("Failed to create store task due to buffer size limitation, " : "current buffer size: $0 size limitation: $1 require: $2", : current_buffer_size_.Load(), store_buffer_capacity_, buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES-> : Increment(buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1); : return f
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12503/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 01 Mar 2023 10:53:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > Thank you for your question, I have done some simple performance tests for I modified ThreadFn function in data-cache-test.cc and added a timer for writing, like this: int64_t write_start_time = UnixMicros(); for (int64_t offset : offsets) { cache->Store(custom_fname, MTIME, offset, test_buffer() + offset, TEMP_BUFFER_SIZE); } *write_time_us = UnixMicros() - start_time; -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 01 Mar 2023 10:36:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#7). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 8 files changed, 330 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/7 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > I'm not convinced that this change will improve performance noticeably. Whe Thank you for your question, I have done some simple performance tests for this. Firstly, I modified ThreadFn function in data-cache-test.cc and added a timer for writing, like this:X Then I calculated the average writing time for each test case. Some of the tests had similar results between synchronous and asynchronous; while there were obvious differences in some cases as shown in the following table: Test case| Policy | Sync/Async | write time(ms) MultiThreadedNoMisses| LRU| Sync | 12.20 MultiThreadedNoMisses| LRU| Async | 20.74 MultiThreadedNoMisses| LIRS | Sync |9.42 MultiThreadedNoMisses| LIRS | Async | 16.75 MultiThreadedWithMisses | LRU| Sync | 510.87 MultiThreadedWithMisses | LRU| Async | 10.06 MultiThreadedWithMisses | LIRS | Sync | 1872.11 MultiThreadedWithMisses | LIRS | Async | 11.02 MultiPartitions | LRU| Sync |1.20 MultiPartitions | LRU| Async |5.23 MultiPartitions | LIRS | Sync |1.26 MultiPartitions | LIRS | Async |7.91 AccessTraceAnonymization | LRU| Sync | 1963.89 AccessTraceAnonymization | LRU| Sync | 2073.62 AccessTraceAnonymization | LRU| Async |9.43 AccessTraceAnonymization | LRU| Async | 13.13 AccessTraceAnonymization | LIRS | Sync | 1663.93 AccessTraceAnonymization | LIRS | Sync | 1501.86 AccessTraceAnonymization | LIRS | Async | 12.83 AccessTraceAnonymization | LIRS | Async | 12.74 I also conducted some practical application tests on the HDFS data of the production environment. I selected some queries that were bottlenecks for HDFS scans and executed them several times on the newly started cluster, and recorded the time consumption (in seconds) for the HDFS scans. The results are as follows: Test case | M/H | No DataCache | Sync DataCache | Async DataCache case1 | MISS | 29.06| 89.00 | 34.15 | HIT | 25.23| 23.13 | | HIT | 30.49| 22.75 | case2 | MISS | 6.92| 9.63 | 6.33 | HIT | 5.19| 4.04 | | HIT | 6.66| 3.40 | case3 | MISS | 11.96| 36.73 | 17.41 | HIT | 7.94| 5.75 | | HIT | 15.82| 5.73 | >From the test result, I think asynchronous writing can still improve the >performance of scanning when cache miss, especially when there is cache entry >eviction. http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271 PS4, Line 271: uint8_t* buffer_; > Yes, this should be unique_ptr. It will have a similar amount of Done http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203 PS4, Line 203: : /// The key used for look up in the cache. : struct CacheKey { :public: : explicit CacheKey(const string& filename, int64_t mtime, int64_t offset) : : key_(filename.size() + sizeof(mtime) + sizeof(offset)) { : DCHECK_GE(mtime, 0); : DCHECK_GE(offset, 0); : key_.append(&mtime, sizeof(mtime)); : key_.append(&offset, sizeof(offset)); : key_.append(filename); : } : : int64_t Hash() const { : return HashUtil::FastHash64(key_.data(), key_.size(), 0); : } : : Slice filename() const { : return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME); : } : : int64_t mtime() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME); : } : : int64_t offset() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET); : } : : Slice ToSlice() const { : return key_; : } : :private: : // Key encoding stored in key_:
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12502/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 6 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 01 Mar 2023 10:10:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#6). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache-trace.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/disk-io-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 8 files changed, 330 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/6 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 6 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > > What circumstances show better performance? I'm not convinced that this change will improve performance noticeably. When the thread writes to the data cache, the OS should buffer the write in memory, and the writer thread should not need to wait for the actual write to the underlying IO device. That may change when we start to use Direct IO for the data cache (IMPALA-11905). I'm looking for some circumstance where data_cache_num_write_threads=N performs better than data_cache_num_write_threads=0. IMPALA-11886 mentions that performance deteriorates when using the data cache. I would like to see concrete numbers about how much this code change helps. In theory, that would apply to the first run of a query when the cache is empty. It could also apply to workloads where the data cache is too small and the cache is being thrashed (e.g. a data cache of 1GB when there is a 5GB working set). http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271 PS4, Line 271: uint8_t* buffer_; > > I think it would be cleaner for this to be a unique_ptr. Yes, this should be unique_ptr. It will have a similar amount of code, but it does automatic cleanup of the buffer when the StoreTask is deleted. For example, if the buffer_ is nullptr, unique_ptr knows not to try to call free on it. Impala's code style prefers to use unique_ptr (or other smart pointers) wherever possible to avoid dealing with manual allocations / frees. https://google.github.io/styleguide/cppguide.html#Ownership_and_Smart_Pointers http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203 PS4, Line 203: : /// The key used for look up in the cache. : struct CacheKey { :public: : explicit CacheKey(const string& filename, int64_t mtime, int64_t offset) : : key_(filename.size() + sizeof(mtime) + sizeof(offset)) { : DCHECK_GE(mtime, 0); : DCHECK_GE(offset, 0); : key_.append(&mtime, sizeof(mtime)); : key_.append(&offset, sizeof(offset)); : key_.append(filename); : } : : int64_t Hash() const { : return HashUtil::FastHash64(key_.data(), key_.size(), 0); : } : : Slice filename() const { : return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME); : } : : int64_t mtime() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME); : } : : int64_t offset() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET); : } : : Slice ToSlice() const { : return key_; : } : :private: : // Key encoding stored in key_: : // : // int64_t mtime; : // int64_t offset; : // filename; : static constexpr int OFFSETOF_MTIME = 0; : static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t); : static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t); : kudu::faststring key_; : }; : : /// The class to abstruct store behavior, including copying the buffer and holding it : /// until store complete. : class StoreTask { :public: : /// Creating a store task requires the filename, mtime, offset that constitutes the : /// cache key, and the buffer and length of the cached data is required too. We : /// allocate a new buffer in the constructor and copy the cache data and update : /// total_size which keeps track of the total buffer size allocate by all store tasks. : explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size); : : /// When the store task is destroyed, the allocated buffer is freed and total_size is
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 4: (9 comments) Thanks for the code review! http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > What circumstances show better performance? How can I show that? Maybe post a record of the comparative tests? I have no relevant experience, please give me some advice. http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc@106 PS4, Line 106: while (cache.current_buffer_size_.Load() != 0) continue; > Let's add a short sleep so that this is not spinning. Done http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271 PS4, Line 271: uint8_t* buffer_; > I think it would be cleaner for this to be a unique_ptr. Did you mean using unique_ptr instead of uint8_t*? I tried but did not reduce the amount of code, did I misunderstand something? http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203 PS4, Line 203: : /// The key used for look up in the cache. : struct CacheKey { :public: : explicit CacheKey(const string& filename, int64_t mtime, int64_t offset) : : key_(filename.size() + sizeof(mtime) + sizeof(offset)) { : DCHECK_GE(mtime, 0); : DCHECK_GE(offset, 0); : key_.append(&mtime, sizeof(mtime)); : key_.append(&offset, sizeof(offset)); : key_.append(filename); : } : : int64_t Hash() const { : return HashUtil::FastHash64(key_.data(), key_.size(), 0); : } : : Slice filename() const { : return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME); : } : : int64_t mtime() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME); : } : : int64_t offset() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET); : } : : Slice ToSlice() const { : return key_; : } : :private: : // Key encoding stored in key_: : // : // int64_t mtime; : // int64_t offset; : // filename; : static constexpr int OFFSETOF_MTIME = 0; : static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t); : static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t); : kudu::faststring key_; : }; : : /// The class to abstruct store behavior, including copying the buffer and holding it : /// until store complete. : class StoreTask { :public: : /// Creating a store task requires the filename, mtime, offset that constitutes the : /// cache key, and the buffer and length of the cached data is required too. We : /// allocate a new buffer in the constructor and copy the cache data and update : /// total_size which keeps track of the total buffer size allocate by all store tasks. : explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size); : : /// When the store task is destroyed, the allocated buffer is freed and total_size is : /// updated. : ~StoreTask(); : : const CacheKey& key() const { return key_; } : const uint8_t* buffer() const { return buffer_; } : int64_t buffer_len() const { return buffer_len_; } : :private: : DISALLOW_COPY_AND_ASSIGN(StoreTask); : : CacheKey key_; : uint8_t* buffer_; : int64_t buffer_len_; : AtomicInt64& total_size_; : }; > Small thing: It would be nice to keep these structures defined in > data-cache.cc. The way to do that is to have a
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12384/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 16 Feb 2023 03:09:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#5). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 6 files changed, 366 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/5 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: What circumstances show better performance? http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc@106 PS4, Line 106: while (cache.current_buffer_size_.Load() != 0) continue; Let's add a short sleep so that this is not spinning. http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271 PS4, Line 271: uint8_t* buffer_; I think it would be cleaner for this to be a unique_ptr. http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203 PS4, Line 203: : /// The key used for look up in the cache. : struct CacheKey { :public: : explicit CacheKey(const string& filename, int64_t mtime, int64_t offset) : : key_(filename.size() + sizeof(mtime) + sizeof(offset)) { : DCHECK_GE(mtime, 0); : DCHECK_GE(offset, 0); : key_.append(&mtime, sizeof(mtime)); : key_.append(&offset, sizeof(offset)); : key_.append(filename); : } : : int64_t Hash() const { : return HashUtil::FastHash64(key_.data(), key_.size(), 0); : } : : Slice filename() const { : return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME); : } : : int64_t mtime() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME); : } : : int64_t offset() const { : return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET); : } : : Slice ToSlice() const { : return key_; : } : :private: : // Key encoding stored in key_: : // : // int64_t mtime; : // int64_t offset; : // filename; : static constexpr int OFFSETOF_MTIME = 0; : static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t); : static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t); : kudu::faststring key_; : }; : : /// The class to abstruct store behavior, including copying the buffer and holding it : /// until store complete. : class StoreTask { :public: : /// Creating a store task requires the filename, mtime, offset that constitutes the : /// cache key, and the buffer and length of the cached data is required too. We : /// allocate a new buffer in the constructor and copy the cache data and update : /// total_size which keeps track of the total buffer size allocate by all store tasks. : explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size); : : /// When the store task is destroyed, the allocated buffer is freed and total_size is : /// updated. : ~StoreTask(); : : const CacheKey& key() const { return key_; } : const uint8_t* buffer() const { return buffer_; } : int64_t buffer_len() const { return buffer_len_; } : :private: : DISALLOW_COPY_AND_ASSIGN(StoreTask); : : CacheKey key_; : uint8_t* buffer_; : int64_t buffer_len_; : AtomicInt64& total_size_; : }; Small thing: It would be nice to keep these structures defined in data-cache.cc. The way to do that is to have a forward declaration of CacheKey and StoreTask here in data-cache.h, move these definitions to data-cache.cc, then also move the definition of DataCache's constructor and destructor to data-cache.cc. Here in data-cache.h: explicit DataCache(const std::string config, bool trace_replay = false); ~Data
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12319/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Feb 2023 02:40:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#4). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 6 files changed, 332 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/4 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12315/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Feb 2023 12:42:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#3). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 6 files changed, 330 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/3 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19475 to look at the new patch set (#2). Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 6 files changed, 330 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/2 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
18770832...@163.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19475 Change subject: IMPALA-11886: Data cache should support asynchronous writes .. IMPALA-11886: Data cache should support asynchronous writes This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens. Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to scan performance deterioration. This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'. Testing: - Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads. - Add DataCacheTest,#OutOfWriteBufferLimit Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json 6 files changed, 330 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/1 -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward <18770832...@163.com>
[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 ) Change subject: IMPALA-11886: Data cache should support asynchronous writes .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc@104 PS1, Line 104: /// before any lookup when running test case in async write mode. line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h@499 PS1, Line 499: /// Limit of the total buffer size used by asynchronous store tasks, when the current line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@117 PS1, Line 117: DEFINE_int32(data_cache_num_write_threads, 0, line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@122 PS1, Line 122: DEFINE_string(data_cache_write_buffer_limit, "1GB", line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@895 PS1, Line 895: int64_t buffer_limit = ParseUtil::ParseMemSpec(FLAGS_data_cache_write_buffer_limit, line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc File be/src/util/impalad-metrics.cc: http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@84 PS1, Line 84: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_ACTIVE_BUFFER_BYTES = line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@86 PS1, Line 86: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_STORE_TASKS_CREATED = line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@88 PS1, Line 88: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_OUT_OF_BUFFER_LIMIT_BYTES = line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward <18770832...@163.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Feb 2023 12:21:10 + Gerrit-HasComments: Yes