Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )
Change subject: IMPALA-11886: Data cache should support asynchronous writes ...................................................................... Patch Set 7: (13 comments) This is interesting, and I think we should go forward with merging this. I have a few code organization comments, but I think this is getting very close. I am running our test jobs with the data cache enabled and async writes turned on. To make that easier in future, can you add code in bin/run-all-tests.sh that defines a new environment variable similar to the one we have for DATA_CACHE_EVICTION_POLICY (e.g. DATA_CACHE_ASYNC_WRITE_THREADS)? See the current code we have here: https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L67 https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L82-L85 That let's us run our full test job with this enabled. http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24 PS4, Line 24: Testing: > Thank you for your question, I have done some simple performance tests for Thank you for the performance numbers. This is interesting enough that we should go ahead and plan on checking this in. Can you incorporate a few of the interesting performance results into the commit message? http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9 PS7, Line 9: write Nit: writes http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10 PS7, Line 10: when cache miss happens Nit: "when a cache miss happens" http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11 PS7, Line 11: synchronized Nit: synchronous http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128 PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20; Let's add a comment here saying that this large value for the queue size is harmless because the total size of the entries on the queue are bound by the data_cache_async_write_bufer_limit. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385 PS7, Line 385: abstruct Nit: abstract http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393 PS7, Line 393: explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset, : const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size) : : key_(filename, mtime, offset), I would like to make StoreTask a simple struct with minimal logic, and move the counter update / buffer allocation outside of this class. Basically, DataCache::SubmitStoreTask() would: - Increment the counter atomically with the CompareAndSwap() subject to the limit - Allocate the memory buffer and copy into it - Increment the async writes outstanding bytes - Construct the StoreTask passing in the new buffer and all the other data needed (but not the total_size AtomicInt64 that we have today) - Unlike today, StoreTask would take in a pointer to the DataCache object. Then, we would add a DataCache::CompleteStoreTask(const StoreTask& store_task), which does the work that the StoreTask destructor does today (decrementing the counters). StoreTask becomes a simple holder of data. StoreTask's destructor calls CompleteStoreTask(this) on the DataCache pointer that was passed into the constructor. Having the allocation logic in DataCache will make it easier for us to hook up the memory tracking. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975 PS7, Line 975: if (buffer_limit < PAGE_SIZE) { : return Status(Substitute("Configured data cache write buffer limit $0 is too small", : FLAGS_data_cache_async_write_buffer_limit)); : } Let's require the limit to be higher. I think the minimum should be 8MB. If someone configures this to a value that is smaller than most IOs, then nothing will ever get cached. Most IOs we do are smaller than 8MB. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059 PS7, Line 1059: if (UNLIKELY(current_buffer_size_.Load() + buffer_len > store_buffer_capacity_)) { : VLOG(2) << Substitute("Failed to create store task due to buffer size limitation, " : "current buffer size: $0 size limitation: $1 require: $2", : current_buffer_size_.Load(), store_buffer_capacity_, buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES-> : Increment(buffer_len); : ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1); : return false; : } This would become a CompareAndSwap loop where either we are at the limit and return false or we successfully bump the current_buffer_size_. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1069 PS7, Line 1069: const StoreTask* task = new StoreTask(filename, mtime, offset, buffer, buffer_len, : current_buffer_size_); Move the logic from the StoreTask constructor (incrementing counters, allocating the buffer, copying the data) here and then pass in the new buffer directly to StoreTask. StoreTask also gets a pointer to this DataCache object. http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1074 PS7, Line 1074: We would also add a CompleteStoreTask() function here that would be called from StoreTask's destructor (and do the decrements). http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc@80 PS7, Line 80: Write threads need to bound the extra memory consumption for holding the " : "temporary buffer though. Nit: Let's change this sentence to say that the extra memory for temporary buffers is limited by data_cache_async_write_buffer_limit. http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json File common/thrift/metrics.json: http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json@653 PS7, Line 653: bytes async Nit: "bytes of async" -- To view, visit http://gerrit.cloudera.org:8080/19475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312 Gerrit-Change-Number: 19475 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Comment-Date: Sat, 11 Mar 2023 21:12:36 +0000 Gerrit-HasComments: Yes
