Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 7:

(13 comments)

This is interesting, and I think we should go forward with merging this. I have 
a few code organization comments, but I think this is getting very close.

I am running our test jobs with the data cache enabled and async writes turned 
on.

To make that easier in future, can you add code in bin/run-all-tests.sh that 
defines a new environment variable similar to the one we have for 
DATA_CACHE_EVICTION_POLICY (e.g. DATA_CACHE_ASYNC_WRITE_THREADS)? See the 
current code we have here:
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L67
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L82-L85
That let's us run our full test job with this enabled.

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> Thank you for your question, I have done some simple performance tests for
Thank you for the performance numbers. This is interesting enough that we 
should go ahead and plan on checking this in. Can you incorporate a few of the 
interesting performance results into the commit message?


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9
PS7, Line 9: write
Nit: writes


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10
PS7, Line 10: when cache miss happens
Nit: "when a cache miss happens"


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11
PS7, Line 11: synchronized
Nit: synchronous


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128
PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20;
Let's add a comment here saying that this large value for the queue size is 
harmless because the total size of the entries on the queue are bound by the 
data_cache_async_write_bufer_limit.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385
PS7, Line 385: abstruct
Nit: abstract


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393
PS7, Line 393:   explicit StoreTask(const std::string& filename, int64_t mtime, 
int64_t offset,
             :       const uint8_t* buffer, int64_t buffer_len, AtomicInt64& 
total_size)
             :     : key_(filename, mtime, offset),
I would like to make StoreTask a simple struct with minimal logic, and move the 
counter update / buffer allocation outside of this class.

Basically, DataCache::SubmitStoreTask() would:
 - Increment the counter atomically with the CompareAndSwap() subject to the 
limit
 - Allocate the memory buffer and copy into it
 - Increment the async writes outstanding bytes
 - Construct the StoreTask passing in the new buffer and all the other data 
needed (but not the total_size AtomicInt64 that we have today)
 - Unlike today, StoreTask would take in a pointer to the DataCache object.

Then, we would add a DataCache::CompleteStoreTask(const StoreTask& store_task), 
which does the work that the StoreTask destructor does today (decrementing the 
counters).

StoreTask becomes a simple holder of data. StoreTask's destructor calls 
CompleteStoreTask(this) on the DataCache pointer that was passed into the 
constructor.

Having the allocation logic in DataCache will make it easier for us to hook up 
the memory tracking.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975
PS7, Line 975:     if (buffer_limit < PAGE_SIZE) {
             :       return Status(Substitute("Configured data cache write 
buffer limit $0 is too small",
             :           FLAGS_data_cache_async_write_buffer_limit));
             :     }
Let's require the limit to be higher. I think the minimum should be 8MB. If 
someone configures this to a value that is smaller than most IOs, then nothing 
will ever get cached. Most IOs we do are smaller than 8MB.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059
PS7, Line 1059:   if (UNLIKELY(current_buffer_size_.Load() + buffer_len > 
store_buffer_capacity_)) {
              :     VLOG(2) << Substitute("Failed to create store task due to 
buffer size limitation, "
              :         "current buffer size: $0 size limitation: $1 require: 
$2",
              :         current_buffer_size_.Load(), store_buffer_capacity_, 
buffer_len);
              :     
ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES->
              :         Increment(buffer_len);
              :     
ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1);
              :     return false;
              :   }
This would become a CompareAndSwap loop where either we are at the limit and 
return false or we successfully bump the current_buffer_size_.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1069
PS7, Line 1069:   const StoreTask* task = new StoreTask(filename, mtime, 
offset, buffer, buffer_len,
              :       current_buffer_size_);
Move the logic from the StoreTask constructor (incrementing counters, 
allocating the buffer, copying the data) here and then pass in the new buffer 
directly to StoreTask. StoreTask also gets a pointer to this DataCache object.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1074
PS7, Line 1074:
We would also add a CompleteStoreTask() function here that would be called from 
StoreTask's destructor (and do the decrements).


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc@80
PS7, Line 80: Write threads need to bound the extra memory consumption for 
holding the "
            :     "temporary buffer though.
Nit: Let's change this sentence to say that the extra memory for temporary 
buffers is limited by data_cache_async_write_buffer_limit.


http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json@653
PS7, Line 653: bytes async
Nit: "bytes of async"



--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Comment-Date: Sat, 11 Mar 2023 21:12:36 +0000
Gerrit-HasComments: Yes

Reply via email to