elek commented on a change in pull request #1336: URL: https://github.com/apache/hadoop-ozone/pull/1336#discussion_r478353429
########## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBuffer.java ########## @@ -44,9 +45,6 @@ static ChunkBuffer allocate(int capacity) { * When increment <= 0, entire buffer is allocated in the beginning. */ static ChunkBuffer allocate(int capacity, int increment) { - if (increment > 0 && increment < capacity) { - return new IncrementalChunkBuffer(capacity, increment, false); Review comment: > @elek , how much of perf gain we will have of we still do incremental buffer allocation? I can repeat the test to get exact numbers, but I couldn't get good performance without removing the incremental buffer. You can easily test it with the new unit test, if you write a lot of data with byte=1, it's still low. > IncrementalChunkBuffer was added to address cases where ozone client were running into OOM with keys less than chunk size , as without this, the smallest buffer which will be allocated will always be equal to the chunk size(4MB by default). I think it's a valid (and important question), but as far as I see it's safe to remove the `IncrementalByteBuffer`. As far as I see the situation is slightly different since HDDS-2331. I tried to test this patch with the commands from the HDDS-2331: ``` ozone freon rk --numOfThreads 1 --numOfVolumes 1 --numOfBuckets 1 --replicationType RATIS --factor ONE --keySize 1048576 --numOfKeys 5120 --bufferSize 65536 ``` I couldn't reproduce the OOM. Based on my understanding: 1. We already have an increment by the `ByteBuffer` but size of the increment is 4MB (adding one more buffer when required) 2. 4MB seems to be acceptable even with many clients in the same JVM, especially if we can have acceptable performance. Let's say I have 100 Ozone clients (in the same JVM!!!) which write 1kb keys. I will have (4MB-1kb) *100 overhead without the `IncrementalChunkBuffer` (as far as I understood). It's still <400MB in exchange for 30-100% performance gain. Sounds like a good deal. But let me know if you see any problems here. 3. Let's say the 400MB overhead is unacceptable (or my calculation was wrong and the overhead is higher ;-) ) As far as I see the `BufferPool` is created per key. I think it would be possible to set the buffer size to `min(keySize, bufferSize)`. With this approach the first and only buffer of the BufferPool can have exactly the required size (which covers all the where the key size is < 4MB) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org