[ 
https://issues.apache.org/jira/browse/HDDS-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955814#comment-16955814
 ] 

Shashikant Banerjee edited comment on HDDS-2331 at 10/21/19 7:04 AM:
---------------------------------------------------------------------

In Ozone, by default the buffer size is equal to the chunk size(16 MB default). 
Once a write call happens, a buffer is allocated and data is being written just 
into the buffer till it gets full/flush/close  and then pushed to datanode  and 
the buffer gets only released when watchForCommit call for the respective 
putBlock call log Index succeeds successfully. So until and unless, 
watchForCommit call gets acknowledged by Ozone Client, we keep holding onto the 
buffer so that, in case the ratis request fails, we have the user data cached 
in the client buffer which can be written over to the next block.

We have  had multiple discussions around on reducing the default buffer size 
and implement a true streaming client, but this is still under consideration. 

[~adoroszlai], for your test, you can try changing the default chunk size to 
say 1 MB and see if it works well. It might also be possible that buffer 
release handling got broken with some changes introduced which need to be 
verified.


was (Author: shashikant):
In Ozone, by default the buffer size is equal to the chunk size(16 MB default). 
Once a write call happens, a buffer is allocated and data is being written just 
into the buffer till it gets full/flush/close  and then pushed to datanode  and 
the buffer gets only released when watchForCommit call for the respective 
putBlock call log Index succeeds successfully. So until and unless, 
watchForCommit call gets acknowledged by Ratis, we keep holding onto the buffer 
so that, in case the ratis request fails, we have the user data cached in the 
client buffer which can be written over to the next block.

We have  had multiple discussions around this on reducing the default buffer 
size and implement a true streaming client, but this is still under 
consideration. 

[~adoroszlai], for your test, you can try changing the default chunk size to 
say 1 MB and see if it works well. It might also be possible that buffer 
release handling got broken with some changes introduced which need to be 
verified.

> Client OOME due to buffer retention
> -----------------------------------
>
>                 Key: HDDS-2331
>                 URL: https://issues.apache.org/jira/browse/HDDS-2331
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>    Affects Versions: 0.5.0
>            Reporter: Attila Doroszlai
>            Assignee: Shashikant Banerjee
>            Priority: Critical
>         Attachments: profiler.png
>
>
> Freon random key generator exhausts default heap after just few hundred 1MB 
> keys.  Heap dump on OOME reveals 150+ instances of 
> {{ContainerCommandRequestMessage}}, each with 16MB {{byte[]}}.
> Steps to reproduce:
> # Start Ozone cluster with 1 datanode
> # Start Freon (5K keys of size 1MB)
> Result: OOME after a few hundred keys
> {noformat}
> $ cd hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/compose/ozone
> $ docker-compose up -d
> $ docker-compose exec scm bash
> $ export HADOOP_OPTS='-XX:+HeapDumpOnOutOfMemoryError'
> $ ozone freon rk --numOfThreads 1 --numOfVolumes 1 --numOfBuckets 1 
> --replicationType RATIS --factor ONE --keySize 1048576 --numOfKeys 5120 
> --bufferSize 65536
> ...
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid289.hprof ...
> Heap dump file created [1456141975 bytes in 7.760 secs]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to