[ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:
----------------------------
    Priority: Minor  (was: Blocker)

> ALLOCATE_BLOCK of scm audit log miss num
> ----------------------------------------
>
>                 Key: HDDS-4228
>                 URL: https://issues.apache.org/jira/browse/HDDS-4228
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Glen Geng
>            Assignee: Marton Elek
>            Priority: Minor
>              Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to