[ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483210#comment-16483210
 ] 

Andrew Wang commented on HDFS-13601:
------------------------------------

I've attached a patch for some flavor and a precommit run. The basic idea is to 
cache what are likely to be fixed strings, or strings coming from a limited set.

I tested this on CDH 5, but the same findings should also apply to trunk. For a 
pure listing-with-locations workload, JMC shows a reduction of TLAB allocation 
from 499MB/s to 384MB/s after applying this patch. Previously, 13% of stacks 
showed up in StringEncoder.encode (converting from String to byte array for 
PB), and now that's reduced to 5.6%. The hotspot is now creating the 
LocatedBlocks and adding all the StorageIDs, which is something separate to 
tackle.

> Optimize ByteString conversions in PBHelper
> -------------------------------------------
>
>                 Key: HDFS-13601
>                 URL: https://issues.apache.org/jira/browse/HDFS-13601
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 2.9.1
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Major
>         Attachments: HDFS-13601.001.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to