[
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483210#comment-16483210
]
Andrew Wang commented on HDFS-13601:
------------------------------------
I've attached a patch for some flavor and a precommit run. The basic idea is to
cache what are likely to be fixed strings, or strings coming from a limited set.
I tested this on CDH 5, but the same findings should also apply to trunk. For a
pure listing-with-locations workload, JMC shows a reduction of TLAB allocation
from 499MB/s to 384MB/s after applying this patch. Previously, 13% of stacks
showed up in StringEncoder.encode (converting from String to byte array for
PB), and now that's reduced to 5.6%. The hotspot is now creating the
LocatedBlocks and adding all the StorageIDs, which is something separate to
tackle.
> Optimize ByteString conversions in PBHelper
> -------------------------------------------
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.1.0, 2.9.1
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Priority: Major
> Attachments: HDFS-13601.001.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being
> spent on String->ByteString conversions. These are often the same strings
> being converted over and over again, meaning there's room for optimization.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]