[
https://issues.apache.org/jira/browse/HDFS-16639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
BugFinder updated HDFS-16639:
-----------------------------
Description:
Hi,
We have been performance profiling with our in-house tools a few versions of
HDFS (including 3.0.0 and 3.3.3) and we have noticed some places for possible
optimizations. According to what we have seen, the method
org.apache.hadoop.hdfs.util.LightWeightHashSet.resize
has a possibly quadratic behavior (linear at the least) which might be
impactful depending on which data is being stored in the instance (e.g. too
many blocks to be removed like here). Albeit this behavior might be reasonable
or even not noticeable in some cases, when under wide locks as in
FSNamesystem.reportBadBlocks *// Holding the write lock*
BlockManager.findAndMarkBlockAsCorrupt
BlockManager.markBlockAsCorrupt
BlockManager.addToInvalidates
InvalidateBlocks.add
LightWeightHashSet.add
LightWeightHashSet.expandIfNecessary
LightWeightHashSet.resize
Could become an issue and a possible source of performance degradations.
There are several call trees that seem to end in resize and have locks, thus
making an improvement there could uplift NN performance in many cases. Of
course, not all of these are bad, or better said, not all of these are
problematic in every workload. We do not have a proposal for a solution yet, as
we are doing exploratory work with our in-house tools. We believe this issue is
present not only in 3.0.0 but also in more recent versions.
was:
Hi,
We have been performance profiling with our in-house tools a few versions of
HDFS (including 3.0.0) and we have noticed some places for possible
optimizations. According to what we have seen, the method
org.apache.hadoop.hdfs.util.LightWeightHashSet.resize
has a possibly quadratic behavior (linear at the least) which might be
impactful depending on which data is being stored in the instance (e.g. too
many blocks to be removed like
[here|https://issues.apache.org/jira/browse/HDFS-16574]). Albeit this behavior
might be reasonable or even not noticeable in some cases, when under wide locks
as in
FSNamesystem.reportBadBlocks *// Holding the write lock*
BlockManager.findAndMarkBlockAsCorrupt
BlockManager.markBlockAsCorrupt
BlockManager.addToInvalidates
InvalidateBlocks.add
LightWeightHashSet.add
LightWeightHashSet.expandIfNecessary
LightWeightHashSet.resize
Could become an issue and a possible source of performance degradations.
There are several call trees that seem to end in resize and have locks, thus
making an improvement there could uplift NN performance in many cases. Of
course, not all of these are bad, or better said, not all of these are
problematic in every workload. We do not have a proposal for a solution yet, as
we are doing exploratory work with our in-house tools. We believe this issue is
present not only in 3.0.0 but also in more recent versions.
> LightWeightHashSet.resize possibly quadratic behavior could affect performance
> ------------------------------------------------------------------------------
>
> Key: HDFS-16639
> URL: https://issues.apache.org/jira/browse/HDFS-16639
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.0.0, 3.3.3
> Reporter: BugFinder
> Priority: Major
>
> Hi,
> We have been performance profiling with our in-house tools a few versions of
> HDFS (including 3.0.0 and 3.3.3) and we have noticed some places for possible
> optimizations. According to what we have seen, the method
> org.apache.hadoop.hdfs.util.LightWeightHashSet.resize
> has a possibly quadratic behavior (linear at the least) which might be
> impactful depending on which data is being stored in the instance (e.g. too
> many blocks to be removed like here). Albeit this behavior might be
> reasonable or even not noticeable in some cases, when under wide locks as in
> FSNamesystem.reportBadBlocks *// Holding the write lock*
> BlockManager.findAndMarkBlockAsCorrupt
> BlockManager.markBlockAsCorrupt
> BlockManager.addToInvalidates
> InvalidateBlocks.add
> LightWeightHashSet.add
> LightWeightHashSet.expandIfNecessary
> LightWeightHashSet.resize
> Could become an issue and a possible source of performance degradations.
> There are several call trees that seem to end in resize and have locks, thus
> making an improvement there could uplift NN performance in many cases. Of
> course, not all of these are bad, or better said, not all of these are
> problematic in every workload. We do not have a proposal for a solution yet,
> as we are doing exploratory work with our in-house tools. We believe this
> issue is present not only in 3.0.0 but also in more recent versions.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]