BugFinder created HDFS-16639:
--------------------------------

             Summary: LightWeightHashSet.resize possibly quadratic behavior 
could affect performance
                 Key: HDFS-16639
                 URL: https://issues.apache.org/jira/browse/HDFS-16639
             Project: Hadoop HDFS
          Issue Type: Test
          Components: hdfs
    Affects Versions: 3.0.0
            Reporter: BugFinder


Hi,

We have been performance profiling with our in-house tools a few versions of 
HDFS (including 3.0.0) and we have noticed some places for possible 
optimizations. According to what we have seen, the method 

org.apache.hadoop.hdfs.util.LightWeightHashSet.resize

has a possibly quadratic behavior (linear at the least) which might be 
impactful depending on which data is being stored in the instance (e.g. too 
many blocks to be removed like 
[here|https://issues.apache.org/jira/browse/HDFS-16574]). Albeit this behavior 
might be reasonable or even not noticeable in some cases, when under wide locks 
as in

FSNamesystem.reportBadBlocks *// Holding the write lock*
  BlockManager.findAndMarkBlockAsCorrupt
    BlockManager.markBlockAsCorrupt
     BlockManager.addToInvalidates
        InvalidateBlocks.add
          LightWeightHashSet.add
            LightWeightHashSet.expandIfNecessary
              LightWeightHashSet.resize

 

There are several call trees that seem to end in resize and have locks. Not all 
of these are bad or better said, not all of these are problematic in every 
workload. We do not have a proposal for a solution yet, as we are doing 
exploratory work with our in-house tools. We believe this issue is present not 
only in 3.0.0 but also in more recent versions. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to