[
https://issues.apache.org/jira/browse/HADOOP-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated HADOOP-14412:
--------------------------------
Attachment: HADOOP-14412.001.patch
Posting a patch that uses AtomicReference to remove the need for any locking.
The basic idea is to keep track of an immutable snapshot of the current state.
If the state changes then we build a new snapshot and atomically replace the
old state. When we need to look at the current state we grab the snapshot
reference _once_ then always use that for the duration of the operation that
needs to examine the state.
> HostsFileReader#getHostDetails is very expensive on large clusters
> ------------------------------------------------------------------
>
> Key: HADOOP-14412
> URL: https://issues.apache.org/jira/browse/HADOOP-14412
> Project: Hadoop Common
> Issue Type: Bug
> Components: util
> Affects Versions: 2.8.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: HADOOP-14412.001.patch
>
>
> After upgrading one of our large clusters to 2.8 we noticed many IPC server
> threads of the resourcemanager spending time in NodesListManager#isValidNode
> which in turn was calling HostsFileReader#getHostDetails. The latter is
> creating complete copies of the include and exclude sets for every node
> heartbeat, and these sets are not small due to the size of the cluster.
> These copies are causing multiple resizes of the underlying HashSets being
> filled and creating lots of garbage.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]