sodonnel opened a new pull request #3085:
URL: https://github.com/apache/ozone/pull/3085
## What changes were proposed in this pull request?
The container report handing code has some issues which this Jira intends to
address.
When handling full container reports, we make several copies of large sets
to identify containers in SCM, but not reported by the DN (they have somehow
got lost on the DNs).
Looking at the current code:
```
synchronized (datanodeDetails) {
final List<ContainerReplicaProto> replicas =
containerReport.getReportsList();
final Set<ContainerID> containersInSCM =
nodeManager.getContainers(datanodeDetails);
final Set<ContainerID> containersInDn = replicas.parallelStream()
.map(ContainerReplicaProto::getContainerID)
.map(ContainerID::valueOf).collect(Collectors.toSet());
final Set<ContainerID> missingReplicas = new
HashSet<>(containersInSCM);
missingReplicas.removeAll(containersInDn);
processContainerReplicas(datanodeDetails, replicas, publisher);
processMissingReplicas(datanodeDetails, missingReplicas);
/*
* Update the latest set of containers for this datanode in
* NodeManager
*/
nodeManager.setContainers(datanodeDetails, containersInDn);
containerManager.notifyContainerReportProcessing(true, true);
}
```
The Set "containersInSCM" comes from NodeStateMap:
```
public Set<ContainerID> getContainers(UUID uuid)
throws NodeNotFoundException {
lock.readLock().lock();
try {
checkIfNodeExist(uuid);
return Collections
.unmodifiableSet(new HashSet<>(nodeToContainer.get(uuid)));
} finally {
lock.readLock().unlock();
}
}
```
This returns a new copy of the set, so there is no need to wrap it
UnModifiable. This means we can avoid copying it again in the report Handler.
The current code ends up with 3 copies of this potentially large set.
Next we take the FCR and stream it into a set of ContainerID. This is used
for two purposes - 1, to subtract from "containersInSCM" to yield the missing
containers. 2, to replace the list already in nodeManager.
We can avoid this second large set, by simply adding each ContainerID to
nodeManager if it is not already there and then remove any "missing" from
nodeManager at the end. This also avoids replacing the entire set of
ContainersIDs and all the tenured objects associated with it with new
effectively identical object instances.
Finally we can process the replicas at the same time, and re-use the
ContainerID object, which we currently form 2 times. We store one copy in the
nodeManager, and then another distinct copy in each ContainerReplica.
I checked how many ContainerID objects are present in SCM, by loading 10
closed containers with 3 replicas, and capturing a heap histogram with jmap:
```
bash-4.2$ jmap -histo:live 8 | grep ContainerID
262: 81 1944
org.apache.hadoop.hdds.scm.container.ContainerID
301: 30 1440
org.apache.hadoop.hdds.scm.container.ContainerReplica
419: 10 800
org.apache.hadoop.hdds.scm.container.ContainerInfo
```
There are 10 Containers as expected, 30 replicas, but 81 ContainerID
objects. Ideally there should be 10, to match the number of Containers, but
some of these may be created from other code paths, such as pipeline creation.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6307
## How was this patch tested?
Existing tests cover the functionality.
For performance testing, I used the existing Freon SCMThroughputBenchmark
class. With it, I simulated FCR handing for a single DN, reporting 10k, 20k and
30k containers. The original code performed approximately:
```
10k: 750 - 800ms per report.
20k: 1550 - 1650ms per report.
30k: 2300 - 2500ms per report.
```
The code in this PR:
```
10k: 530 - 570ms per report.
20k: 1100- 1200ms per report.
30k: 1550 - 1650ms per report.
```
Both scale proportionally with the number of containers, but the revised
code is about 30% faster and has a reduced memory overhead.
After loading 30k containers with 1 replica each I checked the ContainerID
object count using `jmap -histo:live`:
```
Original:
13: 120000 2880000
org.apache.hadoop.hdds.scm.container.ContainerID
Revised:
37: 30000 720000
org.apache.hadoop.hdds.scm.container.ContainerID
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]