sodonnel commented on issue #668:
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-616555768
I am a little concerned about the expense of forming the list of healthy
nodes on large clusters. We have to do quite a lot of work to form a list and
then only use 3 nodes from the list. Even the method `currentPipelineCount()`
needs to do a few map lookups per node to get the current pipeline count. This
is the case even before this change. Creating a pipeline on a large cluster
would be expensive already, but this change probably makes it worse, due to the
sort needed. I know it was me who suggested the sort.
I think the code as it is will work OK upto about 1000 nodes, and then the
performance will drop off as the number of nodes goes toward 10k.
Eg here are some benchmarks I created using this test code, which is similar
to what we are doing in `filterViableNodes()`:
```
public List<Object> sortingWithMap(BenchmarkState state) {
return state.otherList.stream()
.map(o -> new Mock(o, state.rand.nextInt(20)))
.filter(o -> o.getSize() <= 20)
.sorted(Comparator.comparingInt(Mock::getSize))
.map(o -> o.getObject())
.collect(Collectors.toList());
}
```
The OPs per second for various list sizes are:
```
Benchmark (listSize) Mode Cnt Score Error Units
Sorting.sortingWithMap 100 thrpt 3 113948.345 ± 446.426 ops/s
Sorting.sortingWithMap 1000 thrpt 3 9468.507 ± 894.138 ops/s
Sorting.sortingWithMap 5000 thrpt 3 1931.612 ± 263.919 ops/s
Sorting.sortingWithMap 10000 thrpt 3 970.745 ± 25.823 ops/s
Sorting.sortingWithMap 100000 thrpt 3 87.684 ± 35.438 ops/s
```
For a 1000 node cluster, with 10 pipelines per node, we would be looking at
about 1 second to form all the piplines.
For a 5k node cluster, it would be about 25 seconds
For a 10k node cluster it would be 103 seconds, but even here, that would be
at close to 1000 pipelines per second.
Those approx times are just for the filterViableNodes step, and not any
other logic.
Removing the sort step from the above code makes it go about 5 times faster
and removing the two map steps and the sort makes it overall 10x faster.
It we want to allocate the pipelines evenly, where the lowest loaded nodes
always get the new pipeline, then I don't think we can avoid the sort. However
on a larger cluster, picking nodes that have not exceeded the limit at random
would probably result in quite an even spread, and we could do that more
efficiently I think.
What do you think?
If Sammi does not have bandwidth to give this a further review, I will see
if I can get someone else to take a look too.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]