[
https://issues.apache.org/jira/browse/HDDS-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sammi Chen updated HDDS-3466:
-----------------------------
Target Version/s: 0.7.0 (was: 0.6.0)
> Improve filterViableNodes performance in pipeline creation
> ----------------------------------------------------------
>
> Key: HDDS-3466
> URL: https://issues.apache.org/jira/browse/HDDS-3466
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Components: SCM
> Affects Versions: 0.5.0
> Reporter: Li Cheng
> Priority: Major
> Labels: TriagePending
>
> Per [~sodonnell]'s investigation, pipeline creation may have performance
> issue once the load-sorting algorithm in HDDS-3139.
>
> This task is to track potential performance bottleneck caused by this sorting
> operation for pipeline creation in large scale cluster.
>
> I am a little concerned about the expense of forming the list of healthy
> nodes on large clusters. We have to do quite a lot of work to form a list and
> then only use 3 nodes from the list. Even the method
> {{currentPipelineCount()}} needs to do a few map lookups per node to get the
> current pipeline count. This is the case even before this change. Creating a
> pipeline on a large cluster would be expensive already, but this change
> probably makes it worse, due to the sort needed. I know it was me who
> suggested the sort.
> I think the code as it is will work OK upto about 1000 nodes, and then the
> performance will drop off as the number of nodes goes toward 10k.
> Eg here are some benchmarks I created using this test code, which is similar
> to what we are doing in {{filterViableNodes()}}:
>
> {{ public List<Object> sortingWithMap(BenchmarkState state) \{
> return state.otherList.stream()
> .map(o -> new Mock(o, state.rand.nextInt(20)))
> .filter(o -> o.getSize() <= 20)
> .sorted(Comparator.comparingInt(Mock::getSize))
> .map(o -> o.getObject())
> .collect(Collectors.toList());
> }}}
> The OPs per second for various list sizes are:
>
> {{Benchmark (listSize) Mode Cnt Score Error Units
> Sorting.sortingWithMap 100 thrpt 3 113948.345 ± 446.426 ops/s
> Sorting.sortingWithMap 1000 thrpt 3 9468.507 ± 894.138 ops/s
> Sorting.sortingWithMap 5000 thrpt 3 1931.612 ± 263.919 ops/s
> Sorting.sortingWithMap 10000 thrpt 3 970.745 ± 25.823 ops/s
> Sorting.sortingWithMap 100000 thrpt 3 87.684 ± 35.438 ops/s}}
> For a 1000 node cluster, with 10 pipelines per node, we would be looking at
> about 1 second to form all the piplines.
> For a 5k node cluster, it would be about 25 seconds
> For a 10k node cluster it would be 103 seconds, but even here, that would be
> at close to 1000 pipelines per second.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]