On Sunday, 21 April 2013 at 12:08:54 UTC, bearophile wrote:
Arrays#parallelSort uses Fork/Join framework introduced in Java 7 to assign the sorting tasks to multiple threads available in the thread pool. This is called eating your own dog food. Fork/Join implements a work stealing algorithm where in a idle thread can steal tasks queued up in another thread.
An overview of Arrays#parallelSort:

The method uses a threshold value and any array of size lesser than the threshold value is sorted using the Arrays#sort() API (i.e sequential sorting). And the threshold is calculated considering the parallelism of the machine, size of the array and is calculated as:

private static final int getSplitThreshold(int n) {
  int p = ForkJoinPool.getCommonPoolParallelism();
  int t = (p > 1) ? (1 + n / (p << 3)) : n;
  return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t;
}

Once its decided whether to sort the array in parallel or in serial, its now to decide how to divide the array in to multiple parts and then assign each part to a Fork/Join task which will take care of sorting it and then another Fork/Join task which will take care of merging the sorted arrays. The implementation in JDK 8 uses this approach:
- Divide the array into 4 parts.
- Sort the first two parts and then merge them.
- Sort the next two parts and then merge them.
And the above steps are repeated recursively with each part until the size of the part to sort is not lesser than the threshold value calculated above.



I think it's worth adding something similar as strategy of std.algorithm.sort.

FWIW, I created a parallel sort in D a while back using std.parallelism. It was part of std.parallel_algorithm, a half-finished project that I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth. If you have some expensive-to-compare types, though, it may be worthwhile.

https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d

Reply via email to