On 08/02/2016 01:18 AM, Peter Geoghegan wrote:
No merging in parallel

Currently, merging worker *output* runs may only occur in the leader
process. In other words, we always keep n worker processes busy with
scanning-and-sorting (and maybe some merging), but then all processes
but the leader process grind to a halt (note that the leader process
can participate as a scan-and-sort tuplesort worker, just as it will
everywhere else, which is why I specified "parallel_workers = 7" but
talked about 8 workers).

One leader process is kept busy with merging these n output runs on
the fly, so things will bottleneck on that, which you saw in the
example above. As already described, workers will sometimes merge in
parallel, but only their own runs -- never another worker's runs. I
did attempt to address the leader merge bottleneck by implementing
cross-worker run merging in workers. I got as far as implementing a
very rough version of this, but initial results were disappointing,
and so that was not pursued further than the experimentation stage.

Parallel merging is a possible future improvement that could be added
to what I've come up with, but I don't think that it will move the
needle in a really noticeable way.

It'd be good if you could overlap the final merges in the workers with the merge in the leader. ISTM it would be quite straightforward to replace the final tape of each worker with a shared memory queue, so that the leader could start merging and returning tuples as soon as it gets the first tuple from each worker. Instead of having to wait for all the workers to complete first.

- Heikki

