On Mon, Aug 8, 2016 at 4:44 PM, Peter Geoghegan <p...@heroku.com> wrote: > The basic idea I have in mind is that we create runs in workers in the > same way that the parallel CREATE INDEX patch does (one output run per > worker). However, rather than merging in the leader, we use a > splitting algorithm to determine partition boundaries on-the-fly. The > logical tape stuff then does a series of binary searches to find those > exact split points within each worker's "final" tape. Each worker > reports the boundary points of its original materialized output run in > shared memory. Then, the leader instructs workers to "redistribute" > slices of their final runs among each other, by changing the tapeset > metadata to reflect that each worker has nworker input tapes with > redrawn offsets into a unified BufFile. Workers immediately begin > their own private on-the-fly merges.
I think it's a great design, but for that, per-worker final tapes have to always be random-access. I'm not hugely familiar with the code, but IIUC there's some penalty to making them random-access right? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers