On Tue, Sep 27, 2016 at 3:31 PM, Robert Haas <robertmh...@gmail.com> wrote:
> You seem to have erased the subject line from this email somehow.
I think that that was Gmail. Maybe that's new? Generally, I have to go
out of my way to change the subject line, so it seems unlikely that I
fat-fingered it. I wish that they'd stop changing things...
> On Tue, Sep 27, 2016 at 10:18 AM, Peter Geoghegan <p...@heroku.com> wrote:
> No, a shm_mq is not a mutual exclusion mechanism. It's a queue!
> A data dependency and a lock aren't the same thing. If your point in
> saying "we need something like an LWLock" was actually "the leader
> will have to wait if it's merged all tuples from a worker and no more
> are immediately available" then I think that's a pretty odd way to
> make that point. To me, the first statement appears to be false,
> while the second is obviously true.
Okay. Sorry for being unclear.
> OK, well I'm not taking any position on whether what Heikki is
> proposing will turn out to be good from a performance point of view.
> My intuitions about sorting performance haven't turned out to be
> particularly good in the past. I'm only saying that if you do decide
> to queue the tuples passing from worker to leader in a shm_mq, you
> shouldn't read from the shm_mq objects in a round robin, but rather
> read multiple tuples at a time from the same worker whenever that is
> possible without blocking. If you read tuples from workers one at a
> time, you get a lot of context-switch thrashing, because the worker
> wakes up, writes one tuple (or even part of a tuple) and immediately
> goes back to sleep. Then it shortly thereafter does it again.
> Whereas if you drain the worker's whole queue each time you read from
> that queue, then the worker can wake up, refill it completely, and go
> back to sleep again. So you incur fewer context switches for the same
> amount of work. Or at least, that's how it worked out for Gather, and
> what I am saying is that it will probably work out the same way for
> sorting, if somebody chooses to try to implement this.
Maybe. This would involve the overlapping of multiple worker merges
that are performed in parallel with a single leader merge. I don't
think it would be all that great. There would be a trade-off around
disk bandwidth utilization too, so I'm particularly doubtful that the
complexity would pay for itself. But, of course, I too have been wrong
about sorting performance characteristics in the past, so I can't be
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: