Re: accounting for memory used for BufFile during hash joins

Tomas Vondra Wed, 08 May 2019 06:38:03 -0700

On Tue, May 07, 2019 at 05:43:56PM -0700, Melanie Plageman wrote:

  On Tue, May 7, 2019 at 6:59 AM Tomas Vondra <tomas.von...@2ndquadrant.com>
  wrote:


    On Tue, May 07, 2019 at 04:28:36PM +1200, Thomas Munro wrote:
    >On Tue, May 7, 2019 at 3:15 PM Tomas Vondra
    ><tomas.von...@2ndquadrant.com> wrote:
    >> On Tue, May 07, 2019 at 01:48:40PM +1200, Thomas Munro wrote:
    >> Switching to some other algorithm during execution moves the goal
    posts
    >> to the next galaxy, I'm afraid.
    >
    >The main problem I'm aware of with sort-merge join is: not all that is
    >hashable is sortable.  So BNL is actually the only solution I'm aware
    >of for problem B that doesn't involve changing a fundamental thing
    >about PostgreSQL's data type requirements.
    >

    Sure, each of those algorithms has limitations. But I think that's
    mostly
    irrelevant to the main issue - switching between algorithms
    mid-execution.
    At that point some of the tuples might have been already sent sent to
    the
    other nodes, and I have no idea how to "resume" the tuple stream short
    of
    buffering everything locally until the join completes. And that would be
    rather terrible, I guess.

  What if you switched to NLJ on a batch-by-batch basis and did it before
  starting
  execution of the join but after building the inner side of the hash
  table.  That
  way, no tuples will have been sent to other nodes yet.


Interesting idea! I think you're right doing it on a per-batch basis
would solve that problem. Essentially, if all (or >95%) of the tuples
has the same hash value, we could switch to a special "degraded" mode
doing something like a NL. At that point the hash table benefits are
lost anyway, because all the tuples are in a single chain, so it's not
going to be much slower.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: accounting for memory used for BufFile during hash joins

Reply via email to