Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

Robert Haas Thu, 11 Dec 2014 13:18:17 -0800

On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra <[email protected]> wrote:
> No, it's not rescanned. It's scanned only once (for the batch #0), and
> tuples belonging to the other batches are stored in files. If the number
> of batches needs to be increased (e.g. because of incorrect estimate of
> the inner table), the tuples are moved later.


Yeah, I think I sort of knew that, but I got confused.  Thanks for clarifying.

> The idea was that if we could increase the load a bit (e.g. using 2
> tuples per bucket instead of 1), we will still use a single batch in
> some cases (when we miss the work_mem threshold by just a bit). The
> lookups will be slower, but we'll save the I/O.

Yeah.  That seems like a valid theory, but your test results so far
seem to indicate that it's not working out like that - which I find
quite surprising, but, I mean, it is what it is, right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

Reply via email to