Hello KaiGai-san,

On 08/21/2015 02:28 AM, Kouhei Kaigai wrote:
...

But what is the impact on queries that actually need more than 1GB
of buckets? I assume we'd only limit the initial allocation and
still allow the resize based on the actual data (i.e. the 9.5
improvement), so the queries would start with 1GB and then resize
once finding out the optimal size (as done in 9.5). The resize is
not very expensive, but it's not free either, and with so many
tuples (requiring more than 1GB of buckets, i.e. ~130M tuples) it's
probably just a noise in the total query runtime. But I'd be nice
to see some proofs of that ...

The problem here is we cannot know exact size unless Hash node
doesn't read entire inner relation. All we can do is relying
planner's estimation, however, it often computes a crazy number of
rows. I think resizing of hash buckets is a reasonable compromise.

I understand the estimation problem. The question I think we need to answer is how to balance the behavior for well- and poorly-estimated cases. It'd be unfortunate if we lower the memory consumption in the over-estimated case while significantly slowing down the well-estimated ones.

I don't think we have a clear answer at this point - maybe it's not a problem at all and it'll be a win no matter what threshold we choose. But it's a separate problem from the bugfix.

I believe the patch proposed by KaiGai-san is the right one to fix
the bug discussed in this thread. My understanding is KaiGai-san
withdrew the patch as he wants to extend it to address the
over-estimation issue.

I don't think we should do that - IMHO that's an unrelated
improvement and should be addressed in a separate patch.

OK, it might not be a problem we should conclude within a few days,
just before the beta release.

I don't quite see a reason to wait for the over-estimation patch. We probably should backpatch the bugfix anyway (although it's much less likely to run into that before 9.5), and we can't really backpatch the behavior change there (as there's no hash resize).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to