On 26.8.2014 21:38, Jeff Davis wrote:
> On Tue, 2014-08-26 at 12:39 +0300, Heikki Linnakangas wrote:
>> I think this is enough for this commitfest - we have consensus on
>> the design. For the next one, please address those open items, and
> Agreed, return with feedback.
> I need to get the accounting patch in first, which needs to address
> some performance issues, but there's a chance of wrapping those up
Sounds good to me.
I'd like to coordinate our efforts on this a bit, if you're interested.
I've been working on the hashjoin-like batching approach PoC (because I
proposed it, so it's fair I do the work), and I came to the conclusion
that it's pretty much impossible to implement on top of dynahash. I
ended up replacing it with a hashtable (similar to the one in the
hashjoin patch, unsurprisingly), which supports the batching approach
well, and is more memory efficient and actually faster (I see ~25%
speedup in most cases, although YMMV).
I plan to address this in 4 patches:
(1) replacement of dynahash by the custom hash table (done)
(2) memory accounting (not sure what's your plan, I've used the
approach I proposed on 23/8 for now, with a few bugfixes/cleanups)
(3) applying your HashWork patch on top of this (I have this mostly
completed, but need to do more testing over the weekend)
(4) extending this with the batching I proposed, initially only for
aggregates with states that we can serialize/deserialize easily
(e.g. types passed by value) - I'd like to hack on this next week
So at this point I have (1) and (2) pretty much ready, (3) is almost
complete and I plan to start hacking on (4). Also, this does not address
the open items listed in your message.
But I agree this is more complex than the patch you proposed. So if you
choose to pursue your patch, I have no problem with that - I'll then
rebase my changes on top of your patch and submit them separately.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: