On 2016-09-23 17:40:13 -0400, Tom Lane wrote:
> My idea of an appropriate fix would be to resume the scan at the same
> point where the last scan stopped, and work circularly around the table
> when necessary.

I've played with that idea, and it does help a great deal. Not that
surprisingly, it's better than starting at a random point (which in turn
is better than starting at one end all the time).

> But I'm not sure there is any really good way to do that
> in the dynahash infrastructure.  Maybe it'd work to keep the iteration
> state around, but I don't remember how well that interacts with other
> insertions/deletions.  What about in your implementation?

It's easy enough to specify a start point. Requires exposing some things
that I don't necessarily want to - that's why I played around with a
random start point - but other than that it's easy to
implement. Internally growing the hashtable would be kind of an issue,
invalidating that point, but a) we're most of the time not growing at
that point anymore b) it'd be quite harmless to start at the wrong

> There's also the point mentioned in the existing comment, that it'd be
> better to go after pages with more bits set first.  Not sure of an
> inexpensive way to do that (ie, one that doesn't involve multiple
> scans of the hashtable).  But your results suggest that maybe it'd
> be worth making tbm_lossify slower in order to get better results.

It's not easy, I agree.


Andres Freund

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to