Re: [HACKERS] BRIN index and aborted transaction

Alvaro Herrera Thu, 23 Jul 2015 10:24:43 -0700

Robert Haas wrote:

> Maybe I'm confused here, but it seems like the only time
> re-summarization can be needed is when tuples are pruned.  The mere
> act of deleting a tuple, even if the delete goes on to commit, doesn't
> create a scenario where re-summarization can work out to a win,
> because there may still be snapshots that can see it.  At the point
> where we prune the tuple, though, there might well be a benefit in
> re-summarizing, because now a newly-computed summary value won't need
> to cover a value that previously had to be there.


Hm, well, I am not sure that we want to pay the overhead of
re-summarization every time we prune a single tuple from a block range.
That's going to make vacuum much slower, I assume (without measuring);
many page ranges are going to be re-summarized without this actually
changing the range.

For minmax, it would work well to be able to tell whether the deleted
tuple had a value that was either the min or the max; if so it is
possible that the range can be decreased, otherwise not.  I'm not sure
that this would work for inclusion, though.  For geometric types it
means you check whether the value in the deleted tuple overlaps one of
the borders of the bounding box.  I don't know whether this actually
makes sense.  (The obvious thing, which is whether the value overlaps
the bounding box, is also obviously useless because all values overlap
the bounding box by definition.)

I think this would require a new support procedure for opclasses.

> But it seems obviously impractical to re-summarize when we HOT-prune,
> so it seems like the obvious thing to do is make vacuum do it.

Agreed.

> We know during phase one of vacuum whether we saw any dead tuples in
> page range X-Y; if yes, re-summarize.  The only reason not to do this
> is if it causes us to do a lot of resummarization that frequently
> fails to produce a smaller range. Do you have any experimental data
> suggesting that this is or is not a problem?

Well, the other issue is that vacuum is at arms length from a BRIN
index.  Vacuum doesn't provide the deleted-tuples array in a format
convenient for brin to access it; currently the only way we provide
access is a callback function that the index AM can call for every
single indexed TID to indicate whether it is to be removed or not.  BRIN
doesn't have TIDs, so it cannot call it usefully.  (We could make it
call once for every possible TID in a page, but that would be very
wasteful).

I guess we could provide a different callback that provides per-block
information rather than per-tuple; or perhaps something completely
different like simply the pointer to the deleted-TIDs array.
I vaguely recall somebody mentioned the current setup isn't great for
GIN either, so maybe we can find something that solves both cases?

I think this requires that BRIN calls heap_fetch() for each deleted
tuple as it is pruned.  This seems terrible from a performance point of
view.

There has to be a better way.  I'll give it a spin.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN index and aborted transaction

Reply via email to