On Thu, Apr 30, 2015 at 5:35 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>
> On 25 April 2015 at 01:12, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>
>> On Sat, Apr 25, 2015 at 1:58 AM, Jim Nasby <jim.na...@bluetreble.com>
wrote:
>> >
>> > On 4/23/15 10:40 PM, Amit Kapila wrote:
>> >>
>> >> I agree with you and what I think one of the major reasons of bloat
is that
>> >> Index segment doesn't have visibility information due to which
clearing of
>> >> Index needs to be tied along with heap.  Now if we can move
transaction
>> >> information at page level, then we can even think of having it in
Index
>> >> segment as well and then Index can delete/prune it's tuples on it's
own
>> >> which can reduce the bloat in index significantly and there is a
benefit
>> >> to Vacuum as well.
>> >
>> >
>> > I don't see how putting visibility at the page level helps indexes at
all. We could already put XMIN in indexes if we wanted, but it won't help,
because...
>> >
>>
>> We can do that by putting transaction info at tuple level in index as
>> well but that will be huge increase in size of index unless we devise
>> a way to have variable index tuple header rather than a fixed.
>>
>> >> Now this has some downsides as well like Delete
>> >> needs to traverse Index segment as well to Delete mark the tuples, but
>> >> I think the upsides of reducing bloat can certainly outweigh the
downsides.
>> >
>> >
>> > ... which isn't possible. You can not go from a heap tuple to an index
tuple.
>>
>> We will have the access to index value during delete, so why do you
>> think that we need linkage between heap and index tuple to perform
>> Delete operation?  I think we need to think more to design Delete
>> .. by CTID, but that should be doable.
>
>
> I see some assumptions here that need to be challenged.
>
> We can keep xmin and/or xmax on index entries. The above discussion
assumes that the information needs to be updated synchronously. We already
store visibility information on index entries using the lazily updated
killtuple mechanism, so I don't see much problem in setting the xmin in a
similar lazy manner. That way when we use the index if xmax is set we use
it, if it is not we check the heap. (And then you get to freeze indexes as
well ;-( )
> Anyway, I have no objection to making index AM pass visibility
information to indexes that wish to know the information, as long as it is
provided lazily.
>

Providing such an information lazily can help to an extent, but I think
it won't help much in bloat reduction. For example, when an
insert tries to insert a row in index page and found that there is no
space, it can't kill or overwrite any tuple (that is actually dead unless
updated lazily by that time) which is I think one of the main reasons for
index bloat.

> The second assumption is that if we had visibility information in the
index that it would make a difference to bloat. Since as I mention, we
already do have visibility information, I don't see that adding xmax or
xmin would make any difference at all to bloat. So -1 to adding it **for
that reason**.
>

The visibility information is only updated when such an index item
is accessed (lazy updation) and by that time already the new space
for index insertion would be used whereas if the information is provided
synchronously the dead space could be reclaimed much earlier (for
insertions on page which has dead tuples) and will reduce the chances
of bloat.

>
> A much better idea is to work out how to avoid index bloat at cause. If
we are running an UPDATE and we cannot get a cleanup lock, we give up and
do a non-HOT update, causing the index to bloat. It seems better to wait
for a short period to see if we can get the cleanup lock. The short period
is currently 0, so lets start there and vary the duration of wait upwards
proportionally as the index gets more bloated.
>

I think this is a separate and another good way of avoiding the
bloat, but independent of this having something like what we
discussed above will even reduce the chances of bloat for a
non-HOT update in a scenario described by you.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply via email to