On 2016-06-16 09:50:09 -0500, Kevin Grittner wrote:
> On Wed, Jun 15, 2016 at 8:57 PM, Andres Freund <and...@anarazel.de> wrote:
> > The simplest version of the scenario I'm concerned about is that a tuple
> > in a tuple is *not* vacuumed, even though it's elegible to be removed
> > due to STO. If that tuple has toasted columns, it could be the that the
> > toast table was independently vacuumed (autovac considers main/toast
> > tables separately,
> If that were really true, why would we not have the problem in
> current production versions that the toast table could be vacuumed
> before the heap, leading to exactly the issue you are talking
> about?

The issue isn't there without the feature, because we (should) never
access a tuple/detoast a column when it's invisible enough for the
corresponding toast tuple to be vacuumed away. But with
old_snapshot_timeout that's obviously (intentionally) not the case
anymore.  Due to old_snapshot_threshold we'll prune tuples which,
without it, would still be considered HEAPTUPLE_RECENTLY_DEAD.

> It seems to me that a simple test shows that it is not the
> case that the heap is vacuumed without considering toast:

That's why I mentioned autovacuum:
         * Scan pg_class to determine which tables to vacuum.
         * We do this in two passes: on the first one we collect the list of 
         * relations and materialized views, and on the second one we collect
         * TOAST tables. The reason for doing the second pass is that during it 
         * want to use the main relation's pg_class.reloptions entry if the 
         * table does not have any, and we cannot obtain it unless we know
         * beforehand what's the main  table OID.
         * We need to check TOAST tables separately because in cases with short,
         * wide tables there might be proportionally much more activity in the
         * TOAST table than in its parent.
                tab->at_vacoptions = VACOPT_SKIPTOAST |
                        (dovacuum ? VACOPT_VACUUM : 0) |
                        (doanalyze ? VACOPT_ANALYZE : 0) |
                        (!wraparound ? VACOPT_NOWAIT : 0);
(note the skiptoast)
         * Remember the relation's TOAST relation for later, if the caller asked
         * us to process it.  In VACUUM FULL, though, the toast table is
         * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
        if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
                toast_relid = onerel->rd_rel->reltoastrelid;
                toast_relid = InvalidOid;
        if (toast_relid != InvalidOid)
                vacuum_rel(toast_relid, relation, options, params);

> > or the horizon could change between the two heap scans,
> Not a problem in current production why?

Because the horizon will never go to a value which allows "surely dead"
tuples to be read, thus we never detoast columns from a tuple for which
we'd removed toast data. That's why we're performing visibility tests
(hopefully) everywhere, before accessing tuple contents (as opposed to
inspecting the header).

> > or pins could prevent vacuuming of one page, or ...).
> Not a problem in current production why?

Same reason.

> So far I am not seeing any way for TOAST tuples to be pruned in
> advance of the referencing heap tuples, nor any way for the problem
> you describe to happen in the absence of that.

Didn't I just list three different ways, only one of which you doubted
the veracity of? Saying "Not a problem in current production why"
doesn't change it being a problem.

> > It seems the easiest way to fix this would be to make
> > TestForOldSnapshot() "accept" SnapshotToast as well.
> I don't think that would be appropriate without testing the
> characteristics of the underlying table to see whether it should be
> excluded.

You mean checking whether it's a toast table? We could check that, but
since we never use a toast scan outside of toast, it doesn't seem

> But is the TOAST data ever updated without an update to
> the referencing heap tuple?

It shouldn't.

> If not, I don't see any benefit.



Andres Freund

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to