Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

Michael Paquier Sun, 24 Sep 2017 22:42:53 -0700

On Sat, Sep 23, 2017 at 12:56 AM, Bossart, Nathan <bossa...@amazon.com> wrote:
> On 9/21/17, 9:55 PM, "Michael Paquier" <michael.paqu...@gmail.com> wrote:
>> I still think that ExecVacuum() should pass a list of VacuumRelation
>> objects to vacuum(), and get_rel_oids() should take in input a list,
>> and return a completed lists. This way the decision-making of doing
>> everything in the same transaction should happens once in vacuum().
>> And actually, if several relations are defined with VACUUM, your patch
>> would not use one transaction per table as use_own_xacts would be set
>> to false. I think that Tom meant that relations whose processed has
>> finished have to be committed immediately. Per your patch, the commit
>> happens once all relations are committed.
>
> Sorry, I must have misunderstood.  I've attached an updated patch that
> looks more like what you described.  I also cleaned up the test cases
> a bit.
>
> IIUC the only time use_own_xacts will be false is when we are only
> doing ANALYZE and at least one of the following is true:
>
>         1. We are in a transaction block.
>         2. We are processing only one relation.


Yes.

> From the code, it appears that vacuum_rel() always starts and commits a
> new transaction for each relation:
>
>          * vacuum_rel expects to be entered with no transaction active; it 
> will
>          * start and commit its own transaction.  But we are called by an SQL

Yes.

> So, by ensuring that get_rel_oids() returns a list whenever multiple
> tables are specified, we are making sure that commands like
>
>         ANALYZE table1, table2, table3;
>
> create transactions for each processed relation (as long as they are
> not inside a transaction block).

Yes.

> I suppose the alternative would be
> to call vacuum() for each relation and to remove the restriction that
> we must be processing more than one relation for use_own_xacts to be
> true.

The main point of my comment is that like ExecVacuum(), vacuum()
should be a high-level function where is decided if multiple
transactions should be run or not. By calling vacuum() multiple times
you break this promise. vacuum_rel should be the one working with
individual transactions.

Here is the diff between v19 and v21 that matters here:
    /* Now go through the common routine */
-   if (vacstmt->rels == NIL)
-       vacuum(vacstmt->options, NULL, &params, NULL, isTopLevel);
-   else
-   {
-       ListCell *lc;
-       foreach(lc, vacstmt->rels)
-           vacuum(vacstmt->options, lfirst_node(VacuumRelation, lc),
-                  &params, NULL, isTopLevel);
-   }
+   vacuum(vacstmt->options, vacstmt->rels, &params, NULL, isTopLevel);
If you do so, for an ANALYZE with multiple relations you would finish
by using the same transaction for all relations. I think that we had
better be consistent with VACUUM when not using an outer transaction
so as tables are analyzed and committed one by one. This does not
happen here: a unique transaction is used when using a list of
non-partitioned tables.

On Sun, Sep 24, 2017 at 4:37 AM, Bossart, Nathan <bossa...@amazon.com> wrote:
> Here is a version of the patch without the switch to AutovacMemCxt in
> autovacuum_do_vac_analyze(), which should no longer be necessary after
> 335f3d04.

Thanks for the new version.

+       if (!IsAutoVacuumWorkerProcess())
+           ereport(WARNING,
+                 (errmsg("skipping \"%s\" --- relation no longer exists",
+                         relation->relname)));
I like the use of WARNING here, but we could use as well a LOG to be
consistent when a lock obtention is skipped.

+            * going to commit this transaction and begin a new one between now
+            * and then.
+            */
+           relid = RangeVarGetRelid(relinfo->relation, NoLock, false);
We will likely have to wait that the matters discussed in
https://www.postgresql.org/message-id/25023.1506107...@sss.pgh.pa.us
are settled.

+VACUUM FULL vactst, vactst, vactst, vactst;
This is actually a waste of cycles.

I think I don't have much other comments about this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

Reply via email to