On 11/19/15 1:18 AM, Amit Langote wrote:
1) General purpose interface for (maintenance?) commands to report a set

I'm surprised no one has picked up on using this for DML. Certainly anyone works with ETL processes would love to be able to get some clue on the status of a long running query...

About pass 2, ie, lazy_index_vacuum() and
lazy_vacuum_heap(), I don't see how we can do better than reporting its
progress only after finishing all of it without any finer-grained
instrumentation. They are essentially block-box as far as the proposed
instrumentation approach is concerned. Being able to report progress per
index seems good but as a whole, a user would have to wait arbitrarily
long before numbers move forward. We might as well just report a bool
saying we're about to enter a potentially time-consuming index vacuum
round with possibly multiple indexes followed by lazy_vacuum_heap()
processing. Additionally, we can report the incremented count of the
vacuuming round (pass 2) once we are through it.

Another option is to provide the means for the index scan routines to report their progress. Maybe every index AM won't use it, but it'd certainly be a lot better than staring at a long_running boolean.

Note that we can leave them out of
percent_done of overall vacuum progress. Until we have a good solution for
number (3) above, it seems to difficult to incorporate index pages into
overall progress.

IMHO we need to either put a big caution sign on any % estimate that it could be wildly off, or just forgo it completely for now. I'll bet that if we don't provide it some enterprising users will figure out the best way to do this (similar to how the bloat estimate query has evolved over time).

Even if we never get a % done indicator, just being able to see what 'position' a command is at will be very valuable.

As someone pointed out upthread, the final heap truncate phase can take
arbitrarily long and is outside the scope of lazy_scan_heap() to
instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
the same. Note that, it would have to be reported from lazy_vacuum_rel().

ISTM this is similar to the problem of reporting index status, namely that a progress reporting method needs to accept reports from multiple places in the code.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to