On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpok...@gmail.com> wrote:
> Hi,
> Please find attached updated patch with an updated interface.

Well, this isn't right.  You've got this sort of thing:

+            scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
+            /* Report progress to the statistics collector */
+            pgstat_report_progress_update_message(0, progress_message);
+            pgstat_report_progress_update_counter(1, scanned_heap_pages);
+            pgstat_report_progress_update_counter(3, scanned_index_pages);
+            pgstat_report_progress_update_counter(4,
vacrelstats->num_index_scans + 1);

The point of having pgstat_report_progress_update_counter() is so that
you can efficiently update a single counter without having to update
everything, when only one counter has changed.  But here you are
calling this function a whole bunch of times in a row, which
completely misses the point - if you are updating all the counters,
it's more efficient to use an interface that does them all at once
instead of one at a time.

But there's a second problem here, too, which is that I think you've
got this code in the wrong place.  The second point of the
pgstat_report_progress_update_counter interface is that this should be
cheap enough that we can do it every time the counter changes.  That's
not what you are doing here.  You're updating the counters at various
points in the code and just pushing new values for all of them
regardless of which ones have changed.  I think you should find a way
that you can update the value immediately at the exact moment it
changes.  If that seems like too much of a performance hit we can talk
about it, but I think the value of this feature will be greatly
weakened if users can't count on it to be fully and continuously up to
the moment.  When something gets stuck, you want to know where it's
stuck, not approximately kinda where it's stuck.

+                if(!scan_all)
+                    scanned_heap_pages = scanned_heap_pages +

I don't want to be too much of a stickler for details here, but it
seems to me that this is an outright lie.  The number of scanned pages
does not go up when we decide to skip some pages, because scanning and
skipping aren't the same thing.  If we're only going to report one
number here, it needs to be called something like "current heap page",
and then you can just report blkno at the top of each iteration of
lazy_scan_heap's main loop.  If you want to report the numbers of
scanned and skipped pages separately that'd be OK too, but you can't
call it the number of scanned pages and then actually report a value
that is not that.

+        /*
+         * Reporting vacuum progress to statistics collector
+         */

This patch doesn't report anything to the statistics collector, nor should it.

Instead of making the SQL-visible function
pg_stat_get_vacuum_progress(), I think it should be something more
generic like pg_stat_get_command_progress().  Maybe VACUUM will be the
only command that reports into that feature for right now, but I'd
hope for us to change that pretty soon after we get the first patch

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to