Re: [HACKERS] More stats about skipped vacuums

2018-02-09 Thread Kyotaro HORIGUCHI
Hello.

At Thu, 08 Feb 2018 18:21:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180208.182156.96551245.horiguchi.kyot...@lab.ntt.co.jp>
> I suppose that the problem has not been resolved yet..

I found several bugs during studying this but my conclusion here
is that the required decision here is that whether we regard the
unavailability of DSM as a fatal error as we do for out of
memory. Maybe we can go for the direction but just doing it
certainly let some buidfarm animals (at least anole? smew is not
found.) out of their lives.

I've not found the exact cause of the problem that regtest on the
bf animals always suceeded using sysv shmem but "postgres --boot"
by initdb alone can fail using the same mechanism. But regtest
seems to continue working if initdb sets max_connection to 20 or
more.  At least it suceeds for me with the values max_connection
= 20 and shared_buffers=50MB on centos.

Finally, I'd like to propose the followings.

 - kill dynamic_shared_memory_type = nune just now.

 * server stops at startup if DSM is not available.

 - Let initdb set max_connection = 20 as the fallback value in
   the case. (Another porposed patch) And regression should
   succeed with that.

If we are agreed on this, I will be able to go forward.


I want to have opinions on this from the expericed poeple.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2018-02-08 Thread Kyotaro HORIGUCHI
At Thu, 08 Feb 2018 18:04:15 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180208.180415.112312013.horiguchi.kyot...@lab.ntt.co.jp>
> > > I suggest we remove support for dynamic_shared_memory_type = none first,
> > > and see if we get any complaints.  If we don't, then future patches can
> > > rely on it being present.
> > 
> > If we remove it in v11, it'd still be maybe a year from now before we'd
> > have much confidence from that alone that nobody cares.  I think the lack
> > of complaints about it in 9.6 and 10 is a more useful data point.
> 
> So that means that we are assumed to be able to rely on the
> existence of DSM at the present since over a year we had no
> complain despite the fact that DSM is silently turned on? And
> apart from that we are ready to remove 'none' from the options of
> dynamic_shared_memory_type right now?

I found the follwoing commit related to this.

| commit d41ab71712a4457ed39d5471b23949872ac91def
| Author: Robert Haas 
| Date:   Wed Oct 16 09:41:03 2013 -0400
| 
| initdb: Suppress dynamic shared memory when probing for max_connections.
| 
| This might not be the right long-term solution here, but it will
| hopefully turn the buildfarm green again.
| 
| Oversight noted by Andres Freund

The discussion is found here.

https://www.postgresql.org/message-id/ca+tgmoyhiigrcvssjhmbsebmof2zx_9_9rwd75cwvu99yrd...@mail.gmail.com

I suppose that the problem has not been resolved yet..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2018-02-08 Thread Kyotaro HORIGUCHI
Hello,

At Wed, 07 Feb 2018 16:59:20 -0500, Tom Lane  wrote in 
<3246.1518040...@sss.pgh.pa.us>
> Robert Haas  writes:
> > It seems to me that there was a thread where Tom proposed removing
> > support for dynamic_shared_memory_type = none.
> 
> I think you're recalling <32138.1502675...@sss.pgh.pa.us>, wherein
> I pointed out that
> 
> >>> Whether that's worth the trouble is debatable.  The current code
> >>> in initdb believes that every platform has some type of DSM support
> >>> (see choose_dsm_implementation).  Nobody's complained about that,
> >>> and it certainly works on every buildfarm animal.  So for all we know,
> >>> dynamic_shared_memory_type = none is broken already.
> 
> (That was in fact in the same thread Kyotaro-san just linked to about
> reimplementing the stats collector.)
> 
> It's still true that we've no reason to believe there are any supported
> platforms that haven't got some sort of DSM.  Performance might be a
> different question, of course ... but it's hard to believe that
> transferring stats through DSM wouldn't be better than writing them
> out to files.

Good to hear. Thanks.

> > I suggest we remove support for dynamic_shared_memory_type = none first,
> > and see if we get any complaints.  If we don't, then future patches can
> > rely on it being present.
> 
> If we remove it in v11, it'd still be maybe a year from now before we'd
> have much confidence from that alone that nobody cares.  I think the lack
> of complaints about it in 9.6 and 10 is a more useful data point.

So that means that we are assumed to be able to rely on the
existence of DSM at the present since over a year we had no
complain despite the fact that DSM is silently turned on? And
apart from that we are ready to remove 'none' from the options of
dynamic_shared_memory_type right now?

If I may rely on DSM, fallback stuff would not be required.


>   regards, tom lane

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2018-02-07 Thread Tom Lane
Robert Haas  writes:
> It seems to me that there was a thread where Tom proposed removing
> support for dynamic_shared_memory_type = none.

I think you're recalling <32138.1502675...@sss.pgh.pa.us>, wherein
I pointed out that

>>> Whether that's worth the trouble is debatable.  The current code
>>> in initdb believes that every platform has some type of DSM support
>>> (see choose_dsm_implementation).  Nobody's complained about that,
>>> and it certainly works on every buildfarm animal.  So for all we know,
>>> dynamic_shared_memory_type = none is broken already.

(That was in fact in the same thread Kyotaro-san just linked to about
reimplementing the stats collector.)

It's still true that we've no reason to believe there are any supported
platforms that haven't got some sort of DSM.  Performance might be a
different question, of course ... but it's hard to believe that
transferring stats through DSM wouldn't be better than writing them
out to files.

> I suggest we remove support for dynamic_shared_memory_type = none first,
> and see if we get any complaints.  If we don't, then future patches can
> rely on it being present.

If we remove it in v11, it'd still be maybe a year from now before we'd
have much confidence from that alone that nobody cares.  I think the lack
of complaints about it in 9.6 and 10 is a more useful data point.

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2018-02-07 Thread Robert Haas
On Tue, Feb 6, 2018 at 8:34 PM, Kyotaro HORIGUCHI
 wrote:
> Based on the reason, it fails to run when
> dynamic_shared_memory_type = none and it is accompanied by
> several cleanup complexities. The decision there is we should go
> for just using static shared memory rather than adding complexity
> for nothing. If it really needs to be expandable in the future,
> it's the time to use DSA. (But would still maintain a fallback
> stuff.)

It seems to me that there was a thread where Tom proposed removing
support for dynamic_shared_memory_type = none.  The main reason that I
included that option initially was because it seemed silly to risk
causing problems for users whose dynamic shared memory facilities
didn't work for the sake of a feature that, at the time (9.4), had no
in-core users.

But things have shifted a bit since then.  We have had few complaints
about dynamic shared memory causing portability problems (except for
performance: apparently some implementations perform better than
others on some systems, and we need support for huge pages, but
neither of those things are a reason to disable it) and we now have
in-core use that is enabled by default.  I suggest we remove support
for dynamic_shared_memory_type = none first, and see if we get any
complaints.  If we don't, then future patches can rely on it being
present.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2018-02-06 Thread Kyotaro HORIGUCHI
At Tue, 06 Feb 2018 19:24:37 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180206.192437.229464841.horiguchi.kyot...@lab.ntt.co.jp>
> At Tue, 6 Feb 2018 14:50:01 +0900, Masahiko Sawada  
> wrote in 
> > The implementation of autovacuum work-item has been changed (by commit
> > 31ae1638) because dynamic shared memory is not portable enough. IIUC
> > this patch is going to do the similar thing. Since stats collector is
> > also a critical part of the server, should we consider whether we can
> > change it? Or the portability problem is not relevant with this patch?
> 
> Thank you for the pointer. I digged out the following thread from
> this and the the patch seems to be the consequence of the
> discussion. I'll study it and think what to do on this.
> 
> https://www.postgresql.org/message-id/20170814005656.d5tvz464qkmz6...@alap3.anarazel.de

Done. The dominant reason for the ripping-off is that the
work-items array was allocated in a fixed-size DSA segment at
process startup time and wouldn't be resized.

Based on the reason, it fails to run when
dynamic_shared_memory_type = none and it is accompanied by
several cleanup complexities. The decision there is we should go
for just using static shared memory rather than adding complexity
for nothing. If it really needs to be expandable in the future,
it's the time to use DSA. (But would still maintain a fallback
stuff.)


Thinking of this, I think that this patch has a reason to use DSA
but needs some additional work.

- Fallback mechanism when dynamic_shared_memory_type = none

  This means that the old file based stuff lives alongside the
  DSA stuff. This is used when '= none' and on failure of DSA
  mechanism.

- Out-of transactoin clean up stuff

  Something like the following patch.

  
https://www.postgresql.org/message-id/20170814231638.x6vgnzlr7eww4bui@alvherre.pgsql

And as known problems:

- Make it use less LWLocks.

- DSHash shrink mechanism. Maybe need to find the way to detect
  the necessity to shrink.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2018-02-06 Thread Kyotaro HORIGUCHI
At Tue, 6 Feb 2018 14:50:01 +0900, Masahiko Sawada  
wrote in 
> On Mon, Dec 11, 2017 at 8:15 PM, Kyotaro HORIGUCHI
>  wrote:
> > I considered dshash for pgstat.c and the attached is a *PoC*
> > patch, which is not full-fledged and just working under a not so
> > concurrent situation.
> >
> > - Made stats collector an auxiliary process. A crash of stats
> >   collector leads to a whole-server restarting.
> >
> > - dshash lacks capability of sequential scan so added it.
> >
> > - Also added snapshot function to dshash. It just copies
> >   unerlying DSA segments into local memory but currently it
> >   doesn't aquire dshash-level locks at all. I tried the same
> >   thing with resize but it leads to very quick exhaustion of
> >   LWLocks. An LWLock for the whole dshash would be required. (and
> >   it is also useful to resize() and sequential scan.
> >
> > - The current dshash doesn't shrink at all. Such a feature also
> >   will be required. (A server restart causes a shrink of hashes
> >   in the same way as before but bloat dshash requires copying
> >   more than necessary size of memory on takeing a snapshot.)
> >
> > The size of DSA is about 1MB at minimum. Copying entry-by-entry
> > into (non-ds) hash might be better than copying underlying DSA as
> > a whole, and DSA/DSHASH snapshot brings a kind of dirty..
> >
> >
> > Does anyone give me opinions or suggestions?
> >
> 
> The implementation of autovacuum work-item has been changed (by commit
> 31ae1638) because dynamic shared memory is not portable enough. IIUC
> this patch is going to do the similar thing. Since stats collector is
> also a critical part of the server, should we consider whether we can
> change it? Or the portability problem is not relevant with this patch?

Thank you for the pointer. I digged out the following thread from
this and the the patch seems to be the consequence of the
discussion. I'll study it and think what to do on this.

https://www.postgresql.org/message-id/20170814005656.d5tvz464qkmz6...@alap3.anarazel.de

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2018-02-05 Thread Masahiko Sawada
On Mon, Dec 11, 2017 at 8:15 PM, Kyotaro HORIGUCHI
 wrote:
> At Mon, 27 Nov 2017 13:51:22 -0500, Robert Haas  wrote 
> in 
>> On Mon, Nov 27, 2017 at 1:49 AM, Kyotaro HORIGUCHI
>>  wrote:
>> > Hmmm. Okay, we must make stats collector more effeicient if we
>> > want to have additional counters with smaller significance in the
>> > table stats. Currently sizeof(PgStat_StatTabEntry) is 168
>> > bytes. The whole of the patchset increases it to 232 bytes. Thus
>> > the size of a stat file for a database with 1 tables
>> > increases from about 1.7MB to 2.4MB.  DSM and shared dynahash is
>> > not dynamically expandable so placing stats on shared hash
>> > doesn't seem effective. Stats as a regular table could work but
>> > it seems too-much.
>>
>> dshash, which is already committed, is both DSM-based and dynamically
>> expandable.
>
> Yes, I forgot about that. We can just copy memory blocks to take
> a snapshot of stats.
>
>> > Is it acceptable that adding a new section containing this new
>> > counters, which is just loaded as a byte sequence and parsing
>> > (and filling the corresponding hash) is postponed until a counter
>> > in the section is really requested?  The new counters need to be
>> > shown in a separate stats view (maybe named pg_stat_vacuum).
>>
>> Still makes the stats file bigger.
>
> I considered dshash for pgstat.c and the attached is a *PoC*
> patch, which is not full-fledged and just working under a not so
> concurrent situation.
>
> - Made stats collector an auxiliary process. A crash of stats
>   collector leads to a whole-server restarting.
>
> - dshash lacks capability of sequential scan so added it.
>
> - Also added snapshot function to dshash. It just copies
>   unerlying DSA segments into local memory but currently it
>   doesn't aquire dshash-level locks at all. I tried the same
>   thing with resize but it leads to very quick exhaustion of
>   LWLocks. An LWLock for the whole dshash would be required. (and
>   it is also useful to resize() and sequential scan.
>
> - The current dshash doesn't shrink at all. Such a feature also
>   will be required. (A server restart causes a shrink of hashes
>   in the same way as before but bloat dshash requires copying
>   more than necessary size of memory on takeing a snapshot.)
>
> The size of DSA is about 1MB at minimum. Copying entry-by-entry
> into (non-ds) hash might be better than copying underlying DSA as
> a whole, and DSA/DSHASH snapshot brings a kind of dirty..
>
>
> Does anyone give me opinions or suggestions?
>

The implementation of autovacuum work-item has been changed (by commit
31ae1638) because dynamic shared memory is not portable enough. IIUC
this patch is going to do the similar thing. Since stats collector is
also a critical part of the server, should we consider whether we can
change it? Or the portability problem is not relevant with this patch?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] More stats about skipped vacuums

2017-12-11 Thread Kyotaro HORIGUCHI
At Mon, 27 Nov 2017 13:51:22 -0500, Robert Haas  wrote 
in 
> On Mon, Nov 27, 2017 at 1:49 AM, Kyotaro HORIGUCHI
>  wrote:
> > Hmmm. Okay, we must make stats collector more effeicient if we
> > want to have additional counters with smaller significance in the
> > table stats. Currently sizeof(PgStat_StatTabEntry) is 168
> > bytes. The whole of the patchset increases it to 232 bytes. Thus
> > the size of a stat file for a database with 1 tables
> > increases from about 1.7MB to 2.4MB.  DSM and shared dynahash is
> > not dynamically expandable so placing stats on shared hash
> > doesn't seem effective. Stats as a regular table could work but
> > it seems too-much.
> 
> dshash, which is already committed, is both DSM-based and dynamically
> expandable.

Yes, I forgot about that. We can just copy memory blocks to take
a snapshot of stats.

> > Is it acceptable that adding a new section containing this new
> > counters, which is just loaded as a byte sequence and parsing
> > (and filling the corresponding hash) is postponed until a counter
> > in the section is really requested?  The new counters need to be
> > shown in a separate stats view (maybe named pg_stat_vacuum).
> 
> Still makes the stats file bigger.

I considered dshash for pgstat.c and the attached is a *PoC*
patch, which is not full-fledged and just working under a not so
concurrent situation.

- Made stats collector an auxiliary process. A crash of stats
  collector leads to a whole-server restarting.

- dshash lacks capability of sequential scan so added it.

- Also added snapshot function to dshash. It just copies
  unerlying DSA segments into local memory but currently it
  doesn't aquire dshash-level locks at all. I tried the same
  thing with resize but it leads to very quick exhaustion of
  LWLocks. An LWLock for the whole dshash would be required. (and
  it is also useful to resize() and sequential scan.

- The current dshash doesn't shrink at all. Such a feature also
  will be required. (A server restart causes a shrink of hashes
  in the same way as before but bloat dshash requires copying
  more than necessary size of memory on takeing a snapshot.)

The size of DSA is about 1MB at minimum. Copying entry-by-entry
into (non-ds) hash might be better than copying underlying DSA as
a whole, and DSA/DSHASH snapshot brings a kind of dirty..


Does anyone give me opinions or suggestions?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f5c9c45384ec43734c0890dd875101defe6590bc Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 1 Dec 2017 14:34:47 +0900
Subject: [PATCH 1/4] Simple implement of local shapshot of dshash.

Add snapshot feature to DSHASH. This makes palloc'ed copy of
underlying DSA and returns unmodifiable DSHASH using the copied DSA.
---
 src/backend/lib/dshash.c | 74 +++-
 src/backend/utils/mmgr/dsa.c | 57 +-
 src/include/lib/dshash.h |  1 +
 src/include/utils/dsa.h  |  1 +
 4 files changed, 124 insertions(+), 9 deletions(-)

diff --git a/src/backend/lib/dshash.c b/src/backend/lib/dshash.c
index dd87573..973a826 100644
--- a/src/backend/lib/dshash.c
+++ b/src/backend/lib/dshash.c
@@ -112,6 +112,7 @@ struct dshash_table
 	size_t		size_log2;		/* log2(number of buckets) */
 	bool		find_locked;	/* Is any partition lock held by 'find'? */
 	bool		find_exclusively_locked;	/* ... exclusively? */
+	bool		is_snapshot;	/* Is this hash is a local snapshot?*/
 };
 
 /* Given a pointer to an item, find the entry (user data) it holds. */
@@ -228,6 +229,7 @@ dshash_create(dsa_area *area, const dshash_parameters *params, void *arg)
 
 	hash_table->find_locked = false;
 	hash_table->find_exclusively_locked = false;
+	hash_table->is_snapshot = false;
 
 	/*
 	 * Set up the initial array of buckets.  Our initial size is the same as
@@ -279,6 +281,7 @@ dshash_attach(dsa_area *area, const dshash_parameters *params,
 	hash_table->control = dsa_get_address(area, control);
 	hash_table->find_locked = false;
 	hash_table->find_exclusively_locked = false;
+	hash_table->is_snapshot = false;
 	Assert(hash_table->control->magic == DSHASH_MAGIC);
 
 	/*
@@ -321,6 +324,15 @@ dshash_destroy(dshash_table *hash_table)
 	size_t		i;
 
 	Assert(hash_table->control->magic == DSHASH_MAGIC);
+
+	/* this is a local copy */
+	if (hash_table->is_snapshot)
+	{
+		pfree(hash_table->area);
+		pfree(hash_table);
+		return;
+	}
+
 	ensure_valid_bucket_pointers(hash_table);
 
 	/* Free all the entries. */
@@ -355,6 +367,29 @@ dshash_destroy(dshash_table *hash_table)
 }
 
 /*
+ * take a local snapshot of a dshash table
+ */
+dshash_table *
+dshash_take_snapshot(dshash_table *org_table, dsa_area *new_area)
+{
+	dshash_table *new_table;
+
+	if (org_table->is_snapshot)
+		elog(ERROR, "cannot make local copy of 

Re: [HACKERS] More stats about skipped vacuums

2017-11-28 Thread Magnus Hagander
On Tue, Nov 28, 2017 at 12:16 AM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > What I've been thinking about for that one before is if we could just
> > invent a protocol (shmq based maybe) whereby autovacuum can ask the stats
> > collector for a single table or index stat. If autovacuum never needs to
> > see a consistent view between multiple tables, I would think that's going
> > to be a win in a lot of cases.
>
> Perhaps.  Autovac might run through quite a few tables before it finds
> one in need of processing, though, so I'm not quite certain this would
> yield such great benefits in isolation.
>

Hmm. Good point.



> > However, when it comes to the stats system, I'd say that on any busy
> system
> > (which would be the ones to care about), the stats structures are still
> > going to be *written* a lot more than they are read.
>
> Uh, what?  The stats collector doesn't write those files at all except
> on-demand.
>

Oops. Missing one important word. They're going to be *written to* a lot
more than they are read. Meaning that each individual value is likely to be
updated many times before it's ever read. In memory, in the stats
collector.  So not talking about the files at all -- just the numbers,
independent of implementation.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] More stats about skipped vacuums

2017-11-27 Thread Tom Lane
Magnus Hagander  writes:
> What I've been thinking about for that one before is if we could just
> invent a protocol (shmq based maybe) whereby autovacuum can ask the stats
> collector for a single table or index stat. If autovacuum never needs to
> see a consistent view between multiple tables, I would think that's going
> to be a win in a lot of cases.

Perhaps.  Autovac might run through quite a few tables before it finds
one in need of processing, though, so I'm not quite certain this would
yield such great benefits in isolation.

> However, when it comes to the stats system, I'd say that on any busy system
> (which would be the ones to care about), the stats structures are still
> going to be *written* a lot more than they are read.

Uh, what?  The stats collector doesn't write those files at all except
on-demand.

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2017-11-27 Thread Magnus Hagander
On Mon, Nov 27, 2017 at 7:53 PM, Robert Haas  wrote:

> On Sun, Nov 26, 2017 at 3:19 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Sat, Nov 25, 2017 at 12:09 PM, Tom Lane  wrote:
> >>> Mumble.  It's a property I'm pretty hesitant to give up, especially
> >>> since the stats views have worked like that since day one.  It's
> >>> inevitable that weakening that guarantee would break peoples' queries,
> >>> probably subtly.
> >
> >> You mean, queries against the stats views, or queries in general?  If
> >> the latter, by what mechanism would the breakage happen?
> >
> > Queries against the stats views, of course.
>
> Hmm.  Those are probably rare.  If we only took a snapshot of the
> statistics for the backends that explicitly access those views, that
> probably wouldn't be too crazy.
>
> Sorry if this is a stupid question, but how often and for what purpose
> to regular backends need the stats collector data for purposes other
> than querying the stats views?  I thought that the data was only used
> to decide whether to VACUUM/ANALYZE, and therefore would be accessed
> mostly by autovacuum, and for that you'd actually want the most
> up-to-date view of the stats for a particular table that is available,
> not any older snapshot.
>
>
Autovacuum resets the stats to make sure. Autovacuum in particular can
probably be made a lot more efficient, because it only ever looks at one
relation at a time, I think.

What I've been thinking about for that one before is if we could just
invent a protocol (shmq based maybe) whereby autovacuum can ask the stats
collector for a single table or index stat. If autovacuum never needs to
see a consistent view between multiple tables, I would think that's going
to be a win in a lot of cases.

I don't think regular backends use them at all. But anybody looking at the
stats do, and it is pretty important there.

However, when it comes to the stats system, I'd say that on any busy system
(which would be the ones to care about), the stats structures are still
going to be *written* a lot more than they are read. We certainly don't
read them at the rate of once per transaction. A lot of the reads are also
limited to one database of course.

I wonder if we want to implement some sort of copy-on-read-snapshot in the
stats collector itself. So instead of unconditionally publishing
everything, have the backends ask for it. When a backend asks for it it
gets a "snapshot counter" or something from the stats collector, and on the
next write after that we do a copy-write if the snapshot it still
available. (no, i have not thought in detail)

Or -- if we keep a per-database hashtable in dynamic shared memory (which
we can now). Can we copy it into local memory in the backend fast enough
that we can hold a lock and just queue up the stats updates during the
copy? If we can copy the complete structure, that would fix one of the
bigger bottlenecks with it today which is that we dump and rebuild the
hashtables as we go through the tempfiles.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] More stats about skipped vacuums

2017-11-27 Thread Robert Haas
On Sun, Nov 26, 2017 at 3:19 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Nov 25, 2017 at 12:09 PM, Tom Lane  wrote:
>>> Mumble.  It's a property I'm pretty hesitant to give up, especially
>>> since the stats views have worked like that since day one.  It's
>>> inevitable that weakening that guarantee would break peoples' queries,
>>> probably subtly.
>
>> You mean, queries against the stats views, or queries in general?  If
>> the latter, by what mechanism would the breakage happen?
>
> Queries against the stats views, of course.

Hmm.  Those are probably rare.  If we only took a snapshot of the
statistics for the backends that explicitly access those views, that
probably wouldn't be too crazy.

Sorry if this is a stupid question, but how often and for what purpose
to regular backends need the stats collector data for purposes other
than querying the stats views?  I thought that the data was only used
to decide whether to VACUUM/ANALYZE, and therefore would be accessed
mostly by autovacuum, and for that you'd actually want the most
up-to-date view of the stats for a particular table that is available,
not any older snapshot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2017-11-27 Thread Robert Haas
On Mon, Nov 27, 2017 at 1:49 AM, Kyotaro HORIGUCHI
 wrote:
> Hmmm. Okay, we must make stats collector more effeicient if we
> want to have additional counters with smaller significance in the
> table stats. Currently sizeof(PgStat_StatTabEntry) is 168
> bytes. The whole of the patchset increases it to 232 bytes. Thus
> the size of a stat file for a database with 1 tables
> increases from about 1.7MB to 2.4MB.  DSM and shared dynahash is
> not dynamically expandable so placing stats on shared hash
> doesn't seem effective. Stats as a regular table could work but
> it seems too-much.

dshash, which is already committed, is both DSM-based and dynamically
expandable.

> Is it acceptable that adding a new section containing this new
> counters, which is just loaded as a byte sequence and parsing
> (and filling the corresponding hash) is postponed until a counter
> in the section is really requested?  The new counters need to be
> shown in a separate stats view (maybe named pg_stat_vacuum).

Still makes the stats file bigger.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2017-11-26 Thread Michael Paquier
On Mon, Nov 27, 2017 at 5:19 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Nov 25, 2017 at 12:09 PM, Tom Lane  wrote:
>>> Mumble.  It's a property I'm pretty hesitant to give up, especially
>>> since the stats views have worked like that since day one.  It's
>>> inevitable that weakening that guarantee would break peoples' queries,
>>> probably subtly.
>
>> You mean, queries against the stats views, or queries in general?  If
>> the latter, by what mechanism would the breakage happen?
>
> Queries against the stats views, of course.

There has been much discussion on this thread, and the set of patches
as presented may hurt performance for users with a large number of
tables, so I am marking it as returned with feedback.
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-26 Thread Michael Paquier
On Mon, Nov 27, 2017 at 2:49 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> ... Would you think
>> that it is acceptable to add the number of index scans that happened
>> with the verbose output then?
>
> I don't have an objection to it, but can't you tell that from VACUUM
> VERBOSE already?  There should be a "INFO:  scanned index" line for
> each scan.

Of course, sorry for the noise.
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-26 Thread Robert Haas
On Sat, Nov 25, 2017 at 12:09 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Of course, the other obvious question is whether we really need a
>> consistent snapshot, because that's bound to be pretty expensive even
>> if you eliminate the I/O cost.  Taking a consistent snapshot across
>> all 100,000 tables in the database even if we're only ever going to
>> access 5 of those tables doesn't seem like a good or scalable design.
>
> Mumble.  It's a property I'm pretty hesitant to give up, especially
> since the stats views have worked like that since day one.  It's
> inevitable that weakening that guarantee would break peoples' queries,
> probably subtly.

You mean, queries against the stats views, or queries in general?  If
the latter, by what mechanism would the breakage happen?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2017-11-26 Thread Tom Lane
Michael Paquier  writes:
> ... Would you think
> that it is acceptable to add the number of index scans that happened
> with the verbose output then?

I don't have an objection to it, but can't you tell that from VACUUM
VERBOSE already?  There should be a "INFO:  scanned index" line for
each scan.

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2017-11-26 Thread Michael Paquier
On Sun, Nov 26, 2017 at 9:59 AM, Tom Lane  wrote:
> I'd say so ... that's something the average user will never bother with,
> and even if they knew to bother, it's far from obvious what to do with
> the information.  Besides, I don't think you could just save the number
> of scans and nothing else.  For it to be meaningful, you'd at least have
> to know the prevailing work_mem setting and the number of dead tuples
> removed ... and then you'd need some info about your historical average
> and maximum number of dead tuples removed, so that you knew whether the
> last vacuum operation was at all representative.  So this is sounding
> like quite a lot of new counters, in support of perhaps 0.1% of the
> user population.  Most people are just going to set maintenance_work_mem
> as high as they can tolerate and call it good.

In all the PostgreSQL deployments I deal with, the database is
embedded with other things running in parallel and memory is something
that's shared between components, so being able to tune more precisely
any of the *_work_mem parameters has value (a couple of applications
are also doing autovacuum tuning at relation-level). Would you think
that it is acceptable to add the number of index scans that happened
with the verbose output then? Personally I could live with that
information. I recall as well a thread about complains that VACUUM
VERBOSE is showing already too much information, I cannot put my
finger on it specifically now though. With
autovacuum_log_min_duration, it is easy enough to trace a vacuum
pattern. The thing is that for now the tuning is not that evident, and
CPU cycles can be worth saving in some deployments while memory could
be extended more easily.
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Tom Lane
Michael Paquier  writes:
> I am not arguing about skipped vacuuum data here (don't think much of
> it by the way), but of the number of index scans done by the last
> vacuum or autovacuum. This helps in tunning autovacuum_work_mem and
> maintenance_work_mem. The bar is still too high for that?

I'd say so ... that's something the average user will never bother with,
and even if they knew to bother, it's far from obvious what to do with
the information.  Besides, I don't think you could just save the number
of scans and nothing else.  For it to be meaningful, you'd at least have
to know the prevailing work_mem setting and the number of dead tuples
removed ... and then you'd need some info about your historical average
and maximum number of dead tuples removed, so that you knew whether the
last vacuum operation was at all representative.  So this is sounding
like quite a lot of new counters, in support of perhaps 0.1% of the
user population.  Most people are just going to set maintenance_work_mem
as high as they can tolerate and call it good.

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Michael Paquier
On Sun, Nov 26, 2017 at 12:34 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Sat, Nov 25, 2017 at 12:55 AM, Robert Haas  wrote:
>>> I think that's a good thing to worry about.   In the past, Tom has
>>> expressed reluctance to make stats tables that have a row per table
>>> any wider at all, IIRC.
>
>> Tom, any opinions to offer here? pg_stat_all_tables is currently at 22
>> columns (this takes two full lines on my terminal with a font size at
>> 13). With the first patch of what's proposed on this thread there
>> would be 24 columns. Perhaps it would be time to split the vacuum
>> statistics into a new view like pg_stat_tables_vacuum or similar?
>
> My concern is not with the width of any view --- you can always select a
> subset of columns if a view is too wide for your screen.  The issue is the
> number of counters in the stats collector's state.  We know, without any
> doubt, that widening PgStat_StatTabEntry causes serious pain to people
> with large numbers of tables; and they have no recourse when we do it.
> So the bar to adding more counters is very high IMO.  I won't say never,
> but I do doubt that something like skipped vacuums should make the cut.

I am not arguing about skipped vacuuum data here (don't think much of
it by the way), but of the number of index scans done by the last
vacuum or autovacuum. This helps in tunning autovacuum_work_mem and
maintenance_work_mem. The bar is still too high for that?
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Tom Lane
Robert Haas  writes:
> Of course, the other obvious question is whether we really need a
> consistent snapshot, because that's bound to be pretty expensive even
> if you eliminate the I/O cost.  Taking a consistent snapshot across
> all 100,000 tables in the database even if we're only ever going to
> access 5 of those tables doesn't seem like a good or scalable design.

Mumble.  It's a property I'm pretty hesitant to give up, especially
since the stats views have worked like that since day one.  It's
inevitable that weakening that guarantee would break peoples' queries,
probably subtly.

What we need is a way to have a consistent snapshot without implementing
it by copying 100,000 tables' worth of data for every query.  Hmm, I heard
of a technique called MVCC once ...

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Robert Haas
On Sat, Nov 25, 2017 at 10:34 AM, Tom Lane  wrote:
> If we could get rid of the copy-to-a-temporary-file technology for
> transferring the stats collector's data to backends, then this problem
> would probably vanish or at least get a lot less severe.  But that seems
> like a nontrivial project.  With the infrastructure we have today, we
> could probably keep the stats tables in a DSM segment; but how would
> a backend get a consistent snapshot of them?

I suppose the obvious approach is to have a big lock around the
statistics data proper; this could be taken in shared mode to take a
snapshot or in exclusive mode to update statistics.  In addition,
create one or more queues where statistics messages can be enqueued in
lieu of updating the main statistics data directly.  If that doesn't
perform well enough, you could keep two copies of the statistics, A
and B.  At any given time, one copy is quiescent and the other copy is
being updated.  Periodically, at a time when we know that nobody is
taking a snapshot of the statistics, they reverse roles.

Of course, the other obvious question is whether we really need a
consistent snapshot, because that's bound to be pretty expensive even
if you eliminate the I/O cost.  Taking a consistent snapshot across
all 100,000 tables in the database even if we're only ever going to
access 5 of those tables doesn't seem like a good or scalable design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Nov 25, 2017 at 12:55 AM, Robert Haas  wrote:
>> I think that's a good thing to worry about.   In the past, Tom has
>> expressed reluctance to make stats tables that have a row per table
>> any wider at all, IIRC.

> Tom, any opinions to offer here? pg_stat_all_tables is currently at 22
> columns (this takes two full lines on my terminal with a font size at
> 13). With the first patch of what's proposed on this thread there
> would be 24 columns. Perhaps it would be time to split the vacuum
> statistics into a new view like pg_stat_tables_vacuum or similar?

My concern is not with the width of any view --- you can always select a
subset of columns if a view is too wide for your screen.  The issue is the
number of counters in the stats collector's state.  We know, without any
doubt, that widening PgStat_StatTabEntry causes serious pain to people
with large numbers of tables; and they have no recourse when we do it.
So the bar to adding more counters is very high IMO.  I won't say never,
but I do doubt that something like skipped vacuums should make the cut.

If we could get rid of the copy-to-a-temporary-file technology for
transferring the stats collector's data to backends, then this problem
would probably vanish or at least get a lot less severe.  But that seems
like a nontrivial project.  With the infrastructure we have today, we
could probably keep the stats tables in a DSM segment; but how would
a backend get a consistent snapshot of them?

regards, tom lane



Re: [HACKERS] More stats about skipped vacuums

2017-11-25 Thread Michael Paquier
On Sat, Nov 25, 2017 at 12:55 AM, Robert Haas  wrote:
> On Tue, Nov 21, 2017 at 2:09 AM, Kyotaro HORIGUCHI
>  wrote:
>> Yes, my concern here is how many column we can allow in a stats
>> view. I think I'm a bit too warried about that.
>
> I think that's a good thing to worry about.   In the past, Tom has
> expressed reluctance to make stats tables that have a row per table
> any wider at all, IIRC.

Tom, any opinions to offer here? pg_stat_all_tables is currently at 22
columns (this takes two full lines on my terminal with a font size at
13). With the first patch of what's proposed on this thread there
would be 24 columns. Perhaps it would be time to split the vacuum
statistics into a new view like pg_stat_tables_vacuum or similar?
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-24 Thread Robert Haas
On Tue, Nov 21, 2017 at 2:09 AM, Kyotaro HORIGUCHI
 wrote:
> Yes, my concern here is how many column we can allow in a stats
> view. I think I'm a bit too warried about that.

I think that's a good thing to worry about.   In the past, Tom has
expressed reluctance to make stats tables that have a row per table
any wider at all, IIRC.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] More stats about skipped vacuums

2017-11-21 Thread Michael Paquier
On Wed, Nov 22, 2017 at 1:08 PM, Kyotaro HORIGUCHI
 wrote:
> At Wed, 22 Nov 2017 08:20:22 +0900, Michael Paquier 
>  wrote in 
> 
>> On Tue, Nov 21, 2017 at 4:09 PM, Kyotaro HORIGUCHI
>>  wrote:
>> > By the way I'm uneasy that the 'last_vacuum_index_scans' (and
>> > vacuum_fail_count in 0002 and others in 0003, 0004) is mentioning
>> > both VACUUM command and autovacuum, while last_vacuum and
>> > vacuum_count is mentioning only the command. Splitting it into
>> > vacuum/autovaccum seems nonsense but the name is confusing. Do
>> > you have any idea?
>>
>> Hm. I think that you should actually have two fields, one for manual
>> vacuum and one for autovacuum, because each is tied to respectively
>> maintenance_work_mem and autovacuum_work_mem. This way admins are able
>
> It's very convincing for me. Thanks for the suggestion.
>
>> to tune each one of those parameters depending on a look at
>> pg_stat_all_tables. So those should be named perhaps
>> last_vacuum_index_scans and last_autovacuum_index_scans?
>
> Agreed. I'll do so in the next version.

Thanks for considering the suggestion.

> # I forgot to add the version to the patch files...

Don't worry about that. That's not a problem for me I'll just keep
track of the last entry. With the room I have I'll keep focused on
0001 by the way. Others are of course welcome to look at 0002 and
onwards.
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-21 Thread Kyotaro HORIGUCHI
Hello,

At Wed, 22 Nov 2017 08:20:22 +0900, Michael Paquier  
wrote in 
> On Tue, Nov 21, 2017 at 4:09 PM, Kyotaro HORIGUCHI
>  wrote:
> > By the way I'm uneasy that the 'last_vacuum_index_scans' (and
> > vacuum_fail_count in 0002 and others in 0003, 0004) is mentioning
> > both VACUUM command and autovacuum, while last_vacuum and
> > vacuum_count is mentioning only the command. Splitting it into
> > vacuum/autovaccum seems nonsense but the name is confusing. Do
> > you have any idea?
> 
> Hm. I think that you should actually have two fields, one for manual
> vacuum and one for autovacuum, because each is tied to respectively
> maintenance_work_mem and autovacuum_work_mem. This way admins are able

It's very convincing for me. Thanks for the suggestion.

> to tune each one of those parameters depending on a look at
> pg_stat_all_tables. So those should be named perhaps
> last_vacuum_index_scans and last_autovacuum_index_scans?

Agreed. I'll do so in the next version.

# I forgot to add the version to the patch files...

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] More stats about skipped vacuums

2017-11-21 Thread Michael Paquier
On Tue, Nov 21, 2017 at 4:09 PM, Kyotaro HORIGUCHI
 wrote:
> By the way I'm uneasy that the 'last_vacuum_index_scans' (and
> vacuum_fail_count in 0002 and others in 0003, 0004) is mentioning
> both VACUUM command and autovacuum, while last_vacuum and
> vacuum_count is mentioning only the command. Splitting it into
> vacuum/autovaccum seems nonsense but the name is confusing. Do
> you have any idea?

Hm. I think that you should actually have two fields, one for manual
vacuum and one for autovacuum, because each is tied to respectively
maintenance_work_mem and autovacuum_work_mem. This way admins are able
to tune each one of those parameters depending on a look at
pg_stat_all_tables. So those should be named perhaps
last_vacuum_index_scans and last_autovacuum_index_scans?
-- 
Michael



Re: [HACKERS] More stats about skipped vacuums

2017-11-20 Thread Kyotaro HORIGUCHI
Thank you for the comments.

At Sat, 18 Nov 2017 22:23:20 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] More stats about skipped vacuums

2017-11-18 Thread Michael Paquier
On Thu, Nov 16, 2017 at 7:34 PM, Kyotaro HORIGUCHI
 wrote:
> At Wed, 15 Nov 2017 16:13:01 +0900, Michael Paquier 
>  wrote in 
> 
>>  pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze,
>> +   pg_stat_get_vacuum_necessity(C.oid) AS vacuum_required,
>>  pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
>>  pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
>>  pg_stat_get_last_analyze_time(C.oid) as last_analyze,
>>  pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze,
>>  pg_stat_get_vacuum_count(C.oid) AS vacuum_count,
>> Please use spaces instead of tabs. Indentation is not consistent.
>
> Done. Thank you for pointing. (whitespace-mode showed me some
> similar inconsistencies at the other places in the file...)

Yes, I am aware of those which get introduced here and there. Let's
not make things worse..

>> +   case PGSTAT_VACUUM_CANCELED:
>> +   phase = tabentry->vacuum_last_phase;
>> +   /* number of elements of phasestr above */
>> +   if (phase >= 0 && phase <= 7)
>> +   result = psprintf("%s while %s",
>> + status == PGSTAT_VACUUM_CANCELED ?
>> + "canceled" : "error",
>> + phasestr[phase]);
>> Such complication is not necessary. The phase parameter is updated by
>> individual calls of pgstat_progress_update_param(), so the information
>> showed here overlaps with the existing information in the "phase"
>> field.
>
> The "phase" is pg_stat_progress_vacuum's? If "complexy" means
> phasestr[phase], the "phase" cannot be overlap with
> last_vacuum_status since pg_stat_progress_vacuum's entry has
> already gone when someone looks into pg_stat_all_tables and see a
> failed vacuum status. Could you give a bit specific comment?

I mean that if you tend to report this information, you should just
use a separate column for it. Having a single column report two
informations, which are here the type of error and potentially the
moment where it appeared are harder to parse.

>> However, progress reports are here to allow users to do decisions
>> based on the activity of how things are working. This patch proposes
>> to add multiple new fields:
>> - oldest Xmin.
>> - number of index scans.
>> - number of pages truncated.
>> - number of pages that should have been truncated, but are not truncated.
>> Among all this information, as Sawada-san has already mentioned
>> upthread, the more index scans the less dead tuples you can store at
>> once, so autovacuum_work_mem ought to be increases. This is useful for
>> tuning and should be documented properly if reported to give
>> indications about vacuum behavior. The rest though, could indicate how
>> aggressive autovacuum is able to remove tail blocks and do its work.
>> But what really matters for users to decide if autovacuum should be
>> more aggressive is tracking the number of dead tuples, something which
>> is already evaluated.
>
> Hmm. I tend to agree. Such numbers are better to be shown as
> average of the last n vacuums or maximum. I decided to show
> last_vacuum_index_scan only and I think that someone can record
> it continuously to elsewhere if wants.

As a user, what would you make of those numbers? How would they help
in tuning autovacuum for a relation? We need to clear up those
questions before thinking if there are cases where those are useful.

>> Tracking the number of failed vacuum attempts is also something
>> helpful to understand how much the job is able to complete. As there
>> is already tracking vacuum jobs that have completed, it could be
>> possible, instead of logging activity when a vacuum job has failed, to
>> track the number of *begun* jobs on a relation. Then it is possible to
>> guess how many have failed by taking the difference between those that
>> completed properly. Having counters per failure types could also be a
>> possibility.
>
> Maybe pg_stat_all_tables is not the place to hold such many kinds
> of vacuum specific information. pg_stat_vacuum_all_tables or
> something like?

What do you have in mind? pg_stat_all_tables already includes counters
about the number of vacuums and analyze runs completed. I guess that
the number of failures, and the types of failures ought to be similar
counters at the same level.

>> For this commit fest, I would suggest a patch that simply adds
>> tracking for the number of index scans done, with documentation to
>> give recommendations about parameter tuning. i am switching the patch
>> as "waiting on author".
>
> Ok, the patch has been split into the following four parts. (Not
> split by function, but by the kind of information to add.)
> The first one is that.
>
> 0001. Adds pg_stat_all_tables.last_vacuum_index_scans. 

Re: [HACKERS] More stats about skipped vacuums

2017-11-16 Thread Kyotaro HORIGUCHI
Thank you for reviewing this.

At Wed, 15 Nov 2017 16:13:01 +0900, Michael Paquier  
wrote in 
>  pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze,
> +   pg_stat_get_vacuum_necessity(C.oid) AS vacuum_required,
>  pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
>  pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
>  pg_stat_get_last_analyze_time(C.oid) as last_analyze,
>  pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze,
>  pg_stat_get_vacuum_count(C.oid) AS vacuum_count,
> Please use spaces instead of tabs. Indentation is not consistent.

Done. Thank you for pointing. (whitespace-mode showed me some
similar inconsistencies at the other places in the file...)

> +   case PGSTAT_VACUUM_CANCELED:
> +   phase = tabentry->vacuum_last_phase;
> +   /* number of elements of phasestr above */
> +   if (phase >= 0 && phase <= 7)
> +   result = psprintf("%s while %s",
> + status == PGSTAT_VACUUM_CANCELED ?
> + "canceled" : "error",
> + phasestr[phase]);
> Such complication is not necessary. The phase parameter is updated by
> individual calls of pgstat_progress_update_param(), so the information
> showed here overlaps with the existing information in the "phase"
> field.

The "phase" is pg_stat_progress_vacuum's? If "complexy" means
phasestr[phase], the "phase" cannot be overlap with
last_vacuum_status since pg_stat_progress_vacuum's entry has
already gone when someone looks into pg_stat_all_tables and see a
failed vacuum status. Could you give a bit specific comment?

> @@ -210,7 +361,6 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
> PG_RETURN_INT64(result);
>  }
> 
> -
>  Datum
>  pg_stat_get_blocks_hit(PG_FUNCTION_ARGS)
> Noise diff.

Removed.

> Thinking about trying to et something into core by the end of the
> commit fest, this patch presents multiple concepts at once which could
> be split into separate patches for simplicity:
> 1) Additional data fields to help in debugging completed vacuums.
> 2) Tracking of interrupted vacuum jobs in progress table.
> 3) Get state of vacuum job on error.
> 
> However, progress reports are here to allow users to do decisions
> based on the activity of how things are working. This patch proposes
> to add multiple new fields:
> - oldest Xmin.
> - number of index scans.
> - number of pages truncated.
> - number of pages that should have been truncated, but are not truncated.
> Among all this information, as Sawada-san has already mentioned
> upthread, the more index scans the less dead tuples you can store at
> once, so autovacuum_work_mem ought to be increases. This is useful for
> tuning and should be documented properly if reported to give
> indications about vacuum behavior. The rest though, could indicate how
> aggressive autovacuum is able to remove tail blocks and do its work.
> But what really matters for users to decide if autovacuum should be
> more aggressive is tracking the number of dead tuples, something which
> is already evaluated.

Hmm. I tend to agree. Such numbers are better to be shown as
average of the last n vacuums or maximum. I decided to show
last_vacuum_index_scan only and I think that someone can record
it continuously to elsewhere if wants.

> Tracking the number of failed vacuum attempts is also something
> helpful to understand how much the job is able to complete. As there
> is already tracking vacuum jobs that have completed, it could be
> possible, instead of logging activity when a vacuum job has failed, to
> track the number of *begun* jobs on a relation. Then it is possible to
> guess how many have failed by taking the difference between those that
> completed properly. Having counters per failure types could also be a
> possibility.

Maybe pg_stat_all_tables is not the place to hold such many kinds
of vacuum specific information. pg_stat_vacuum_all_tables or
something like?

> For this commit fest, I would suggest a patch that simply adds
> tracking for the number of index scans done, with documentation to
> give recommendations about parameter tuning. i am switching the patch
> as "waiting on author".

Ok, the patch has been split into the following four parts. (Not
split by function, but by the kind of information to add.)
The first one is that.

0001. Adds pg_stat_all_tables.last_vacuum_index_scans. Documentation is added.

0002. Adds pg_stat_all_tables.vacuum_required. And primitive documentation.

0003. Adds pg_stat_all_tables.last_vacuum_status/autovacuum_fail_count
   plus primitive documentation.

0004. truncation information stuff.


One concern on pg_stat_all_tables view is the number of
predefined functions it uses. Currently 20 functions and this
patch adds more seven. I feel it's 

Re: [HACKERS] More stats about skipped vacuums

2017-11-14 Thread Michael Paquier
On Mon, Oct 30, 2017 at 8:57 PM, Kyotaro HORIGUCHI
 wrote:
> At Thu, 26 Oct 2017 15:06:30 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20171026.150630.115694437.horiguchi.kyot...@lab.ntt.co.jp>
>> At Fri, 20 Oct 2017 19:15:16 +0900, Masahiko Sawada  
>> wrote in