Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-28 Thread Peter Geoghegan
I noticed a minor omission in the patch as committed. Attached patch
corrects this.


-- 
Peter Geoghegan
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*** generate_normalized_query(pgssJumbleStat
*** 2726,2732 
  		if (tok_len  0)
  			continue;			/* ignore any duplicates */
  
! 		/* Copy next chunk, or as much as will fit */
  		len_to_wrt = off - last_off;
  		len_to_wrt -= last_tok_len;
  
--- 2726,2732 
  		if (tok_len  0)
  			continue;			/* ignore any duplicates */
  
! 		/* Copy next chunk */
  		len_to_wrt = off - last_off;
  		len_to_wrt -= last_tok_len;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-28 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 I noticed a minor omission in the patch as committed. Attached patch
 corrects this.

Duh.  Thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-27 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Sun, Jan 26, 2014 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Meh.  This line of argument seems to reduce to we don't need to worry
 about performance of this code path because it won't be reached often.

 I think I may have over-elaborated, giving you the false impression
 that this was something I felt strongly about. I'm glad that the
 overhead has been shown to be quite low, and I think that lexing
 without the lock held will be fine.

OK.  Committed after a couple of small further revisions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-27 Thread Peter Geoghegan
On Mon, Jan 27, 2014 at 12:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 OK.  Committed after a couple of small further revisions.

Great. Thank you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-26 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Sat, Jan 25, 2014 at 2:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, it's fairly expensive to generate that text, in the case of a
 large/complex statement.  It's possible that continuing to hold the lock
 is nonetheless the right thing to do because release+reacquire is also
 expensive; but there is no proof of that AFAIK, and I believe that
 release+reacquire is not likely to be expensive unless the lock is heavily
 contended, which would be exactly the conditions under which keeping it
 wouldn't be such a good idea anyway.

 My reason for only acquiring the shared lock once, and performing text
 normalization with it held was that in practice most query texts are
 not all that expensive to lex/normalize, and the cost of holding the
 lock for the vast majority of sessions (that won't experience
 contention) is nil.

Meh.  This line of argument seems to reduce to we don't need to worry
about performance of this code path because it won't be reached often.
I don't particularly buy that; it may be true in some usage patterns but
probably not all.  And if you *do* buy it, it can be turned around to say
that we needn't worry about the costs of an extra locking cycle, anyway.

However, the only thing that is really going to settle this argument is
data, so here's a simple test case.  I made a script that just consisted
of many repetitions of 
   \dd
   SELECT pg_stat_statements_reset();
and ran it in psql while tracing the system with perf.  (\dd produces
a query that is middlingly complex, but certainly not a patch on those
I've seen produced by many real applications.  The point of the reset
is to force us through the stats-entry-creation code path each time.)

The hits above 1%, in a non-assert-enabled backend:

13.92%34576   postmaster  [kernel.kallsyms] 
[k] 0x8103ed21   
 8.64%21645   postmaster  postgres  
[.] AllocSetAlloc
 5.09%12502   postmaster  postgres  
[.] SearchCatCache   
 2.37% 6025   postmaster  postgres  
[.] MemoryContextStrdup  
 2.22% 5624   postmaster  libc-2.12.so  
[.] __strcmp_sse42   
 1.93% 4860   postmaster  postgres  
[.] core_yylex   
 1.84% 4501   postmaster  postgres  
[.] base_yyparse 
 1.83% 4601   postmaster  postgres  
[.] ScanKeywordLookup
 1.54% 3893   postmaster  postgres  
[.] copyObject   
 1.54% 3849   postmaster  postgres  
[.] MemoryContextAllocZeroAligned
 1.30% 3237   postmaster  postgres  
[.] expression_tree_walker   
 1.23% 3069   postmaster  postgres  
[.] palloc   
 1.15% 2903   postmaster  libc-2.12.so  
[.] memcpy   
 1.04% 2566   postmaster  postgres  
[.] hash_uint32  

So 3.76% of the entire runtime is going into core_yylex+ScanKeywordLookup,
and presumably half of that can be blamed on generate_normalized_query.
On the other hand, the earliest mention of lock-related functions in the
profile is

 0.17%  411   postmaster  postgres  
[.] LWLockAcquire

and LWLockRelease is down here:

 0.11%  264   postmaster  postgres  
[.] LWLockRelease

So on this example, the *total* cost of LWLocks, across the entire
backend, is something like 15% of the cost of generate_normalized_query.
This seems to me to be reasonably strong evidence that we should worry
about that cost, and that releasing/reacquiring the pgss LWLock is
trivial by comparison.

Of course, if the lock were contended then the locking cost would go
up substantially --- but that would also be a scenario in which it'd
be important not to hold the lock unnecessarily.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-26 Thread Peter Geoghegan
On Sun, Jan 26, 2014 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Meh.  This line of argument seems to reduce to we don't need to worry
 about performance of this code path because it won't be reached often.

I think I may have over-elaborated, giving you the false impression
that this was something I felt strongly about. I'm glad that the
overhead has been shown to be quite low, and I think that lexing
without the lock held will be fine.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Wed, Jan 22, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, have you got any sort of test scenario you could share for this
 purpose?  I'm sure I could build something, but if you've already
 done it ...

 I simply ran the standard regression tests, and then straced a backend
 as it executed the pgss-calling query. I'm not sure that I've
 understood your question.

 As previously noted, my approach to testing this patch involved variations of:
 running make installcheck-parallel

Do the regression tests fail for you when doing this?

What I see is a duplicate occurrence of an escape_string_warning bleat:

*** /home/postgres/pgsql/src/test/regress/expected/plpgsql.out  Fri Jan  3 17:07
:46 2014
--- /home/postgres/pgsql/src/test/regress/results/plpgsql.out   Sat Jan 25 13:37
:20 2014
***
*** 4568,4573 
--- 4568,4579 
  HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
  QUERY:  SELECT 'foo\\bar\041baz'
  CONTEXT:  PL/pgSQL function strtest() line 4 at RETURN
+ WARNING:  nonstandard use of \\ in a string literal
+ LINE 1: SELECT 'foo\\bar\041baz'
+^
+ HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
+ QUERY:  SELECT 'foo\\bar\041baz'
+ CONTEXT:  PL/pgSQL function strtest() line 4 at RETURN
 strtest   
  -
   foo\bar!baz

==

which seems to happen because generate_normalized_query() reruns the
core lexer on the given statement, and if you got a warning the first
time, you'll get another one.

It seems this only happens with rather small values of
pg_stat_statements.max, else there's already a RETURN text-constant
query in the hashtable so we don't need to do reparsing right here.
(I'm testing with max = 100 to exercise the garbage collection code.)

I'm not sure this is a big deal for real-world use, because surely by now
everyone has either fixed their escape-string issues or disabled the
warning.  But it's annoying for testing.

The big picture of course is that having this warning issued this way
is broken anyhow, and has been since day one, so it's not entirely
pg_stat_statements' fault.  I'm just wondering if it's worth taking
the trouble to turn off the warning when reparsing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Peter Geoghegan
On Sat, Jan 25, 2014 at 11:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Do the regression tests fail for you when doing this?

 What I see is a duplicate occurrence of an escape_string_warning bleat:

 *** /home/postgres/pgsql/src/test/regress/expected/plpgsql.out  Fri Jan  3 
 17:07
 :46 2014
 --- /home/postgres/pgsql/src/test/regress/results/plpgsql.out   Sat Jan 25 
 13:37
 :20 2014
 ***
 *** 4568,4573 
 --- 4568,4579 
   HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
   QUERY:  SELECT 'foo\\bar\041baz'
   CONTEXT:  PL/pgSQL function strtest() line 4 at RETURN
 + WARNING:  nonstandard use of \\ in a string literal
 + LINE 1: SELECT 'foo\\bar\041baz'
 +^
 + HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
 + QUERY:  SELECT 'foo\\bar\041baz'
 + CONTEXT:  PL/pgSQL function strtest() line 4 at RETURN
  strtest
   -
foo\bar!baz

 ==

 which seems to happen because generate_normalized_query() reruns the
 core lexer on the given statement, and if you got a warning the first
 time, you'll get another one.

Oh, yes, I noticed that and reached the same conclusion. Sorry, I
probably should have mentioned this pro-actively.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Peter Geoghegan
On Sat, Jan 25, 2014 at 1:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've chosen to handle failures to load query
 text data by just returning NULL for that query text, which seems
 reasonable given the new mindset that the query text is auxiliary data
 less important than the actual counts.

I guess that's okay.

 I've not done much more than smoke-testing on this; I'm thinking about
 hot-wiring the code to sometimes force it through the error paths,
 just to make sure they act as expected.  I've also not done any
 real performance testing.

I was never too worried about the impact on performance, because any
price that must be paid is paid only when new entries are created,
which is typically a very rare event that was already expensive due to
requiring an exclusive lock. This patch will make it much rarer in
practice by increasing the pg_stat_statements.max default, and thus
reducing the impact on such exclusive locks on *all* sessions, not
just those executing less popular queries. I don't know about you, but
the reason that I didn't performance test this is that it's really
hard to think of a representative scenario where it could possibly
matter, even though it seems like there might be one.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Peter Geoghegan
Why do you think it's better to release the shared lock while
generating a normalized query text, only to acquire it once more? I'm
not suggesting that it's the wrong thing to do. I'm curious about the
reasoning around assessing the costs.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 Why do you think it's better to release the shared lock while
 generating a normalized query text, only to acquire it once more? I'm
 not suggesting that it's the wrong thing to do. I'm curious about the
 reasoning around assessing the costs.

Well, it's fairly expensive to generate that text, in the case of a
large/complex statement.  It's possible that continuing to hold the lock
is nonetheless the right thing to do because release+reacquire is also
expensive; but there is no proof of that AFAIK, and I believe that
release+reacquire is not likely to be expensive unless the lock is heavily
contended, which would be exactly the conditions under which keeping it
wouldn't be such a good idea anyway.  So I'd prefer to leave it doing what
it did before, until there's some concrete evidence that keeping the lock
is a better idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-25 Thread Peter Geoghegan
On Sat, Jan 25, 2014 at 2:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, it's fairly expensive to generate that text, in the case of a
 large/complex statement.  It's possible that continuing to hold the lock
 is nonetheless the right thing to do because release+reacquire is also
 expensive; but there is no proof of that AFAIK, and I believe that
 release+reacquire is not likely to be expensive unless the lock is heavily
 contended, which would be exactly the conditions under which keeping it
 wouldn't be such a good idea anyway.

My reason for only acquiring the shared lock once, and performing text
normalization with it held was that in practice most query texts are
not all that expensive to lex/normalize, and the cost of holding the
lock for the vast majority of sessions (that won't experience
contention) is nil. Acquiring the shared lock twice for something like
that struck me as unjustified. I though it was closer to the original
coding to lex with the lock, because of course the original coding
will only ever acquire the shared lock once. The original
lex-with-no-lock coding is only justified by well, might as well.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Peter Geoghegan
On Tue, Jan 21, 2014 at 8:01 PM, Peter Geoghegan p...@heroku.com wrote:
 Actually, I think the whole thing is rather badly designed, as warm
 cache or no you're still proposing to do thousands of kernel calls
 while holding a potentially contended LWLock.  I suggest what we
 do is (1) read in the whole file, (2) acquire the lock, (3)
 re-read the whole file in the same buffer, (4) work from the buffer.

 fseeko() and fread() are part of the standard library, not any kernel.
 I don't believe that what I've done in qtext_fetch() implies thousands
 of kernel calls, or thousands of context switches.

I ran an strace on a pg_stat_statements backend with a full ~5,000
entries. It seems that glibc is actually content to always make
lseek() and read() syscalls in the event of an fseek(), even though it
does maintain its own buffer and could in principle have a lot more
gumption about that [1]. I agree that we should copy everything into a
buffer in one go.

I think that making the LWLocking duration as brief as possible isn't
critically important. Automated tools will only be interested in new
query texts (implying possible I/O while share locking) as they
observe new entries. But new entries are the only thing worth
considering that may potentially block on that shared lock (although,
not as they write out their query texts, which almost always just
requires a shared lock). New entries ought to be very rare occurrence
compared to the execution of queries - certainly, on the busy
production systems I use pg_stat_statements on, it may be weeks before
a new query is observed (uh, with the possible exception of *my*
ad-hoc pg_stat_statements queries). On my laptop, explain analyze
select * from pg_stat_statements indicates a total runtime of ~9ms
with 5,000 regression test entries, even with the (still modest)
overhead of doing all those read()/lseek() calls. So we're talking
about a small impact on a new entry, that will only be affected once,
that was always going to have to write out its query text to file
anyway. On top of all this, in general it's very improbable that any
given new entry will be affected at all, because two unlikely things
need to happen at the same time for that to be possible.

The most important aspect of keeping the overhead of
pg_stat_statements low is that there not be too much cache pressure.
Obviously the huge reduction in its shared memory footprint that this
patch allows helps with that.

[1] http://www.pixelbeat.org/programming/stdio_buffering/
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Peter Geoghegan
On Tue, Jan 21, 2014 at 8:01 PM, Peter Geoghegan p...@heroku.com wrote:
 BTW shouldn't there be an fflush() in qtext_store?

 I don't think so, no. Whenever qtext_store() callers don't fflush()
 before releasing their exclusive lock, they FreeFile() before doing so
 instead. In the case of pgss_shmem_startup() there is no lock held
 anyway, because it doesn't matter, for the same reason it doesn't
 matter that we're modifying shmem structures in a way that ordinarily
 necessitates an exclusive lock. Why fflush() every qtext_store() call
 if there is expected to be more than one, as is the case for several
 callers?

Having said all that, it occurs to me now that I could have been
smarter about where I fflush(). I'm pretty sure pgss_store() should
have two fflush() calls. The first can occur with just a shared LWLock
held, before the lock strength promotion interim (you probably noticed
that I no longer generate a normalized query text in that interim, for
reasons that are obvious). The second fflush() may occur only in the
very unlikely event of a garbage collection necessitating a new
qtext_store() call with an exclusive lock held, a few lines after the
first fflush(). No need to fflush() with the exclusive lock the vast
majority of the time.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Tue, Jan 21, 2014 at 8:01 PM, Peter Geoghegan p...@heroku.com wrote:
 Actually, I think the whole thing is rather badly designed, as warm
 cache or no you're still proposing to do thousands of kernel calls
 while holding a potentially contended LWLock.  I suggest what we
 do is (1) read in the whole file, (2) acquire the lock, (3)
 re-read the whole file in the same buffer, (4) work from the buffer.

 fseeko() and fread() are part of the standard library, not any kernel.
 I don't believe that what I've done in qtext_fetch() implies thousands
 of kernel calls, or thousands of context switches.

 I ran an strace on a pg_stat_statements backend with a full ~5,000
 entries. It seems that glibc is actually content to always make
 lseek() and read() syscalls in the event of an fseek(), even though it
 does maintain its own buffer and could in principle have a lot more
 gumption about that [1]. I agree that we should copy everything into a
 buffer in one go.

Yeah, that was what I thought might happen.  Even if glibc were smarter,
a lot of other libc implementations aren't.  Also, ISTM that traversal
of the hash table will generally lead to more-or-less-random access into
the text file.  So unless you assume that libc has mmap'd the whole
file, it'd have to do a lot of kernel calls anyway to get different pages
of the file into its buffer at different times.

I think that read the whole file is probably going to net out not any
more complex as far as our code goes, and it'll be making a lot fewer
assumptions about how smart the libc level is and what's happening at
the kernel level.  I'll have a go at coding it, anyway.

 I think that making the LWLocking duration as brief as possible isn't
 critically important.

Maybe, but I don't like the idea that calling pg_stat_statements()
frequently could result in random glitches in response time for other
queries.

 Automated tools will only be interested in new
 query texts (implying possible I/O while share locking) as they
 observe new entries.

This argument would be more plausible if there were a way to fetch
the text for a previously-observed entry without invoking
pg_stat_statements().  I'm surprised you've not submitted a patch
to add such a function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 I ran an strace on a pg_stat_statements backend with a full ~5,000
 entries.

BTW, have you got any sort of test scenario you could share for this
purpose?  I'm sure I could build something, but if you've already
done it ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Peter Geoghegan
On Wed, Jan 22, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, have you got any sort of test scenario you could share for this
 purpose?  I'm sure I could build something, but if you've already
 done it ...

I simply ran the standard regression tests, and then straced a backend
as it executed the pgss-calling query. I'm not sure that I've
understood your question.

As previously noted, my approach to testing this patch involved variations of:

running make installcheck-parallel

Concurrently, running pgbench with multiple clients each calling
pg_stat_statements() and pg_stat_statements_reset(). The latter was
sometimes rigged to do a direct garbage collection to stress test
things. My expectation was that the sanity checking done everywhere
would complain if there were any race conditions or other bugs. This
was pretty effective as a smoke test during development.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Wed, Jan 22, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, have you got any sort of test scenario you could share for this
 purpose?  I'm sure I could build something, but if you've already
 done it ...

 I simply ran the standard regression tests, and then straced a backend
 as it executed the pgss-calling query. I'm not sure that I've
 understood your question.

Hm, are there 5K distinct queries in the regression tests?  I'd have
thought they weren't enough to fill a large pgss hash.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-22 Thread Peter Geoghegan
On Wed, Jan 22, 2014 at 1:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hm, are there 5K distinct queries in the regression tests?  I'd have
 thought they weren't enough to fill a large pgss hash.

Well, yes. They're a really bad case for pg_stat_statements, in that
almost all distinct queries are only executed once or twice and yet
there are enough of them. I think this is generally useful for
testing.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-21 Thread Tom Lane
BTW, what is the value of using posix_fadvise() in warm_fcache?

ISTM that if you're worried about waiting for I/O while holding the
LWLock (as I concur you should be), this coding is entirely inadequate
for preventing that, as the whole point of posix_fadvise is that it
won't wait around for the I/O to happen.  It strains credulity to
the breaking point to imagine that the kernel will have gotten
around to reading the file in the small number of nanoseconds that
will elapse before you start needing the data.  Unless of course the file
is already in buffers, but then the whole thing was a waste of code.

I think we should flush the posix_fadvise() code path as being neither
useful nor portable, and just do the actual read() as in the other
path (which btw is lacking an essential fseek).

Actually, I think the whole thing is rather badly designed, as warm
cache or no you're still proposing to do thousands of kernel calls
while holding a potentially contended LWLock.  I suggest what we
do is (1) read in the whole file, (2) acquire the lock, (3)
re-read the whole file in the same buffer, (4) work from the buffer.

It might be possible to optimize the re-read away in the common case
that the file didn't actually change since we started to read it;
we'd need to put a bit more reporting into qtext_store probably, so
that it's possible to tell if any space is reserved but still unwritten.
BTW shouldn't there be an fflush() in qtext_store?

We could also eliminate some palloc/pfree cycles by using the string
directly in the buffer (at the small cost of needing to include the
strings' trailing nulls in the file).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-21 Thread Peter Geoghegan
On Tue, Jan 21, 2014 at 6:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, what is the value of using posix_fadvise() in warm_fcache?

 ISTM that if you're worried about waiting for I/O while holding the
 LWLock (as I concur you should be), this coding is entirely inadequate
 for preventing that, as the whole point of posix_fadvise is that it
 won't wait around for the I/O to happen.  It strains credulity to
 the breaking point to imagine that the kernel will have gotten
 around to reading the file in the small number of nanoseconds that
 will elapse before you start needing the data.  Unless of course the file
 is already in buffers, but then the whole thing was a waste of code.

Not necessarily. Under Linux for example, POSIX_FADV_SEQUENTIAL sets
the readahead window to the default size for the backing device. That
seems like a useful thing, though I will admit that it's hard to
quantify how useful here. If it is useful, and furthermore if it's
more useful than just reading the file into a temp buffer (which, I
admit, is not clear), then why not do it where we can? Adopting
support for the non-portable posix_fadvise is a sunk cost.

 I think we should flush the posix_fadvise() code path as being neither
 useful nor portable, and just do the actual read() as in the other
 path

Perhaps.

 Actually, I think the whole thing is rather badly designed, as warm
 cache or no you're still proposing to do thousands of kernel calls
 while holding a potentially contended LWLock.  I suggest what we
 do is (1) read in the whole file, (2) acquire the lock, (3)
 re-read the whole file in the same buffer, (4) work from the buffer.

fseeko() and fread() are part of the standard library, not any kernel.
I don't believe that what I've done in qtext_fetch() implies thousands
of kernel calls, or thousands of context switches.

Now, using a buffer not managed by libc might be better, but I should
note what holding that LWLock actually represents. As you know, it's a
shared lock that does not block the updating of existing entries. It
will only block the creation of entirely new ones. That ought to be
very rare. You may recall that when I worked with you on
pg_stat_statements in the past, I characterized the rate of growth as
being logarithmic. Once the system has been running for half an hour,
new entries become very infrequent indeed. Admittedly, I may be
failing to consider a worst case there, but in the average case this
works fine. Encouraging automated tools to lazily retrieve query texts
is a big part of why I thought this might be the least worst thing.

It's not obvious to me that always avoiding blocking new entries while
performing physical I/O is important enough to warrant the additional
complexity of copying to a buffer first, and mostly working off that.
Also, you must consider that there may have been a garbage collection
in the interim, so you have to get the shared lock twice (to check the
garbage collection count, to later guard against your buffer having
been invalidated by a garbage collection) instead of just getting the
shared lock once. I discussed this with Pavel, actually. Getting the
shared lock twice on every pg_stat_statements() call doesn't seem very
good either. Especially since you'll have to open the file twice,
*with* the shared lock second time around, when your scheme misses new
entries. Your scheme won't work out under exactly the same
circumstances that mine doesn't do too well, which in when there is a
new entry to consider. When that doesn't happen, holding a shared lock
is not that important.

 It might be possible to optimize the re-read away in the common case
 that the file didn't actually change since we started to read it;
 we'd need to put a bit more reporting into qtext_store probably, so
 that it's possible to tell if any space is reserved but still unwritten.

So, to be clear, you'd still be reading from the file if that proved
necessary? I think you'll have to, because it seems useful that
pg_stat_statements offers a limited form of consistency, in that you
only see entries at a point in time, even if the figures themselves
are not exactly consistent. Maybe I've missed something, but it's not
obvious to me that you're any better off as compared to just using a
FILE/stream.

 BTW shouldn't there be an fflush() in qtext_store?

I don't think so, no. Whenever qtext_store() callers don't fflush()
before releasing their exclusive lock, they FreeFile() before doing so
instead. In the case of pgss_shmem_startup() there is no lock held
anyway, because it doesn't matter, for the same reason it doesn't
matter that we're modifying shmem structures in a way that ordinarily
necessitates an exclusive lock. Why fflush() every qtext_store() call
if there is expected to be more than one, as is the case for several
callers?

FreeFile() calls fclose(). fclose() is described by the standard as
flushing its stream. It is my understanding that calling fflush() on a
stream immediately prior to 

Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-21 Thread Peter Geoghegan
On Tue, Jan 21, 2014 at 8:01 PM, Peter Geoghegan p...@heroku.com wrote:
 Not necessarily. Under Linux for example, POSIX_FADV_SEQUENTIAL sets
 the readahead window to the default size for the backing device

Excuse me; I meant to put POSIX_FADV_SEQUENTIAL doubles this size
[default size for the backing device].

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-20 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Sat, Dec 7, 2013 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote:
 The patch doesn't apply cleanly against the master due to recent change of
 pg_stat_statements. Could you update the patch?

 Revision is attached, including changes recently discussed on other
 thread [1] to allow avoiding sending query text where that's not
 desirable and increase in default pg_stat_statements.max to 5000.

I see this is marked as ready for committer.  Where does it stand in
relation to the other long-running thread about calls under-estimation
propagation?  I was surprised to find that there isn't any CommitFest
entry linked to that thread, so I'm wondering if that proposal is
abandoned or what.  If it's not, is committing this going to blow up
that patch?

BTW, I'm also thinking that the detected_version kluge is about ready
to collapse of its own weight, or at least is clearly going to break in
future.  What we need to do is embed the API version in the C name of the
pg_stat_statements function instead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-20 Thread Peter Geoghegan
On Mon, Jan 20, 2014 at 12:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I see this is marked as ready for committer.  Where does it stand in
 relation to the other long-running thread about calls under-estimation
 propagation?  I was surprised to find that there isn't any CommitFest
 entry linked to that thread, so I'm wondering if that proposal is
 abandoned or what.  If it's not, is committing this going to blow up
 that patch?

I believe that proposal was withdrawn. I think the conclusion there
was that we should just expose queryid and be done with it. In a way,
exposing the queryid enabled this work, because it provides an
identifier that can be used instead of sending large query texts each
call.

 BTW, I'm also thinking that the detected_version kluge is about ready
 to collapse of its own weight, or at least is clearly going to break in
 future.  What we need to do is embed the API version in the C name of the
 pg_stat_statements function instead.

I agree that it isn't scalable.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2014-01-20 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Mon, Jan 20, 2014 at 12:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, I'm also thinking that the detected_version kluge is about ready
 to collapse of its own weight, or at least is clearly going to break in
 future.  What we need to do is embed the API version in the C name of the
 pg_stat_statements function instead.

 I agree that it isn't scalable.

Yeah.  It was more or less tolerable as long as we were only using it in
connection with identifying the set of output columns, but using it to
identify the expected input arguments too seems darn rickety.  I'll
refactor so there are separate C call points for each supported API
version.  (Well, I guess it's too late to separate 1.0 and 1.1, but
we can make 1.2 and future versions separate.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-12-09 Thread Peter Geoghegan
On Sat, Dec 7, 2013 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote:
 The patch doesn't apply cleanly against the master due to recent change of
 pg_stat_statements. Could you update the patch?

Revision is attached, including changes recently discussed on other
thread [1] to allow avoiding sending query text where that's not
desirable and increase in default pg_stat_statements.max to 5000.

I've found myself tweaking things a bit more here and there. I've
added additional clarification around why we do an in-place garbage
collection (i.e. why we don't just write a new file and atomically
rename -- grep over-engineer to find what I mean). I also fully
documented the pg_stat_statements function. If we want external tool
authors to use it to avoid sending query texts, they have to know
about it in the first place.

I now use the pg_stat_tmp directory for my temp file (so
pg_stat_statements behaves more like the statistics collector). The
stats_temp_directory GUC is not used, for reasons that a patch comment
explains.

I've fuzz-tested this throughout with the make installcheck-parallel
regression tests. I found it useful to run that in one terminal, while
running pgbench with multiple clients in another. The pgbench script
looks something like:


select * from pg_stat_statements;
select pg_stat_statements_reset();


pg_stat_statements_reset() was temporarily rigged to perform a garbage
collection (and not a reset) for the benefit of this stress-test. The
function pg_stat_statements(), the garbage collection function and
pgss_store() will complain loudly if they notice anything out of the
ordinary. I was not able to demonstrate any problems through this
technique (though I think it focused my thinking on the right areas
during development). Clearly, missed subtleties around managing the
external file while locking in order to ensure correct behavior (that
no session can observe something that it should or should not) will be
something that a committer will want to pay particular attention to. I
go to some lengths to to avoid doing this with only a shared lock.

FYI, without hacking things up, it's quite hard to make external query
text garbage collection occur - need_gc_qtexts() will almost always
say no. It will only automatically happen once with
pg_stat_statements.max set to 1000 when the standard regression tests
are run. This is a really extreme workload in terms of causing
pg_stat_statements churn, which shows just how unlikely garbage
collection is in the real world. Still, it ought to be totally robust
when it happens.

In passing, I did this:

/*
!* Rename file into place, to atomically commit to serializing.
 */

Because at this juncture, there ought to be no file to replace (not
since Magnus had pg_stat_statements not keep the main global file
around for as long as the server is running, in the style of the
statistics collector).

Thanks

[1] 
http://www.postgresql.org/message-id/cam3swzsvff-vbnc8sc-0cpwzxvqh49reybchb95t95fcrgs...@mail.gmail.com
-- 
Peter Geoghegan


pg_stat_statements_ext_text.v5.2013_12_09.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-12-09 Thread Peter Geoghegan
On Mon, Dec 9, 2013 at 7:31 PM, Peter Geoghegan p...@heroku.com wrote:
 I go to some lengths to to avoid doing this with only a shared lock.

I should have said: I go to great lengths to do this with only a
shared lock, and not an exclusive (see gc_count stuff).


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-12-07 Thread Fujii Masao
On Sun, Nov 17, 2013 at 1:15 PM, Peter Geoghegan p...@heroku.com wrote:
 On Sat, Nov 16, 2013 at 4:36 PM, Peter Geoghegan p...@heroku.com wrote:
 I'll post a revision with fixes for those. Another concern is that I
 see some race conditions that only occur when pg_stat_statements cache
 is under a lot of pressure with a lot of concurrency, that wasn't
 revealed by my original testing. I'm working on a fix for that.

 Revision attached.

The patch doesn't apply cleanly against the master due to recent change of
pg_stat_statements. Could you update the patch?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-20 Thread Peter Geoghegan
With a pg_stat_statements.max of 1,000, on my system with the patch
applied the additional amount of shared memory required is only
192KiB. That compares with about 1.17MiB for the same setting in
master's version of pg_stat_statements. Surely with this huge
reduction, we should revisit that default - I think that a default
setting of 5,000 for pg_stat_statements.max makes more sense.

entry_dealloc() requires an exclusive lock, locking all queries out of
simply recording their costs. With a lot of cache pressure this could
be really expensive. In my opinion that risk alone justifies the
change.

Without adding another GUC, we might even go so far as to have a
mechanism like checkpoint_warning warn that entry_dealloc() calls
occur too frequently...

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-16 Thread Peter Geoghegan
On Fri, Nov 15, 2013 at 8:51 AM, Peter Eisentraut pete...@gmx.net wrote:
 Compiler warnings with fortify settings:

I'll post a revision with fixes for those. Another concern is that I
see some race conditions that only occur when pg_stat_statements cache
is under a lot of pressure with a lot of concurrency, that wasn't
revealed by my original testing. I'm working on a fix for that.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-16 Thread Peter Geoghegan
On Sat, Nov 16, 2013 at 4:36 PM, Peter Geoghegan p...@heroku.com wrote:
 I'll post a revision with fixes for those. Another concern is that I
 see some race conditions that only occur when pg_stat_statements cache
 is under a lot of pressure with a lot of concurrency, that wasn't
 revealed by my original testing. I'm working on a fix for that.

Revision attached.


-- 
Peter Geoghegan


pg_stat_statements_ext_text.v2.2013_11_16.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 3:17 AM, Peter Geoghegan wrote:
 Attached patch allows pg_stat_statements to store arbitrarily long
 query texts, using an external, transparently managed lookaside file.

Compiler warnings with fortify settings:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE 
-I/usr/include/libxml2   -c -o pg_stat_statements.o pg_stat_statements.c -MMD 
-MP -MF .deps/pg_stat_statements.Po
pg_stat_statements.c: In function ‘entry_reset’:
pg_stat_statements.c:1827:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]
pg_stat_statements.c: In function ‘garbage_collect_query_texts’:
pg_stat_statements.c:1779:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers