Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-12-04 Thread Magnus Hagander
On Mon, Dec 2, 2013 at 10:42 PM, Peter Eisentraut pete...@gmx.net wrote:

 On 11/19/13, 11:30 PM, Peter Geoghegan wrote:
  +1 from me.
  
  
   That's +1 for *not* including this?
  Right.

 I agree with not including this.


If you're looking for more of those, here's another +1 for not including
it. (And of course also for Peters comment about that we need to figure out
if something is actually missing for building this higher level tool that
can provide this information and more)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-12-02 Thread Peter Eisentraut
On 11/19/13, 11:30 PM, Peter Geoghegan wrote:
 +1 from me.
 
 
  That's +1 for *not* including this?
 Right.

I agree with not including this.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-19 Thread Jeff Janes
On Mon, Nov 18, 2013 at 10:56 AM, Peter Geoghegan p...@heroku.com wrote:

 On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com
 wrote:
  The same idea was proposed before but not committed because
  Itagaki thought that pg_stat_statements view should report only raw
 values.
  Please read the following thread. I have the same feeling with him.
  Anyway we should listen to more opinions from other people, though.
 
 http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp

 +1 from me.


That's +1 for *not* including this?  If so, I agree as well.  It would be
easy enough to create a sql view that computes this if one wanted (unlike
the min and max execution time, which can't be done externally).  I also
have a theological opposition to exposing the buffer hit ratio.  There are
two ways to improve the buffer hit ratio.  One is to have fewer misses,
which is worthwhile but you can more easily do it by looking directly at
the number of misses or misses per execution, rather than a buffer hit
ratio. Or you can dilute out the misses by gratuitously increasing the
hits by uselessly reading cached buffers over and over again, which is
counter-productive and evil and perverse.

Take a small to medium look-up table, drop all the indexes so it has to be
full scanned all the time, and maybe rebuild it with a lower fillfactor
(but not so low that it becomes too big to cache), and watch that buffer
hit ratio go through the roof, while true performance goes the other way.



 I think that a higher level tool needs to be built on
 pg_stat_statements to make things easy for those that want a slick,
 pre-packaged solution. As I said on the min/max thread, if we're not
 doing enough to help people who would like to build such a tool, we
 should discuss how we can do better.


I'd like to be able to separate queries by the application_name
and/or client_addr which issued them.

Cheers,

Jeff


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-19 Thread Peter Geoghegan
On Tue, Nov 19, 2013 at 12:12 PM, Jeff Janes jeff.ja...@gmail.com wrote:
  The same idea was proposed before but not committed because
  Itagaki thought that pg_stat_statements view should report only raw
  values.
  Please read the following thread. I have the same feeling with him.
  Anyway we should listen to more opinions from other people, though.
 
  http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp

 +1 from me.


 That's +1 for *not* including this?

Right. Unfortunately, discussion of the other pg_stat_statements
patches spilled into this thread.

 I think that a higher level tool needs to be built on
 pg_stat_statements to make things easy for those that want a slick,
 pre-packaged solution. As I said on the min/max thread, if we're not
 doing enough to help people who would like to build such a tool, we
 should discuss how we can do better.


 I'd like to be able to separate queries by the application_name and/or
 client_addr which issued them.

I see why that would be desirable (it would also be desirable to
sometimes *not* break costs down by user), but it's hard to see a way
to make that work without hurting cases where it doesn't matter (i.e.
the majority of cases).

So thinking about it some more, you could hack together a patch fairly
easily that adds a GUC that makes the current userid a union. A
setting remembers the current preference as to what that actually
stores. You remember the setting on shutdown, when serializing the
stats, in the file header. You can set the setting to hash a whole
bunch of alternative things along the lines you refer to here, such as
client_addr or application_name.

I can see why some aspects of that would be controversial, but it
could probably be made to work. Maybe the hash is used in a second
shared dynahash, where the original values are looked-up, so a new
text column is added to the view that displays things as needed that
way (when the GUC isn't set to its default, userid is NULL, else this
new column is NULL). It would probably be okay to make this spill to
disk, simply because it wouldn't come up that often, and you could be
clever about the locking. Notably, something like a client_addr is
stable within a backend, so it would be cheap to cache the hash value.

The logical/physical I/O divide is probably the main factor that makes
the track_io_timing stuff compelling.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread Haribabu kommi
On 18 October 2013 13:35 KONDO Mitsumasa wrote: 
 Hi,
 
 I submit improvement of pg_stat_statement usage patch in CF3.
 
 In pg_stat_statement, I think buffer hit ratio is very important value.
 However, it is difficult to calculate it, and it need complicated SQL.
 This patch makes it more simple usage and documentation.
 
  -bench=# SELECT query, calls, total_time, rows, 100.0 *
 shared_blks_hit /
  -   nullif(shared_blks_hit + shared_blks_read, 0) AS
 hit_percent
  +bench=# SELECT query, calls, total_time, rows,
  +shared_blks_hit_percent
 FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
 
 It will be very simple:-)
 
 This patch conflicts pg_stat_statement_min_max_exectime patch which I
 submitted, and pg_stat_statement_min_max_exectime patch also adds new
 columns which are min_time and max_time. So I'd like to change it in
 this opportunity.

This patch adds another column shared_blks_hit_percent to pg_stat_statements 
view
Which is very beneficial to the user to know how much percentage of blks are 
hit.

All changes are fine and working as described. Marked as ready for committer.

Regards,
Hari babu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa

(2013/11/18 20:16), Haribabu kommi wrote:

On 18 October 2013 13:35 KONDO Mitsumasa wrote:

This patch conflicts pg_stat_statement_min_max_exectime patch which I
submitted, and pg_stat_statement_min_max_exectime patch also adds new
columns which are min_time and max_time. So I'd like to change it in
this opportunity.


This patch adds another column shared_blks_hit_percent to pg_stat_statements 
view
Which is very beneficial to the user to know how much percentage of blks are 
hit.

All changes are fine and working as described. Marked as ready for committer.

Thank you for your reviewing!

However, I'd like to add average time in each statement, too. Attached patch is 
my latest one. Adding shared_blks_hit_percent and ave_time. This is the adding 
main code.

+ total_time / calls::float AS avg_time,


If this patch and min/max and stddev patch will be commited, we can see more 
detail and simple information in pg_stat_statements, by light-weight coding.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
new file mode 100644
index 000..f0a8e0f
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
@@ -0,0 +1,63 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements();
+
+/* Then we can drop them */
+DROP VIEW pg_stat_statements;
+DROP FUNCTION pg_stat_statements();
+
+/* Now redefine */
+CREATE FUNCTION pg_stat_statements(
+OUT userid oid,
+OUT dbid oid,
+OUT query text,
+OUT calls int8,
+OUT total_time float8,
+OUT rows int8,
+OUT shared_blks_hit int8,
+OUT shared_blks_read int8,
+OUT shared_blks_dirtied int8,
+OUT shared_blks_written int8,
+OUT local_blks_hit int8,
+OUT local_blks_read int8,
+OUT local_blks_dirtied int8,
+OUT local_blks_written int8,
+OUT temp_blks_read int8,
+OUT temp_blks_written int8,
+OUT blk_read_time float8,
+OUT blk_write_time float8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE VIEW pg_stat_statements AS
+  SELECT userid,
+ dbid,
+ query,
+ calls,
+ total_time,
+ rows,
+ CASE WHEN shared_blks_hit + shared_blks_read  0
+   THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read))
+   ELSE 0 END AS shared_blks_hit_percent,
+ shared_blks_hit,
+ shared_blks_read,
+ shared_blks_dirtied,
+ shared_blks_written,
+ local_blks_hit,
+ local_blks_read,
+ local_blks_dirtied,
+ local_blks_written,
+ temp_blks_read,
+ temp_blks_written,
+ blk_read_time,
+ blk_write_time
+  FROM pg_stat_statements();
+
+GRANT SELECT ON pg_stat_statements TO PUBLIC;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.2.sql
new file mode 100644
index 000..41dc16b
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.2.sql
@@ -0,0 +1,65 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use CREATE EXTENSION pg_stat_statements to load this file. \quit
+
+-- Register functions.
+CREATE FUNCTION pg_stat_statements_reset()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE FUNCTION pg_stat_statements(
+OUT userid oid,
+OUT dbid oid,
+OUT query text,
+OUT calls int8,
+OUT total_time float8,
+OUT rows int8,
+OUT shared_blks_hit int8,
+OUT shared_blks_read int8,
+OUT shared_blks_dirtied int8,
+OUT shared_blks_written int8,
+OUT local_blks_hit int8,
+OUT local_blks_read int8,
+OUT local_blks_dirtied int8,
+OUT local_blks_written int8,
+OUT temp_blks_read int8,
+OUT temp_blks_written int8,
+OUT blk_read_time float8,
+OUT blk_write_time float8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+-- Register a view on the function for ease of use.
+CREATE VIEW pg_stat_statements AS
+  SELECT userid,
+ dbid,
+ query,
+ calls,
+ total_time,
+ total_time / calls::float AS avg_time,
+ rows,
+ CASE WHEN shared_blks_hit + shared_blks_read  0
+   THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read))
+   ELSE 0 END AS shared_blks_hit_percent,
+ shared_blks_hit,
+ shared_blks_read,
+ shared_blks_dirtied,
+ shared_blks_written,
+   

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread Fujii Masao
On Mon, Nov 18, 2013 at 8:36 PM, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:
 (2013/11/18 20:16), Haribabu kommi wrote:

 On 18 October 2013 13:35 KONDO Mitsumasa wrote:

 This patch conflicts pg_stat_statement_min_max_exectime patch which I
 submitted, and pg_stat_statement_min_max_exectime patch also adds new
 columns which are min_time and max_time. So I'd like to change it in
 this opportunity.


 This patch adds another column shared_blks_hit_percent to
 pg_stat_statements view
 Which is very beneficial to the user to know how much percentage of blks
 are hit.

The same idea was proposed before but not committed because
Itagaki thought that pg_stat_statements view should report only raw values.
Please read the following thread. I have the same feeling with him.
Anyway we should listen to more opinions from other people, though.
http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread Peter Geoghegan
On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote:
 The same idea was proposed before but not committed because
 Itagaki thought that pg_stat_statements view should report only raw values.
 Please read the following thread. I have the same feeling with him.
 Anyway we should listen to more opinions from other people, though.
 http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp

+1 from me. I think that a higher level tool needs to be built on
pg_stat_statements to make things easy for those that want a slick,
pre-packaged solution. As I said on the min/max thread, if we're not
doing enough to help people who would like to build such a tool, we
should discuss how we can do better.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa

(2013/11/19 3:56), Peter Geoghegan wrote:

On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote:

The same idea was proposed before but not committed because
Itagaki thought that pg_stat_statements view should report only raw values.
Please read the following thread. I have the same feeling with him.
Anyway we should listen to more opinions from other people, though.
http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp
I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values 
be just simple. However, were his changes just simple? I cannot understand his 
aesthetics sense and also you, too:-(


PG8.4 before:

012 CREATE FUNCTION pg_stat_statements(
013 OUT userid oid,
014 OUT dbid oid,
015 OUT query text,
016 OUT calls int8,
017 OUT total_time float8,
018 OUT rows int8
019 )


PG9.0 after:

012 CREATE FUNCTION pg_stat_statements(
013 OUT userid oid,
014 OUT dbid oid,
015 OUT query text,
016 OUT calls int8,
017 OUT total_time float8,
018 OUT rows int8,
019 OUT shared_blks_hit int8,
020 OUT shared_blks_read int8,
021 OUT shared_blks_written int8,
022 OUT local_blks_hit int8,
023 OUT local_blks_read int8,
024 OUT local_blks_written int8,
025 OUT temp_blks_read int8,
026 OUT temp_blks_written int8
027 )
It's too complicated, and do you know how to tuning PG from information of 
local_* and temp_*?
At least, I think that most user cannot tuning from these information, and it 
might not be useful information only part of them.



+1 from me. I think that a higher level tool needs to be built on
pg_stat_statements to make things easy for those that want a slick,
pre-packaged solution.
No. It's not for geek tools and people having pre-packaged solution in big 
company, but also for common DBA tools.


By the way, MySQL and Oracle database which are very popular have these 
statistics. I think that your argument might disturb people who wants to 
migration from these database and will accelerate popularity of these database more.



As I said on the min/max thread, if we're not
doing enough to help people who would like to build such a tool, we
should discuss how we can do better.

Could you tell me how to get min/max statistics with low cost?
I'm not sure about detail of your patch in CF, but it seems very high cost.
Repeatedly, I think that if we want to get drastic detail statistics, we have to 
create another tools of statistics. Your patch will be these statistics tools. 
However, pg_stat_statement sholud be just light weight.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa

(2013/11/19 11:12), KONDO Mitsumasa wrote:

(2013/11/19 3:56), Peter Geoghegan wrote:

On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote:

The same idea was proposed before but not committed because
Itagaki thought that pg_stat_statements view should report only raw values.
Please read the following thread. I have the same feeling with him.
Anyway we should listen to more opinions from other people, though.
http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp

I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values 
be
just simple. However, were his changes just simple? I cannot understand his
aesthetics sense and also you, too:-(

PG8.4 before:

012 CREATE FUNCTION pg_stat_statements(
013 OUT userid oid,
014 OUT dbid oid,
015 OUT query text,
016 OUT calls int8,
017 OUT total_time float8,
018 OUT rows int8
019 )


PG9.0 after:

012 CREATE FUNCTION pg_stat_statements(
013 OUT userid oid,
014 OUT dbid oid,
015 OUT query text,
016 OUT calls int8,
017 OUT total_time float8,
018 OUT rows int8,
019 OUT shared_blks_hit int8,
020 OUT shared_blks_read int8,
021 OUT shared_blks_written int8,
022 OUT local_blks_hit int8,
023 OUT local_blks_read int8,
024 OUT local_blks_written int8,
025 OUT temp_blks_read int8,
026 OUT temp_blks_written int8
027 )

It's too complicated, and do you know how to tuning PG from information of
local_* and temp_*?
At least, I think that most user cannot tuning from these information, and it
might not be useful information only part of them.


+1 from me. I think that a higher level tool needs to be built on
pg_stat_statements to make things easy for those that want a slick,
pre-packaged solution.

No. It's not for geek tools and people having pre-packaged solution in big
company, but also for common DBA tools.

By the way, MySQL and Oracle database which are very popular have these
statistics. I think that your argument might disturb people who wants to
migration from these database and will accelerate popularity of these database 
more.


As I said on the min/max thread, if we're not
doing enough to help people who would like to build such a tool, we
should discuss how we can do better.

Could you tell me how to get min/max statistics with low cost?
I'm not sure about detail of your patch in CF, but it seems very high cost.
Repeatedly, I think that if we want to get drastic detail statistics, we have to
create another tools of statistics. Your patch will be these statistics tools.
However, pg_stat_statement sholud be just light weight.

Regards,

Oh, I forgot to say it...
I beleive that essence of information technology is to show more informative 
information in little information with low cost. If it wrong, please let me know 
the reason.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread Peter Geoghegan
On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:
 I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw
 values be just simple. However, were his changes just simple? I cannot
 understand his aesthetics sense and also you, too:-(

 It's too complicated, and do you know how to tuning PG from information of
 local_* and temp_*?
 At least, I think that most user cannot tuning from these information, and
 it might not be useful information only part of them.

All of those costs are cumulative aggregates. If we didn't aggregate
them, then the user couldn't possibly determine them on their own, to
any approximation. That's the difference. If you think the local_* and
temp_* aren't very useful, I'm inclined to agree, but it's too late to
do anything about that now.

 No. It's not for geek tools and people having pre-packaged solution in big
 company, but also for common DBA tools.

I don't think that the tool needs to be expensive. If selecting from
the pg_stat_statements view every 1-3 seconds is too expensive for
such a tool, we can have a discussion about being smarter, because
there certainly are ways to optimize it.

Regarding your min/max patch: I'm opposed to adding even more to the
spinlock-protected counters struct, so that we can get an exact answer
to a question where an approximate answer would very likely be good
enough. And as Itagaki-san said 4 years ago, who is to say that what
you've done here for buffers (or equally, what you've done in your
min/max patch) is more interesting than the same thing but for another
cost? The point of having what you've removed from the
pg_stat_statements docs about calculating averages is that it is an
example that can be generalized from. I certainly think there should
be better tooling to make displaying costs over time easier, or
characterizing the distribution, but unfortunately this isn't it.

Something like pg_stat_statements is allowed to be approximate. That's
considered an appropriate trade-off. Most obviously, today there can
be hash table collisions, and some of the entries can therefore be
plain wrong. Previously, I put the probability of 1 collision in the
hash table at about 1% when pg_stat_statements.max is set to 10,000.
So if your min/max patch was implemented in userspace, and an
outlier is lost in the noise with just one second of activity, I'm not
terribly concerned about that. It's a trade-off, and if you don't
think it's the correct one, well then I'm afraid that's just where you
and I differ. As I've said many times, if you want to have a
discussion about making aggressive snapshotting of the
pg_stat_statements view more practical, I think that would be very
useful.

 By the way, MySQL and Oracle database which are very popular have these
 statistics. I think that your argument might disturb people who wants to
 migration from these database and will accelerate popularity of these
 database more.

I think that there should be better tooling built on top of
pg_stat_statements. I don't know what Oracle does, but I'm pretty sure
that MySQL has nothing like pg_stat_statements. Please correct me if
I'm mistaken.

 As I said on the min/max thread, if we're not
 doing enough to help people who would like to build such a tool, we
 should discuss how we can do better.

 Could you tell me how to get min/max statistics with low cost?

See my previous comments on the other thread about making
pg_stat_statements only return changed entries, and only sending the
query text once.

 I'm not sure about detail of your patch in CF, but it seems very high cost.

I think you should actually go and read the code and read my
explanation of it, and refrain from making uninformed remarks like
this. Whatever overhead my patch may add, the important point is that
it obviously and self-evidently adds exactly zero overhead to
maintaining statistics for existing entries - we only care about the
query text when first creating an entry, or when viewing an entry when
the view is selected from. With the reduced shared memory consumption,
more entries can be created, making the cost of creating new entries
(and thus whatever might have been added to that cost) matter less.
Having said that, the additional cost is thought to be very low
anyway.

If you read my mail to the list about this, you'd know that I
specifically worried about the implications of what I'd proposed for
tools like your tool. That's part of the reason I keep urging you to
think about making pg_stat_statements cheaper for consumer tools.
There is no reason to send query texts to such tools more than once
per entry, and no reason to send unchanged entries.

 Repeatedly, I think that if we want to get drastic detail statistics, we
 have to create another tools of statistics. Your patch will be these
 statistics tools. However, pg_stat_statement sholud be just light weight.

This is incomprehensible. As with the cumulative aggregate costs, how
is a consumer 

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa

(2013/11/19 12:03), Peter Geoghegan wrote:

On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:

I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw
values be just simple. However, were his changes just simple? I cannot
understand his aesthetics sense and also you, too:-(



It's too complicated, and do you know how to tuning PG from information of
local_* and temp_*?
At least, I think that most user cannot tuning from these information, and
it might not be useful information only part of them.


All of those costs are cumulative aggregates. If we didn't aggregate
them, then the user couldn't possibly determine them on their own, to
any approximation. That's the difference. If you think the local_* and
temp_* aren't very useful, I'm inclined to agree, but it's too late to
do anything about that now.
I regret past decision of Itagaki-san's patch, and improvement might not be 
possible. However, we can change it, if we get have logical reason to change it.



No. It's not for geek tools and people having pre-packaged solution in big
company, but also for common DBA tools.


I don't think that the tool needs to be expensive. If selecting from
the pg_stat_statements view every 1-3 seconds is too expensive for
such a tool, we can have a discussion about being smarter, because
there certainly are ways to optimize it.
I can understand why you say my patch is heavy now! Your monitoring methods are 
redically heavy. In general, we get pg_stat_statements view every 1 min - 5min. 
It is because monitoring SQLs must not heavier than common main SQLs. If we 
measure the real performance, we don't measure with monitoring SQL's cost. And, I 
cannot still understand you'd like to collect drastic detail performance of 
statements. I'd like to only know max, avg and stddev in each statement. They are 
enough, because we can improve them by using these information.



Regarding your min/max patch: I'm opposed to adding even more to the
spinlock-protected counters struct, so that we can get an exact answer
to a question where an approximate answer would very likely be good
enough. And as Itagaki-san said 4 years ago, who is to say that what
you've done here for buffers (or equally, what you've done in your
min/max patch) is more interesting than the same thing but for another
cost? The point of having what you've removed from the
pg_stat_statements docs about calculating averages is that it is an
example that can be generalized from. I certainly think there should
be better tooling to make displaying costs over time easier, or
characterizing the distribution, but unfortunately this isn't it.

Something like pg_stat_statements is allowed to be approximate. That's
considered an appropriate trade-off. Most obviously, today there can
be hash table collisions, and some of the entries can therefore be
plain wrong. Previously, I put the probability of 1 collision in the
hash table at about 1% when pg_stat_statements.max is set to 10,000.
So if your min/max patch was implemented in userspace, and an
outlier is lost in the noise with just one second of activity, I'm not
terribly concerned about that. It's a trade-off, and if you don't
think it's the correct one, well then I'm afraid that's just where you
and I differ. As I've said many times, if you want to have a
discussion about making aggressive snapshotting of the
pg_stat_statements view more practical, I think that would be very
useful.
In summary of your comment, your patch is lower cost than I proposed patch. 
Because my patch has long lock problem, on the other hands your patch doesn't 
these problem. Is it right? If it true, I can understand your theoretically, but 
I'm not sure without real practice or benchmark that it is really or not. So we 
will need benchmark test in my patch and yours. I try it.



By the way, MySQL and Oracle database which are very popular have these
statistics. I think that your argument might disturb people who wants to
migration from these database and will accelerate popularity of these
database more.


I think that there should be better tooling built on top of
pg_stat_statements. I don't know what Oracle does, but I'm pretty sure
that MySQL has nothing like pg_stat_statements. Please correct me if
I'm mistaken.
I joined the db tech show case 2013 which is held in japan last week. Oracle 
speaker intoroduced performance schema and like these in MySQL 5.6. This is the 
slide of his. It's almost in japanese, but please see it since 31page. It is 
wirtten in SQL.

http://www.slideshare.net/yoyamasaki/20131110-tuning-onmysql56

In MySQL 5.6, it has information which are sum_time, min_time, avg_time, max_time 
and sum_lock_time. I think it is useful for understanding our executing statements.



As I said on the min/max thread, if we're not
doing enough to help people who would like to build such a tool, we
should discuss how we can do better.


Could you tell me how to get min/max statistics 

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread Peter Geoghegan
On Mon, Nov 18, 2013 at 10:08 PM, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:
 I regret past decision of Itagaki-san's patch, and improvement might not be
 possible. However, we can change it, if we get have logical reason to change
 it.

I suppose so.

 I don't think that the tool needs to be expensive. If selecting from
 the pg_stat_statements view every 1-3 seconds is too expensive for
 such a tool, we can have a discussion about being smarter, because
 there certainly are ways to optimize it.

 I can understand why you say my patch is heavy now! Your monitoring methods
 are redically heavy. In general, we get pg_stat_statements view every 1 min
 - 5min. It is because monitoring SQLs must not heavier than common main
 SQLs.

Certainly, not everyone would want to take snapshots so frequently.
Most users would not. However, if you're worried about the overhead of
parsing the pg_stat_statements call, and then executing
pg_stat_statements, you're worried about the wrong thing. What about
the overhead of fingerprinting every single statement executed against
the server? Now, as I've already acknowledged I might actually have
regressed things when pg_stat_statements is selected from very
frequently. That increases the need for the tool to support external
aggregation tools well.

 If we measure the real performance, we don't measure with monitoring
 SQL's cost. And, I cannot still understand you'd like to collect drastic
 detail performance of statements. I'd like to only know max, avg and stddev
 in each statement. They are enough, because we can improve them by using
 these information.

It's not just the performance cost of this one addition. It's also a
question of where do we draw the line?, particularly given the
current lack of atomic portable addition. Most users of
pg_stat_statements don't care about the min/max. Furthermore, I'm not
convinced that you've adequately addressed the question of local
minima and maxima, even in your most recent revision.

 In summary of your comment, your patch is lower cost than I proposed patch.
 Because my patch has long lock problem, on the other hands your patch
 doesn't these problem. Is it right? If it true, I can understand your
 theoretically, but I'm not sure without real practice or benchmark that it
 is really or not. So we will need benchmark test in my patch and yours. I
 try it.

I don't know why you decided to compare our two patches at all -
what's the point? How are you going to benchmark my patch? If the
answer is pgbench, let me save you some time: there is no added
overhead there, because any additional cost that may exist is paid
only once per entry (at a time when an exclusive lock already needs to
be acquired on the entire table to create a new entry anyway).
Furthermore, my patch might actually be cost negative (i.e. faster),
because we can create a query text in the external file with only a
shared lock held (just using a spinlock to read-and-advance the offset
into the external file), rather than, with an *exclusive* lock held,
performing memcmp() to the 1KiB text shared memory buffer.

 I joined the db tech show case 2013 which is held in japan last week. Oracle
 speaker intoroduced performance schema and like these in MySQL 5.6. This is
 the slide of his. It's almost in japanese, but please see it since 31page.
 It is wirtten in SQL.
 http://www.slideshare.net/yoyamasaki/20131110-tuning-onmysql56

 In MySQL 5.6, it has information which are sum_time, min_time, avg_time,
 max_time and sum_lock_time. I think it is useful for understanding our
 executing statements.

Based on a reading of
https://dev.mysql.com/doc/refman/5.6/en/events-statements-current-table.html,
I don't think that is comparable to pg_stat_statements. It looks like
this doesn't do any normalization - if it did, why is there one row
per thread?

 This is incomprehensible. As with the cumulative aggregate costs, how
 is a consumer of pg_stat_statements possibly going to get the full
 query text from anywhere else? It *has* to come from
 pg_stat_statements directly.

 I think that it is necessary to make it clear with real test.

I'd have started with a benchmark of your changes to
pg_stat_statements for an unsympathetic case. Even though there is
surely some demand for what you're asking for, you haven't adequately
demonstrated or even argued that what you've proposed is the right way
to do it, nor have you even tried to quantify the costs, or given any
indication that you've considered them.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-10-18 Thread KONDO Mitsumasa
Hi,

I submit improvement of pg_stat_statement usage patch in CF3.

In pg_stat_statement, I think buffer hit ratio is very important value. However,
it is difficult to calculate it, and it need complicated SQL. This patch makes 
it
more simple usage and documentation.

 -bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
 -   nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
 +bench=# SELECT query, calls, total_time, rows, shared_blks_hit_percent
FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;

It will be very simple:-)

This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted,
and pg_stat_statement_min_max_exectime patch also adds new columns which are
min_time and max_time. So I'd like to change it in this opportunity.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index e8aed61..5c63940 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,8 +4,10 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.1.sql pg_stat_statements--1.0--1.1.sql \
-	pg_stat_statements--unpackaged--1.0.sql
+DATA = pg_stat_statements--1.2.sql \
+   pg_stat_statements--1.0--1.1.sql \
+   pg_stat_statements--1.1--1.2.sql \
+   pg_stat_statements--unpackaged--1.0.sql
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql
index 5be281e..5662273 100644
--- a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql
@@ -1,7 +1,7 @@
 /* contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql */
 
 -- complain if script is sourced in psql, rather than via ALTER EXTENSION
-\echo Use ALTER EXTENSION pg_stat_statements UPDATE TO '1.1' to load this file. \quit
+\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit
 
 /* First we have to remove them from the extension */
 ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
new file mode 100644
index 000..f0a8e0f
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
@@ -0,0 +1,63 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements();
+
+/* Then we can drop them */
+DROP VIEW pg_stat_statements;
+DROP FUNCTION pg_stat_statements();
+
+/* Now redefine */
+CREATE FUNCTION pg_stat_statements(
+OUT userid oid,
+OUT dbid oid,
+OUT query text,
+OUT calls int8,
+OUT total_time float8,
+OUT rows int8,
+OUT shared_blks_hit int8,
+OUT shared_blks_read int8,
+OUT shared_blks_dirtied int8,
+OUT shared_blks_written int8,
+OUT local_blks_hit int8,
+OUT local_blks_read int8,
+OUT local_blks_dirtied int8,
+OUT local_blks_written int8,
+OUT temp_blks_read int8,
+OUT temp_blks_written int8,
+OUT blk_read_time float8,
+OUT blk_write_time float8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE VIEW pg_stat_statements AS
+  SELECT userid,
+ dbid,
+ query,
+ calls,
+ total_time,
+ rows,
+ CASE WHEN shared_blks_hit + shared_blks_read  0
+   THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read))
+   ELSE 0 END AS shared_blks_hit_percent,
+ shared_blks_hit,
+ shared_blks_read,
+ shared_blks_dirtied,
+ shared_blks_written,
+ local_blks_hit,
+ local_blks_read,
+ local_blks_dirtied,
+ local_blks_written,
+ temp_blks_read,
+ temp_blks_written,
+ blk_read_time,
+ blk_write_time
+  FROM pg_stat_statements();
+
+GRANT SELECT ON pg_stat_statements TO PUBLIC;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql
deleted file mode 100644
index 42e4d68..000
--- a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql
+++ /dev/null
@@ -1,43 +0,0 @@
-/* contrib/pg_stat_statements/pg_stat_statements--1.1.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use CREATE EXTENSION pg_stat_statements to load this file. \quit
-
--- Register functions.
-CREATE FUNCTION pg_stat_statements_reset()