Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Dec 2, 2013 at 10:42 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/19/13, 11:30 PM, Peter Geoghegan wrote: +1 from me. That's +1 for *not* including this? Right. I agree with not including this. If you're looking for more of those, here's another +1 for not including it. (And of course also for Peters comment about that we need to figure out if something is actually missing for building this higher level tool that can provide this information and more) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On 11/19/13, 11:30 PM, Peter Geoghegan wrote: +1 from me. That's +1 for *not* including this? Right. I agree with not including this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Nov 18, 2013 at 10:56 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp +1 from me. That's +1 for *not* including this? If so, I agree as well. It would be easy enough to create a sql view that computes this if one wanted (unlike the min and max execution time, which can't be done externally). I also have a theological opposition to exposing the buffer hit ratio. There are two ways to improve the buffer hit ratio. One is to have fewer misses, which is worthwhile but you can more easily do it by looking directly at the number of misses or misses per execution, rather than a buffer hit ratio. Or you can dilute out the misses by gratuitously increasing the hits by uselessly reading cached buffers over and over again, which is counter-productive and evil and perverse. Take a small to medium look-up table, drop all the indexes so it has to be full scanned all the time, and maybe rebuild it with a lower fillfactor (but not so low that it becomes too big to cache), and watch that buffer hit ratio go through the roof, while true performance goes the other way. I think that a higher level tool needs to be built on pg_stat_statements to make things easy for those that want a slick, pre-packaged solution. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. I'd like to be able to separate queries by the application_name and/or client_addr which issued them. Cheers, Jeff
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Tue, Nov 19, 2013 at 12:12 PM, Jeff Janes jeff.ja...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp +1 from me. That's +1 for *not* including this? Right. Unfortunately, discussion of the other pg_stat_statements patches spilled into this thread. I think that a higher level tool needs to be built on pg_stat_statements to make things easy for those that want a slick, pre-packaged solution. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. I'd like to be able to separate queries by the application_name and/or client_addr which issued them. I see why that would be desirable (it would also be desirable to sometimes *not* break costs down by user), but it's hard to see a way to make that work without hurting cases where it doesn't matter (i.e. the majority of cases). So thinking about it some more, you could hack together a patch fairly easily that adds a GUC that makes the current userid a union. A setting remembers the current preference as to what that actually stores. You remember the setting on shutdown, when serializing the stats, in the file header. You can set the setting to hash a whole bunch of alternative things along the lines you refer to here, such as client_addr or application_name. I can see why some aspects of that would be controversial, but it could probably be made to work. Maybe the hash is used in a second shared dynahash, where the original values are looked-up, so a new text column is added to the view that displays things as needed that way (when the GUC isn't set to its default, userid is NULL, else this new column is NULL). It would probably be okay to make this spill to disk, simply because it wouldn't come up that often, and you could be clever about the locking. Notably, something like a client_addr is stable within a backend, so it would be cheap to cache the hash value. The logical/physical I/O divide is probably the main factor that makes the track_io_timing stuff compelling. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On 18 October 2013 13:35 KONDO Mitsumasa wrote: Hi, I submit improvement of pg_stat_statement usage patch in CF3. In pg_stat_statement, I think buffer hit ratio is very important value. However, it is difficult to calculate it, and it need complicated SQL. This patch makes it more simple usage and documentation. -bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit / - nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent +bench=# SELECT query, calls, total_time, rows, +shared_blks_hit_percent FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5; It will be very simple:-) This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted, and pg_stat_statement_min_max_exectime patch also adds new columns which are min_time and max_time. So I'd like to change it in this opportunity. This patch adds another column shared_blks_hit_percent to pg_stat_statements view Which is very beneficial to the user to know how much percentage of blks are hit. All changes are fine and working as described. Marked as ready for committer. Regards, Hari babu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
(2013/11/18 20:16), Haribabu kommi wrote: On 18 October 2013 13:35 KONDO Mitsumasa wrote: This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted, and pg_stat_statement_min_max_exectime patch also adds new columns which are min_time and max_time. So I'd like to change it in this opportunity. This patch adds another column shared_blks_hit_percent to pg_stat_statements view Which is very beneficial to the user to know how much percentage of blks are hit. All changes are fine and working as described. Marked as ready for committer. Thank you for your reviewing! However, I'd like to add average time in each statement, too. Attached patch is my latest one. Adding shared_blks_hit_percent and ave_time. This is the adding main code. + total_time / calls::float AS avg_time, If this patch and min/max and stddev patch will be commited, we can see more detail and simple information in pg_stat_statements, by light-weight coding. Regards, -- Mitsumasa KONDO NTT Open Source Software Center diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql new file mode 100644 index 000..f0a8e0f --- /dev/null +++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql @@ -0,0 +1,63 @@ +/* contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit + +/* First we have to remove them from the extension */ +ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements; +ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(); + +/* Then we can drop them */ +DROP VIEW pg_stat_statements; +DROP FUNCTION pg_stat_statements(); + +/* Now redefine */ +CREATE FUNCTION pg_stat_statements( +OUT userid oid, +OUT dbid oid, +OUT query text, +OUT calls int8, +OUT total_time float8, +OUT rows int8, +OUT shared_blks_hit int8, +OUT shared_blks_read int8, +OUT shared_blks_dirtied int8, +OUT shared_blks_written int8, +OUT local_blks_hit int8, +OUT local_blks_read int8, +OUT local_blks_dirtied int8, +OUT local_blks_written int8, +OUT temp_blks_read int8, +OUT temp_blks_written int8, +OUT blk_read_time float8, +OUT blk_write_time float8 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME' +LANGUAGE C; + +CREATE VIEW pg_stat_statements AS + SELECT userid, + dbid, + query, + calls, + total_time, + rows, + CASE WHEN shared_blks_hit + shared_blks_read 0 + THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read)) + ELSE 0 END AS shared_blks_hit_percent, + shared_blks_hit, + shared_blks_read, + shared_blks_dirtied, + shared_blks_written, + local_blks_hit, + local_blks_read, + local_blks_dirtied, + local_blks_written, + temp_blks_read, + temp_blks_written, + blk_read_time, + blk_write_time + FROM pg_stat_statements(); + +GRANT SELECT ON pg_stat_statements TO PUBLIC; diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.2.sql new file mode 100644 index 000..41dc16b --- /dev/null +++ b/contrib/pg_stat_statements/pg_stat_statements--1.2.sql @@ -0,0 +1,65 @@ +/* contrib/pg_stat_statements/pg_stat_statements--1.2.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use CREATE EXTENSION pg_stat_statements to load this file. \quit + +-- Register functions. +CREATE FUNCTION pg_stat_statements_reset() +RETURNS void +AS 'MODULE_PATHNAME' +LANGUAGE C; + +CREATE FUNCTION pg_stat_statements( +OUT userid oid, +OUT dbid oid, +OUT query text, +OUT calls int8, +OUT total_time float8, +OUT rows int8, +OUT shared_blks_hit int8, +OUT shared_blks_read int8, +OUT shared_blks_dirtied int8, +OUT shared_blks_written int8, +OUT local_blks_hit int8, +OUT local_blks_read int8, +OUT local_blks_dirtied int8, +OUT local_blks_written int8, +OUT temp_blks_read int8, +OUT temp_blks_written int8, +OUT blk_read_time float8, +OUT blk_write_time float8 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME' +LANGUAGE C; + +-- Register a view on the function for ease of use. +CREATE VIEW pg_stat_statements AS + SELECT userid, + dbid, + query, + calls, + total_time, + total_time / calls::float AS avg_time, + rows, + CASE WHEN shared_blks_hit + shared_blks_read 0 + THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read)) + ELSE 0 END AS shared_blks_hit_percent, + shared_blks_hit, + shared_blks_read, + shared_blks_dirtied, + shared_blks_written, +
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Nov 18, 2013 at 8:36 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/11/18 20:16), Haribabu kommi wrote: On 18 October 2013 13:35 KONDO Mitsumasa wrote: This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted, and pg_stat_statement_min_max_exectime patch also adds new columns which are min_time and max_time. So I'd like to change it in this opportunity. This patch adds another column shared_blks_hit_percent to pg_stat_statements view Which is very beneficial to the user to know how much percentage of blks are hit. The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp +1 from me. I think that a higher level tool needs to be built on pg_stat_statements to make things easy for those that want a slick, pre-packaged solution. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
(2013/11/19 3:56), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values be just simple. However, were his changes just simple? I cannot understand his aesthetics sense and also you, too:-( PG8.4 before: 012 CREATE FUNCTION pg_stat_statements( 013 OUT userid oid, 014 OUT dbid oid, 015 OUT query text, 016 OUT calls int8, 017 OUT total_time float8, 018 OUT rows int8 019 ) PG9.0 after: 012 CREATE FUNCTION pg_stat_statements( 013 OUT userid oid, 014 OUT dbid oid, 015 OUT query text, 016 OUT calls int8, 017 OUT total_time float8, 018 OUT rows int8, 019 OUT shared_blks_hit int8, 020 OUT shared_blks_read int8, 021 OUT shared_blks_written int8, 022 OUT local_blks_hit int8, 023 OUT local_blks_read int8, 024 OUT local_blks_written int8, 025 OUT temp_blks_read int8, 026 OUT temp_blks_written int8 027 ) It's too complicated, and do you know how to tuning PG from information of local_* and temp_*? At least, I think that most user cannot tuning from these information, and it might not be useful information only part of them. +1 from me. I think that a higher level tool needs to be built on pg_stat_statements to make things easy for those that want a slick, pre-packaged solution. No. It's not for geek tools and people having pre-packaged solution in big company, but also for common DBA tools. By the way, MySQL and Oracle database which are very popular have these statistics. I think that your argument might disturb people who wants to migration from these database and will accelerate popularity of these database more. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. Could you tell me how to get min/max statistics with low cost? I'm not sure about detail of your patch in CF, but it seems very high cost. Repeatedly, I think that if we want to get drastic detail statistics, we have to create another tools of statistics. Your patch will be these statistics tools. However, pg_stat_statement sholud be just light weight. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
(2013/11/19 11:12), KONDO Mitsumasa wrote: (2013/11/19 3:56), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have the same feeling with him. Anyway we should listen to more opinions from other people, though. http://www.postgresql.org/message-id/20091222172719.8b86.52131...@oss.ntt.co.jp I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values be just simple. However, were his changes just simple? I cannot understand his aesthetics sense and also you, too:-( PG8.4 before: 012 CREATE FUNCTION pg_stat_statements( 013 OUT userid oid, 014 OUT dbid oid, 015 OUT query text, 016 OUT calls int8, 017 OUT total_time float8, 018 OUT rows int8 019 ) PG9.0 after: 012 CREATE FUNCTION pg_stat_statements( 013 OUT userid oid, 014 OUT dbid oid, 015 OUT query text, 016 OUT calls int8, 017 OUT total_time float8, 018 OUT rows int8, 019 OUT shared_blks_hit int8, 020 OUT shared_blks_read int8, 021 OUT shared_blks_written int8, 022 OUT local_blks_hit int8, 023 OUT local_blks_read int8, 024 OUT local_blks_written int8, 025 OUT temp_blks_read int8, 026 OUT temp_blks_written int8 027 ) It's too complicated, and do you know how to tuning PG from information of local_* and temp_*? At least, I think that most user cannot tuning from these information, and it might not be useful information only part of them. +1 from me. I think that a higher level tool needs to be built on pg_stat_statements to make things easy for those that want a slick, pre-packaged solution. No. It's not for geek tools and people having pre-packaged solution in big company, but also for common DBA tools. By the way, MySQL and Oracle database which are very popular have these statistics. I think that your argument might disturb people who wants to migration from these database and will accelerate popularity of these database more. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. Could you tell me how to get min/max statistics with low cost? I'm not sure about detail of your patch in CF, but it seems very high cost. Repeatedly, I think that if we want to get drastic detail statistics, we have to create another tools of statistics. Your patch will be these statistics tools. However, pg_stat_statement sholud be just light weight. Regards, Oh, I forgot to say it... I beleive that essence of information technology is to show more informative information in little information with low cost. If it wrong, please let me know the reason. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values be just simple. However, were his changes just simple? I cannot understand his aesthetics sense and also you, too:-( It's too complicated, and do you know how to tuning PG from information of local_* and temp_*? At least, I think that most user cannot tuning from these information, and it might not be useful information only part of them. All of those costs are cumulative aggregates. If we didn't aggregate them, then the user couldn't possibly determine them on their own, to any approximation. That's the difference. If you think the local_* and temp_* aren't very useful, I'm inclined to agree, but it's too late to do anything about that now. No. It's not for geek tools and people having pre-packaged solution in big company, but also for common DBA tools. I don't think that the tool needs to be expensive. If selecting from the pg_stat_statements view every 1-3 seconds is too expensive for such a tool, we can have a discussion about being smarter, because there certainly are ways to optimize it. Regarding your min/max patch: I'm opposed to adding even more to the spinlock-protected counters struct, so that we can get an exact answer to a question where an approximate answer would very likely be good enough. And as Itagaki-san said 4 years ago, who is to say that what you've done here for buffers (or equally, what you've done in your min/max patch) is more interesting than the same thing but for another cost? The point of having what you've removed from the pg_stat_statements docs about calculating averages is that it is an example that can be generalized from. I certainly think there should be better tooling to make displaying costs over time easier, or characterizing the distribution, but unfortunately this isn't it. Something like pg_stat_statements is allowed to be approximate. That's considered an appropriate trade-off. Most obviously, today there can be hash table collisions, and some of the entries can therefore be plain wrong. Previously, I put the probability of 1 collision in the hash table at about 1% when pg_stat_statements.max is set to 10,000. So if your min/max patch was implemented in userspace, and an outlier is lost in the noise with just one second of activity, I'm not terribly concerned about that. It's a trade-off, and if you don't think it's the correct one, well then I'm afraid that's just where you and I differ. As I've said many times, if you want to have a discussion about making aggressive snapshotting of the pg_stat_statements view more practical, I think that would be very useful. By the way, MySQL and Oracle database which are very popular have these statistics. I think that your argument might disturb people who wants to migration from these database and will accelerate popularity of these database more. I think that there should be better tooling built on top of pg_stat_statements. I don't know what Oracle does, but I'm pretty sure that MySQL has nothing like pg_stat_statements. Please correct me if I'm mistaken. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. Could you tell me how to get min/max statistics with low cost? See my previous comments on the other thread about making pg_stat_statements only return changed entries, and only sending the query text once. I'm not sure about detail of your patch in CF, but it seems very high cost. I think you should actually go and read the code and read my explanation of it, and refrain from making uninformed remarks like this. Whatever overhead my patch may add, the important point is that it obviously and self-evidently adds exactly zero overhead to maintaining statistics for existing entries - we only care about the query text when first creating an entry, or when viewing an entry when the view is selected from. With the reduced shared memory consumption, more entries can be created, making the cost of creating new entries (and thus whatever might have been added to that cost) matter less. Having said that, the additional cost is thought to be very low anyway. If you read my mail to the list about this, you'd know that I specifically worried about the implications of what I'd proposed for tools like your tool. That's part of the reason I keep urging you to think about making pg_stat_statements cheaper for consumer tools. There is no reason to send query texts to such tools more than once per entry, and no reason to send unchanged entries. Repeatedly, I think that if we want to get drastic detail statistics, we have to create another tools of statistics. Your patch will be these statistics tools. However, pg_stat_statement sholud be just light weight. This is incomprehensible. As with the cumulative aggregate costs, how is a consumer
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
(2013/11/19 12:03), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values be just simple. However, were his changes just simple? I cannot understand his aesthetics sense and also you, too:-( It's too complicated, and do you know how to tuning PG from information of local_* and temp_*? At least, I think that most user cannot tuning from these information, and it might not be useful information only part of them. All of those costs are cumulative aggregates. If we didn't aggregate them, then the user couldn't possibly determine them on their own, to any approximation. That's the difference. If you think the local_* and temp_* aren't very useful, I'm inclined to agree, but it's too late to do anything about that now. I regret past decision of Itagaki-san's patch, and improvement might not be possible. However, we can change it, if we get have logical reason to change it. No. It's not for geek tools and people having pre-packaged solution in big company, but also for common DBA tools. I don't think that the tool needs to be expensive. If selecting from the pg_stat_statements view every 1-3 seconds is too expensive for such a tool, we can have a discussion about being smarter, because there certainly are ways to optimize it. I can understand why you say my patch is heavy now! Your monitoring methods are redically heavy. In general, we get pg_stat_statements view every 1 min - 5min. It is because monitoring SQLs must not heavier than common main SQLs. If we measure the real performance, we don't measure with monitoring SQL's cost. And, I cannot still understand you'd like to collect drastic detail performance of statements. I'd like to only know max, avg and stddev in each statement. They are enough, because we can improve them by using these information. Regarding your min/max patch: I'm opposed to adding even more to the spinlock-protected counters struct, so that we can get an exact answer to a question where an approximate answer would very likely be good enough. And as Itagaki-san said 4 years ago, who is to say that what you've done here for buffers (or equally, what you've done in your min/max patch) is more interesting than the same thing but for another cost? The point of having what you've removed from the pg_stat_statements docs about calculating averages is that it is an example that can be generalized from. I certainly think there should be better tooling to make displaying costs over time easier, or characterizing the distribution, but unfortunately this isn't it. Something like pg_stat_statements is allowed to be approximate. That's considered an appropriate trade-off. Most obviously, today there can be hash table collisions, and some of the entries can therefore be plain wrong. Previously, I put the probability of 1 collision in the hash table at about 1% when pg_stat_statements.max is set to 10,000. So if your min/max patch was implemented in userspace, and an outlier is lost in the noise with just one second of activity, I'm not terribly concerned about that. It's a trade-off, and if you don't think it's the correct one, well then I'm afraid that's just where you and I differ. As I've said many times, if you want to have a discussion about making aggressive snapshotting of the pg_stat_statements view more practical, I think that would be very useful. In summary of your comment, your patch is lower cost than I proposed patch. Because my patch has long lock problem, on the other hands your patch doesn't these problem. Is it right? If it true, I can understand your theoretically, but I'm not sure without real practice or benchmark that it is really or not. So we will need benchmark test in my patch and yours. I try it. By the way, MySQL and Oracle database which are very popular have these statistics. I think that your argument might disturb people who wants to migration from these database and will accelerate popularity of these database more. I think that there should be better tooling built on top of pg_stat_statements. I don't know what Oracle does, but I'm pretty sure that MySQL has nothing like pg_stat_statements. Please correct me if I'm mistaken. I joined the db tech show case 2013 which is held in japan last week. Oracle speaker intoroduced performance schema and like these in MySQL 5.6. This is the slide of his. It's almost in japanese, but please see it since 31page. It is wirtten in SQL. http://www.slideshare.net/yoyamasaki/20131110-tuning-onmysql56 In MySQL 5.6, it has information which are sum_time, min_time, avg_time, max_time and sum_lock_time. I think it is useful for understanding our executing statements. As I said on the min/max thread, if we're not doing enough to help people who would like to build such a tool, we should discuss how we can do better. Could you tell me how to get min/max statistics
Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
On Mon, Nov 18, 2013 at 10:08 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I regret past decision of Itagaki-san's patch, and improvement might not be possible. However, we can change it, if we get have logical reason to change it. I suppose so. I don't think that the tool needs to be expensive. If selecting from the pg_stat_statements view every 1-3 seconds is too expensive for such a tool, we can have a discussion about being smarter, because there certainly are ways to optimize it. I can understand why you say my patch is heavy now! Your monitoring methods are redically heavy. In general, we get pg_stat_statements view every 1 min - 5min. It is because monitoring SQLs must not heavier than common main SQLs. Certainly, not everyone would want to take snapshots so frequently. Most users would not. However, if you're worried about the overhead of parsing the pg_stat_statements call, and then executing pg_stat_statements, you're worried about the wrong thing. What about the overhead of fingerprinting every single statement executed against the server? Now, as I've already acknowledged I might actually have regressed things when pg_stat_statements is selected from very frequently. That increases the need for the tool to support external aggregation tools well. If we measure the real performance, we don't measure with monitoring SQL's cost. And, I cannot still understand you'd like to collect drastic detail performance of statements. I'd like to only know max, avg and stddev in each statement. They are enough, because we can improve them by using these information. It's not just the performance cost of this one addition. It's also a question of where do we draw the line?, particularly given the current lack of atomic portable addition. Most users of pg_stat_statements don't care about the min/max. Furthermore, I'm not convinced that you've adequately addressed the question of local minima and maxima, even in your most recent revision. In summary of your comment, your patch is lower cost than I proposed patch. Because my patch has long lock problem, on the other hands your patch doesn't these problem. Is it right? If it true, I can understand your theoretically, but I'm not sure without real practice or benchmark that it is really or not. So we will need benchmark test in my patch and yours. I try it. I don't know why you decided to compare our two patches at all - what's the point? How are you going to benchmark my patch? If the answer is pgbench, let me save you some time: there is no added overhead there, because any additional cost that may exist is paid only once per entry (at a time when an exclusive lock already needs to be acquired on the entire table to create a new entry anyway). Furthermore, my patch might actually be cost negative (i.e. faster), because we can create a query text in the external file with only a shared lock held (just using a spinlock to read-and-advance the offset into the external file), rather than, with an *exclusive* lock held, performing memcmp() to the 1KiB text shared memory buffer. I joined the db tech show case 2013 which is held in japan last week. Oracle speaker intoroduced performance schema and like these in MySQL 5.6. This is the slide of his. It's almost in japanese, but please see it since 31page. It is wirtten in SQL. http://www.slideshare.net/yoyamasaki/20131110-tuning-onmysql56 In MySQL 5.6, it has information which are sum_time, min_time, avg_time, max_time and sum_lock_time. I think it is useful for understanding our executing statements. Based on a reading of https://dev.mysql.com/doc/refman/5.6/en/events-statements-current-table.html, I don't think that is comparable to pg_stat_statements. It looks like this doesn't do any normalization - if it did, why is there one row per thread? This is incomprehensible. As with the cumulative aggregate costs, how is a consumer of pg_stat_statements possibly going to get the full query text from anywhere else? It *has* to come from pg_stat_statements directly. I think that it is necessary to make it clear with real test. I'd have started with a benchmark of your changes to pg_stat_statements for an unsympathetic case. Even though there is surely some demand for what you're asking for, you haven't adequately demonstrated or even argued that what you've proposed is the right way to do it, nor have you even tried to quantify the costs, or given any indication that you've considered them. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio
Hi, I submit improvement of pg_stat_statement usage patch in CF3. In pg_stat_statement, I think buffer hit ratio is very important value. However, it is difficult to calculate it, and it need complicated SQL. This patch makes it more simple usage and documentation. -bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit / - nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent +bench=# SELECT query, calls, total_time, rows, shared_blks_hit_percent FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5; It will be very simple:-) This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted, and pg_stat_statement_min_max_exectime patch also adds new columns which are min_time and max_time. So I'd like to change it in this opportunity. Regards, -- Mitsumasa KONDO NTT Open Source Software Center diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile index e8aed61..5c63940 100644 --- a/contrib/pg_stat_statements/Makefile +++ b/contrib/pg_stat_statements/Makefile @@ -4,8 +4,10 @@ MODULE_big = pg_stat_statements OBJS = pg_stat_statements.o EXTENSION = pg_stat_statements -DATA = pg_stat_statements--1.1.sql pg_stat_statements--1.0--1.1.sql \ - pg_stat_statements--unpackaged--1.0.sql +DATA = pg_stat_statements--1.2.sql \ + pg_stat_statements--1.0--1.1.sql \ + pg_stat_statements--1.1--1.2.sql \ + pg_stat_statements--unpackaged--1.0.sql ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql index 5be281e..5662273 100644 --- a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql +++ b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql @@ -1,7 +1,7 @@ /* contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql */ -- complain if script is sourced in psql, rather than via ALTER EXTENSION -\echo Use ALTER EXTENSION pg_stat_statements UPDATE TO '1.1' to load this file. \quit +\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit /* First we have to remove them from the extension */ ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements; diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql new file mode 100644 index 000..f0a8e0f --- /dev/null +++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql @@ -0,0 +1,63 @@ +/* contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use ALTER EXTENSION pg_stat_statements UPDATE to load this file. \quit + +/* First we have to remove them from the extension */ +ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements; +ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(); + +/* Then we can drop them */ +DROP VIEW pg_stat_statements; +DROP FUNCTION pg_stat_statements(); + +/* Now redefine */ +CREATE FUNCTION pg_stat_statements( +OUT userid oid, +OUT dbid oid, +OUT query text, +OUT calls int8, +OUT total_time float8, +OUT rows int8, +OUT shared_blks_hit int8, +OUT shared_blks_read int8, +OUT shared_blks_dirtied int8, +OUT shared_blks_written int8, +OUT local_blks_hit int8, +OUT local_blks_read int8, +OUT local_blks_dirtied int8, +OUT local_blks_written int8, +OUT temp_blks_read int8, +OUT temp_blks_written int8, +OUT blk_read_time float8, +OUT blk_write_time float8 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME' +LANGUAGE C; + +CREATE VIEW pg_stat_statements AS + SELECT userid, + dbid, + query, + calls, + total_time, + rows, + CASE WHEN shared_blks_hit + shared_blks_read 0 + THEN 100.0 * (shared_blks_hit::float / (shared_blks_hit + shared_blks_read)) + ELSE 0 END AS shared_blks_hit_percent, + shared_blks_hit, + shared_blks_read, + shared_blks_dirtied, + shared_blks_written, + local_blks_hit, + local_blks_read, + local_blks_dirtied, + local_blks_written, + temp_blks_read, + temp_blks_written, + blk_read_time, + blk_write_time + FROM pg_stat_statements(); + +GRANT SELECT ON pg_stat_statements TO PUBLIC; diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql deleted file mode 100644 index 42e4d68..000 --- a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql +++ /dev/null @@ -1,43 +0,0 @@ -/* contrib/pg_stat_statements/pg_stat_statements--1.1.sql */ - --- complain if script is sourced in psql, rather than via CREATE EXTENSION -\echo Use CREATE EXTENSION pg_stat_statements to load this file. \quit - --- Register functions. -CREATE FUNCTION pg_stat_statements_reset()