Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-10-05 Thread Gregory Smith

On 8/13/14, 5:14 AM, MauMau wrote:
I'm interested in b, too.  I was thinking of proposing a performance 
diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and 
V$SESSION_EVENT).  So, if you do this, I'd like to contribute to the 
functional design, code and doc review, and testing.


I already wrote up a design like this once: 
http://www.postgresql.org/message-id/509300f7.5000...@2ndquadrant.com


The main problem when I tried to code it was figuring out how to store 
the data.  When I wrote that, I thought I could just stuff it into a 
shared memory block the way pg_stat_statements did.  That didn't really 
work out.  I think it's manageable now because the Dynamic Shared Memory 
management written since then has the right sort of shape to do the job.


This is one of those jobs where I think the coding itself is the hard 
part, not the design nor the review.  What I really want is something 
that dumps this data into memory, then a second process that persists to 
disk in batches.  I think that's the only way we'll get high performance 
on reads while still saving enough data to be useful after a crash.


We're working on getting a few things in this area fully funded to dig 
into them harder.  The idea of designing for high-speed audit logs into 
memory and then persisting to disk has a lot of overlap with this one 
too, and that may get picked up too.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-19 Thread MauMau

From: Alvaro Herrera alvhe...@2ndquadrant.com

Is this supposed to be session-local data, or is it visible from remote
sessions too?  How durable is it supposed to be?  Keep in mind that in
case of a crash, all pgstats data is erased.


I want it to be visible from other sessions.  I'm okay about the data 
erasure during recovery.  We can probably extend pg_statsinfo to save the 
new info for long-term trend analysis.  TBH, I want a feature like 
pg_statsinfo in core.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-18 Thread Pavel Stehule
2014-08-18 7:42 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  2014-08-13 15:22 GMT+02:00 MauMau maumau...@gmail.com:

   I didn't mean performance statistics data to be stored in database
 tables.
   I just meant:
  
   * pg_stat_system_events is a view to show data on memory, which returns
   one row for each event across the system.  This is similar to
   V$SYSTEM_EVENT in Oracle.
  
   * pg_stat_session_events is a view to show data on memory, which
 returns
   one row for each event on one session.  This is similar to
 V$SESSION_EVENT
   in Oracle.
  
   * The above views represent the current accumulated data like other
   pg_stat_xxx views.
  
   * EXPLAIN ANALYZE and auto_explain shows all events for one query.  The
   lock waits you are trying to record in the server log is one of the
 events.
 
  I am little bit sceptic about only memory based structure. Is it this
  concept acceptable for commiters?

 Is this supposed to be session-local data, or is it visible from remote
 sessions too?  How durable is it supposed to be?  Keep in mind that in
 case of a crash, all pgstats data is erased.


surely it should be visible from all sessions and least 48 hours.

I have no problem with cleaning pgstats after crash -  it is cost related
to minimal overhead. And on server related hw there are (should be)  a
minimal number of crash.

Regards

Pavel



 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-17 Thread Alvaro Herrera
Pavel Stehule wrote:
 2014-08-13 15:22 GMT+02:00 MauMau maumau...@gmail.com:

  I didn't mean performance statistics data to be stored in database tables.
  I just meant:
 
  * pg_stat_system_events is a view to show data on memory, which returns
  one row for each event across the system.  This is similar to
  V$SYSTEM_EVENT in Oracle.
 
  * pg_stat_session_events is a view to show data on memory, which returns
  one row for each event on one session.  This is similar to V$SESSION_EVENT
  in Oracle.
 
  * The above views represent the current accumulated data like other
  pg_stat_xxx views.
 
  * EXPLAIN ANALYZE and auto_explain shows all events for one query.  The
  lock waits you are trying to record in the server log is one of the events.
 
 I am little bit sceptic about only memory based structure. Is it this
 concept acceptable for commiters?

Is this supposed to be session-local data, or is it visible from remote
sessions too?  How durable is it supposed to be?  Keep in mind that in
case of a crash, all pgstats data is erased.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread MauMau

From: Pavel Stehule pavel.steh...@gmail.com

There are two relative independent tasks

a) monitor and show total lock time of living queries

b) monitor and log total lock time of executed queries.

I am interested by @b now. When we work with slow query log, then we would
to identify reason for long duration. Locks are important source of these
queries on some systems.


I'm interested in b, too.  I was thinking of proposing a performance 
diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and 
V$SESSION_EVENT).  So, if you do this, I'd like to contribute to the 
functional design, code and doc review, and testing.


The point is to collect as much information about bottlenecks as possible, 
including lock waits.  The rough sketch is:


What info to collect:
* heavyweight lock waits shown by pg_locks
* lightweight lock waits
* latch waits
* socket waits (mainly for client input)

How the info is delivered:
* pg_stat_system_events shows the accumulated total accross the server 
instance

* pg_stat_session_events shows the accumulated total for each session
* EXPLAIN ANALYZE and auto_explain shows the accumulated total for each 
query


We need to describe in the manual how to diagnose and tne the system with 
these event info.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread Pavel Stehule
2014-08-13 11:14 GMT+02:00 MauMau maumau...@gmail.com:

 From: Pavel Stehule pavel.steh...@gmail.com

  There are two relative independent tasks

 a) monitor and show total lock time of living queries

 b) monitor and log total lock time of executed queries.

 I am interested by @b now. When we work with slow query log, then we would
 to identify reason for long duration. Locks are important source of these
 queries on some systems.


 I'm interested in b, too.  I was thinking of proposing a performance
 diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and
 V$SESSION_EVENT).  So, if you do this, I'd like to contribute to the
 functional design, code and doc review, and testing.


isn't it too heavy?

I have just terrible negative experience with Vertica, where this design is
used - almost all information about queries are available, but any query to
related tables are terrible slow, so I am inclined to more simple design
oriented to log based solution. Table based solutions is not practical when
you exec billions queries per day. I understand to motivation, but I afraid
so it can be very expensive and slow on highly load servers.



 The point is to collect as much information about bottlenecks as possible,
 including lock waits.  The rough sketch is:

 What info to collect:
 * heavyweight lock waits shown by pg_locks
 * lightweight lock waits
 * latch waits
 * socket waits (mainly for client input)






 How the info is delivered:
 * pg_stat_system_events shows the accumulated total accross the server
 instance
 * pg_stat_session_events shows the accumulated total for each session
 * EXPLAIN ANALYZE and auto_explain shows the accumulated total for each
 query

 We need to describe in the manual how to diagnose and tne the system with
 these event info.

 Regards
 MauMau




Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread MauMau

From: Pavel Stehule pavel.steh...@gmail.com

isn't it too heavy?


Are you concerned about the impactof collection overhead on the queries 
diagnosed?  Maybe not light, but I'm optimistic.  Oracle has the track 
record of long use, and MySQL provides performance schema starting from 5.6.


I have just terrible negative experience with Vertica, where this design 
is
used - almost all information about queries are available, but any query 
to

related tables are terrible slow, so I am inclined to more simple design
oriented to log based solution. Table based solutions is not practical 
when
you exec billions queries per day. I understand to motivation, but I 
afraid

so it can be very expensive and slow on highly load servers.


Which do you mean by query related to tables, the queries from 
applications being diagnosed, or the queries that diagnose the performance 
using statistics views?


Could you elaborate on your experience with Vertica?  That trouble may be 
just because Vertica's implementation is not refined.


I understand the feeling of being inclined to log based solution for its 
implementation simplicity.  However, the server log is difficult or 
impossible to access using SQL queries.  This prevents the development of 
performance diagnostics functionality in GUI administration tools.  Also, 
statistics views allow for easy access on PAAS like Amazon RDS and Heroku.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread Pavel Stehule
2014-08-13 13:59 GMT+02:00 MauMau maumau...@gmail.com:

 From: Pavel Stehule pavel.steh...@gmail.com

  isn't it too heavy?


 Are you concerned about the impactof collection overhead on the queries
 diagnosed?  Maybe not light, but I'm optimistic.  Oracle has the track
 record of long use, and MySQL provides performance schema starting from 5.6.


partially, I afraid about total performance (about impact on IO) - when we
use a usual tables, then any analyses without indexes are slow, so you need
a indexes, and we cannot deferred index update. You should thinking about
retention policy - and without partitioning you got massive deletes. So I
cannot to imagine a usage of table based solution together with some higher
load. Our MVCC storage is not practical for storing only inserted data, and
some custom storage has no indexes - so this design is relative big project.




  I have just terrible negative experience with Vertica, where this design
 is
 used - almost all information about queries are available, but any query
 to
 related tables are terrible slow, so I am inclined to more simple design
 oriented to log based solution. Table based solutions is not practical
 when
 you exec billions queries per day. I understand to motivation, but I
 afraid
 so it can be very expensive and slow on highly load servers.


 Which do you mean by query related to tables, the queries from
 applications being diagnosed, or the queries that diagnose the performance
 using statistics views?

 Could you elaborate on your experience with Vertica?  That trouble may be
 just because Vertica's implementation is not refined.


sure - Vertica is not mature database. More it has only one storage type
optimized for OLAP, what is wrong for long catalog, and for working with
performance events too.


 I understand the feeling of being inclined to log based solution for its
 implementation simplicity.  However, the server log is difficult or
 impossible to access using SQL queries.  This prevents the development of
 performance diagnostics functionality in GUI administration tools.  Also,
 statistics views allow for easy access on PAAS like Amazon RDS and Heroku.


I prefer a possibility to read log via SQL (maybe some FDW) than use tables
for storing log. These tables can be relative very large in few days - and
we cannot to write specialized engine like MySQL simply.

Pavel



 Regards
 MauMau




Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread MauMau

From: Pavel Stehule pavel.steh...@gmail.com

2014-08-13 13:59 GMT+02:00 MauMau maumau...@gmail.com:

Are you concerned about the impactof collection overhead on the queries
diagnosed?  Maybe not light, but I'm optimistic.  Oracle has the track
record of long use, and MySQL provides performance schema starting from 
5.6.



partially, I afraid about total performance (about impact on IO) - when we
use a usual tables, then any analyses without indexes are slow, so you 
need

a indexes, and we cannot deferred index update. You should thinking about
retention policy - and without partitioning you got massive deletes. So I
cannot to imagine a usage of table based solution together with some 
higher
load. Our MVCC storage is not practical for storing only inserted data, 
and
some custom storage has no indexes - so this design is relative big 
project.


I prefer a possibility to read log via SQL (maybe some FDW) than use 
tables

for storing log. These tables can be relative very large in few days - and
we cannot to write specialized engine like MySQL simply.


I didn't mean performance statistics data to be stored in database tables. 
I just meant:


* pg_stat_system_events is a view to show data on memory, which returns one 
row for each event across the system.  This is similar to V$SYSTEM_EVENT in 
Oracle.


* pg_stat_session_events is a view to show data on memory, which returns one 
row for each event on one session.  This is similar to V$SESSION_EVENT in 
Oracle.


* The above views represent the current accumulated data like other 
pg_stat_xxx views.


* EXPLAIN ANALYZE and auto_explain shows all events for one query.  The lock 
waits you are trying to record in the server log is one of the events.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-13 Thread Pavel Stehule
2014-08-13 15:22 GMT+02:00 MauMau maumau...@gmail.com:

 From: Pavel Stehule pavel.steh...@gmail.com

 2014-08-13 13:59 GMT+02:00 MauMau maumau...@gmail.com:

 Are you concerned about the impactof collection overhead on the queries

 diagnosed?  Maybe not light, but I'm optimistic.  Oracle has the track
 record of long use, and MySQL provides performance schema starting from
 5.6.



 partially, I afraid about total performance (about impact on IO) - when we
 use a usual tables, then any analyses without indexes are slow, so you
 need
 a indexes, and we cannot deferred index update. You should thinking about
 retention policy - and without partitioning you got massive deletes. So I
 cannot to imagine a usage of table based solution together with some
 higher
 load. Our MVCC storage is not practical for storing only inserted data,
 and
 some custom storage has no indexes - so this design is relative big
 project.

 I prefer a possibility to read log via SQL (maybe some FDW) than use
 tables
 for storing log. These tables can be relative very large in few days - and
 we cannot to write specialized engine like MySQL simply.


 I didn't mean performance statistics data to be stored in database tables.
 I just meant:

 * pg_stat_system_events is a view to show data on memory, which returns
 one row for each event across the system.  This is similar to
 V$SYSTEM_EVENT in Oracle.

 * pg_stat_session_events is a view to show data on memory, which returns
 one row for each event on one session.  This is similar to V$SESSION_EVENT
 in Oracle.

 * The above views represent the current accumulated data like other
 pg_stat_xxx views.

 * EXPLAIN ANALYZE and auto_explain shows all events for one query.  The
 lock waits you are trying to record in the server log is one of the events.


I am little bit sceptic about only memory based structure. Is it this
concept acceptable for commiters?

Pavel



 Regards
 MauMau




[HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-12 Thread Pavel Stehule
Hello

I was asked about possibility to show a lock time of slow queries.

I proposed a design based on log line prefix, but it was rejected.

Any idea how to show a lock time in some practical form together with
logged slow query?


Regards

Pavel


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-12 Thread Michael Paquier
On Wed, Aug 13, 2014 at 4:59 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Any idea how to show a lock time in some practical form together with logged
 slow query?

Doing a join on pg_stat_activity and pg_locks is not going to help
much as you could only get the moment when query has started or its
state has changed. Have you thought about the addition of a new column
in pg_locks containing the timestamp of the moment a lock has been
taken? I am sure that we are concerned about the performance impact
that extra calls to gettimeofday could have though...
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-12 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 Doing a join on pg_stat_activity and pg_locks is not going to help
 much as you could only get the moment when query has started or its
 state has changed. Have you thought about the addition of a new column
 in pg_locks containing the timestamp of the moment a lock has been
 taken? I am sure that we are concerned about the performance impact
 that extra calls to gettimeofday could have though...

In theory this could be driven off the same gettimeofday needed to
start the deadlock_timeout timer.  Not sure how messy that'd be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-12 Thread Pavel Stehule
Hi


2014-08-13 6:18 GMT+02:00 Michael Paquier michael.paqu...@gmail.com:

 On Wed, Aug 13, 2014 at 4:59 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Any idea how to show a lock time in some practical form together with
 logged
  slow query?

 Doing a join on pg_stat_activity and pg_locks is not going to help
 much as you could only get the moment when query has started or its
 state has changed. Have you thought about the addition of a new column
 in pg_locks containing the timestamp of the moment a lock has been
 taken? I am sure that we are concerned about the performance impact
 that extra calls to gettimeofday could have though...
 Regards,


There are two relative independent tasks

a) monitor and show total lock time of living queries

b) monitor and log total lock time of executed queries.

I am interested by @b now. When we work with slow query log, then we would
to identify reason for long duration. Locks are important source of these
queries on some systems.

What I know, we do monitoring these times for deadlock identification
trigger and for log_lock_waits - so there should not be any new pressure to
performance.

Regards

Pavel


 --
 Michael



Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-08-12 Thread Pavel Stehule
2014-08-13 7:19 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Michael Paquier michael.paqu...@gmail.com writes:
  Doing a join on pg_stat_activity and pg_locks is not going to help
  much as you could only get the moment when query has started or its
  state has changed. Have you thought about the addition of a new column
  in pg_locks containing the timestamp of the moment a lock has been
  taken? I am sure that we are concerned about the performance impact
  that extra calls to gettimeofday could have though...

 In theory this could be driven off the same gettimeofday needed to
 start the deadlock_timeout timer.  Not sure how messy that'd be.


we use it in out custom patch without problems

Pavel



 regards, tom lane