Re: [HACKERS] four minor proposals for 9.5

2014-04-17 Thread Pavel Stehule
2014-04-17 7:12 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:

 On Mon, Apr 14, 2014 at 6:27 PM, Robert Haas robertmh...@gmail.com
 wrote:
  I agree.  I don't think the idea of pushing this into the
  log_line_prefix stuff as a one-off is a very good one.  Sure, we could
  wedge it in there, but we've got an existing precedent that everything
  that you can get with log_line_prefix also shows up in the CSV output
  file.  And it's easy to imagine LOTS more counters that somebody might
  want to have.  Time spent planning, time spent executing, time spent
  waiting for disk I/O, time spent returning results to client, and I'm
  sure people will think of many others.  I think this will balloon out
  of control if we don't have a more systematic design for this sort of
  thing.

 Can't we think of some infrastructure similar to what is done for
 log_duration and log_min_duration_statement?
 Current it prints like below:
 LOG:  duration: 343.000 ms  statement: create table t1(c1 int);

 Let us say if user wants to track lock wait time a statement has
 spent, then enable some config parameter (either log_lock_duration
 or some other convenient way)

 LOG:  lock duration: 'x' ms  statement: create table t1(c1 int);


isn't it log_line_prefix analogy?

We can introduce new feature without hard dependency on CSV format

I am thinking so there are clean requests:

simply parseable - vector of numbers is ideal
simply activated, deactivated - maybe list of flags in GUC

Regards

Pavel



 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] four minor proposals for 9.5

2014-04-17 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 We can introduce new feature without hard dependency on CSV format

Look, the long and the short of it is that there is not consensus
that this measurement is worth creating a new CSV log column for.
And from that, there is also not consensus that it's worth putting
into log_line_prefix.  Therefore, this proposal is dead.  Please
stop prolonging the argument.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-16 Thread Amit Kapila
On Mon, Apr 14, 2014 at 6:27 PM, Robert Haas robertmh...@gmail.com wrote:
 I agree.  I don't think the idea of pushing this into the
 log_line_prefix stuff as a one-off is a very good one.  Sure, we could
 wedge it in there, but we've got an existing precedent that everything
 that you can get with log_line_prefix also shows up in the CSV output
 file.  And it's easy to imagine LOTS more counters that somebody might
 want to have.  Time spent planning, time spent executing, time spent
 waiting for disk I/O, time spent returning results to client, and I'm
 sure people will think of many others.  I think this will balloon out
 of control if we don't have a more systematic design for this sort of
 thing.

Can't we think of some infrastructure similar to what is done for
log_duration and log_min_duration_statement?
Current it prints like below:
LOG:  duration: 343.000 ms  statement: create table t1(c1 int);

Let us say if user wants to track lock wait time a statement has
spent, then enable some config parameter (either log_lock_duration
or some other convenient way)

LOG:  lock duration: 'x' ms  statement: create table t1(c1 int);

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-14 Thread Robert Haas
On Tue, Apr 8, 2014 at 12:34 PM, Gregory Smith gregsmithpg...@gmail.com wrote:
 On 4/6/14 2:46 PM, Pavel Stehule wrote:
 Proposed options are interesting for enterprise using, when you have a
 some more smart tools for log entry processing, and when you need a complex
 view about performance of billions queries - when cancel time and lock time
 is important piece in mosaic of server' fitness.

 I once sent a design proposal over for something I called Performance
 Events that included this.  It will be difficult to get everything people
 want to track into log_line_prefix macro form.  You're right that this
 particular one can probably be pushed into there, but you're adding four
 macros just for this feature. And that's only a fraction of what people
 expect from database per-query performance metrics.

I agree.  I don't think the idea of pushing this into the
log_line_prefix stuff as a one-off is a very good one.  Sure, we could
wedge it in there, but we've got an existing precedent that everything
that you can get with log_line_prefix also shows up in the CSV output
file.  And it's easy to imagine LOTS more counters that somebody might
want to have.  Time spent planning, time spent executing, time spent
waiting for disk I/O, time spent returning results to client, and I'm
sure people will think of many others.  I think this will balloon out
of control if we don't have a more systematic design for this sort of
thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-14 Thread Pavel Stehule
2014-04-14 14:57 GMT+02:00 Robert Haas robertmh...@gmail.com:

 On Tue, Apr 8, 2014 at 12:34 PM, Gregory Smith gregsmithpg...@gmail.com
 wrote:
  On 4/6/14 2:46 PM, Pavel Stehule wrote:
  Proposed options are interesting for enterprise using, when you have a
  some more smart tools for log entry processing, and when you need a
 complex
  view about performance of billions queries - when cancel time and lock
 time
  is important piece in mosaic of server' fitness.
 
  I once sent a design proposal over for something I called Performance
  Events that included this.  It will be difficult to get everything
 people
  want to track into log_line_prefix macro form.  You're right that this
  particular one can probably be pushed into there, but you're adding four
  macros just for this feature. And that's only a fraction of what people
  expect from database per-query performance metrics.

 I agree.  I don't think the idea of pushing this into the
 log_line_prefix stuff as a one-off is a very good one.  Sure, we could
 wedge it in there, but we've got an existing precedent that everything
 that you can get with log_line_prefix also shows up in the CSV output
 file.  And it's easy to imagine LOTS more counters that somebody might
 want to have.  Time spent planning, time spent executing, time spent
 waiting for disk I/O, time spent returning results to client, and I'm
 sure people will think of many others.  I think this will balloon out
 of control if we don't have a more systematic design for this sort of
 thing.



I am thinking about this feature - and it has a more dimensions

1. In context of relation databases I am thinking so query duration, and
query lock time are relative basic metric and then should be in
log_line_prefix

2. I can imagine, so someone would to monitor a lot of metric on query
level -  and we should to provide some method how to do it. log_line_prefix
should be or should not be solution.

Any log_line_prefix like solution can be useful. Do you have some idea
about future direction of PostgreSQL logging?

Requests:

a) possibility to choose a format: CSV, JSON, XML
b) possibility to choose a set of metrics: ...
c) possibility to choose a target: syslog, file, ..

Pavel


--
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] four minor proposals for 9.5

2014-04-14 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2014-04-14 14:57 GMT+02:00 Robert Haas robertmh...@gmail.com:
 I agree.  I don't think the idea of pushing this into the
 log_line_prefix stuff as a one-off is a very good one.  Sure, we could
 wedge it in there, but we've got an existing precedent that everything
 that you can get with log_line_prefix also shows up in the CSV output
 file.  And it's easy to imagine LOTS more counters that somebody might
 want to have.

I think this argument is sufficient to kill this line of proposal
entirely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-09 Thread Amit Kapila
On Tue, Apr 8, 2014 at 9:07 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014-04-08 6:27 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:
 So do you want to just print lock time for error'd statements, won't
 it better to
 do it for non-error'd statements as well or rather I feel it can be more
 useful
 for non-error statements? Do we already have some easy way to get
 wait-time
 for non-error statements?


 There are two points:

 a) we have no a some infrastructure how to glue some specific info to any
 query other than log_line_prefix.

Can't we do that by using log_duration and log_min_duration_statement?
For Example, if I enable these parameters, below is the log:

LOG:  duration: 343.000 ms  statement: create table t1(c1 int);

 And I have no any idea, how and what
 implement better. And I don't think so any new infrastructure (mechanism) is
 necessary. log_line_prefix increase log size, but it is very well parseable
 - splunk and similar sw has no problem with it.

One thing that could happen if we implement total lock time at
log_line_prefix is that if user enables log_lock_waits, then it will start
printing duration for each lock wait time, not sure again it depends on
implementation.

 b) lock time can be interesting on error statements too - for example - you
 can cancel locked queries - so you would to see a lock time and duration for
 cancelled queries. So this implementation can be sensible too.

Agreed, I just said it will be quite useful for non-error'd long running
statements as well, so it might be good idea to see if we can implement
it for successful statements as well.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-09 Thread Pavel Stehule
2014-04-10 5:50 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:

 On Tue, Apr 8, 2014 at 9:07 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-04-08 6:27 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:
  So do you want to just print lock time for error'd statements, won't
  it better to
  do it for non-error'd statements as well or rather I feel it can be more
  useful
  for non-error statements? Do we already have some easy way to get
  wait-time
  for non-error statements?
 
 
  There are two points:
 
  a) we have no a some infrastructure how to glue some specific info to any
  query other than log_line_prefix.

 Can't we do that by using log_duration and log_min_duration_statement?
 For Example, if I enable these parameters, below is the log:

 LOG:  duration: 343.000 ms  statement: create table t1(c1 int);


yes, sorry. You have true. I though about this possibility, and I choose
log_line_prefix due simple configurability and better parserability. But
again, enhancing log_duration feature can be implement together with
enhancing log_line_prefix.

I'll try to visualise in prototype.

Regards

Pavel



  And I have no any idea, how and what
  implement better. And I don't think so any new infrastructure
 (mechanism) is
  necessary. log_line_prefix increase log size, but it is very well
 parseable
  - splunk and similar sw has no problem with it.

 One thing that could happen if we implement total lock time at
 log_line_prefix is that if user enables log_lock_waits, then it will start
 printing duration for each lock wait time, not sure again it depends on
 implementation.

  b) lock time can be interesting on error statements too - for example -
 you
  can cancel locked queries - so you would to see a lock time and duration
 for
  cancelled queries. So this implementation can be sensible too.

 Agreed, I just said it will be quite useful for non-error'd long running
 statements as well, so it might be good idea to see if we can implement
 it for successful statements as well.


 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] four minor proposals for 9.5

2014-04-08 Thread Pavel Stehule
2014-04-08 6:27 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:

 On Mon, Apr 7, 2014 at 12:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-04-04 6:51 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:
  On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com
 
  wrote:
   2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:
   So I'll prepare a some prototypes in next month for
  
   1. log a execution time for cancelled queries,
   2. track a query lock time
  Yes. Initially I though only about cancelled queries, but now O am
 thinking
  so some more wide solution can be better. Sometimes - some long queries
 can
  be stopped by Postgres, or by system - and info about duration can be
 same
  usefull.

 Right.

 
  One more thing I think currently also we can find that by crude way
  (pg_stat_activity has query_start time and log_line_prefix has end
 time),
  but I think the having in one place 'log' will be more useful.
  ??

 I just wanted to say that if someone wants to calculate the duration
 of cancelled query (or other error'd query), you can do that by checking
 the start time from pg_stat_activity and end time from log (using
 log_line_prefix), but this is of course cumbersome.

   Same technique I would to
   use for printing lock time - it can be printing instead symbol %L.
 
  Above you have mentioned that you are planing to have three different
  lock times (Table, Tuple and others), so will this one symbol (%L)
 enable
  all three lock times?
 
 
  My idea is start with %L as total lock time, what is useful for wide
 group
  of users, and next or in same time we can enhance it with two chars
 prefix
  symbols

 So do you want to just print lock time for error'd statements, won't
 it better to
 do it for non-error'd statements as well or rather I feel it can be more
 useful
 for non-error statements? Do we already have some easy way to get wait-time
 for non-error statements?


There are two points:

a) we have no a some infrastructure how to glue some specific info to any
query other than log_line_prefix. And I have no any idea, how and what
implement better. And I don't think so any new infrastructure (mechanism)
is necessary. log_line_prefix increase log size, but it is very well
parseable - splunk and similar sw has no problem with it.

b) lock time can be interesting on error statements too - for example - you
can cancel locked queries - so you would to see a lock time and duration
for cancelled queries. So this implementation can be sensible too.



  so
 
  %L .. total lock time
  %Lr .. lock relation
  %Lt .. lock tuples
  %Lo .. lock others
 
  L = Lr + Lt + Lr
 
 
  Are you planing to add new logs for logging this or planing to use
  existing
  infrastructure?
 
 
  I have not a prototype yet, so I don't know what will be necessary

 Okay, I think then it's better to discuss after your initial analysis/
 prototype, but I think you might need to add some infrastructure code
 to make it possible, as lock database object (relation, tuple, ..) and lock
 others (buffers, ..) have different locking strategy, so to get total wait
 time
 for a statement due to different kind of locks you need to accumulate
 different wait times.


yes, we can wait after prototype will be ready,



  One general point is that won't it be bit difficult to parse the log
 line
  having
  so many times, should we consider to log with some special marker for
  each time (for example: tup_lock_wait_time:1000ms).
 
 
  We should to optimize a log_line_prefix processing instead.
 
  Proposed options are interesting for enterprise using, when you have a
  some more smart tools for log entry processing, and when you need a
 complex
  view about performance of billions queries - when cancel time and lock
 time
  is important piece in mosaic of server' fitness.

 Exactly, though this might not be directly related to this patch, but
 having
 it would be useful.


I don't afraid about impact to performance (surely, it should be tested
first). My previous implementation in GoodData was based on active used
mechanism - it doesn't introduce any new overhead.

But it should be verified on prototype

regards

Pavel



 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] four minor proposals for 9.5

2014-04-08 Thread Gregory Smith

On 4/6/14 2:46 PM, Pavel Stehule wrote:


Proposed options are interesting for enterprise using, when you have 
a some more smart tools for log entry processing, and when you need a 
complex view about performance of billions queries - when cancel time 
and lock time is important piece in mosaic of server' fitness.


I once sent a design proposal over for something I called Performance 
Events that included this.  It will be difficult to get everything 
people want to track into log_line_prefix macro form.  You're right that 
this particular one can probably be pushed into there, but you're adding 
four macros just for this feature. And that's only a fraction of what 
people expect from database per-query performance metrics.


The problem I got stuck on with the performance event project was where 
to store the data collected.  If you want to keep up with read rates, 
you can't use the existing log infrastructure.  It has to be something 
faster, lighter.  I wanted to push the data into shared memory somewhere 
instead.  Then some sort of logging consumer could drain that queue and 
persist it to disk.


Since then, we've had a number of advances, particularly these two:

-Dynamic shared memory allocation.
-Query log data from pg_stat_statements can persist.

With those important fundamentals available, I'm wandering around right 
now trying to get development resources to pick the whole event logging 
idea up again.  The hardest parts of the infrastructure I was stuck on 
in the past are in the code today.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-08 Thread Pavel Stehule
2014-04-08 18:34 GMT+02:00 Gregory Smith gregsmithpg...@gmail.com:

 On 4/6/14 2:46 PM, Pavel Stehule wrote:


 Proposed options are interesting for enterprise using, when you have a
 some more smart tools for log entry processing, and when you need a complex
 view about performance of billions queries - when cancel time and lock time
 is important piece in mosaic of server' fitness.


 I once sent a design proposal over for something I called Performance
 Events that included this.  It will be difficult to get everything people
 want to track into log_line_prefix macro form.  You're right that this
 particular one can probably be pushed into there, but you're adding four
 macros just for this feature. And that's only a fraction of what people
 expect from database per-query performance metrics.

 The problem I got stuck on with the performance event project was where to
 store the data collected.  If you want to keep up with read rates, you
 can't use the existing log infrastructure.  It has to be something faster,
 lighter.  I wanted to push the data into shared memory somewhere instead.
  Then some sort of logging consumer could drain that queue and persist it
 to disk.

 Since then, we've had a number of advances, particularly these two:

 -Dynamic shared memory allocation.
 -Query log data from pg_stat_statements can persist.


I know nothing about your proposal, so I cannot to talk about it. But I am
sure so any memory based solution is not practical for us. It can work well
for cumulative values (per database), but we need a two views - individual
(per queries) and cumulative (per database, per database server). We
process billion queries per day, and for us is more practical to use a
external log processing tools. But I understand well so for large group of
users can be memory solution perfect and I am  thinking so these designs
should coexists together - we log a slow queries (we log plans) and we use
a pg_stat_statements. And users can choose the best method for their
environment.

Probably some API (some data) can be shared by both designs.

Regards

Pavel



 With those important fundamentals available, I'm wandering around right
 now trying to get development resources to pick the whole event logging
 idea up again.  The hardest parts of the infrastructure I was stuck on in
 the past are in the code today.

 --
 Greg Smith greg.sm...@crunchydatasolutions.com
 Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/



Re: [HACKERS] four minor proposals for 9.5

2014-04-07 Thread Amit Kapila
On Mon, Apr 7, 2014 at 12:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014-04-04 6:51 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:
 On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:
  So I'll prepare a some prototypes in next month for
 
  1. log a execution time for cancelled queries,
  2. track a query lock time
 Yes. Initially I though only about cancelled queries, but now O am thinking
 so some more wide solution can be better. Sometimes - some long queries can
 be stopped by Postgres, or by system - and info about duration can be same
 usefull.

Right.


 One more thing I think currently also we can find that by crude way
 (pg_stat_activity has query_start time and log_line_prefix has end time),
 but I think the having in one place 'log' will be more useful.
 ??

I just wanted to say that if someone wants to calculate the duration
of cancelled query (or other error'd query), you can do that by checking
the start time from pg_stat_activity and end time from log (using
log_line_prefix), but this is of course cumbersome.

  Same technique I would to
  use for printing lock time - it can be printing instead symbol %L.

 Above you have mentioned that you are planing to have three different
 lock times (Table, Tuple and others), so will this one symbol (%L) enable
 all three lock times?


 My idea is start with %L as total lock time, what is useful for wide group
 of users, and next or in same time we can enhance it with two chars prefix
 symbols

So do you want to just print lock time for error'd statements, won't
it better to
do it for non-error'd statements as well or rather I feel it can be more useful
for non-error statements? Do we already have some easy way to get wait-time
for non-error statements?

 so

 %L .. total lock time
 %Lr .. lock relation
 %Lt .. lock tuples
 %Lo .. lock others

 L = Lr + Lt + Lr


 Are you planing to add new logs for logging this or planing to use
 existing
 infrastructure?


 I have not a prototype yet, so I don't know what will be necessary

Okay, I think then it's better to discuss after your initial analysis/
prototype, but I think you might need to add some infrastructure code
to make it possible, as lock database object (relation, tuple, ..) and lock
others (buffers, ..) have different locking strategy, so to get total wait time
for a statement due to different kind of locks you need to accumulate
different wait times.

 One general point is that won't it be bit difficult to parse the log line
 having
 so many times, should we consider to log with some special marker for
 each time (for example: tup_lock_wait_time:1000ms).


 We should to optimize a log_line_prefix processing instead.

 Proposed options are interesting for enterprise using, when you have a
 some more smart tools for log entry processing, and when you need a complex
 view about performance of billions queries - when cancel time and lock time
 is important piece in mosaic of server' fitness.

Exactly, though this might not be directly related to this patch, but having
it would be useful.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-06 Thread Pavel Stehule
2014-04-04 6:51 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:

 On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:
  So I'll prepare a some prototypes in next month for
 
  1. log a execution time for cancelled queries,
  2. track a query lock time
 
 
  When I though about this proposal, I found so our implementation is
  plus/minus hack, that can works well in GoodData, but can be inconsistent
  with native postgresql. So in this proposal I'll plan some different
 than we
  use, but I hope so it is more consistent with upstream.
 
  So I miss a execution time for cancelled queries. Same time can be
  interesting for some queries that was from any reasons - temp file limits
  can stop queries after 5 minutes, some out of memory etc ..
 
  It is not hard to implement printing duration for cancelled queries, but
 is
  impossible do it for any kind of exception. But there is way. We can use
 a
  log line prefix space. Now, there are not a possibility to print
 duration
  - so we can introduce a new symbol %D as duration.

 This means for any exception/error it will print duration if %D is used,
 not
 only for cancelled queries or you planing just for some specific logs like
 when query is cancelled.


Yes. Initially I though only about cancelled queries, but now O am thinking
so some more wide solution can be better. Sometimes - some long queries can
be stopped by Postgres, or by system - and info about duration can be same
usefull.


 One more thing I think currently also we can find that by crude way
 (pg_stat_activity has query_start time and log_line_prefix has end time),
 but I think the having in one place 'log' will be more useful.


??




  Same technique I would to
  use for printing lock time - it can be printing instead symbol %L.

 Above you have mentioned that you are planing to have three different
 lock times (Table, Tuple and others), so will this one symbol (%L) enable
 all three lock times?


My idea is start with %L as total lock time, what is useful for wide group
of users, and next or in same time we can enhance it with two chars prefix
symbols

so

%L .. total lock time
%Lr .. lock relation
%Lt .. lock tuples
%Lo .. lock others

L = Lr + Lt + Lr


 Are you planing to add new logs for logging this or planing to use existing
 infrastructure?


I have not a prototype yet, so I don't know what will be necessary

I expect, so almost all infrastructure is done - and I don't expect
significant changes.



 One general point is that won't it be bit difficult to parse the log line
 having
 so many times, should we consider to log with some special marker for
 each time (for example: tup_lock_wait_time:1000ms).


We should to optimize a log_line_prefix processing instead.

Proposed options are interesting for enterprise using, when you have a
some more smart tools for log entry processing, and when you need a complex
view about performance of billions queries - when cancel time and lock time
is important piece in mosaic of server' fitness.

Regards

Pavel




 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] four minor proposals for 9.5

2014-04-03 Thread Amit Kapila
On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:
 So I'll prepare a some prototypes in next month for

 1. log a execution time for cancelled queries,
 2. track a query lock time


 When I though about this proposal, I found so our implementation is
 plus/minus hack, that can works well in GoodData, but can be inconsistent
 with native postgresql. So in this proposal I'll plan some different than we
 use, but I hope so it is more consistent with upstream.

 So I miss a execution time for cancelled queries. Same time can be
 interesting for some queries that was from any reasons - temp file limits
 can stop queries after 5 minutes, some out of memory etc ..

 It is not hard to implement printing duration for cancelled queries, but is
 impossible do it for any kind of exception. But there is way. We can use a
 log line prefix space. Now, there are not a possibility to print duration
 - so we can introduce a new symbol %D as duration.

This means for any exception/error it will print duration if %D is used, not
only for cancelled queries or you planing just for some specific logs like
when query is cancelled.
One more thing I think currently also we can find that by crude way
(pg_stat_activity has query_start time and log_line_prefix has end time),
but I think the having in one place 'log' will be more useful.


 Same technique I would to
 use for printing lock time - it can be printing instead symbol %L.

Above you have mentioned that you are planing to have three different
lock times (Table, Tuple and others), so will this one symbol (%L) enable
all three lock times?
Are you planing to add new logs for logging this or planing to use existing
infrastructure?

One general point is that won't it be bit difficult to parse the log line having
so many times, should we consider to log with some special marker for
each time (for example: tup_lock_wait_time:1000ms).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-01 Thread Pavel Stehule
2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 After week, I can to evaluate a community reflection:


 2014-03-19 16:34 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 I wrote a few patches, that we use in our production. These patches are
 small, but I hope, so its can be interesting for upstream:

 1. cancel time - we log a execution time cancelled statements


 there is a interest



 2. fatal verbose - this patch ensure a verbose log for fatal errors. It
 simplify a investigation about reasons of error.


 not too much



 3. relation limit - possibility to set session limit for maximum size of
 relations. Any relation cannot be extended over this limit in session, when
 this value is higher than zero. Motivation - we use lot of queries like
 CREATE TABLE AS SELECT .. , and some very big results decreased a disk free
 space too much. It was high risk in our multi user environment. Motivation
 is similar like temp_files_limit.


 is not a interest



 4. track statement lock - we are able to track a locking time for query
 and print this data in slow query log and auto_explain log. It help to us
 with lather slow query log analysis.


 there is a interest

 So I'll prepare a some prototypes in next month for

 1. log a execution time for cancelled queries,
 2. track a query lock time


When I though about this proposal, I found so our implementation is
plus/minus hack, that can works well in GoodData, but can be inconsistent
with native postgresql. So in this proposal I'll plan some different than
we use, but I hope so it is more consistent with upstream.

So I miss a execution time for cancelled queries. Same time can be
interesting for some queries that was from any reasons - temp file limits
can stop queries after 5 minutes, some out of memory etc ..

It is not hard to implement printing duration for cancelled queries, but is
impossible do it for any kind of exception. But there is way. We can use a
log line prefix space. Now, there are not a possibility to print duration
- so we can introduce a new symbol %D as duration. Same technique I would
to use for printing lock time - it can be printing instead symbol %L.

so log entry should to look like

timestamp, duration, locktime ERROR, query was cancelled 

Anybody can activate or deactivate these features by using %D or %L in
log_line_prefix

Comments, notes?

Regards

Pavel





Regards

 Pavel



 Do you thinking so  these patches can be generally useful?

 Regards

 Pavel





Re: [HACKERS] four minor proposals for 9.5

2014-03-27 Thread Pavel Stehule
Hello


2014-03-20 1:28 GMT+01:00 Josh Berkus j...@agliodbs.com:

 Pavel,

  I wrote a few patches, that we use in our production. These patches are
  small, but I hope, so its can be interesting for upstream:
 
  1. cancel time - we log a execution time cancelled statements

 Manually cancelled?  statement_timeout?

 Anyway, +1 to add the time to the existing log message, but not in an
 additional log line.


probably it will be possible



 BTW, what do folks think of the idea of adding a new log column called
 timing, which would record duration etc.?  It would be really nice not
 to have to parse this out of the text message all the time ...

  2. fatal verbose - this patch ensure a verbose log for fatal errors. It
  simplify a investigation about reasons of error.

 Configurable, or not?

  3. relation limit - possibility to set session limit for maximum size of
  relations. Any relation cannot be extended over this limit in session,
 when
  this value is higher than zero. Motivation - we use lot of queries like
  CREATE TABLE AS SELECT .. , and some very big results decreased a disk
 free
  space too much. It was high risk in our multi user environment.
 Motivation
  is similar like temp_files_limit.

 I'd think the size of the relation you were creating would be difficult
 to measure.  Also, would this apply to REINDEX/VACUUM FULL/ALTER?  Or
 just CREATE TABLE AS/SELECT INTO?

  4. track statement lock - we are able to track a locking time for query
 and
  print this data in slow query log and auto_explain log. It help to us
 with
  lather slow query log analysis.

 I'm very interested in this.  What does it look like?

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com



Re: [HACKERS] four minor proposals for 9.5

2014-03-27 Thread Pavel Stehule
Hello

After week, I can to evaluate a community reflection:


2014-03-19 16:34 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 I wrote a few patches, that we use in our production. These patches are
 small, but I hope, so its can be interesting for upstream:

 1. cancel time - we log a execution time cancelled statements


there is a interest



 2. fatal verbose - this patch ensure a verbose log for fatal errors. It
 simplify a investigation about reasons of error.


not too much



 3. relation limit - possibility to set session limit for maximum size of
 relations. Any relation cannot be extended over this limit in session, when
 this value is higher than zero. Motivation - we use lot of queries like
 CREATE TABLE AS SELECT .. , and some very big results decreased a disk free
 space too much. It was high risk in our multi user environment. Motivation
 is similar like temp_files_limit.


is not a interest



 4. track statement lock - we are able to track a locking time for query
 and print this data in slow query log and auto_explain log. It help to us
 with lather slow query log analysis.


there is a interest

So I'll prepare a some prototypes in next month for

1. log a execution time for cancelled queries,
2. track a query lock time

Regards

Pavel



 Do you thinking so  these patches can be generally useful?

 Regards

 Pavel



Re: [HACKERS] four minor proposals for 9.5

2014-03-20 Thread Mark Kirkwood

On 20/03/14 13:28, Josh Berkus wrote:


3. relation limit - possibility to set session limit for maximum size of
relations. Any relation cannot be extended over this limit in session, when
this value is higher than zero. Motivation - we use lot of queries like
CREATE TABLE AS SELECT .. , and some very big results decreased a disk free
space too much. It was high risk in our multi user environment. Motivation
is similar like temp_files_limit.


I'd think the size of the relation you were creating would be difficult
to measure.  Also, would this apply to REINDEX/VACUUM FULL/ALTER?  Or
just CREATE TABLE AS/SELECT INTO?



Also I think this would probably only make sense for TEMPORARY tables - 
otherwise you can get this sort of thing going on:


- you create a table and you have set a relation size limit
- you commit and keep working
- I add a whole lot of rows to your new table (taking it over the limit)
- you go to add some more rows to this table...

Should you now be stopped working? Does this feature need to track *who* 
added which chunks of a table (suspect very difficult to do sensibly)?


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-03-20 Thread Pavel Stehule
2014-03-20 5:36 GMT+01:00 Amit Kapila amit.kapil...@gmail.com:

 On Wed, Mar 19, 2014 at 9:04 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
  I wrote a few patches, that we use in our production. These patches are
  small, but I hope, so its can be interesting for upstream:
 
  1. cancel time - we log a execution time cancelled statements
 
  2. fatal verbose - this patch ensure a verbose log for fatal errors. It
  simplify a investigation about reasons of error.
 
  3. relation limit - possibility to set session limit for maximum size of
  relations. Any relation cannot be extended over this limit in session,
 when
  this value is higher than zero. Motivation - we use lot of queries like
  CREATE TABLE AS SELECT .. , and some very big results decreased a disk
 free
  space too much. It was high risk in our multi user environment.
 Motivation
  is similar like temp_files_limit.

 So if there is error on reaching max threshold size, won't it loose all
 data or
 is that expected?


Exception is expected



 Also I think it might not be applicable for all table inserts, as normally
 checkpointer/bgrwriter flushes data, so you might not be able to estimate
 size immediately when your SQL statement is executing.

 Won't it better to have LIMIT Rows in Select statement or generically
 have table level option for Max Rows?


It significantly harder to expect N for LIMIT - there are lot of varlena
types, so sometimes you can be too strict, sometimes you can be too tolerant




 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] four minor proposals for 9.5

2014-03-20 Thread Pavel Stehule
2014-03-20 7:25 GMT+01:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz:

 On 20/03/14 13:28, Josh Berkus wrote:

  3. relation limit - possibility to set session limit for maximum size of
 relations. Any relation cannot be extended over this limit in session,
 when
 this value is higher than zero. Motivation - we use lot of queries like
 CREATE TABLE AS SELECT .. , and some very big results decreased a disk
 free
 space too much. It was high risk in our multi user environment.
 Motivation
 is similar like temp_files_limit.


 I'd think the size of the relation you were creating would be difficult
 to measure.  Also, would this apply to REINDEX/VACUUM FULL/ALTER?  Or
 just CREATE TABLE AS/SELECT INTO?


 Also I think this would probably only make sense for TEMPORARY tables -
 otherwise you can get this sort of thing going on:

 - you create a table and you have set a relation size limit
 - you commit and keep working
 - I add a whole lot of rows to your new table (taking it over the limit)
 - you go to add some more rows to this table...


you cannot to across session limit and is not important if you do inserts
more times or once.

This patch is very simple - it is only one safeguard against too low space
on disc in very dynamic multi user enironment

--- ./src/backend/storage/smgr/md.c 2014-02-26 17:29:36.864189192 +0100
***
*** 27,32 
--- 27,33 
  #include storage/bufmgr.h
  #include storage/relfilenode.h
  #include storage/smgr.h
+ #include utils/guc.h
  #include utils/hsearch.h
  #include utils/memutils.h
  #include pg_trace.h
***
*** 180,185 
--- 181,191 
  static BlockNumber _mdnblocks(SMgrRelation reln, ForkNumber forknum,
   MdfdVec *seg);

+ /*
+  *  limits for relations size
+  */
+ int max_blocks;

  /*
   *mdinit() -- Initialize private state for magnetic disk storage
manager.
***
*** 475,480 
--- 481,494 
Assert(blocknum = mdnblocks(reln, forknum));
  #endif

+   if (max_blocks != -1  blocknum  (BlockNumber) max_blocks)
+   ereport(ERROR,
+   (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+errmsg(cannot extend file beyond %u
blocks,
+   max_blocks),
+errhint(Session file limit defined by
\hard_relation_limit\ (%s) is over.,
+
GetConfigOptionByName(hard_relation_limit, NULL;
+





 Should you now be stopped working? Does this feature need to track *who*
 added which chunks of a table (suspect very difficult to do sensibly)?

 Regards

 Mark



Re: [HACKERS] four minor proposals for 9.5

2014-03-20 Thread Mark Kirkwood

On 20/03/14 20:08, Pavel Stehule wrote:




2014-03-20 7:25 GMT+01:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz
Also I think this would probably only make sense for TEMPORARY
tables - otherwise you can get this sort of thing going on:

- you create a table and you have set a relation size limit
- you commit and keep working
- I add a whole lot of rows to your new table (taking it over the limit)
- you go to add some more rows to this table...


you cannot to across session limit and is not important if you do
inserts more times or once.



Sorry Pavel - what you have said above is difficult for me to understand 
- if the limit is intended as a *session* limit then concurrent activity 
from multiple sessions makes it behave - well - strangely to say the 
least, as tables are essentially shared resources.


Regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-03-20 Thread Pavel Stehule
2014-03-20 9:47 GMT+01:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz:

 On 20/03/14 20:08, Pavel Stehule wrote:




 2014-03-20 7:25 GMT+01:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz
 Also I think this would probably only make sense for TEMPORARY
 tables - otherwise you can get this sort of thing going on:

 - you create a table and you have set a relation size limit
 - you commit and keep working
 - I add a whole lot of rows to your new table (taking it over the
 limit)
 - you go to add some more rows to this table...


 you cannot to across session limit and is not important if you do
 inserts more times or once.


 Sorry Pavel - what you have said above is difficult for me to understand -
 if the limit is intended as a *session* limit then concurrent activity from
 multiple sessions makes it behave - well - strangely to say the least, as
 tables are essentially shared resources.


I am sorry, I should to explain first our use case. Our product support
multidimensional modelling - usually we have a few (less than 1000)
unlimited user data tables. When  user can to see some view (report), our
engine generate 10 - 100 queries and result of these queries are stored in
tables. Then result of one calculation can be shared between reports,
users. These tables (caches) are semi temporal - life cycle is about hour,
max days. Some queries in multidimensional analysis are Cartesian products
- we are not able to estimate well a sizes of these tables - due free
schema - users can create own logical model (users can fill these data
freely) - and variability of generated queries is too long.

So we need to some safeguards in background.

Regards

Pavel





 Regards

 Mark




Re: [HACKERS] four minor proposals for 9.5

2014-03-19 Thread Vik Fearing
On 03/19/2014 04:34 PM, Pavel Stehule wrote:
 Hello

 I wrote a few patches, that we use in our production. These patches
 are small, but I hope, so its can be interesting for upstream:

 1. cancel time - we log a execution time cancelled statements

+1

I even wrote a patch to do this, but it wasn't pretty so I never posted
it.  I would be happy to review yours or revive mine.

 2. fatal verbose - this patch ensure a verbose log for fatal errors.
 It simplify a investigation about reasons of error.

+0

 3. relation limit - possibility to set session limit for maximum size
 of relations. Any relation cannot be extended over this limit in
 session, when this value is higher than zero. Motivation - we use lot
 of queries like CREATE TABLE AS SELECT .. , and some very big results
 decreased a disk free space too much. It was high risk in our multi
 user environment. Motivation is similar like temp_files_limit.

-1

Also, I can't find any temp_files_limit setting anywhere.

 4. track statement lock - we are able to track a locking time for
 query and print this data in slow query log and auto_explain log. It
 help to us with lather slow query log analysis.

-0

 Do you thinking so  these patches can be generally useful?

Yes and no, as scored above.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-03-19 Thread Pavel Stehule
2014-03-19 23:31 GMT+01:00 Vik Fearing vik.fear...@dalibo.com:

 On 03/19/2014 04:34 PM, Pavel Stehule wrote:
  Hello
 
  I wrote a few patches, that we use in our production. These patches
  are small, but I hope, so its can be interesting for upstream:
 
  1. cancel time - we log a execution time cancelled statements

 +1

 I even wrote a patch to do this, but it wasn't pretty so I never posted
 it.  I would be happy to review yours or revive mine.

  2. fatal verbose - this patch ensure a verbose log for fatal errors.
  It simplify a investigation about reasons of error.

 +0

  3. relation limit - possibility to set session limit for maximum size
  of relations. Any relation cannot be extended over this limit in
  session, when this value is higher than zero. Motivation - we use lot
  of queries like CREATE TABLE AS SELECT .. , and some very big results
  decreased a disk free space too much. It was high risk in our multi
  user environment. Motivation is similar like temp_files_limit.

 -1

 Also, I can't find any temp_files_limit setting anywhere.


any our server serves a about 100 users per seconds. We have system, that
generate relative complex queries, where some Cartesian products are valid
(although we are not happy). This limits removes a few (but real) critical
issues, when we didn't enough free space on disc. After implementation of
this limit we are able to ensure safe long time production with about 20%
free space. In multiuser environment is more safe to kill some queries than
lost your all  production.



  4. track statement lock - we are able to track a locking time for
  query and print this data in slow query log and auto_explain log. It
  help to us with lather slow query log analysis.

 -0

  Do you thinking so  these patches can be generally useful?

 Yes and no, as scored above.

 --
 Vik




Re: [HACKERS] four minor proposals for 9.5

2014-03-19 Thread Josh Berkus
Pavel,

 I wrote a few patches, that we use in our production. These patches are
 small, but I hope, so its can be interesting for upstream:
 
 1. cancel time - we log a execution time cancelled statements

Manually cancelled?  statement_timeout?

Anyway, +1 to add the time to the existing log message, but not in an
additional log line.

BTW, what do folks think of the idea of adding a new log column called
timing, which would record duration etc.?  It would be really nice not
to have to parse this out of the text message all the time ...

 2. fatal verbose - this patch ensure a verbose log for fatal errors. It
 simplify a investigation about reasons of error.

Configurable, or not?

 3. relation limit - possibility to set session limit for maximum size of
 relations. Any relation cannot be extended over this limit in session, when
 this value is higher than zero. Motivation - we use lot of queries like
 CREATE TABLE AS SELECT .. , and some very big results decreased a disk free
 space too much. It was high risk in our multi user environment. Motivation
 is similar like temp_files_limit.

I'd think the size of the relation you were creating would be difficult
to measure.  Also, would this apply to REINDEX/VACUUM FULL/ALTER?  Or
just CREATE TABLE AS/SELECT INTO?

 4. track statement lock - we are able to track a locking time for query and
 print this data in slow query log and auto_explain log. It help to us with
 lather slow query log analysis.

I'm very interested in this.  What does it look like?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-03-19 Thread Pavel Stehule
2014-03-20 1:28 GMT+01:00 Josh Berkus j...@agliodbs.com:

 Pavel,

  I wrote a few patches, that we use in our production. These patches are
  small, but I hope, so its can be interesting for upstream:
 
  1. cancel time - we log a execution time cancelled statements

 Manually cancelled?  statement_timeout?


Manually cancelled - we have a more levels - user cancel 3..10 minutes,
query executor timeout 20 minutes, and hard core statement limit 25 minutes.

logging of execution time helps to us identify reason of cancel, and helps
to us identify impatient user (and where is a limit). It is same like query
time, when you log it.



 Anyway, +1 to add the time to the existing log message, but not in an
 additional log line.

 BTW, what do folks think of the idea of adding a new log column called
 timing, which would record duration etc.?  It would be really nice not
 to have to parse this out of the text message all the time ...

  2. fatal verbose - this patch ensure a verbose log for fatal errors. It
  simplify a investigation about reasons of error.

 Configurable, or not?


we have not configuration, but it should be configurable naturally - A main
motivation about this feature was a same message for more errors - and
fatal level helps to us identify a source. But we cannot to enable verbose
level as default due log size and log overhead.



  3. relation limit - possibility to set session limit for maximum size of
  relations. Any relation cannot be extended over this limit in session,
 when
  this value is higher than zero. Motivation - we use lot of queries like
  CREATE TABLE AS SELECT .. , and some very big results decreased a disk
 free
  space too much. It was high risk in our multi user environment.
 Motivation
  is similar like temp_files_limit.

 I'd think the size of the relation you were creating would be difficult
 to measure.  Also, would this apply to REINDEX/VACUUM FULL/ALTER?  Or
 just CREATE TABLE AS/SELECT INTO?


It was only relation limit without indexes or anything else. It just early
stop for queries where statement timeout is too late (and allocated space
on disc is too long). Our statement limit is 20 minutes and then a query
can create table about 100GB - only ten unlimited users had to take our
full free space on Amazon disc.



  4. track statement lock - we are able to track a locking time for query
 and
  print this data in slow query log and auto_explain log. It help to us
 with
  lather slow query log analysis.

 I'm very interested in this.  What does it look like?


We divided locks to three kinds (levels): tables, tuples, and others. As
results we print three numbers for any SQL statement - waiting to table's
locks, waiting to tuple's locks and waiting to other's locks (extending
page locks and similar). I don't remember so we used any info in this
detail's level, but it is interesting for slow queries. You don't spend
time over analyse of mystical fast/slow queries - you see clearly so
problem was in locks.

Regards

Pavel



 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com



Re: [HACKERS] four minor proposals for 9.5

2014-03-19 Thread Amit Kapila
On Wed, Mar 19, 2014 at 9:04 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I wrote a few patches, that we use in our production. These patches are
 small, but I hope, so its can be interesting for upstream:

 1. cancel time - we log a execution time cancelled statements

 2. fatal verbose - this patch ensure a verbose log for fatal errors. It
 simplify a investigation about reasons of error.

 3. relation limit - possibility to set session limit for maximum size of
 relations. Any relation cannot be extended over this limit in session, when
 this value is higher than zero. Motivation - we use lot of queries like
 CREATE TABLE AS SELECT .. , and some very big results decreased a disk free
 space too much. It was high risk in our multi user environment. Motivation
 is similar like temp_files_limit.

So if there is error on reaching max threshold size, won't it loose all data or
is that expected?

Also I think it might not be applicable for all table inserts, as normally
checkpointer/bgrwriter flushes data, so you might not be able to estimate
size immediately when your SQL statement is executing.

Won't it better to have LIMIT Rows in Select statement or generically
have table level option for Max Rows?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers