Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-11-06 Thread Tomas Vondra

On Wed, Nov 06, 2019 at 08:00:57PM +0100, Adrien Nayrat wrote:

On 11/6/19 7:21 PM, Tomas Vondra wrote:

I've pushed the reworked version of log_statement_sample_rate patch [1].
If I understand correctly, that makes this patch unnecessary, and we
should mark it as rejected. Or do we still need it?


Yes, the goal of this patch was to disable sampling and log all queries whose
duration exceed  log_statement_sample_limit.

For now it is possible just with log_min_duration_statement which log all
queries whose duration exceed it.



OK, I've marked it as rejected. If someone thinks we should still have
something like it, please submit a patch implementing it.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-11-06 Thread Adrien Nayrat
On 11/6/19 7:21 PM, Tomas Vondra wrote:
> I've pushed the reworked version of log_statement_sample_rate patch [1].
> If I understand correctly, that makes this patch unnecessary, and we
> should mark it as rejected. Or do we still need it?

Yes, the goal of this patch was to disable sampling and log all queries whose
duration exceed  log_statement_sample_limit.

For now it is possible just with log_min_duration_statement which log all
queries whose duration exceed it.

-- 
Adrien





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-11-06 Thread Tomas Vondra

On Sun, Aug 04, 2019 at 11:41:54PM +0200, Tomas Vondra wrote:

On Sun, Aug 04, 2019 at 10:48:48PM +0200, Tomas Vondra wrote:

On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:

Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?



No, that interaction only affects statement-level sampling.


OK, I was confusing the features.


For transaction-level sampling we do the sampling independently of the
statement duration, i.e. we when starting a transaction we determine
whether the whole transaction will be sampled. It has nothing to do with
the proposed log_statement_sample_limit.


So, to clarify: our plan is that a given statement will be logged
if any of these various partial-logging features says to do so?



Yes, I think that's the expected behavior.

- did it exceed log_min_duration_statement? -> log it
- is it part of sampled xact? -> log it
- maybe sample the statement (to be reverted / reimplemented)


(And the knock on HEAD's behavior is exactly that it breaks that
independence for log_min_duration_statement.)



Yeah. There's no way to use sampling, while ensure logging of all
queries longer than some limit.



FWIW I've reverted the log_statement_sample_rate (both from master and
REL_12_STABLE). May the buildfarm be merciful to me.

I've left the log_transaction_sample_rate in, as that seems unaffected
by this discussion.



I've pushed the reworked version of log_statement_sample_rate patch [1].
If I understand correctly, that makes this patch unnecessary, and we
should mark it as rejected. Or do we still need it?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tomas Vondra

On Sun, Aug 04, 2019 at 10:48:48PM +0200, Tomas Vondra wrote:

On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:

Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?



No, that interaction only affects statement-level sampling.


OK, I was confusing the features.


For transaction-level sampling we do the sampling independently of the
statement duration, i.e. we when starting a transaction we determine
whether the whole transaction will be sampled. It has nothing to do with
the proposed log_statement_sample_limit.


So, to clarify: our plan is that a given statement will be logged
if any of these various partial-logging features says to do so?



Yes, I think that's the expected behavior.

- did it exceed log_min_duration_statement? -> log it
- is it part of sampled xact? -> log it
- maybe sample the statement (to be reverted / reimplemented)


(And the knock on HEAD's behavior is exactly that it breaks that
independence for log_min_duration_statement.)



Yeah. There's no way to use sampling, while ensure logging of all
queries longer than some limit.



FWIW I've reverted the log_statement_sample_rate (both from master and
REL_12_STABLE). May the buildfarm be merciful to me.

I've left the log_transaction_sample_rate in, as that seems unaffected
by this discussion.


regards


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tomas Vondra

On Sun, Aug 04, 2019 at 04:25:12PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:

Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?



No, that interaction only affects statement-level sampling.


OK, I was confusing the features.


For transaction-level sampling we do the sampling independently of the
statement duration, i.e. we when starting a transaction we determine
whether the whole transaction will be sampled. It has nothing to do with
the proposed log_statement_sample_limit.


So, to clarify: our plan is that a given statement will be logged
if any of these various partial-logging features says to do so?



Yes, I think that's the expected behavior.

- did it exceed log_min_duration_statement? -> log it
- is it part of sampled xact? -> log it
- maybe sample the statement (to be reverted / reimplemented)


(And the knock on HEAD's behavior is exactly that it breaks that
independence for log_min_duration_statement.)



Yeah. There's no way to use sampling, while ensure logging of all
queries longer than some limit.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tom Lane
Tomas Vondra  writes:
> On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:
>> Isn't the issue here the interaction between log_transaction_sample_rate
>> and log_min_duration_statement?

> No, that interaction only affects statement-level sampling.

OK, I was confusing the features.

> For transaction-level sampling we do the sampling independently of the
> statement duration, i.e. we when starting a transaction we determine
> whether the whole transaction will be sampled. It has nothing to do with
> the proposed log_statement_sample_limit.

So, to clarify: our plan is that a given statement will be logged
if any of these various partial-logging features says to do so?

(And the knock on HEAD's behavior is exactly that it breaks that
independence for log_min_duration_statement.)

regards, tom lane




Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tomas Vondra

On Sun, Aug 04, 2019 at 03:16:12PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

OK, I have the revert ready. The one thing I'm wondering about is
whether we need to revert log_transaction_sample_rate too? I think it's
pretty much independent feature, so I think we can keep that. Opinions?


Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?  Seems like we have that question
regardless of whether log_statement_sample_limit exists.



No, that interaction only affects statement-level sampling.

For transaction-level sampling we do the sampling independently of the
statement duration, i.e. we when starting a transaction we determine
whether the whole transaction will be sampled. It has nothing to do with
the proposed log_statement_sample_limit.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tom Lane
Tomas Vondra  writes:
> OK, I have the revert ready. The one thing I'm wondering about is
> whether we need to revert log_transaction_sample_rate too? I think it's
> pretty much independent feature, so I think we can keep that. Opinions?

Isn't the issue here the interaction between log_transaction_sample_rate
and log_min_duration_statement?  Seems like we have that question
regardless of whether log_statement_sample_limit exists.

regards, tom lane




Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-04 Thread Tomas Vondra

On Fri, Aug 02, 2019 at 09:53:40AM +0200, Adrien Nayrat wrote:

On 8/1/19 12:04 PM, Tomas Vondra wrote:

On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:

Hi,

As we are at the end of this CF and there is still discussions about whether we
should revert log_statement_sample_limit and log_statement_sample_rate, or try
to fix it in v12.
I moved this patch to next commit fest and change status from "ready for
commiter" to "need review". I hope I didn't make a mistake.



Thanks. The RFC status was clearly stale, so thanks for updating. I should
have done that after my review. I think the patch would be moved to the
next CF at the end, but I might be wrong. In any case, I don't think
you've done any mistake.

As for the sampling patch - I think we'll end up reverting the feature for
v12 - it's far too late to rework it at this point. Sorry about that, I
know it's not a warm feeling when you get something done, and then it gets
reverted on the last minute. :-(



Don't worry, I understand. It is better to add straigforward GUC in v13 than
confusionning in v12 we will regret.



OK, I have the revert ready. The one thing I'm wondering about is
whether we need to revert log_transaction_sample_rate too? I think it's
pretty much independent feature, so I think we can keep that. Opinions?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-02 Thread Adrien Nayrat
On 8/1/19 12:04 PM, Tomas Vondra wrote:
> On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:
>> Hi,
>>
>> As we are at the end of this CF and there is still discussions about whether 
>> we
>> should revert log_statement_sample_limit and log_statement_sample_rate, or 
>> try
>> to fix it in v12.
>> I moved this patch to next commit fest and change status from "ready for
>> commiter" to "need review". I hope I didn't make a mistake.
>>
> 
> Thanks. The RFC status was clearly stale, so thanks for updating. I should
> have done that after my review. I think the patch would be moved to the
> next CF at the end, but I might be wrong. In any case, I don't think
> you've done any mistake.
> 
> As for the sampling patch - I think we'll end up reverting the feature for
> v12 - it's far too late to rework it at this point. Sorry about that, I
> know it's not a warm feeling when you get something done, and then it gets
> reverted on the last minute. :-(
> 

Don't worry, I understand. It is better to add straigforward GUC in v13 than
confusionning in v12 we will regret.






Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-01 Thread Tomas Vondra

On Thu, Aug 01, 2019 at 11:47:46AM +0200, Adrien Nayrat wrote:

Hi,

As we are at the end of this CF and there is still discussions about whether we
should revert log_statement_sample_limit and log_statement_sample_rate, or try
to fix it in v12.
I moved this patch to next commit fest and change status from "ready for
commiter" to "need review". I hope I didn't make a mistake.



Thanks. The RFC status was clearly stale, so thanks for updating. I should
have done that after my review. I think the patch would be moved to the
next CF at the end, but I might be wrong. In any case, I don't think
you've done any mistake.

As for the sampling patch - I think we'll end up reverting the feature for
v12 - it's far too late to rework it at this point. Sorry about that, I
know it's not a warm feeling when you get something done, and then it gets
reverted on the last minute. :-(

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-08-01 Thread Adrien Nayrat
Hi,

As we are at the end of this CF and there is still discussions about whether we
should revert log_statement_sample_limit and log_statement_sample_rate, or try
to fix it in v12.
I moved this patch to next commit fest and change status from "ready for
commiter" to "need review". I hope I didn't make a mistake.

Best regards,



signature.asc
Description: OpenPGP digital signature


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-31 Thread Adrien Nayrat
On 7/28/19 12:19 AM, Tomas Vondra wrote:
> Hi,
> 
> I've started reviewing this patch, thinking that maybe I could get it
> committed as it's marked as RFC. In general I agree with having this
> fuature, but I think we need to rethink the GUC because the current
> approach is just confusing.
> 
> The way the current patch works is that we have three GUCs:
> 
>  log_min_duration_statement
>  log_statement_sample_limit
>  log_statement_sample_rate
> 
> and it essentially works like this:
> 
> - If the duration exceeds log_min_duration_statement, we start sampling
>  the commands with log_statement_sample rate.
> 
> - If the duration exceeds log_statement_sample_limit, we just log the
>  command every time (i.e. we disable sampling, using sample rate 1.0).
> 
> IMO that's bound to be confusing for users, because one threshold
> behaves as minimum while the other behaves as maximum.

I agree, it took me a while to understand how it behave with the three GUC. That
why I tried to enrich documentation, but it may mean that the functionality is
not properly implemented.

> 
> 
> What I think we should do instead is to use two minimum thresholds.
> 
> 1) log_min_duration_sample - enables sampling of commands, using the
> existing GUC log_statement_sample_rate
> 
> 2) log_min_duration_statement - logs all commands exceeding this
> 
> 
> I think this is going to be much easier for users to understand.

+1, I like this idea.

I don't really have an opinion if we have to revert the whole feature or try to
fix it for v12. But it is true it is a late to fix it.

Furthermore, users who really want this feature in v12 can use an extension for
that purpose [1].

1: I made this extension with same kind of features :
https://github.com/anayrat/pg_sampletolog



signature.asc
Description: OpenPGP digital signature


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-30 Thread Michael Paquier
On Tue, Jul 30, 2019 at 03:43:58PM -0400, Tom Lane wrote:
> Well, we do not need to have a backwards-compatibility problem
> here, because we have yet to release a version containing
> log_statement_sample_rate.  I do not think it's too late to decide
> that v12's semantics for that are broken, and either revert that
> patch in v12, or back-patch a fix to make it match this idea.

With my RTM hat on, if we think that the current semantics of
log_statement_sample_rate are broken and need a redesign, then I would
take the safest path and just revert the original patch in v12, and
finally make sure that it brews correctly for v13.  We are in beta2
and close to a beta3, so redesigning things at this stage on a stable
branch sounds wrong.
--
Michael


signature.asc
Description: PGP signature


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-30 Thread Tomas Vondra

On Tue, Jul 30, 2019 at 03:43:58PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

I've started reviewing this patch, thinking that maybe I could get it
committed as it's marked as RFC. In general I agree with having this
fuature, but I think we need to rethink the GUC because the current
approach is just confusing.
...
What I think we should do instead is to use two minimum thresholds.
1) log_min_duration_sample - enables sampling of commands, using the
existing GUC log_statement_sample_rate
2) log_min_duration_statement - logs all commands exceeding this
I think this is going to be much easier for users to understand.


I agree with Tomas' idea.


The one difference between those approaches is in how they work with
existing current settings. That is, let's say you have
  log_min_duration_statement = 1000
  log_statement_sample_rate = 0.01
then no queries below 1000ms will be logged, and 1% of longer queries
will be sampled. And with the original config (as proposed in v3 of the
patch), this would still work the same way.
With the new approach (two min thresholds) it'd behave differently,
because we'd log *all* queries longer than 1000ms (not just 1%). And
whether we'd sample any queries (using log_statement_sample_rate) would
depend on how we'd pick the default value for the other threshold.


Well, we do not need to have a backwards-compatibility problem
here, because we have yet to release a version containing
log_statement_sample_rate.  I do not think it's too late to decide
that v12's semantics for that are broken, and either revert that
patch in v12, or back-patch a fix to make it match this idea.



I'm willing to try fixing this to salvage the feature for v12. The
question is how would that fix look like - IMO we'd need to introduce
the new threshold GUC, essentially implementing what this thread is
about. It's not a complex patch, but it kinda flies in the face of
feature freeze. OTOH if we call it a fix ...

The patch itself is not that complicated - attached is a WIP version,
(particularly) the docs need more work.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c91e3e1550..8cb695044b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5850,8 +5850,7 @@ local0.*/var/log/postgresql
 
  Causes the duration of each completed statement to be logged
  if the statement ran for at least the specified number of
- milliseconds, modulated by 
log_statement_sample_rate.
- Setting this to zero prints all statement durations.
+ milliseconds. Setting this to zero prints all statement durations.
  -1 (the default) disables logging statements due to
  exceeding duration threshold; for example, if you set it to
  250ms, then all SQL statements that run 250ms or
@@ -5882,6 +5881,48 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_min_duration_sample (integer)
+  
+   log_min_duration_sample configuration 
parameter
+  
+  
+   
+
+ Causes the duration of each completed statement to be logged
+ if the statement ran for at least the specified number of
+ milliseconds, modulated by 
log_statement_sample_rate.
+ Setting this to zero samples all statement durations.
+ -1 (the default) disables sampling of statements;
+ for example, if you set it to 250ms, then all
+ SQL statements that run 250ms or longer will be sampled for
+ logging. Statements with duration exceeding
+ log_min_duration_statement are not subject to
+ sampling and are logged every time.
+
+
+
+ For clients using extended query protocol, durations of the Parse,
+ Bind, and Execute steps are logged independently.
+
+
+   
+
+ When using this option together with
+ ,
+ the text of statements that are logged because of
+ log_statement will not be repeated in the
+ duration log message.
+ If you are not using syslog, it is 
recommended
+ that you log the PID or session ID using
+ 
+ so that you can link the statement message to the later
+ duration message using the process ID or session ID.
+
+   
+   
+  
+
  
   log_statement_sample_rate (real)
   
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a6505c7335..ae1def0f37 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2213,7 +2213,8 @@ check_log_statement(List *stmt_list)
 /*
  * check_log_duration
  * Determine whether current command's duration should be logged.
- * If log_statement_sample_rate < 1.0, log only a sample.
+ * If log_statement_sample_rate < 1.0, 

Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-30 Thread Tom Lane
Tomas Vondra  writes:
> I've started reviewing this patch, thinking that maybe I could get it
> committed as it's marked as RFC. In general I agree with having this
> fuature, but I think we need to rethink the GUC because the current
> approach is just confusing.
> ...
> What I think we should do instead is to use two minimum thresholds.
> 1) log_min_duration_sample - enables sampling of commands, using the
> existing GUC log_statement_sample_rate
> 2) log_min_duration_statement - logs all commands exceeding this
> I think this is going to be much easier for users to understand.

I agree with Tomas' idea.

> The one difference between those approaches is in how they work with
> existing current settings. That is, let's say you have
>   log_min_duration_statement = 1000
>   log_statement_sample_rate = 0.01
> then no queries below 1000ms will be logged, and 1% of longer queries
> will be sampled. And with the original config (as proposed in v3 of the
> patch), this would still work the same way.
> With the new approach (two min thresholds) it'd behave differently,
> because we'd log *all* queries longer than 1000ms (not just 1%). And
> whether we'd sample any queries (using log_statement_sample_rate) would
> depend on how we'd pick the default value for the other threshold.

Well, we do not need to have a backwards-compatibility problem
here, because we have yet to release a version containing
log_statement_sample_rate.  I do not think it's too late to decide
that v12's semantics for that are broken, and either revert that
patch in v12, or back-patch a fix to make it match this idea.

regards, tom lane




Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-27 Thread Tomas Vondra

Hi,

I've started reviewing this patch, thinking that maybe I could get it
committed as it's marked as RFC. In general I agree with having this
fuature, but I think we need to rethink the GUC because the current
approach is just confusing.

The way the current patch works is that we have three GUCs:

 log_min_duration_statement
 log_statement_sample_limit
 log_statement_sample_rate

and it essentially works like this:

- If the duration exceeds log_min_duration_statement, we start sampling
 the commands with log_statement_sample rate.

- If the duration exceeds log_statement_sample_limit, we just log the
 command every time (i.e. we disable sampling, using sample rate 1.0).

IMO that's bound to be confusing for users, because one threshold
behaves as minimum while the other behaves as maximum.


What I think we should do instead is to use two minimum thresholds.

1) log_min_duration_sample - enables sampling of commands, using the
existing GUC log_statement_sample_rate

2) log_min_duration_statement - logs all commands exceeding this


I think this is going to be much easier for users to understand.


The one difference between those approaches is in how they work with
existing current settings. That is, let's say you have

 log_min_duration_statement = 1000
 log_statement_sample_rate = 0.01

then no queries below 1000ms will be logged, and 1% of longer queries
will be sampled. And with the original config (as proposed in v3 of the
patch), this would still work the same way.

With the new approach (two min thresholds) it'd behave differently,
because we'd log *all* queries longer than 1000ms (not just 1%). And
whether we'd sample any queries (using log_statement_sample_rate) would
depend on how we'd pick the default value for the other threshold.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-12 Thread Pavel Stehule
Hi

pá 12. 7. 2019 v 13:07 odesílatel Adrien Nayrat 
napsal:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:tested, passed
>
> Sorry, I forgot to tick "passed" boxes.


Thank you

Pavel


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-12 Thread Adrien Nayrat
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Sorry, I forgot to tick "passed" boxes.

Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-07-12 Thread Adrien Nayrat
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

I test the latest patch attached to this thread 
(log_statement_sample_limit-3.patch). Everything looks good to me.

The new status of this patch is: Ready for Committer


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-19 Thread Pavel Stehule
st 19. 6. 2019 v 10:49 odesílatel Adrien Nayrat 
napsal:

> On 6/18/19 8:29 PM, Pavel Stehule wrote:
> >
> >
> > út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat <
> adrien.nay...@anayrat.info
> > > napsal:
> >
> > Hi,
> >
> > I tried the patch, here my comment:
> >
> > > gettext_noop("Zero effective disables sampling. "
> > >  "-1 use sampling every time (without
> limit)."),
> >
> > I do not agree with the zero case. In fact, sampling is disabled as
> soon as
> > setting is less than log_min_duration_statements. Furthermore, I
> think we should
> > provide a more straightforward description for users.
> >
> >
> > You have true, but I have not a idea,how to describe it in one line. In
> this
> > case the zero is corner case, and sampling is disabled without
> dependency on
> > log_min_duration_statement.
> >
> > I think so this design has only few useful values and ranges
> >
> > a) higher than log_min_duration_statement .. sampling is active with
> limit
> > b) 0 .. for this case - other way how to effective disable sampling - no
> > dependency on other
> > c) -1 or negative value - sampling is allowed every time.
> >
> > Sure, there is range (0..log_min_duration_statement), but for this range
> this
> > value has not sense. I think so this case cannot be mentioned in short
> > description. But it should be mentioned in documentation.
>
> Yes, it took me a while to understand :) I am ok to keep simple in GUC
> description and give more information in documentation.
>

Maybe some like. "The zero block sampling. Negative value forces sampling
without limit"


> >
> >
> > I changed few comments and documentation:
> >
> >   * As we added much more logic in this function with statement and
> transaction
> > sampling. And now with statement_sample_rate, it is not easy to
> understand the
> > logic on first look. I reword comment in check_log_duration, I hope
> it is more
> > straightforward.
> >
> >   * I am not sure if "every_time" is a good naming for the variable.
> In fact, if
> > duration exceeds limit we disable sampling. Maybe sampling_disabled
> is more
> > clear?
> >
> >
> > For me important is following line
> >
> > (exceeded && (in_sample || every_time))
> >
> > I think so "every_time" or "always" or "every" is in this context more
> > illustrative than "sampling_disabled". But my opinion is not strong in
> this
> > case, and I have not a problem accept common opinion.
>
> Oh, yes, that's correct. I do not have a strong opinion too. Maybe someone
> else
> will have better idea.
>

the naming in this case is not hard issue, and comitter  can decide.

Regards

Pavel

>
> --
> Adrien
>
>


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-19 Thread Adrien Nayrat
On 6/18/19 8:29 PM, Pavel Stehule wrote:
> 
> 
> út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat  > napsal:
> 
> Hi,
> 
> I tried the patch, here my comment:
> 
> > gettext_noop("Zero effective disables sampling. "
> >                          "-1 use sampling every time (without limit)."),
> 
> I do not agree with the zero case. In fact, sampling is disabled as soon 
> as
> setting is less than log_min_duration_statements. Furthermore, I think we 
> should
> provide a more straightforward description for users.
> 
> 
> You have true, but I have not a idea,how to describe it in one line. In this
> case the zero is corner case, and sampling is disabled without dependency on
> log_min_duration_statement.
>  
> I think so this design has only few useful values and ranges
> 
> a) higher than log_min_duration_statement .. sampling is active with limit
> b) 0 .. for this case - other way how to effective disable sampling - no
> dependency on other
> c) -1 or negative value - sampling is allowed every time.
> 
> Sure, there is range (0..log_min_duration_statement), but for this range this
> value has not sense. I think so this case cannot be mentioned in short
> description. But it should be mentioned in documentation.

Yes, it took me a while to understand :) I am ok to keep simple in GUC
description and give more information in documentation.

> 
> 
> I changed few comments and documentation:
> 
>   * As we added much more logic in this function with statement and 
> transaction
> sampling. And now with statement_sample_rate, it is not easy to 
> understand the
> logic on first look. I reword comment in check_log_duration, I hope it is 
> more
> straightforward.
> 
>   * I am not sure if "every_time" is a good naming for the variable. In 
> fact, if
> duration exceeds limit we disable sampling. Maybe sampling_disabled is 
> more
> clear?
> 
> 
> For me important is following line
> 
> (exceeded && (in_sample || every_time))
> 
> I think so "every_time" or "always" or "every" is in this context more
> illustrative than "sampling_disabled". But my opinion is not strong in this
> case, and I have not a problem accept common opinion.

Oh, yes, that's correct. I do not have a strong opinion too. Maybe someone else
will have better idea.

-- 
Adrien



signature.asc
Description: OpenPGP digital signature


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-18 Thread Pavel Stehule
út 18. 6. 2019 v 14:03 odesílatel Adrien Nayrat 
napsal:

> Hi,
>
> I tried the patch, here my comment:
>
> > gettext_noop("Zero effective disables sampling. "
> >  "-1 use sampling every time (without limit)."),
>
> I do not agree with the zero case. In fact, sampling is disabled as soon as
> setting is less than log_min_duration_statements. Furthermore, I think we
> should
> provide a more straightforward description for users.
>

You have true, but I have not a idea,how to describe it in one line. In
this case the zero is corner case, and sampling is disabled without
dependency on log_min_duration_statement.

I think so this design has only few useful values and ranges

a) higher than log_min_duration_statement .. sampling is active with limit
b) 0 .. for this case - other way how to effective disable sampling - no
dependency on other
c) -1 or negative value - sampling is allowed every time.

Sure, there is range (0..log_min_duration_statement), but for this range
this value has not sense. I think so this case cannot be mentioned in short
description. But it should be mentioned in documentation.


> I changed few comments and documentation:
>
>   * As we added much more logic in this function with statement and
> transaction
> sampling. And now with statement_sample_rate, it is not easy to understand
> the
> logic on first look. I reword comment in check_log_duration, I hope it is
> more
> straightforward.
>
>   * I am not sure if "every_time" is a good naming for the variable. In
> fact, if
> duration exceeds limit we disable sampling. Maybe sampling_disabled is
> more clear?
>

For me important is following line

(exceeded && (in_sample || every_time))

I think so "every_time" or "always" or "every" is in this context more
illustrative than "sampling_disabled". But my opinion is not strong in this
case, and I have not a problem accept common opinion.


>
>   * I propose to add some words in log_min_duration_statement and
> log_statement_sample_rate documentation.
>
>   * Rephrased log_statement_sample_limit documentation, I hope it help
> understanding.
>
> Patch attached.
>
> Regards,
>
> --
> Adrien
>


Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-18 Thread Adrien Nayrat
Hi,

I tried the patch, here my comment:

> gettext_noop("Zero effective disables sampling. "
>  "-1 use sampling every time (without limit)."),

I do not agree with the zero case. In fact, sampling is disabled as soon as
setting is less than log_min_duration_statements. Furthermore, I think we should
provide a more straightforward description for users.

I changed few comments and documentation:

  * As we added much more logic in this function with statement and transaction
sampling. And now with statement_sample_rate, it is not easy to understand the
logic on first look. I reword comment in check_log_duration, I hope it is more
straightforward.

  * I am not sure if "every_time" is a good naming for the variable. In fact, if
duration exceeds limit we disable sampling. Maybe sampling_disabled is more 
clear?

  * I propose to add some words in log_min_duration_statement and
log_statement_sample_rate documentation.

  * Rephrased log_statement_sample_limit documentation, I hope it help
understanding.

Patch attached.

Regards,

-- 
Adrien
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..ad965a084c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5850,7 +5850,8 @@ local0.*/var/log/postgresql
 
  Causes the duration of each completed statement to be logged
  if the statement ran for at least the specified number of
- milliseconds, modulated by log_statement_sample_rate.
+ milliseconds, modulated by log_statement_sample_rate
+ and log_statement_sample_limit.
  Setting this to zero prints all statement durations.
  -1 (the default) disables logging statements due to
  exceeding duration threshold; for example, if you set it to
@@ -5894,6 +5895,8 @@ local0.*/var/log/postgresql
   to be logged.
  The default is 1.0, meaning log all such
  statements.
+ log_statement_sample_limit allows to disable sampling
+ if duration exceeds the limit.
  Setting this to zero disables logging by duration, same as setting
  log_min_duration_statement to
  -1.
@@ -5903,6 +5906,26 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_statement_sample_limit (integer)
+  
+   log_statement_sample_limit configuration parameter
+  
+  
+   
+
+ Disable effect of log_statement_sample_rate when
+ statements last longer than the specified limit in milliseconds.
+ Otherwise, statements that last longer than log_min_duration_statement
+ but less than log_statement_sample_limit are subject
+ to log sampling according to log_statement_sample_rate.
+ When log_statement_sample_limit is less or equal to
+ log_min_duration_statement then log sampling is
+ never applied.
+
+   
+  
+
  
   log_transaction_sample_rate (real)
   
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..50b3f839da 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2192,10 +2192,15 @@ check_log_statement(List *stmt_list)
 
 /*
  * check_log_duration
- *		Determine whether current command's duration should be logged.
- *		If log_statement_sample_rate < 1.0, log only a sample.
- *		We also check if this statement in this transaction must be logged
- *		(regardless of its duration).
+ *
+ * Determine whether current command's duration should be logged. There is also
+ * sampling mechanism following these rules:
+ *		- If log_statement_sample_rate < 1.0, log only a sample.
+ *		- If log_statement_sample_limit >= 0 and duration exceeds this limit,
+ *		  sampling is ignored, so the command is logged.
+ *
+ * If xact_is_sampled is true (due to log_transaction_sample_rate), it means
+ * the whole transaction is logged (regardless of its duration).
  *
  * Returns:
  *		0 if no logging is needed
@@ -2217,6 +,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		every_time;
 		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
@@ -2234,6 +2240,14 @@ check_log_duration(char *msec_str, bool was_logged)
 	 (secs > log_min_duration_statement / 1000 ||
 	  secs * 1000 + msecs >= log_min_duration_statement)));
 
+		/*
+		 * When log_statement_sample_limit is exceeded, then query is
+		 * logged every time.
+		 */
+		every_time = (log_statement_sample_limit >= 0 &&
+	   (secs > log_statement_sample_limit / 1000 ||
+		secs * 1000 + msecs >= log_statement_sample_limit));
+
 		/*
 		 * Do not log if log_statement_sample_rate = 0. Log a sample if
 		 * log_statement_sample_rate <= 1 and avoid unnecessary random() call
@@ -2245,7 +2259,7 @@ check_log_duration(char *msec_str, bool was_logged)
 			(log_statement_sample_rate == 1 ||
 			 random() <= 

Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-17 Thread Pavel Stehule
Hi

po 17. 6. 2019 v 22:40 odesílatel Pavel Stehule 
napsal:

> Hi
>
> čt 6. 6. 2019 v 10:48 odesílatel Gilles Darold  napsal:
>
>> Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
>> > Hi
>> >
>> > I like the idea of sampling slow statements via
>> > log_statement_sample_rate. But I miss some parameter that can ensure
>> > so every query executed over this limit is logged.
>> >
>> > Can we introduce new option
>> >
>> > log_statement_sampling_limit
>> >
>> > The query with execution time over this limit is logged every time.
>> >
>> > What do you think about this?
>> >
>> > Regards
>> >
>> > Pavel
>>
>>
>> +1, log_min_duration_statement is modulated by log_statement_sample_rate
>> that mean that there is no more way to log all statements over a certain
>> duration limit. log_statement_sampling_limit might probably always be
>> upper than log_min_duration_statement.
>>
>
> Here is a patch
>

I did error in logic - fixed



> Regards
>
> Pavel
>
>
>>
>> --
>> Gilles Darold
>> http://www.darold.net/
>>
>>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..690f54f731 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5903,6 +5903,21 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_statement_sample_limit (integer)
+  
+   log_statement_sample_limit configuration parameter
+  
+  
+   
+
+ Only if query duration is less than this value, the log sampling is active.
+ When log_statement_sample_limit is -1
+ then log sampling is without limit.
+
+   
+  
+
  
   log_transaction_sample_rate (real)
   
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..11ad1dd012 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2193,7 +2193,9 @@ check_log_statement(List *stmt_list)
 /*
  * check_log_duration
  *		Determine whether current command's duration should be logged.
- *		If log_statement_sample_rate < 1.0, log only a sample.
+ *		If log_statement_sample_rate < 1.0, log only a sample and duration
+ *		is less than log_statement_sample_limit. log_statement_sample_limit
+ *		is used only if this value is different than -1.
  *		We also check if this statement in this transaction must be logged
  *		(regardless of its duration).
  *
@@ -2217,6 +2219,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		every_time;
 		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
@@ -2234,6 +2237,14 @@ check_log_duration(char *msec_str, bool was_logged)
 	 (secs > log_min_duration_statement / 1000 ||
 	  secs * 1000 + msecs >= log_min_duration_statement)));
 
+		/*
+		 * When log_statement_sample_limit is exceeded, then query is
+		 * logged every time.
+		 */
+		every_time = (log_statement_sample_limit >= 0 &&
+	   (secs > log_statement_sample_limit / 1000 ||
+		secs * 1000 + msecs >= log_statement_sample_limit));
+
 		/*
 		 * Do not log if log_statement_sample_rate = 0. Log a sample if
 		 * log_statement_sample_rate <= 1 and avoid unnecessary random() call
@@ -2245,7 +2256,7 @@ check_log_duration(char *msec_str, bool was_logged)
 			(log_statement_sample_rate == 1 ||
 			 random() <= log_statement_sample_rate * MAX_RANDOM_VALUE);
 
-		if ((exceeded && in_sample) || log_duration || xact_is_sampled)
+		if ((exceeded && (in_sample || every_time)) || log_duration || xact_is_sampled)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 	 secs * 1000 + msecs, usecs % 1000);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1208eb9a68..b0c7d9df0e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -510,6 +510,7 @@ int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
+int			log_statement_sample_limit = -1;
 int			log_temp_files = -1;
 double		log_statement_sample_rate = 1.0;
 double		log_xact_sample_rate = 0;
@@ -2715,6 +2716,19 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_statement_sample_limit", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Sets the maximum execution time for sampling "
+		 "logged statements."),
+			gettext_noop("Zero effective disables sampling. "
+		 "-1 use sampling every time (without limit)."),
+			GUC_UNIT_MS
+		},
+		_statement_sample_limit,
+		-1, -1, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"log_autovacuum_min_duration", PGC_SIGHUP, LOGGING_WHAT,
 			gettext_noop("Sets the minimum execution time above which "
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index e709177c37..8f0c2c59d4 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -252,6 +252,7 @@ extern int	log_min_error_statement;
 extern PGDLLIMPORT int log_min_messages;
 extern PGDLLIMPORT int 

Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-17 Thread Pavel Stehule
Hi

čt 6. 6. 2019 v 10:48 odesílatel Gilles Darold  napsal:

> Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
> > Hi
> >
> > I like the idea of sampling slow statements via
> > log_statement_sample_rate. But I miss some parameter that can ensure
> > so every query executed over this limit is logged.
> >
> > Can we introduce new option
> >
> > log_statement_sampling_limit
> >
> > The query with execution time over this limit is logged every time.
> >
> > What do you think about this?
> >
> > Regards
> >
> > Pavel
>
>
> +1, log_min_duration_statement is modulated by log_statement_sample_rate
> that mean that there is no more way to log all statements over a certain
> duration limit. log_statement_sampling_limit might probably always be
> upper than log_min_duration_statement.
>

Here is a patch

Regards

Pavel


>
> --
> Gilles Darold
> http://www.darold.net/
>
>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..690f54f731 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5903,6 +5903,21 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_statement_sample_limit (integer)
+  
+   log_statement_sample_limit configuration parameter
+  
+  
+   
+
+ Only if query duration is less than this value, the log sampling is active.
+ When log_statement_sample_limit is -1
+ then log sampling is without limit.
+
+   
+  
+
  
   log_transaction_sample_rate (real)
   
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..2d8cc7d411 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2193,7 +2193,9 @@ check_log_statement(List *stmt_list)
 /*
  * check_log_duration
  *		Determine whether current command's duration should be logged.
- *		If log_statement_sample_rate < 1.0, log only a sample.
+ *		If log_statement_sample_rate < 1.0, log only a sample and duration
+ *		is less than log_statement_sample_limit. log_statement_sample_limit
+ *		is used only if this value is different than -1.
  *		We also check if this statement in this transaction must be logged
  *		(regardless of its duration).
  *
@@ -2217,6 +2219,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		every_time;
 		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
@@ -2234,6 +2237,15 @@ check_log_duration(char *msec_str, bool was_logged)
 	 (secs > log_min_duration_statement / 1000 ||
 	  secs * 1000 + msecs >= log_min_duration_statement)));
 
+		/*
+		 * When log_statement_sample_limit is exceeded, then query is
+		 * logged every time.
+		 */
+		every_time = (log_statement_sample_limit == 0 ||
+	  (log_statement_sample_limit > 0 &&
+	   (secs > log_statement_sample_limit / 1000 ||
+		secs * 1000 + msecs >= log_statement_sample_limit)));
+
 		/*
 		 * Do not log if log_statement_sample_rate = 0. Log a sample if
 		 * log_statement_sample_rate <= 1 and avoid unnecessary random() call
@@ -2245,7 +2257,7 @@ check_log_duration(char *msec_str, bool was_logged)
 			(log_statement_sample_rate == 1 ||
 			 random() <= log_statement_sample_rate * MAX_RANDOM_VALUE);
 
-		if ((exceeded && in_sample) || log_duration || xact_is_sampled)
+		if ((exceeded && (in_sample || every_time)) || log_duration || xact_is_sampled)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 	 secs * 1000 + msecs, usecs % 1000);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1208eb9a68..b0c7d9df0e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -510,6 +510,7 @@ int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
+int			log_statement_sample_limit = -1;
 int			log_temp_files = -1;
 double		log_statement_sample_rate = 1.0;
 double		log_xact_sample_rate = 0;
@@ -2715,6 +2716,19 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_statement_sample_limit", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Sets the maximum execution time for sampling "
+		 "logged statements."),
+			gettext_noop("Zero effective disables sampling. "
+		 "-1 use sampling every time (without limit)."),
+			GUC_UNIT_MS
+		},
+		_statement_sample_limit,
+		-1, -1, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"log_autovacuum_min_duration", PGC_SIGHUP, LOGGING_WHAT,
 			gettext_noop("Sets the minimum execution time above which "
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index e709177c37..8f0c2c59d4 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -252,6 +252,7 @@ extern int	log_min_error_statement;
 extern PGDLLIMPORT int log_min_messages;
 extern PGDLLIMPORT int client_min_messages;
 extern int	log_min_duration_statement;
+extern int	log_statement_sample_limit;
 

Re: idea: log_statement_sample_rate - bottom limit for sampling

2019-06-06 Thread Gilles Darold
Le 06/06/2019 à 10:38, Pavel Stehule a écrit :
> Hi
>
> I like the idea of sampling slow statements via 
> log_statement_sample_rate. But I miss some parameter that can ensure
> so every query executed over this limit is logged.
>
> Can we introduce new option
>
> log_statement_sampling_limit
>
> The query with execution time over this limit is logged every time.
>
> What do you think about this?
>
> Regards
>
> Pavel


+1, log_min_duration_statement is modulated by log_statement_sample_rate
that mean that there is no more way to log all statements over a certain
duration limit. log_statement_sampling_limit might probably always be
upper than log_min_duration_statement.


-- 
Gilles Darold
http://www.darold.net/