Re: [HACKERS] auto_explain sample rate

2016-03-13 Thread Magnus Hagander
On Sat, Mar 12, 2016 at 3:20 PM, Julien Rouhaud 
wrote:

> On 11/03/2016 17:55, Robert Haas wrote:
> > On Fri, Mar 11, 2016 at 11:33 AM, Tomas Vondra
> >  wrote:
> >> A bit late, but I think we should rename the GUC variable to
> >> "sampling_rate" (instead of sample_ratio) as that's what pgbench uses
> >> for the same thing. That'd be more consistent.
> >
> > I like that idea.  It seems like slightly better terminology.
> >
>
> I like it too. I also just noticed that I duplicated the var type by
> mistake in the documentation :/
>
> Attached patch fixes both.
>
>
I also agree, and thanks for doing the work :) Applied!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] auto_explain sample rate

2016-03-12 Thread Julien Rouhaud
On 11/03/2016 17:55, Robert Haas wrote:
> On Fri, Mar 11, 2016 at 11:33 AM, Tomas Vondra
>  wrote:
>> A bit late, but I think we should rename the GUC variable to
>> "sampling_rate" (instead of sample_ratio) as that's what pgbench uses
>> for the same thing. That'd be more consistent.
> 
> I like that idea.  It seems like slightly better terminology.
> 

I like it too. I also just noticed that I duplicated the var type by
mistake in the documentation :/

Attached patch fixes both.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/contrib/auto_explain/auto_explain.c 
b/contrib/auto_explain/auto_explain.c
index 76d1831..55529af 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,7 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
-static double auto_explain_sample_ratio = 1;
+static double auto_explain_sample_rate = 1;
 
 static const struct config_enum_entry format_options[] = {
{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -163,10 +163,10 @@ _PG_init(void)
 NULL,
 NULL);
 
-   DefineCustomRealVariable("auto_explain.sample_ratio",
+   DefineCustomRealVariable("auto_explain.sample_rate",
 "Fraction of queries 
to process.",
NULL,
-   
&auto_explain_sample_ratio,
+   
&auto_explain_sample_rate,
1.0,
0.0,
1.0,
@@ -209,11 +209,11 @@ static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
/*
-* For ratio sampling, randomly choose top-level statement. Either
+* For rate sampling, randomly choose top-level statement. Either
 * all nested statements will be explained or none will.
 */
if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
-   current_query_sampled = (random() < auto_explain_sample_ratio *
+   current_query_sampled = (random() < auto_explain_sample_rate *
MAX_RANDOM_VALUE);
 
if (auto_explain_enabled() && current_query_sampled)
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index 6f1bde0..38e6f50 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -206,18 +206,17 @@ LOAD 'auto_explain';
 

 
- auto_explain.sample_ratio (real)
+ auto_explain.sample_rate (real)
  
-  auto_explain.sample_ratio configuration 
parameter
+  auto_explain.sample_rate configuration 
parameter
  
 
 
  
-  auto_explain.sample_ratio (floating 
point)
-  causes auto_explain to only explain a fraction of the statements in each
-  session.  The default is 1, meaning explain all the queries.  In case
-  of nested statements, either all will be explained or none. Only
-  superusers can change this setting.
+  auto_explain.sample_rate causes auto_explain to only
+  explain a fraction of the statements in each session.  The default is 1,
+  meaning explain all the queries.  In case of nested statements, either 
all
+  will be explained or none. Only superusers can change this setting.
  
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Robert Haas
On Fri, Mar 11, 2016 at 11:33 AM, Tomas Vondra
 wrote:
> A bit late, but I think we should rename the GUC variable to
> "sampling_rate" (instead of sample_ratio) as that's what pgbench uses
> for the same thing. That'd be more consistent.

I like that idea.  It seems like slightly better terminology.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Tomas Vondra
Hi,

On Fri, 2016-03-11 at 15:11 +0100, Magnus Hagander wrote:
> 
> > 
> Applied with a minor word-fix in the documentation and removal of
> some unrelated whitespace changes. 

A bit late, but I think we should rename the GUC variable to
"sampling_rate" (instead of sample_ratio) as that's what pgbench uses
for the same thing. That'd be more consistent.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Julien Rouhaud
On 11/03/2016 15:11, Magnus Hagander wrote:
> 
> 
> On Fri, Mar 11, 2016 at 3:03 PM, Magnus Hagander  > wrote:
> 
> 
> 
> On Fri, Mar 11, 2016 at 2:51 PM, Julien Rouhaud
> mailto:julien.rouh...@dalibo.com>> wrote:
> 
> On 11/03/2016 11:45, Magnus Hagander wrote:
> >
> > Coming back to the previous discussions about random() - AFAICT this
> > patch will introduce the random() call always (in 
> explain_ExecutorStart):
> >
> > +if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
> > +current_query_sampled = (random() < auto_explain_sample_ratio *
> > +MAX_RANDOM_VALUE);
> >
> >
> > Not sure what the overhead is, but wouldn't it be better to include 
> a
> > check for current_query_sampled>0 in the  if part of that statement?
> > Regardless of performance, that feels cleaner to me. Or am I missing
> > something?
> 
> You mean check for auto_explain_sample_ratio > 0 ?
> 
> 
> I did, but I think what I should have meant is
> auto_explain_sample_ratio < 1.
> 
>  
> 
> There would be a corner case if a query is sampled
> (current_query_sampled is true) and then
> auto_explain_sample_ratio is
> set to 0, all subsequent queries in this backend would be processed.
> 
> 
> There would have to be an else block as well of course, that set it
> back.
>  
>  
> 
> We could add a specific test for this case to spare a random()
> call, but
> I fear it'd be overkill. Maybe it's better to document that the
> good way
> to deactivate auto_explain is to set
> auto_explain.log_min_duration to -1.
> 
> 
> I guess in the end it probably is - a general random() call is
> pretty cheap compared to all the things we're doing. I think my mind
> was stuck in crypto-random which can be more expensive :)
> 
> 
> Applied with a minor word-fix in the documentation and removal of some
> unrelated whitespace changes. 
> 

Thanks!

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Magnus Hagander
On Fri, Mar 11, 2016 at 3:03 PM, Magnus Hagander 
wrote:

>
>
> On Fri, Mar 11, 2016 at 2:51 PM, Julien Rouhaud  > wrote:
>
>> On 11/03/2016 11:45, Magnus Hagander wrote:
>> >
>> > Coming back to the previous discussions about random() - AFAICT this
>> > patch will introduce the random() call always (in
>> explain_ExecutorStart):
>> >
>> > +if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
>> > +current_query_sampled = (random() < auto_explain_sample_ratio *
>> > +MAX_RANDOM_VALUE);
>> >
>> >
>> > Not sure what the overhead is, but wouldn't it be better to include a
>> > check for current_query_sampled>0 in the  if part of that statement?
>> > Regardless of performance, that feels cleaner to me. Or am I missing
>> > something?
>>
>> You mean check for auto_explain_sample_ratio > 0 ?
>>
>
> I did, but I think what I should have meant is auto_explain_sample_ratio <
> 1.
>
>
>
>> There would be a corner case if a query is sampled
>> (current_query_sampled is true) and then auto_explain_sample_ratio is
>> set to 0, all subsequent queries in this backend would be processed.
>>
>
> There would have to be an else block as well of course, that set it back.
>
>
>
>> We could add a specific test for this case to spare a random() call, but
>> I fear it'd be overkill. Maybe it's better to document that the good way
>> to deactivate auto_explain is to set auto_explain.log_min_duration to -1.
>>
>
> I guess in the end it probably is - a general random() call is pretty
> cheap compared to all the things we're doing. I think my mind was stuck in
> crypto-random which can be more expensive :)
>
>
Applied with a minor word-fix in the documentation and removal of some
unrelated whitespace changes.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Petr Jelinek

On 11/03/16 11:45, Magnus Hagander wrote:

On Thu, Mar 10, 2016 at 10:07 PM, Petr Jelinek mailto:p...@2ndquadrant.com>> wrote:

On 10/03/16 20:59, Julien Rouhaud wrote:

On 10/03/2016 04:37, Petr Jelinek wrote:

On 17/02/16 01:17, Julien Rouhaud wrote:


Agreed, it's too obscure. Attached v4 fixes as you said.


Seems to be simple enough patch and works. However I would like
documentation to say that the range is 0 to 1 and represents
fraction of
the queries sampled, because right now both the GUC
description and the
documentation say it's in percent but that's not really true
as percent
is 0 to 100.


Agreed. v5 attached fixes that.


Great, I will test it once more (just because when I don't bugs
suddenly appear out of nowhere) and mark it ready for committer.


Coming back to the previous discussions about random() - AFAICT this
patch will introduce the random() call always (in explain_ExecutorStart):

+if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+current_query_sampled = (random() < auto_explain_sample_ratio *
+MAX_RANDOM_VALUE);



No it doesn't as the documented way to turn off auto_explain is to set 
auto_explain_log_min_duration to -1 which is also the default.


In any case the code for that would have to be something like
if (auto_explain_sample_ratio == 0)
 current_query_sampled = false
else if 

Not sure if I consider that cleaner but it would definitely remove the 
call to random() in case user has set auto_explain_log_min_duration to 
something else than -1 and "turned off" the auto_explain by setting 
auto_explain_sample_ratio to 0.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Magnus Hagander
On Fri, Mar 11, 2016 at 2:51 PM, Julien Rouhaud 
wrote:

> On 11/03/2016 11:45, Magnus Hagander wrote:
> >
> > Coming back to the previous discussions about random() - AFAICT this
> > patch will introduce the random() call always (in explain_ExecutorStart):
> >
> > +if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
> > +current_query_sampled = (random() < auto_explain_sample_ratio *
> > +MAX_RANDOM_VALUE);
> >
> >
> > Not sure what the overhead is, but wouldn't it be better to include a
> > check for current_query_sampled>0 in the  if part of that statement?
> > Regardless of performance, that feels cleaner to me. Or am I missing
> > something?
>
> You mean check for auto_explain_sample_ratio > 0 ?
>

I did, but I think what I should have meant is auto_explain_sample_ratio <
1.



> There would be a corner case if a query is sampled
> (current_query_sampled is true) and then auto_explain_sample_ratio is
> set to 0, all subsequent queries in this backend would be processed.
>

There would have to be an else block as well of course, that set it back.



> We could add a specific test for this case to spare a random() call, but
> I fear it'd be overkill. Maybe it's better to document that the good way
> to deactivate auto_explain is to set auto_explain.log_min_duration to -1.
>

I guess in the end it probably is - a general random() call is pretty cheap
compared to all the things we're doing. I think my mind was stuck in
crypto-random which can be more expensive :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Julien Rouhaud
On 11/03/2016 11:45, Magnus Hagander wrote:
> 
> Coming back to the previous discussions about random() - AFAICT this
> patch will introduce the random() call always (in explain_ExecutorStart):
> 
> +if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
> +current_query_sampled = (random() < auto_explain_sample_ratio *
> +MAX_RANDOM_VALUE);
> 
> 
> Not sure what the overhead is, but wouldn't it be better to include a
> check for current_query_sampled>0 in the  if part of that statement?
> Regardless of performance, that feels cleaner to me. Or am I missing
> something?

You mean check for auto_explain_sample_ratio > 0 ?

There would be a corner case if a query is sampled
(current_query_sampled is true) and then auto_explain_sample_ratio is
set to 0, all subsequent queries in this backend would be processed.

We could add a specific test for this case to spare a random() call, but
I fear it'd be overkill. Maybe it's better to document that the good way
to deactivate auto_explain is to set auto_explain.log_min_duration to -1.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-11 Thread Magnus Hagander
On Thu, Mar 10, 2016 at 10:07 PM, Petr Jelinek  wrote:

> On 10/03/16 20:59, Julien Rouhaud wrote:
>
>> On 10/03/2016 04:37, Petr Jelinek wrote:
>>
>>> On 17/02/16 01:17, Julien Rouhaud wrote:
>>>

 Agreed, it's too obscure. Attached v4 fixes as you said.


>>> Seems to be simple enough patch and works. However I would like
>>> documentation to say that the range is 0 to 1 and represents fraction of
>>> the queries sampled, because right now both the GUC description and the
>>> documentation say it's in percent but that's not really true as percent
>>> is 0 to 100.
>>>
>>>
>> Agreed. v5 attached fixes that.
>>
>>
> Great, I will test it once more (just because when I don't bugs suddenly
> appear out of nowhere) and mark it ready for committer.
>
>
Coming back to the previous discussions about random() - AFAICT this patch
will introduce the random() call always (in explain_ExecutorStart):

+ if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+ current_query_sampled = (random() < auto_explain_sample_ratio *
+ MAX_RANDOM_VALUE);


Not sure what the overhead is, but wouldn't it be better to include a check
for current_query_sampled>0 in the  if part of that statement? Regardless
of performance, that feels cleaner to me. Or am I missing something?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] auto_explain sample rate

2016-03-10 Thread Petr Jelinek

On 10/03/16 20:59, Julien Rouhaud wrote:

On 10/03/2016 04:37, Petr Jelinek wrote:

On 17/02/16 01:17, Julien Rouhaud wrote:


Agreed, it's too obscure. Attached v4 fixes as you said.



Seems to be simple enough patch and works. However I would like
documentation to say that the range is 0 to 1 and represents fraction of
the queries sampled, because right now both the GUC description and the
documentation say it's in percent but that's not really true as percent
is 0 to 100.



Agreed. v5 attached fixes that.



Great, I will test it once more (just because when I don't bugs suddenly 
appear out of nowhere) and mark it ready for committer.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-10 Thread Julien Rouhaud
On 10/03/2016 04:37, Petr Jelinek wrote:
> On 17/02/16 01:17, Julien Rouhaud wrote:
>>
>> Agreed, it's too obscure. Attached v4 fixes as you said.
>>
> 
> Seems to be simple enough patch and works. However I would like
> documentation to say that the range is 0 to 1 and represents fraction of
> the queries sampled, because right now both the GUC description and the
> documentation say it's in percent but that's not really true as percent
> is 0 to 100.
> 

Agreed. v5 attached fixes that.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 0950e18..2755fd1 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,6 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
+static double auto_explain_sample_ratio = 1;
 
 static const struct config_enum_entry format_options[] = {
 	{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -47,10 +48,14 @@ static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
+/* Is the current query sampled, per backend */
+static bool current_query_sampled = true;
+
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
 	 (nesting_level == 0 || auto_explain_log_nested_statements))
 
+
 void		_PG_init(void);
 void		_PG_fini(void);
 
@@ -62,6 +67,7 @@ static void explain_ExecutorFinish(QueryDesc *queryDesc);
 static void explain_ExecutorEnd(QueryDesc *queryDesc);
 
 
+
 /*
  * Module load callback
  */
@@ -159,6 +165,19 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomRealVariable("auto_explain.sample_ratio",
+			 "Fraction of queries to process.",
+			NULL,
+			&auto_explain_sample_ratio,
+			1.0,
+			0.0,
+			1.0,
+			PGC_SUSET,
+			0,
+			NULL,
+			NULL,
+			NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -191,7 +210,15 @@ _PG_fini(void)
 static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
-	if (auto_explain_enabled())
+	/*
+	 * For ratio sampling, randomly choose top-level statement. Either
+	 * all nested statements will be explained or none will.
+	 */
+	if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+		current_query_sampled = (random() < auto_explain_sample_ratio *
+MAX_RANDOM_VALUE);
+
+	if (auto_explain_enabled() && current_query_sampled)
 	{
 		/* Enable per-node instrumentation iff log_analyze is required. */
 		if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
@@ -210,7 +237,7 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	else
 		standard_ExecutorStart(queryDesc, eflags);
 
-	if (auto_explain_enabled())
+	if (auto_explain_enabled() && current_query_sampled)
 	{
 		/*
 		 * Set up to track total elapsed time in ExecutorRun.  Make sure the
@@ -280,7 +307,7 @@ explain_ExecutorFinish(QueryDesc *queryDesc)
 static void
 explain_ExecutorEnd(QueryDesc *queryDesc)
 {
-	if (queryDesc->totaltime && auto_explain_enabled())
+	if (queryDesc->totaltime && auto_explain_enabled() && current_query_sampled)
 	{
 		double		msec;
 
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index d527208..978ef35 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -203,6 +203,24 @@ LOAD 'auto_explain';
  
 

+
+   
+
+ auto_explain.sample_ratio (real)
+ 
+  auto_explain.sample_ratio configuration parameter
+ 
+
+
+ 
+  auto_explain.sample_ratio (floating point)
+  causes auto_explain to only explain a fraction of the statements in each
+  session.  The default is 1, meaning explaining all the queries.  In case
+  of nested statements, either all will be explained or none. Only
+  superusers can change this setting.
+ 
+
+   
   
 
   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-03-09 Thread Petr Jelinek

On 17/02/16 01:17, Julien Rouhaud wrote:


Agreed, it's too obscure. Attached v4 fixes as you said.



Seems to be simple enough patch and works. However I would like 
documentation to say that the range is 0 to 1 and represents fraction of 
the queries sampled, because right now both the GUC description and the 
documentation say it's in percent but that's not really true as percent 
is 0 to 100.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-02-16 Thread Julien Rouhaud
On 16/02/2016 22:51, Alvaro Herrera wrote:
> Julien Rouhaud wrote:
> 
> Hijacking this macro is just too obscure:
> 
>>  #define auto_explain_enabled() \
>>  (auto_explain_log_min_duration >= 0 && \
>> - (nesting_level == 0 || auto_explain_log_nested_statements))
>> + (nesting_level == 0 || auto_explain_log_nested_statements) && \
>> + current_query_sampled)
> 
> because it then becomes hard to figure out that assigning to _sampled is
> what makes the enabled() check pass or not depending on sampling:
> 
>> @@ -191,6 +211,14 @@ _PG_fini(void)
>>  static void
>>  explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
>>  {
>> +/*
>> + * For ratio sampling, randomly choose top-level statement. Either
>> + * all nested statements will be explained or none will.
>> + */
>> +if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
>> +current_query_sampled = (random() < auto_explain_sample_ratio *
>> +MAX_RANDOM_VALUE);
>> +
>>  if (auto_explain_enabled())
>>  {
> 
> I think it's better to keep the "enabled" macro unmodified, and just add
> another conditional to the "if" test there.
> 

Thanks for looking at this!

Agreed, it's too obscure. Attached v4 fixes as you said.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 0950e18..fa564fd 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,6 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
+static double auto_explain_sample_ratio = 1;
 
 static const struct config_enum_entry format_options[] = {
 	{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -47,10 +48,14 @@ static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
+/* Is the current query sampled, per backend */
+static bool current_query_sampled = true;
+
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
 	 (nesting_level == 0 || auto_explain_log_nested_statements))
 
+
 void		_PG_init(void);
 void		_PG_fini(void);
 
@@ -62,6 +67,7 @@ static void explain_ExecutorFinish(QueryDesc *queryDesc);
 static void explain_ExecutorEnd(QueryDesc *queryDesc);
 
 
+
 /*
  * Module load callback
  */
@@ -159,6 +165,19 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomRealVariable("auto_explain.sample_ratio",
+			 "Percentage of queries to process.",
+			NULL,
+			&auto_explain_sample_ratio,
+			1.0,
+			0.0,
+			1.0,
+			PGC_SUSET,
+			0,
+			NULL,
+			NULL,
+			NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -191,7 +210,15 @@ _PG_fini(void)
 static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
-	if (auto_explain_enabled())
+	/*
+	 * For ratio sampling, randomly choose top-level statement. Either
+	 * all nested statements will be explained or none will.
+	 */
+	if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+		current_query_sampled = (random() < auto_explain_sample_ratio *
+MAX_RANDOM_VALUE);
+
+	if (auto_explain_enabled() && current_query_sampled)
 	{
 		/* Enable per-node instrumentation iff log_analyze is required. */
 		if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
@@ -210,7 +237,7 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	else
 		standard_ExecutorStart(queryDesc, eflags);
 
-	if (auto_explain_enabled())
+	if (auto_explain_enabled() && current_query_sampled)
 	{
 		/*
 		 * Set up to track total elapsed time in ExecutorRun.  Make sure the
@@ -280,7 +307,7 @@ explain_ExecutorFinish(QueryDesc *queryDesc)
 static void
 explain_ExecutorEnd(QueryDesc *queryDesc)
 {
-	if (queryDesc->totaltime && auto_explain_enabled())
+	if (queryDesc->totaltime && auto_explain_enabled() && current_query_sampled)
 	{
 		double		msec;
 
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index d527208..9d40e65 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -203,6 +203,23 @@ LOAD 'auto_explain';
  
 

+
+   
+
+ auto_explain.sample_ratio (real)
+ 
+  auto_explain.sample_ratio configuration parameter
+ 
+
+
+ 
+  auto_explain.sample_ratio causes auto_explain to only
+  explain X percent statements in each session.  In case of nested
+  statements, either all will be explained or none. Only superusers can
+  change this setting.
+ 
+
+   
   
 
   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-02-16 Thread Alvaro Herrera
Julien Rouhaud wrote:

Hijacking this macro is just too obscure:

>  #define auto_explain_enabled() \
>   (auto_explain_log_min_duration >= 0 && \
> -  (nesting_level == 0 || auto_explain_log_nested_statements))
> +  (nesting_level == 0 || auto_explain_log_nested_statements) && \
> +  current_query_sampled)

because it then becomes hard to figure out that assigning to _sampled is
what makes the enabled() check pass or not depending on sampling:

> @@ -191,6 +211,14 @@ _PG_fini(void)
>  static void
>  explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
>  {
> + /*
> +  * For ratio sampling, randomly choose top-level statement. Either
> +  * all nested statements will be explained or none will.
> +  */
> + if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
> + current_query_sampled = (random() < auto_explain_sample_ratio *
> + MAX_RANDOM_VALUE);
> +
>   if (auto_explain_enabled())
>   {

I think it's better to keep the "enabled" macro unmodified, and just add
another conditional to the "if" test there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2016-02-16 Thread Julien Rouhaud
On 25/08/2015 14:45, Michael Paquier wrote:
> On Fri, Jul 17, 2015 at 2:53 PM, Craig Ringer  wrote:
>> On 7 July 2015 at 21:37, Julien Rouhaud  wrote:
>>
>>> Well, I obviously missed that pg_srand48() is only used if the system
>>> lacks random/srandom, sorry for the noise.  So yes, random() must be
>>> used instead of pg_lrand48().
>>>
>>> I'm attaching a new version of the patch fixing this issue just in case.
>>
>> Thanks for picking this up. I've been trying to find time to come back
>> to it but been swamped in priority work.
> 
> For now I am marking that as returned with feedback.
> 

PFA v3 of the patch, rebased on current head. It fixes the last issue
(sample a percentage of queries).

I'm adding it to the next commitfest.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 0950e18..4edb08e 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,6 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
+static double auto_explain_sample_ratio = 1;
 
 static const struct config_enum_entry format_options[] = {
 	{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -47,9 +48,14 @@ static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
+/* Is the current query sampled, per backend */
+static bool current_query_sampled = true;
+
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
-	 (nesting_level == 0 || auto_explain_log_nested_statements))
+	 (nesting_level == 0 || auto_explain_log_nested_statements) && \
+	 current_query_sampled)
+
 
 void		_PG_init(void);
 void		_PG_fini(void);
@@ -62,6 +68,7 @@ static void explain_ExecutorFinish(QueryDesc *queryDesc);
 static void explain_ExecutorEnd(QueryDesc *queryDesc);
 
 
+
 /*
  * Module load callback
  */
@@ -159,6 +166,19 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomRealVariable("auto_explain.sample_ratio",
+			 "Percentage of queries to process.",
+			NULL,
+			&auto_explain_sample_ratio,
+			1.0,
+			0.0,
+			1.0,
+			PGC_SUSET,
+			0,
+			NULL,
+			NULL,
+			NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -191,6 +211,14 @@ _PG_fini(void)
 static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
+	/*
+	 * For ratio sampling, randomly choose top-level statement. Either
+	 * all nested statements will be explained or none will.
+	 */
+	if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+		current_query_sampled = (random() < auto_explain_sample_ratio *
+MAX_RANDOM_VALUE);
+
 	if (auto_explain_enabled())
 	{
 		/* Enable per-node instrumentation iff log_analyze is required. */
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index d527208..9d40e65 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -203,6 +203,23 @@ LOAD 'auto_explain';
  
 

+
+   
+
+ auto_explain.sample_ratio (real)
+ 
+  auto_explain.sample_ratio configuration parameter
+ 
+
+
+ 
+  auto_explain.sample_ratio causes auto_explain to only
+  explain X percent statements in each session.  In case of nested
+  statements, either all will be explained or none. Only superusers can
+  change this setting.
+ 
+
+   
   
 
   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2015-08-25 Thread Michael Paquier
On Fri, Jul 17, 2015 at 2:53 PM, Craig Ringer  wrote:
> On 7 July 2015 at 21:37, Julien Rouhaud  wrote:
>
>> Well, I obviously missed that pg_srand48() is only used if the system
>> lacks random/srandom, sorry for the noise.  So yes, random() must be
>> used instead of pg_lrand48().
>>
>> I'm attaching a new version of the patch fixing this issue just in case.
>
> Thanks for picking this up. I've been trying to find time to come back
> to it but been swamped in priority work.

For now I am marking that as returned with feedback.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2015-07-16 Thread Craig Ringer
On 7 July 2015 at 21:37, Julien Rouhaud  wrote:

> Well, I obviously missed that pg_srand48() is only used if the system
> lacks random/srandom, sorry for the noise.  So yes, random() must be
> used instead of pg_lrand48().
>
> I'm attaching a new version of the patch fixing this issue just in case.

Thanks for picking this up. I've been trying to find time to come back
to it but been swamped in priority work.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2015-07-07 Thread Julien Rouhaud
On 05/07/2015 18:22, Julien Rouhaud wrote:
> On 03/06/2015 15:00, Craig Ringer wrote:
>>
>>
>> On 3 June 2015 at 20:04, Andres Freund > > wrote:
>>
>> On 2015-06-03 18:54:24 +0800, Craig Ringer wrote:
>> > OK, here we go.
>>
>> Hm. Wouldn't random sampling be better than what you do? If your queries
>> have a pattern to them (e.g. you always issue the same 10 queries in
>> succession), this will possibly only show a subset of the queries.
>>
>> I think a formulation in fraction (i.e. a float between 0 and 1) will
>> also be easier to understand.
>>
>>
>> Could be, yeah. I was thinking about the cost of generating a random
>> each time, but it's going to vanish in the noise compared to the rest of
>> the costs in query execution.
>>
> 
> Hello, I've just reviewed the patch.
> 
> I'm not sure if there's a consensus on the sample rate format.  FWIW, I
> also think a fraction would be easier to understand.  Any news about
> generating a random at each call to avoid the query pattern problem ?
> 
> The patch applies without error.  I wonder if there's any reason for
> using pg_lrand48() instead of random(), as there's a port for random()
> if the system lacks it.
> 
> After some quick checks, I found that auto_explain_sample_counter is
> always initialized with the same value.  After some digging, it seems
> that pg_lrand48() always returns the same values in the same order, at
> least on my computer.  Have I missed something?
> 

Well, I obviously missed that pg_srand48() is only used if the system
lacks random/srandom, sorry for the noise.  So yes, random() must be
used instead of pg_lrand48().

I'm attaching a new version of the patch fixing this issue just in case.


> Otherwise, after replacing the pg_lrand48() call with a random(), it
> works just fine.
> 
>> ---
>>  Craig Ringer   http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
> 
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
*** a/contrib/auto_explain/auto_explain.c
--- b/contrib/auto_explain/auto_explain.c
***
*** 29,34  static bool auto_explain_log_triggers = false;
--- 29,35 
  static bool auto_explain_log_timing = true;
  static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
  static bool auto_explain_log_nested_statements = false;
+ static int  auto_explain_sample_ratio = 1;
  
  static const struct config_enum_entry format_options[] = {
  	{"text", EXPLAIN_FORMAT_TEXT, false},
***
*** 47,55  static ExecutorRun_hook_type prev_ExecutorRun = NULL;
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  
  #define auto_explain_enabled() \
  	(auto_explain_log_min_duration >= 0 && \
! 	 (nesting_level == 0 || auto_explain_log_nested_statements))
  
  void		_PG_init(void);
  void		_PG_fini(void);
--- 48,61 
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  
+ /* per-backend counter used for ratio sampling */
+ static int  auto_explain_sample_counter = 0;
+ 
  #define auto_explain_enabled() \
  	(auto_explain_log_min_duration >= 0 && \
! 	 (nesting_level == 0 || auto_explain_log_nested_statements)) && \
! 	 (auto_explain_sample_ratio == 1 || auto_explain_sample_counter == 0)
! 
  
  void		_PG_init(void);
  void		_PG_fini(void);
***
*** 61,66  static void explain_ExecutorRun(QueryDesc *queryDesc,
--- 67,85 
  static void explain_ExecutorFinish(QueryDesc *queryDesc);
  static void explain_ExecutorEnd(QueryDesc *queryDesc);
  
+ static void
+ auto_explain_sample_ratio_assign_hook(int newval, void *extra)
+ {
+ 	if (auto_explain_sample_ratio != newval)
+ 	{
+ 		/* Schedule a counter reset when the sample ratio changed */
+ 		auto_explain_sample_counter = -1;
+ 	}
+ 
+ 	auto_explain_sample_ratio = newval;
+ }
+ 
+ 
  
  /*
   * Module load callback
***
*** 159,164  _PG_init(void)
--- 178,195 
  			 NULL,
  			 NULL);
  
+ 	DefineCustomIntVariable("auto_explain.sample_ratio",
+ 		"Only explain one in approx. every sample_ratio queries, or 1 for all",
+ 			NULL,
+ 			&auto_explain_sample_ratio,
+ 			1,
+ 			1, INT_MAX - 1,
+ 			PGC_SUSET,
+ 			0,
+ 			NULL,
+ 			auto_explain_sample_ratio_assign_hook,
+ 			NULL);
+ 
  	EmitWarningsOnPlaceholders("auto_explain");
  
  	/* Install hooks. */
***
*** 191,196  _PG_fini(void)
--- 222,250 
  static void
  explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
  {
+ 	/*
+ 	 * For ratio sampling, only increment the counter for top-level
+ 	 * statements. Either all nested statements will be explained
+ 	 * or none will, because we need to know at ExecutorEnd hook time
+ 	 * whether or not we explained any given statement.
+ 	 */
+ 	if (nesting_level == 0 && auto_explain_sample_ratio > 1)
+ 	{
+ 		if (auto_explain_sam

Re: [HACKERS] auto_explain sample rate

2015-07-05 Thread Julien Rouhaud
On 03/06/2015 15:00, Craig Ringer wrote:
> 
> 
> On 3 June 2015 at 20:04, Andres Freund  > wrote:
> 
> On 2015-06-03 18:54:24 +0800, Craig Ringer wrote:
> > OK, here we go.
> 
> Hm. Wouldn't random sampling be better than what you do? If your queries
> have a pattern to them (e.g. you always issue the same 10 queries in
> succession), this will possibly only show a subset of the queries.
> 
> I think a formulation in fraction (i.e. a float between 0 and 1) will
> also be easier to understand.
> 
> 
> Could be, yeah. I was thinking about the cost of generating a random
> each time, but it's going to vanish in the noise compared to the rest of
> the costs in query execution.
> 

Hello, I've just reviewed the patch.

I'm not sure if there's a consensus on the sample rate format.  FWIW, I
also think a fraction would be easier to understand.  Any news about
generating a random at each call to avoid the query pattern problem ?

The patch applies without error.  I wonder if there's any reason for
using pg_lrand48() instead of random(), as there's a port for random()
if the system lacks it.

After some quick checks, I found that auto_explain_sample_counter is
always initialized with the same value.  After some digging, it seems
that pg_lrand48() always returns the same values in the same order, at
least on my computer.  Have I missed something?

Otherwise, after replacing the pg_lrand48() call with a random(), it
works just fine.

> ---
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Craig Ringer
On 3 June 2015 at 20:04, Andres Freund  wrote:

> On 2015-06-03 18:54:24 +0800, Craig Ringer wrote:
> > OK, here we go.
>
> Hm. Wouldn't random sampling be better than what you do? If your queries
> have a pattern to them (e.g. you always issue the same 10 queries in
> succession), this will possibly only show a subset of the queries.
>
> I think a formulation in fraction (i.e. a float between 0 and 1) will
> also be easier to understand.


Could be, yeah. I was thinking about the cost of generating a random each
time, but it's going to vanish in the noise compared to the rest of the
costs in query execution.

---
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Andres Freund
On 2015-06-03 18:54:24 +0800, Craig Ringer wrote:
> OK, here we go.

Hm. Wouldn't random sampling be better than what you do? If your queries
have a pattern to them (e.g. you always issue the same 10 queries in
succession), this will possibly only show a subset of the queries.

I think a formulation in fraction (i.e. a float between 0 and 1) will
also be easier to understand.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Craig Ringer
On 2 June 2015 at 15:07, Craig Ringer  wrote:

> On 29 May 2015 at 11:35, Tom Lane  wrote:
>
>> Craig Ringer  writes:
>> > It's sometimes desirable to collect auto_explain data with ANALYZE in
>> order
>> > to track down hard-to-reproduce issues, but the performance impacts can
>> be
>> > pretty hefty on the DB.
>>
>> > I'm inclined to add a sample rate to auto_explain so that it can trigger
>> > only on x percent of queries,
>>
>> That sounds reasonable ...
>>
>
> Cool, I'll cook that up then. Thanks for the sanity check.
>

OK, here we go.

To make sure it doesn't trigger on all backends at once, and to ensure it
doesn't rely on a shared point of contention in shmem, this sets up a
counter with a random value on each backend start.

Because it needs to either always run both the Start and End hooks, or run
neither, this doesn't count nested statements for sampling purposes. So if
you run my_huge_plpgsql_function() then either all its statements will be
explained or none of them will. This only applies if nested statement
explain is enabled. It's possible to get around this by adding a separate
nested statement counter that's reset at each top level End hook, but it
doesn't seem worthwhile.

The sample rate has no effect on ANALYZE, which remains enabled or disabled
for all queries. I don't see any point adding a separate sample rate
control to ANALYZE only some sub-proportion of EXPLAINed statements.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Tra
From 1f395e11bad41a77881970c7179f65e33c523cdd Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 3 Jun 2015 17:48:02 +0800
Subject: [PATCH] Allow sampling of only some queries by auto_explain

The new parameter auto_explain.sample_ratio specifies the proportion
of queries that are to be explained.  The starting offset is randomized
so that somewhere between the 1st and n'th statement gets explained,
then every n'th statement. No statement classification is performed.

If nested statement auto_explain is enabled, then for any given
top level statement either all sub-statements or none will be explained,
i.e. sampling doesn't apply to nested statements.
---
 contrib/auto_explain/auto_explain.c | 56 -
 doc/src/sgml/auto-explain.sgml  | 18 
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 2a184ed..161e24e 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,6 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
+static int  auto_explain_sample_ratio = 1;
 
 static const struct config_enum_entry format_options[] = {
 	{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -47,9 +48,14 @@ static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
+/* per-backend counter used for ratio sampling */
+static int  auto_explain_sample_counter = 0;
+
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
-	 (nesting_level == 0 || auto_explain_log_nested_statements))
+	 (nesting_level == 0 || auto_explain_log_nested_statements)) && \
+	 (auto_explain_sample_ratio == 1 || auto_explain_sample_counter == 0)
+
 
 void		_PG_init(void);
 void		_PG_fini(void);
@@ -61,6 +67,19 @@ static void explain_ExecutorRun(QueryDesc *queryDesc,
 static void explain_ExecutorFinish(QueryDesc *queryDesc);
 static void explain_ExecutorEnd(QueryDesc *queryDesc);
 
+static void
+auto_explain_sample_ratio_assign_hook(int newval, void *extra)
+{
+	if (auto_explain_sample_ratio != newval)
+	{
+		/* Schedule a counter reset when the sample ratio changed */
+		auto_explain_sample_counter = -1;
+	}
+
+	auto_explain_sample_ratio = newval;
+}
+
+
 
 /*
  * Module load callback
@@ -159,6 +178,18 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomIntVariable("auto_explain.sample_ratio",
+		"Only explain one in approx. every sample_ratio queries, or 1 for all",
+			NULL,
+			&auto_explain_sample_ratio,
+			1,
+			1, INT_MAX - 1,
+			PGC_SUSET,
+			0,
+			NULL,
+			auto_explain_sample_ratio_assign_hook,
+			NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -191,6 +222,29 @@ _PG_fini(void)
 static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
+	/*
+	 * For ratio sampling, only increment the counter for top-level
+	 * statements. Either all nested statements will be explained
+	 * or none will, because we need to know at ExecutorEnd hook time
+	 * whether or not we explained any given statement.
+	 */
+	if (nesting_level == 0 && auto_explain_sample_ratio > 1)
+	{
+		if (auto_explain

Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Craig Ringer
>
>
>> lot of variants - I would to see cost and times for EXPLAIN ANALYZE every
> time - but the precision of time can be reduced to 1ms. It is question if
> we can significantly reduce the cost (or number of calls) of getting system
> time.
>
> Pavel
>
>
>
OK, so you're suggesting a sampling-based EXPLAIN.

That'd be interesting, but is totally unrelated to this work on
auto_explain.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Pavel Stehule
2015-06-03 9:46 GMT+02:00 Craig Ringer :

>
> On 3 June 2015 at 15:22, Pavel Stehule  wrote:
>
>>
>>
>> 2015-06-03 9:17 GMT+02:00 Craig Ringer :
>>
>>>
>>>
>>> On 2 June 2015 at 15:11, Pavel Stehule  wrote:
>>>


 2015-06-02 9:07 GMT+02:00 Craig Ringer :

>
> For the majority of users I'm sure it's sufficient to just have a
> sample rate.
>
> Anything that's trying to match individual queries could be interested
> in all sorts of different things. Queries that touch a particular table
> being one of the more obvious things, or queries that mention a particular
> literal. Rather than try to design something complicated in advance that
> anticipates all needs, I'm thinking it makes sense to just throw a hook in
> there. If some patterns start to emerge in terms of useful real world
> filtering criteria then that'd better inform any more user accessible
> design down the track.
>

 same method can be interesting for interactive EXPLAIN ANALYZE too.
 TIMING has about 20%-30% overhead and usually we don't need a perfectly
 exact numbers

>>>
>>> I don't understand what you are suggesting here.
>>>
>>
>> using some sampling for EXPLAIN ANALYZE statement
>>
>>
> Do you mean that you'd like to be able to set a fraction of queries on
> which auto_explain does ANALYZE, so most of the time it just outputs an
> ordinary EXPLAIN?
>
> Or maybe we're talking about different things re the original proposal? I
> don't see how this would work. If you run EXPLAIN ANALYZE interactively
> like you said above. You'd surely want it to report costs and timings, or
> whatever it is that you ask for, all the time. Not just some of the time
> based on some background setting.
>
> Are you advocating a profiling-based approach for EXPLAIN ANALYZE timing
> where we sample which executor node we're under at regular intervals,
> instead of timing everything? Or suggesting a way to filter out sub-trees
> so you only get timing data on some sub-portion of a query?
>
>
lot of variants - I would to see cost and times for EXPLAIN ANALYZE every
time - but the precision of time can be reduced to 1ms. It is question if
we can significantly reduce the cost (or number of calls) of getting system
time.

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Craig Ringer
On 3 June 2015 at 15:22, Pavel Stehule  wrote:

>
>
> 2015-06-03 9:17 GMT+02:00 Craig Ringer :
>
>>
>>
>> On 2 June 2015 at 15:11, Pavel Stehule  wrote:
>>
>>>
>>>
>>> 2015-06-02 9:07 GMT+02:00 Craig Ringer :
>>>

 For the majority of users I'm sure it's sufficient to just have a
 sample rate.

 Anything that's trying to match individual queries could be interested
 in all sorts of different things. Queries that touch a particular table
 being one of the more obvious things, or queries that mention a particular
 literal. Rather than try to design something complicated in advance that
 anticipates all needs, I'm thinking it makes sense to just throw a hook in
 there. If some patterns start to emerge in terms of useful real world
 filtering criteria then that'd better inform any more user accessible
 design down the track.

>>>
>>> same method can be interesting for interactive EXPLAIN ANALYZE too.
>>> TIMING has about 20%-30% overhead and usually we don't need a perfectly
>>> exact numbers
>>>
>>
>> I don't understand what you are suggesting here.
>>
>
> using some sampling for EXPLAIN ANALYZE statement
>
>
Do you mean that you'd like to be able to set a fraction of queries on
which auto_explain does ANALYZE, so most of the time it just outputs an
ordinary EXPLAIN?

Or maybe we're talking about different things re the original proposal? I
don't see how this would work. If you run EXPLAIN ANALYZE interactively
like you said above. You'd surely want it to report costs and timings, or
whatever it is that you ask for, all the time. Not just some of the time
based on some background setting.

Are you advocating a profiling-based approach for EXPLAIN ANALYZE timing
where we sample which executor node we're under at regular intervals,
instead of timing everything? Or suggesting a way to filter out sub-trees
so you only get timing data on some sub-portion of a query?


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Pavel Stehule
2015-06-03 9:17 GMT+02:00 Craig Ringer :

>
>
> On 2 June 2015 at 15:11, Pavel Stehule  wrote:
>
>>
>>
>> 2015-06-02 9:07 GMT+02:00 Craig Ringer :
>>
>>>
>>> For the majority of users I'm sure it's sufficient to just have a sample
>>> rate.
>>>
>>> Anything that's trying to match individual queries could be interested
>>> in all sorts of different things. Queries that touch a particular table
>>> being one of the more obvious things, or queries that mention a particular
>>> literal. Rather than try to design something complicated in advance that
>>> anticipates all needs, I'm thinking it makes sense to just throw a hook in
>>> there. If some patterns start to emerge in terms of useful real world
>>> filtering criteria then that'd better inform any more user accessible
>>> design down the track.
>>>
>>
>> same method can be interesting for interactive EXPLAIN ANALYZE too.
>> TIMING has about 20%-30% overhead and usually we don't need a perfectly
>> exact numbers
>>
>
> I don't understand what you are suggesting here.
>

using some sampling for EXPLAIN ANALYZE statement

Pavel

>
>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] auto_explain sample rate

2015-06-03 Thread Craig Ringer
On 2 June 2015 at 15:11, Pavel Stehule  wrote:

>
>
> 2015-06-02 9:07 GMT+02:00 Craig Ringer :
>
>>
>> For the majority of users I'm sure it's sufficient to just have a sample
>> rate.
>>
>> Anything that's trying to match individual queries could be interested in
>> all sorts of different things. Queries that touch a particular table being
>> one of the more obvious things, or queries that mention a particular
>> literal. Rather than try to design something complicated in advance that
>> anticipates all needs, I'm thinking it makes sense to just throw a hook in
>> there. If some patterns start to emerge in terms of useful real world
>> filtering criteria then that'd better inform any more user accessible
>> design down the track.
>>
>
> same method can be interesting for interactive EXPLAIN ANALYZE too. TIMING
> has about 20%-30% overhead and usually we don't need a perfectly exact
> numbers
>

I don't understand what you are suggesting here.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] auto_explain sample rate

2015-06-02 Thread Pavel Stehule
2015-06-02 9:07 GMT+02:00 Craig Ringer :

> On 29 May 2015 at 11:35, Tom Lane  wrote:
>
>> Craig Ringer  writes:
>> > It's sometimes desirable to collect auto_explain data with ANALYZE in
>> order
>> > to track down hard-to-reproduce issues, but the performance impacts can
>> be
>> > pretty hefty on the DB.
>>
>> > I'm inclined to add a sample rate to auto_explain so that it can trigger
>> > only on x percent of queries,
>>
>> That sounds reasonable ...
>>
>
> Cool, I'll cook that up then. Thanks for the sanity check.
>
>
>> > and also add a sample test hook that can be
>> > used to target statements of interest more narrowly (using a C hook
>> > function).
>>
>> You'd have to be pretty desperate, *and* knowledgeable, to write a
>> C function for that.  Can't we invent something a bit more user-friendly
>> for the purpose?  No idea what it should look like though.
>>
>
> I've been that desperate.
>
> For the majority of users I'm sure it's sufficient to just have a sample
> rate.
>
> Anything that's trying to match individual queries could be interested in
> all sorts of different things. Queries that touch a particular table being
> one of the more obvious things, or queries that mention a particular
> literal. Rather than try to design something complicated in advance that
> anticipates all needs, I'm thinking it makes sense to just throw a hook in
> there. If some patterns start to emerge in terms of useful real world
> filtering criteria then that'd better inform any more user accessible
> design down the track.
>

same method can be interesting for interactive EXPLAIN ANALYZE too. TIMING
has about 20%-30% overhead and usually we don't need a perfectly exact
numbers

Regards

Pavel


>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] auto_explain sample rate

2015-06-02 Thread Craig Ringer
On 29 May 2015 at 11:35, Tom Lane  wrote:

> Craig Ringer  writes:
> > It's sometimes desirable to collect auto_explain data with ANALYZE in
> order
> > to track down hard-to-reproduce issues, but the performance impacts can
> be
> > pretty hefty on the DB.
>
> > I'm inclined to add a sample rate to auto_explain so that it can trigger
> > only on x percent of queries,
>
> That sounds reasonable ...
>

Cool, I'll cook that up then. Thanks for the sanity check.


> > and also add a sample test hook that can be
> > used to target statements of interest more narrowly (using a C hook
> > function).
>
> You'd have to be pretty desperate, *and* knowledgeable, to write a
> C function for that.  Can't we invent something a bit more user-friendly
> for the purpose?  No idea what it should look like though.
>

I've been that desperate.

For the majority of users I'm sure it's sufficient to just have a sample
rate.

Anything that's trying to match individual queries could be interested in
all sorts of different things. Queries that touch a particular table being
one of the more obvious things, or queries that mention a particular
literal. Rather than try to design something complicated in advance that
anticipates all needs, I'm thinking it makes sense to just throw a hook in
there. If some patterns start to emerge in terms of useful real world
filtering criteria then that'd better inform any more user accessible
design down the track.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] auto_explain sample rate

2015-05-28 Thread Tom Lane
Craig Ringer  writes:
> It's sometimes desirable to collect auto_explain data with ANALYZE in order
> to track down hard-to-reproduce issues, but the performance impacts can be
> pretty hefty on the DB.

> I'm inclined to add a sample rate to auto_explain so that it can trigger
> only on x percent of queries,

That sounds reasonable ...

> and also add a sample test hook that can be
> used to target statements of interest more narrowly (using a C hook
> function).

You'd have to be pretty desperate, *and* knowledgeable, to write a
C function for that.  Can't we invent something a bit more user-friendly
for the purpose?  No idea what it should look like though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] auto_explain sample rate

2015-05-28 Thread Craig Ringer
Hi all

It's sometimes desirable to collect auto_explain data with ANALYZE in order
to track down hard-to-reproduce issues, but the performance impacts can be
pretty hefty on the DB.

I'm inclined to add a sample rate to auto_explain so that it can trigger
only on x percent of queries, and also add a sample test hook that can be
used to target statements of interest more narrowly (using a C hook
function).

Sound like a reasonable approach?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services