On 06/27/2018 11:13 PM, Adrien Nayrat wrote:
>> 3) Is it intentional to only sample with log_min_duration_statement and
>> not also with log_duration?  It seems like it should affect both.  In
>> both cases, the name is too generic.  Something called "log_sample_rate"
>> should sample **everything**.
> I do not think it could be useful to sample other case such as log_duration.
> 
> But yes, the GUC is confusing and I am not comfortable to introduce a new GUC 
> in
> my initial patch.
> 
> Maybe we should adapt current GUC with something like :
> 
> log_min_duration_statement = <time>,<sample rate>>
> This give :
> 
> log_min_duration_statement = 0,0.1
> 
> Equivalent to :
> log_min_duration_statement = 0
> log_sample_rate = 0.1
> 
> Thought?
> 

After reflection it seems a bad idea :

  * it breaks compatibility with external tools
  * it introduce a kind of "composite" GUC which may add complexity to use. For
example in pg_settings view.

What do you think of : log_min_duration_statement_sample ? Is it too long?


I saw a few days ago this error on http://commitfest.cputube.org

postgres.sgml:5202: element xref: validity error : IDREF attribute linkend
references an unknown ID "log_min_duration_statement"

Patch attached with fix on linkend marker

Regards,

-- 
Adrien

commit d388b1e31926c2f5f12aa5453cc4df6e7d9e5f60
Author: anayrat <adrien.nay...@anayrat.info>
Date:   Tue Jul 10 19:58:16 2018 +0200

    Add a new GUC, log_sample_rate, to log a fraction of queries.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e307bb4e8e..72791ad2a6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5190,6 +5190,26 @@ local0.*    /var/log/postgresql
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-log-sample-rate" xreflabel="log_sample_rate">
+      <term><varname>log_sample_rate</varname> (<type>real</type>)
+      <indexterm>
+       <primary><varname>log_sample_rate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+       <listitem>
+        <para>
+         Causes logging only a fraction of the statements when <xref
+         linkend="guc-log-min-duration-statement"/> is used. The default is
+         <literal>1</literal>, meaning log all statements longer than
+         <varname>log_min_duration_statement</varname>.  Setting this to zero
+         disables logging, same as setting
+         <varname>log_min_duration_statement</varname> to
+         <literal>-1</literal>.  Using <varname>log_sample_rate</varname> is
+         helpful when the traffic is too high to log all queries.
+        </para>
+       </listitem>
+      </varlistentry>
+
      </variablelist>
 
     <para>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f4133953be..5590f9a9d4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2115,7 +2115,8 @@ check_log_statement(List *stmt_list)
 
 /*
  * check_log_duration
- *		Determine whether current command's duration should be logged
+ *		Determine whether current command's duration should be logged.
+ *		If log_sample_rate < 1.0, log only a sample.
  *
  * Returns:
  *		0 if no logging is needed
@@ -2137,6 +2138,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
 							GetCurrentTimestamp(),
@@ -2153,7 +2155,15 @@ check_log_duration(char *msec_str, bool was_logged)
 					 (secs > log_min_duration_statement / 1000 ||
 					  secs * 1000 + msecs >= log_min_duration_statement)));
 
-		if (exceeded || log_duration)
+		/*
+		 * Do not log if log_sample_rate = 0.
+		 * Log a sample if log_sample_rate <= 1 and avoid unecessary random()
+		 * call if log_sample_rate = 1.
+		 */
+		in_sample = log_sample_rate != 0 &&
+			(log_sample_rate == 1 || random() <= log_sample_rate * MAX_RANDOM_VALUE);
+
+		if (exceeded && in_sample || log_duration)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 					 secs * 1000 + msecs, usecs % 1000);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 17292e04fe..daf1ae3af1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -455,6 +455,7 @@ int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
 int			log_temp_files = -1;
+double 		log_sample_rate = 1.0;
 int			trace_recovery_messages = LOG;
 
 int			temp_file_limit = -1;
@@ -3257,6 +3258,17 @@ static struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_sample_rate", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Fraction of statements to log."),
+			gettext_noop("1.0 logs all statements."),
+			NULL
+		},
+		&log_sample_rate,
+		1.0, 0.0, 1.0,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 657c3f81f8..b8ebbe9f7f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -446,6 +446,8 @@
 					# statements running at least this number
 					# of milliseconds
 
+#log_sample_rate = 1  # Fraction of logged statements. 1 means log all
+					# statements.
 
 # - What to Log -
 
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index f462eabe59..f50a0d1f1b 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -252,6 +252,7 @@ extern PGDLLIMPORT int log_min_messages;
 extern PGDLLIMPORT int client_min_messages;
 extern int	log_min_duration_statement;
 extern int	log_temp_files;
+extern double	log_sample_rate;
 
 extern int	temp_file_limit;
 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to