On 05/31/2018 03:34 AM, David Rowley wrote:
> On 31 May 2018 at 06:44, Adrien Nayrat <adrien.nay...@anayrat.info> wrote:
>> Here is a naive SELECT only bench with a dataset which fit in ram (scale 
>> factor
>> = 100) and PGDATA and log on a ramdisk:
>> shared_buffers = 4GB
>> seq_page_cost = random_page_cost = 1.0
>> logging_collector = on (no rotation)
> 
> It would be better to just: SELECT 1; to try to get the true overhead
> of the additional logging code.

Right, I wonder why I didn't have the idea :)

> 
>> I don't know the cost of random() call?
> 
> It's probably best to test in Postgres to see if there's an overhead
> to the new code.  It may be worth special casing the 0 and 1 case so
> random() is not called.

Thanks, I based on auto_explain's code. I will send a patch for auto_explain
with same changes.

> 
> +    (random() < log_sample_rate * MAX_RANDOM_VALUE);
> 
> this should be <=, or you'll randomly miss logging a query when
> log_sample_rate is 1.0 every 4 billion or so queries.

Right, it is the same in auto_explain.

> 
> Of course, it would be better if we had a proper profiler, but I can
> see your need for this. Enabling logging of all queries in production
> is currently reserved for people with low traffic servers and the
> insane.


Yes, here are results with "SELECT 1" and changes to avoid random() call with 0
and 1.


master :
log_min_duration_statement = 0
TPS: 41294
log size: 2477589 lines (199MB)

log_min_duration_statement = -1
TPS: 49478
log size: 0 lines


patched:
log_min_duration_statement = 0
log_sample_rate = 1
TPS: 41635
log size: 2497994 lines (201MB)

log_min_duration_statement = 0
log_sample_rate = 0.1
TPS: 46915
log size: 282303 lines (23MB)

log_min_duration_statement = 0
log_sample_rate = 0.01
TPS: 48867
log size: 29193 lines (2.4MB)

log_min_duration_statement = 0
log_sample_rate = 0
TPS: 48912
log size: 0 lines

log_min_duration_statement = -1
log_sample_rate = 1
TPS: 49061
log size: 0 lines

Difference between l_m_d_s = 0 and -1 is about 16% and logger process eat around
30% CPU on my laptop.


Revised patch attached.


Thanks,


-- 
Adrien NAYRAT

commit d7d82d9d07a9285a45bb28b8dbe3cff6c8b91408
Author: anayrat <adrien.nay...@anayrat.info>
Date:   Wed May 30 20:27:54 2018 +0200

    Add a new GUC to log a fraction of queries.

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f4133953be..10d4e44209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2115,7 +2115,8 @@ check_log_statement(List *stmt_list)
 
 /*
  * check_log_duration
- *		Determine whether current command's duration should be logged
+ *		Determine whether current command's duration should be logged.
+ *		If log_sample_rate < 1.0, log only a fraction of the command.
  *
  * Returns:
  *		0 if no logging is needed
@@ -2151,7 +2152,9 @@ check_log_duration(char *msec_str, bool was_logged)
 		exceeded = (log_min_duration_statement == 0 ||
 					(log_min_duration_statement > 0 &&
 					 (secs > log_min_duration_statement / 1000 ||
-					  secs * 1000 + msecs >= log_min_duration_statement)));
+					  secs * 1000 + msecs >= log_min_duration_statement))) &&
+				   log_sample_rate != 0 && (log_sample_rate == 1 ||
+					 random() <= log_sample_rate * MAX_RANDOM_VALUE);
 
 		if (exceeded || log_duration)
 		{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ee1444c427..b219682b83 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -455,6 +455,7 @@ int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
 int			log_temp_files = -1;
+double		log_sample_rate = 1.0;
 int			trace_recovery_messages = LOG;
 
 int			temp_file_limit = -1;
@@ -3257,6 +3258,17 @@ static struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_sample_rate", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Fraction of statements to log."),
+			gettext_noop("1.0 log all statements."),
+			NULL
+		},
+		&log_sample_rate,
+		1.0, 0.0, 1.0,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 3d88e80a20..c990f6c20b 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -447,6 +447,8 @@
 					# statements running at least this number
 					# of milliseconds
 
+#log_sample_rate = 1  # Fraction of logged statements. 1 means log all
+					# statements.
 
 # - What to Log -
 
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 3d13a33b94..4480756954 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -252,6 +252,7 @@ extern PGDLLIMPORT int log_min_messages;
 extern PGDLLIMPORT int client_min_messages;
 extern int	log_min_duration_statement;
 extern int	log_temp_files;
+extern double	log_sample_rate;
 
 extern int	temp_file_limit;
 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to