On Wed, Mar 30, 2022 at 3:04 PM Magnus Hagander <mag...@hagander.net> wrote:

> On Tue, Mar 29, 2022 at 10:06 PM David Rowley <dgrowle...@gmail.com>
> wrote:
>
>> On Wed, 30 Mar 2022 at 02:38, Robert Haas <robertmh...@gmail.com> wrote:
>> > I think WARNING is fine. After all, the parameter is called
>> > "jit_warn_above_fraction".
>>
>> I had a think about this patch.  I guess it's a little similar to
>> checkpoint_warning. The good thing about the checkpoint_warning is
>> that in the LOG message we give instructions about how the DBA can fix
>> the issue, i.e increase max_wal_size.
>>
>> With the proposed patch I see there is no hint about what might be
>> done to remove/reduce the warnings.  I imagine that's because it's not
>> all that clear which GUC should be changed. In my view, likely
>> jit_above_cost is the most relevant but there is also
>> jit_inline_above_cost, jit_optimize_above_cost, jit_tuple_deforming
>> and jit_expressions which are relevant too.
>>
>> If we go with this patch,  the problem I see here is that the amount
>> of work the JIT compiler must do for a given query depends mostly on
>> the number of expressions that must be compiled in the query (also to
>> a lesser extent jit_inline_above_cost, jit_optimize_above_cost,
>> jit_tuple_deforming and jit_expressions). The DBA does not really have
>> much control over the number of expressions in the query.  All he or
>> she can do to get rid of the warning is something like increase
>> jit_above_cost.  After a few iterations of that, the end result is
>> that jit_above_cost is now high enough that JIT no longer triggers
>> for, say, that query to that table with 1000 partitions where no
>> plan-time pruning takes place.  Is that really a good thing? It likely
>> means that we just rarely JIT anything at all!
>>
>
> I don't agree with the conclusion of that.
>
> What the parameter would be useful for is to be able to tune those costs
> (or just turn it off) *for that individual query*. That doesn't mean you
> "rarely JIT anything atll", it just means you rarely JIT that particular
> query.
>
> In fact, my goal is to specifically make people do that and *not* just
> turn off JIT globally.
>
>
> I'd much rather see us address the costing problem before adding some
>> warning, especially a warning where it's not clear how to make go
>> away.
>>
>
> The easiest way would be to add a HINT that says turn off jit for this
> particular query or something?
>
> I do agree that if we can make  "spending too much time on JIT vs query
> runtime" go away completely, then there is no need for a parameter like
> this.
>
> I still think the warning is useful. And I think it may stay useful even
> after we have made the JIT costing smarter -- though that's not certain of
> course.
>
>
This patch is still sitting at "ready for committer".

As an example, I have added such a hint in the attached.

I still stand  by that this patch is better than nothing. Sure, I would
love for us to adapt the JIT costing model and algorithm to make this not a
problem. And once we've done that, we should remove the parameter again.

It's not on by default, and it's trivial to remove in the future.


Yes, we're right up at the deadline. I'd still like to get it in, so I'd
really appreciate some further voices :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6e3e27bed7..fa35fe04e8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6718,6 +6718,21 @@ local0.*    /var/log/postgresql
        </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-jit-warn-above-fraction" xreflabel="jit_warn_above_fraction">
+      <term><varname>jit_warn_above_fraction</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>jit_warn_above_fraction</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+       <listitem>
+        <para>
+         Causes a warning to be written to the log if the time spent on JIT (see <xref linkend="jit"/>)
+         goes above this fraction of the total query runtime.
+         A value of 0 (the default) disables the warning.
+        </para>
+       </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-log-startup-progress-interval" xreflabel="log_startup_progress_interval">
       <term><varname>log_startup_progress_interval</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/tcop/fastpath.c b/src/backend/tcop/fastpath.c
index d429aa4663..a1bff893a3 100644
--- a/src/backend/tcop/fastpath.c
+++ b/src/backend/tcop/fastpath.c
@@ -196,6 +196,7 @@ HandleFunctionRequest(StringInfo msgBuf)
 	struct fp_info *fip;
 	bool		callit;
 	bool		was_logged = false;
+	int			msecs;
 	char		msec_str[32];
 
 	/*
@@ -305,7 +306,7 @@ HandleFunctionRequest(StringInfo msgBuf)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(&msecs, msec_str, was_logged))
 	{
 		case 1:
 			ereport(LOG,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..d1dd2273ad 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -998,7 +998,9 @@ exec_simple_query(const char *query_string)
 	bool		save_log_statement_stats = log_statement_stats;
 	bool		was_logged = false;
 	bool		use_implicit_block;
+	int			msecs;
 	char		msec_str[32];
+	int64		jit_time = 0;
 
 	/*
 	 * Report query to various monitoring facilities.
@@ -1257,6 +1259,16 @@ exec_simple_query(const char *query_string)
 
 		receiver->rDestroy(receiver);
 
+		/* Collect JIT timings in case it's active */
+		if (jit_enabled && jit_warn_above_fraction > 0 && portal->queryDesc && portal->queryDesc->estate->es_jit)
+		{
+			jit_time +=
+				INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) +
+				INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) +
+				INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) +
+				INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter);
+		}
+
 		PortalDrop(portal, false);
 
 		if (lnext(parsetree_list, parsetree_item) == NULL)
@@ -1327,7 +1339,7 @@ exec_simple_query(const char *query_string)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(&msecs, msec_str, was_logged))
 	{
 		case 1:
 			ereport(LOG,
@@ -1343,6 +1355,17 @@ exec_simple_query(const char *query_string)
 			break;
 	}
 
+	if (jit_enabled && jit_warn_above_fraction > 0)
+	{
+		if (jit_time > msecs * jit_warn_above_fraction)
+		{
+			ereport(WARNING,
+					(errmsg("JIT total processing time was %lld ms of %d ms",
+							(long long) jit_time, msecs)),
+					errhint("Consider turning of JIT or changing the threshold for this particular query"));
+		}
+	}
+
 	if (save_log_statement_stats)
 		ShowUsage("QUERY STATISTICS");
 
@@ -1370,6 +1393,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	CachedPlanSource *psrc;
 	bool		is_named;
 	bool		save_log_statement_stats = log_statement_stats;
+	int			msecs;
 	char		msec_str[32];
 
 	/*
@@ -1563,7 +1587,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(&msecs, msec_str, false))
 	{
 		case 1:
 			ereport(LOG,
@@ -1610,6 +1634,7 @@ exec_bind_message(StringInfo input_message)
 	MemoryContext oldContext;
 	bool		save_log_statement_stats = log_statement_stats;
 	bool		snapshot_set = false;
+	int			msecs;
 	char		msec_str[32];
 	ParamsErrorCbData params_data;
 	ErrorContextCallback params_errcxt;
@@ -2022,7 +2047,7 @@ exec_bind_message(StringInfo input_message)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(&msecs, msec_str, false))
 	{
 		case 1:
 			ereport(LOG,
@@ -2068,6 +2093,7 @@ exec_execute_message(const char *portal_name, long max_rows)
 	bool		is_xact_command;
 	bool		execute_is_fetch;
 	bool		was_logged = false;
+	int			msecs;
 	char		msec_str[32];
 	ParamsErrorCbData params_data;
 	ErrorContextCallback params_errcxt;
@@ -2259,7 +2285,7 @@ exec_execute_message(const char *portal_name, long max_rows)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(&msecs, msec_str, was_logged))
 	{
 		case 1:
 			ereport(LOG,
@@ -2282,6 +2308,24 @@ exec_execute_message(const char *portal_name, long max_rows)
 			break;
 	}
 
+	if (jit_enabled && jit_warn_above_fraction > 0)
+	{
+		int64 jit_time =
+			INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) +
+			INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) +
+			INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) +
+			INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter);
+
+		if (jit_time > msecs * jit_warn_above_fraction)
+		{
+			ereport(WARNING,
+					(errmsg("JIT total processing time was %lld ms of %d ms",
+							(long long) jit_time, msecs)),
+					errhint("Consider turning of JIT or changing the threshold for this particular query"));
+
+		}
+	}
+
 	if (save_log_statement_stats)
 		ShowUsage("EXECUTE MESSAGE STATISTICS");
 
@@ -2328,6 +2372,10 @@ check_log_statement(List *stmt_list)
  *		1 if just the duration should be logged
  *		2 if duration and query details should be logged
  *
+ * The millisecond runtime of the query is stored in msecs if any logging parameter
+ * that requires the calculation of time is enabled, even if the result is not that
+ * the entry should be logged. In other cases, it is set to -1.
+ *
  * If logging is needed, the duration in msec is formatted into msec_str[],
  * which must be a 32-byte buffer.
  *
@@ -2335,14 +2383,13 @@ check_log_statement(List *stmt_list)
  * essentially prevents 2 from being returned).
  */
 int
-check_log_duration(char *msec_str, bool was_logged)
+check_log_duration(int *msecs, char *msec_str, bool was_logged)
 {
 	if (log_duration || log_min_duration_sample >= 0 ||
-		log_min_duration_statement >= 0 || xact_is_sampled)
+	log_min_duration_statement >= 0 || xact_is_sampled || jit_warn_above_fraction > 0)
 	{
 		long		secs;
 		int			usecs;
-		int			msecs;
 		bool		exceeded_duration;
 		bool		exceeded_sample_duration;
 		bool		in_sample = false;
@@ -2350,7 +2397,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
 							GetCurrentTimestamp(),
 							&secs, &usecs);
-		msecs = usecs / 1000;
+		*msecs = usecs / 1000;
 
 		/*
 		 * This odd-looking test for log_min_duration_* being exceeded is
@@ -2360,12 +2407,12 @@ check_log_duration(char *msec_str, bool was_logged)
 		exceeded_duration = (log_min_duration_statement == 0 ||
 							 (log_min_duration_statement > 0 &&
 							  (secs > log_min_duration_statement / 1000 ||
-							   secs * 1000 + msecs >= log_min_duration_statement)));
+							   secs * 1000 + *msecs >= log_min_duration_statement)));
 
 		exceeded_sample_duration = (log_min_duration_sample == 0 ||
 									(log_min_duration_sample > 0 &&
 									 (secs > log_min_duration_sample / 1000 ||
-									  secs * 1000 + msecs >= log_min_duration_sample)));
+									  secs * 1000 + *msecs >= log_min_duration_sample)));
 
 		/*
 		 * Do not log if log_statement_sample_rate = 0. Log a sample if
@@ -2380,13 +2427,17 @@ check_log_duration(char *msec_str, bool was_logged)
 		if (exceeded_duration || in_sample || log_duration || xact_is_sampled)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
-					 secs * 1000 + msecs, usecs % 1000);
+					 secs * 1000 + *msecs, usecs % 1000);
 			if ((exceeded_duration || in_sample || xact_is_sampled) && !was_logged)
 				return 2;
 			else
 				return 1;
 		}
 	}
+	else
+	{
+		*msecs = -1;
+	}
 
 	return 0;
 }
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 22b5571a70..7e9859a9e1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -642,6 +642,7 @@ int			log_parameter_max_length_on_error = 0;
 int			log_temp_files = -1;
 double		log_statement_sample_rate = 1.0;
 double		log_xact_sample_rate = 0;
+double		jit_warn_above_fraction = 0;
 int			trace_recovery_messages = LOG;
 char	   *backtrace_functions;
 char	   *backtrace_symbol_list;
@@ -3943,6 +3944,19 @@ static struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"jit_warn_above_fraction", PGC_USERSET, LOGGING_WHEN,
+			gettext_noop("Sets the fraction of query time spent on JIT before writing "
+						 "a warning to the log."),
+			gettext_noop("Write a message to the server log if more than this "
+						 "fraction of the query runtime is spent on JIT. "
+						 "Zero turns off the warning.")
+		},
+		&jit_warn_above_fraction,
+		0.0, 0.0, 1.0,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 94270eb0ec..82d0f5135b 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -411,6 +411,8 @@
 #jit_optimize_above_cost = 500000	# use expensive JIT optimizations if
 					# query is more expensive than this;
 					# -1 disables
+#jit_warn_above_fraction = 0		# warn if JIT is more than this fraction of
+					# query runtime. 0 disables.
 
 # - Genetic Query Optimizer -
 
diff --git a/src/include/jit/jit.h b/src/include/jit/jit.h
index 707176d9ed..59b37d6dc0 100644
--- a/src/include/jit/jit.h
+++ b/src/include/jit/jit.h
@@ -89,6 +89,7 @@ extern bool jit_tuple_deforming;
 extern double jit_above_cost;
 extern double jit_inline_above_cost;
 extern double jit_optimize_above_cost;
+extern double jit_warn_above_fraction;
 
 
 extern void jit_reset_after_error(void);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..3f12f7fe25 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -87,7 +87,7 @@ extern void PostgresMain(const char *dbname,
 extern long get_stack_depth_rlimit(void);
 extern void ResetUsage(void);
 extern void ShowUsage(const char *title);
-extern int	check_log_duration(char *msec_str, bool was_logged);
+extern int	check_log_duration(int *msec, char *msec_str, bool was_logged);
 extern void set_debug_options(int debug_flag,
 							  GucContext context, GucSource source);
 extern bool set_plan_disabling_options(const char *arg,

Reply via email to