On Fri, Feb 25, 2022 at 5:23 PM Justin Pryzby <pry...@telsasoft.com> wrote: > > On Fri, Feb 25, 2022 at 04:16:01PM +0100, Magnus Hagander wrote: > > + { > > + {"jit_warn_above_fraction", PGC_SUSET, LOGGING_WHEN, > > + gettext_noop("Sets the fraction of query time spent > > on JIT before writing" > > + "a warning to the log."), > > + gettext_noop("Write a message tot he server log if > > more than this" > > + "fraction of the query > > runtime is spent on JIT." > > + "Zero turns off the > > warning.") > > + }, > > + &jit_warn_above_fraction, > > + 0.0, 0.0, 1.0, > > + NULL, NULL, NULL > > + }, > > Should be PGC_USERSET ?
Yes. Definitely. Copy/paste thinko. > + gettext_noop("Write a message tot he server log if > more than this" > > to the Yup. > + if (jit_enabled && jit_warn_above_fraction > 0) > + { > + int64 jit_time = > + > INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) > + > + > INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) > + > + > INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) > + > + > INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter); > + > + if (jit_time > msecs * jit_warn_above_fraction) > + { > + ereport(WARNING, > + (errmsg("JIT time was %ld ms of %d > ms", > + jit_time, msecs))); > + } > + } > > > I think it should be a NOTICE (or less?) Hmm. I'm not so sure. Other similar parameters often use LOG, but the downside of that is that it won't be sent to the client. The problem with using NOTICE is that it won't go to the log by default. It needs to be at least warning to do that. > Is it more useful if this is applied combined with log_min_duration_statement > ? > > It's easy to imagine a query for which the planner computes a high cost, but > actually runs quickly. You might get a bunch of WARNINGs that the query took > 10 MS and JIT was 75% of that, even if you don't care about queries that take > less than 10 SEC. Yeah, that's definitely a problem. But which way would you want to tie it to log_min_duration_statement? That a statement would both have to take longer than log_min_duration_statement *and* have JIT above a certain percentage? In my experience that is instead likely to miss most of the interesting times. Maybe it would need a separate guc for the timing, but I'd like to avoid that, I'm not sure it's a function worth *that* much... > I should say that this is already available by processing the output of > autoexplain. True. You just can't trigger it based on it. (and it can be somewhat of a PITA to parse things out of auto_explain on busy systems, but it does give a lot of very useful details) Meanwhile here is an updated based on your other comments above, as well as those from Julien. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7ed8c82a9d..ec4308a480 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -6592,6 +6592,21 @@ local0.* /var/log/postgresql </listitem> </varlistentry> + <varlistentry id="guc-jit-warn-above-fraction" xreflabel="jit_warn_above_fraction"> + <term><varname>jit_warn_above_fraction</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>jit_warn_above_fraction</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Causes a warning to be written to the log if the time spent on JIT (see <xref linkend="jit"/>) + goes above this fraction of the total query runtime. + A value of 0 (the default) disables the warning. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-log-startup-progress-interval" xreflabel="log_startup_progress_interval"> <term><varname>log_startup_progress_interval</varname> (<type>integer</type>) <indexterm> diff --git a/src/backend/tcop/fastpath.c b/src/backend/tcop/fastpath.c index d429aa4663..a1bff893a3 100644 --- a/src/backend/tcop/fastpath.c +++ b/src/backend/tcop/fastpath.c @@ -196,6 +196,7 @@ HandleFunctionRequest(StringInfo msgBuf) struct fp_info *fip; bool callit; bool was_logged = false; + int msecs; char msec_str[32]; /* @@ -305,7 +306,7 @@ HandleFunctionRequest(StringInfo msgBuf) /* * Emit duration logging if appropriate. */ - switch (check_log_duration(msec_str, was_logged)) + switch (check_log_duration(&msecs, msec_str, was_logged)) { case 1: ereport(LOG, diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index d7e39aed64..fcf20428a0 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -1016,7 +1016,9 @@ exec_simple_query(const char *query_string) bool save_log_statement_stats = log_statement_stats; bool was_logged = false; bool use_implicit_block; + int msecs; char msec_str[32]; + int64 jit_time = 0; /* * Report query to various monitoring facilities. @@ -1275,6 +1277,16 @@ exec_simple_query(const char *query_string) receiver->rDestroy(receiver); + /* Collect JIT timings in case it's active */ + if (jit_enabled && jit_warn_above_fraction > 0 && portal->queryDesc && portal->queryDesc->estate->es_jit) + { + jit_time += + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter); + } + PortalDrop(portal, false); if (lnext(parsetree_list, parsetree_item) == NULL) @@ -1345,7 +1357,7 @@ exec_simple_query(const char *query_string) /* * Emit duration logging if appropriate. */ - switch (check_log_duration(msec_str, was_logged)) + switch (check_log_duration(&msecs, msec_str, was_logged)) { case 1: ereport(LOG, @@ -1361,6 +1373,16 @@ exec_simple_query(const char *query_string) break; } + if (jit_enabled && jit_warn_above_fraction > 0) + { + if (jit_time > msecs * jit_warn_above_fraction) + { + ereport(WARNING, + (errmsg("JIT total processing time was %ld ms of %d ms", + jit_time, msecs))); + } + } + if (save_log_statement_stats) ShowUsage("QUERY STATISTICS"); @@ -1388,6 +1410,7 @@ exec_parse_message(const char *query_string, /* string to execute */ CachedPlanSource *psrc; bool is_named; bool save_log_statement_stats = log_statement_stats; + int msecs; char msec_str[32]; /* @@ -1581,7 +1604,7 @@ exec_parse_message(const char *query_string, /* string to execute */ /* * Emit duration logging if appropriate. */ - switch (check_log_duration(msec_str, false)) + switch (check_log_duration(&msecs, msec_str, false)) { case 1: ereport(LOG, @@ -1628,6 +1651,7 @@ exec_bind_message(StringInfo input_message) MemoryContext oldContext; bool save_log_statement_stats = log_statement_stats; bool snapshot_set = false; + int msecs; char msec_str[32]; ParamsErrorCbData params_data; ErrorContextCallback params_errcxt; @@ -2040,7 +2064,7 @@ exec_bind_message(StringInfo input_message) /* * Emit duration logging if appropriate. */ - switch (check_log_duration(msec_str, false)) + switch (check_log_duration(&msecs, msec_str, false)) { case 1: ereport(LOG, @@ -2086,6 +2110,7 @@ exec_execute_message(const char *portal_name, long max_rows) bool is_xact_command; bool execute_is_fetch; bool was_logged = false; + int msecs; char msec_str[32]; ParamsErrorCbData params_data; ErrorContextCallback params_errcxt; @@ -2277,7 +2302,7 @@ exec_execute_message(const char *portal_name, long max_rows) /* * Emit duration logging if appropriate. */ - switch (check_log_duration(msec_str, was_logged)) + switch (check_log_duration(&msecs, msec_str, was_logged)) { case 1: ereport(LOG, @@ -2300,6 +2325,22 @@ exec_execute_message(const char *portal_name, long max_rows) break; } + if (jit_enabled && jit_warn_above_fraction > 0) + { + int64 jit_time = + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) + + INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter); + + if (jit_time > msecs * jit_warn_above_fraction) + { + ereport(WARNING, + (errmsg("JIT total processing time was %ld ms of %d ms", + jit_time, msecs))); + } + } + if (save_log_statement_stats) ShowUsage("EXECUTE MESSAGE STATISTICS"); @@ -2346,6 +2387,10 @@ check_log_statement(List *stmt_list) * 1 if just the duration should be logged * 2 if duration and query details should be logged * + * The millisecond runtime of the query is stored in msecs if any logging parameter + * that requires the calculation of time is enabled, even if the result is not that + * the entry should be logged. In other cases, it is set to -1. + * * If logging is needed, the duration in msec is formatted into msec_str[], * which must be a 32-byte buffer. * @@ -2353,14 +2398,13 @@ check_log_statement(List *stmt_list) * essentially prevents 2 from being returned). */ int -check_log_duration(char *msec_str, bool was_logged) +check_log_duration(int *msecs, char *msec_str, bool was_logged) { if (log_duration || log_min_duration_sample >= 0 || - log_min_duration_statement >= 0 || xact_is_sampled) + log_min_duration_statement >= 0 || xact_is_sampled || jit_warn_above_fraction > 0) { long secs; int usecs; - int msecs; bool exceeded_duration; bool exceeded_sample_duration; bool in_sample = false; @@ -2368,7 +2412,7 @@ check_log_duration(char *msec_str, bool was_logged) TimestampDifference(GetCurrentStatementStartTimestamp(), GetCurrentTimestamp(), &secs, &usecs); - msecs = usecs / 1000; + *msecs = usecs / 1000; /* * This odd-looking test for log_min_duration_* being exceeded is @@ -2378,12 +2422,12 @@ check_log_duration(char *msec_str, bool was_logged) exceeded_duration = (log_min_duration_statement == 0 || (log_min_duration_statement > 0 && (secs > log_min_duration_statement / 1000 || - secs * 1000 + msecs >= log_min_duration_statement))); + secs * 1000 + *msecs >= log_min_duration_statement))); exceeded_sample_duration = (log_min_duration_sample == 0 || (log_min_duration_sample > 0 && (secs > log_min_duration_sample / 1000 || - secs * 1000 + msecs >= log_min_duration_sample))); + secs * 1000 + *msecs >= log_min_duration_sample))); /* * Do not log if log_statement_sample_rate = 0. Log a sample if @@ -2398,13 +2442,17 @@ check_log_duration(char *msec_str, bool was_logged) if (exceeded_duration || in_sample || log_duration || xact_is_sampled) { snprintf(msec_str, 32, "%ld.%03d", - secs * 1000 + msecs, usecs % 1000); + secs * 1000 + *msecs, usecs % 1000); if ((exceeded_duration || in_sample || xact_is_sampled) && !was_logged) return 2; else return 1; } } + else + { + *msecs = -1; + } return 0; } diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 6d11f9c71b..cefb05c653 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -608,6 +608,7 @@ int log_parameter_max_length_on_error = 0; int log_temp_files = -1; double log_statement_sample_rate = 1.0; double log_xact_sample_rate = 0; +double jit_warn_above_fraction = 0; int trace_recovery_messages = LOG; char *backtrace_functions; char *backtrace_symbol_list; @@ -3873,6 +3874,19 @@ static struct config_real ConfigureNamesReal[] = NULL, NULL, NULL }, + { + {"jit_warn_above_fraction", PGC_USERSET, LOGGING_WHEN, + gettext_noop("Sets the fraction of query time spent on JIT before writing " + "a warning to the log."), + gettext_noop("Write a message to the server log if more than this " + "fraction of the query runtime is spent on JIT. " + "Zero turns off the warning.") + }, + &jit_warn_above_fraction, + 0.0, 0.0, 1.0, + NULL, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 4a094bb38b..19795e5398 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -404,6 +404,8 @@ #jit_optimize_above_cost = 500000 # use expensive JIT optimizations if # query is more expensive than this; # -1 disables +#jit_warn_above_fraction = 0 # warn if JIT is more than this fraction of + # query runtime. 0 disables. # - Genetic Query Optimizer - diff --git a/src/include/jit/jit.h b/src/include/jit/jit.h index 707176d9ed..59b37d6dc0 100644 --- a/src/include/jit/jit.h +++ b/src/include/jit/jit.h @@ -89,6 +89,7 @@ extern bool jit_tuple_deforming; extern double jit_above_cost; extern double jit_inline_above_cost; extern double jit_optimize_above_cost; +extern double jit_warn_above_fraction; extern void jit_reset_after_error(void); diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h index 92291a750d..3f12f7fe25 100644 --- a/src/include/tcop/tcopprot.h +++ b/src/include/tcop/tcopprot.h @@ -87,7 +87,7 @@ extern void PostgresMain(const char *dbname, extern long get_stack_depth_rlimit(void); extern void ResetUsage(void); extern void ShowUsage(const char *title); -extern int check_log_duration(char *msec_str, bool was_logged); +extern int check_log_duration(int *msec, char *msec_str, bool was_logged); extern void set_debug_options(int debug_flag, GucContext context, GucSource source); extern bool set_plan_disabling_options(const char *arg,