On 2018-09-18 10:03:02 +0530, Amit Khandekar wrote:
> Attached v3 patch that does the above change.

Attached is a revised version of that patch.  I've changed quite a few
things:
- I've reverted the split of "base" and "provider specific" contexts - I
  don't think it really buys us anything here.

- I've reverted the context creation changes - instead of creating a
  context in the leader just to store instrumentation in the worker,
  there's now a new EState->es_jit_combined_instr.

- That also means worker instrumentation doesn't get folded into the
  leader's instrumentation. This seems good for the future and for
  extensions - it's not actually "linear" time that's spent doing
  JIT in workers (& leader), as all of that work happens in
  parallel. Being able to disentangle that seems important.

- Only allocate / transport JIT instrumentation if instrumentation is
  enabled.

- Smaller cleanups.

Forcing parallelism and JIT to be on, I get:

postgres[20216][1]=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS) SELECT count(*) FROM 
pg_class;

Finalize Aggregate  (cost=15.43..15.44 rows=1 width=8) (actual 
time=183.282..183.282 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=13
  ->  Gather  (cost=15.41..15.42 rows=2 width=8) (actual time=152.835..152.904 
rows=2 loops=1)
        Output: (PARTIAL count(*))
        Workers Planned: 2
        Workers Launched: 2
        JIT for worker 0:
          Functions: 2
          Options: Inlining true, Optimization true, Expressions true, 
Deforming true
          Timing: Generation 0.480 ms, Inlining 78.789 ms, Optimization 5.797 
ms, Emission 5.554 ms, Total 90.620 ms
        JIT for worker 1:
          Functions: 2
          Options: Inlining true, Optimization true, Expressions true, 
Deforming true
          Timing: Generation 0.475 ms, Inlining 78.632 ms, Optimization 5.954 
ms, Emission 5.900 ms, Total 90.962 ms
        Buffers: shared hit=13
        ->  Partial Aggregate  (cost=15.41..15.42 rows=1 width=8) (actual 
time=90.608..90.609 rows=1 loops=2)
              Output: PARTIAL count(*)
              Buffers: shared hit=13
              Worker 0: actual time=90.426..90.426 rows=1 loops=1
                Buffers: shared hit=7
              Worker 1: actual time=90.791..90.792 rows=1 loops=1
                Buffers: shared hit=6
              ->  Parallel Seq Scan on pg_catalog.pg_class  (cost=0.00..14.93 
rows=193 width=0) (actual time=0.006..0.048 rows=193 loops=2)
                    Output: relname, relnamespace, reltype, reloftype, 
relowner, relam, relfilenode, reltablespace, relpages, reltuples, 
relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, 
relkind, relnatts, relchecks, relhasoids, relhasrules, relhastriggers, 
relhassubclass, relrowsecurity, relfo
                    Buffers: shared hit=13
                    Worker 0: actual time=0.006..0.047 rows=188 loops=1
                      Buffers: shared hit=7
                    Worker 1: actual time=0.006..0.048 rows=198 loops=1
                      Buffers: shared hit=6
Planning Time: 0.152 ms
JIT:
  Functions: 4
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 0.955 ms, Inlining 157.422 ms, Optimization 11.751 ms, 
Emission 11.454 ms, Total 181.582 ms
Execution Time: 184.197 ms


Instead of "JIT for worker 0" we could instead do "Worker 0: JIT", but
I'm not sure that's better, given that the JIT group is multi-line
output.

Currently structured formats show this as:
   "JIT": {
     "Worker Number": 1,
     "Functions": 2,
     "Options": {
       "Inlining": true,
       "Optimization": true,
       "Expressions": true,
       "Deforming": true
     },
     "Timing": {
       "Generation": 0.374,
       "Inlining": 70.341,
       "Optimization": 8.006,
       "Emission": 7.670,
       "Total": 86.390
     }
   },


This needs a bit more polish tomorrow, but I'm starting to like where
this is going.  Comments?

Greetings,

Andres Freund
>From dd7d504df9fca0ee0cc0a48930980cda2ad8be1d Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 25 Sep 2018 01:46:34 -0700
Subject: [PATCH] Collect JIT instrumentation from workers.

Author: Amit Khandekar and Andres Freund
Reviewed-By:
Discussion: https://postgr.es/m/caj3gd9elrz51rk_gtkod+71idcjpb_n8ec6vu2aw-vicsae...@mail.gmail.com
Backpatch: 11-
---
 contrib/auto_explain/auto_explain.c |  6 +-
 src/backend/commands/explain.c      | 87 +++++++++++++++++++----------
 src/backend/executor/execMain.c     | 15 +++++
 src/backend/executor/execParallel.c | 82 +++++++++++++++++++++++++++
 src/backend/jit/jit.c               | 11 ++++
 src/backend/jit/llvm/llvmjit.c      | 14 ++---
 src/backend/jit/llvm/llvmjit_expr.c |  2 +-
 src/include/commands/explain.h      |  3 +-
 src/include/executor/execParallel.h |  1 +
 src/include/jit/jit.h               | 27 +++++++--
 src/include/nodes/execnodes.h       |  8 +++
 11 files changed, 209 insertions(+), 47 deletions(-)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 0c0eb3fb9e0..a19c1d2a3c3 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -362,9 +362,9 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 			ExplainPrintPlan(es, queryDesc);
 			if (es->analyze && auto_explain_log_triggers)
 				ExplainPrintTriggers(es, queryDesc);
-			if (queryDesc->estate->es_jit && es->costs &&
-				queryDesc->estate->es_jit->created_functions > 0)
-				ExplainPrintJIT(es, queryDesc);
+			if (es->costs)
+				ExplainPrintJIT(es, queryDesc->estate->es_jit_flags,
+								queryDesc->estate->es_jit_combined_instr, -1);
 			ExplainEndOutput(es);
 
 			/* Remove last line break */
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 01520154e48..2ecd278f7e3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -563,9 +563,9 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 * depending on build options.  Might want to separate that out from COSTS
 	 * at a later stage.
 	 */
-	if (queryDesc->estate->es_jit && es->costs &&
-		queryDesc->estate->es_jit->created_functions > 0)
-		ExplainPrintJIT(es, queryDesc);
+	if (es->costs)
+		ExplainPrintJIT(es, queryDesc->estate->es_jit_flags,
+						queryDesc->estate->es_jit_combined_instr, -1);
 
 	/*
 	 * Close down the query and free resources.  Include time for this in the
@@ -690,20 +690,26 @@ ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc)
 
 /*
  * ExplainPrintJIT -
- *	  Append information about JITing to es->str.
+ *	  Append information about JITing to es->str. Can be used to print
+ *    the JIT instrumentation of the backend or that of a specific worker.
  */
 void
-ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
+ExplainPrintJIT(ExplainState *es, int jit_flags,
+				JitInstrumentation *ji, int worker_num)
 {
-	JitContext *jc = queryDesc->estate->es_jit;
 	instr_time	total_time;
+	bool		for_workers = (worker_num >= 0);
+
+	/* don't print information if no JITing happened */
+	if (!ji || ji->created_functions == 0)
+		return;
 
 	/* calculate total time */
 	INSTR_TIME_SET_ZERO(total_time);
-	INSTR_TIME_ADD(total_time, jc->generation_counter);
-	INSTR_TIME_ADD(total_time, jc->inlining_counter);
-	INSTR_TIME_ADD(total_time, jc->optimization_counter);
-	INSTR_TIME_ADD(total_time, jc->emission_counter);
+	INSTR_TIME_ADD(total_time, ji->generation_counter);
+	INSTR_TIME_ADD(total_time, ji->inlining_counter);
+	INSTR_TIME_ADD(total_time, ji->optimization_counter);
+	INSTR_TIME_ADD(total_time, ji->emission_counter);
 
 	ExplainOpenGroup("JIT", "JIT", true, es);
 
@@ -711,27 +717,30 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 	{
 		appendStringInfoSpaces(es->str, es->indent * 2);
-		appendStringInfo(es->str, "JIT:\n");
+		if (for_workers)
+			appendStringInfo(es->str, "JIT for worker %u:\n", worker_num);
+		else
+			appendStringInfo(es->str, "JIT:\n");
 		es->indent += 1;
 
-		ExplainPropertyInteger("Functions", NULL, jc->created_functions, es);
+		ExplainPropertyInteger("Functions", NULL, ji->created_functions, es);
 
 		appendStringInfoSpaces(es->str, es->indent * 2);
 		appendStringInfo(es->str, "Options: %s %s, %s %s, %s %s, %s %s\n",
-						 "Inlining", jc->flags & PGJIT_INLINE ? "true" : "false",
-						 "Optimization", jc->flags & PGJIT_OPT3 ? "true" : "false",
-						 "Expressions", jc->flags & PGJIT_EXPR ? "true" : "false",
-						 "Deforming", jc->flags & PGJIT_DEFORM ? "true" : "false");
+						 "Inlining", jit_flags & PGJIT_INLINE ? "true" : "false",
+						 "Optimization", jit_flags & PGJIT_OPT3 ? "true" : "false",
+						 "Expressions", jit_flags & PGJIT_EXPR ? "true" : "false",
+						 "Deforming", jit_flags & PGJIT_DEFORM ? "true" : "false");
 
 		if (es->analyze && es->timing)
 		{
 			appendStringInfoSpaces(es->str, es->indent * 2);
 			appendStringInfo(es->str,
 							 "Timing: %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms\n",
-							 "Generation", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->generation_counter),
-							 "Inlining", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->inlining_counter),
-							 "Optimization", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->optimization_counter),
-							 "Emission", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->emission_counter),
+							 "Generation", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
+							 "Inlining", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
+							 "Optimization", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->optimization_counter),
+							 "Emission", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->emission_counter),
 							 "Total", 1000.0 * INSTR_TIME_GET_DOUBLE(total_time));
 		}
 
@@ -739,13 +748,14 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
 	}
 	else
 	{
-		ExplainPropertyInteger("Functions", NULL, jc->created_functions, es);
+		ExplainPropertyInteger("Worker Number", NULL, worker_num, es);
+		ExplainPropertyInteger("Functions", NULL, ji->created_functions, es);
 
 		ExplainOpenGroup("Options", "Options", true, es);
-		ExplainPropertyBool("Inlining", jc->flags & PGJIT_INLINE, es);
-		ExplainPropertyBool("Optimization", jc->flags & PGJIT_OPT3, es);
-		ExplainPropertyBool("Expressions", jc->flags & PGJIT_EXPR, es);
-		ExplainPropertyBool("Deforming", jc->flags & PGJIT_DEFORM, es);
+		ExplainPropertyBool("Inlining", jit_flags & PGJIT_INLINE, es);
+		ExplainPropertyBool("Optimization", jit_flags & PGJIT_OPT3, es);
+		ExplainPropertyBool("Expressions", jit_flags & PGJIT_EXPR, es);
+		ExplainPropertyBool("Deforming", jit_flags & PGJIT_DEFORM, es);
 		ExplainCloseGroup("Options", "Options", true, es);
 
 		if (es->analyze && es->timing)
@@ -753,16 +763,16 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
 			ExplainOpenGroup("Timing", "Timing", true, es);
 
 			ExplainPropertyFloat("Generation", "ms",
-								 1000.0 * INSTR_TIME_GET_DOUBLE(jc->generation_counter),
+								 1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
 								 3, es);
 			ExplainPropertyFloat("Inlining", "ms",
-								 1000.0 * INSTR_TIME_GET_DOUBLE(jc->inlining_counter),
+								 1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
 								 3, es);
 			ExplainPropertyFloat("Optimization", "ms",
-								 1000.0 * INSTR_TIME_GET_DOUBLE(jc->optimization_counter),
+								 1000.0 * INSTR_TIME_GET_DOUBLE(ji->optimization_counter),
 								 3, es);
 			ExplainPropertyFloat("Emission", "ms",
-								 1000.0 * INSTR_TIME_GET_DOUBLE(jc->emission_counter),
+								 1000.0 * INSTR_TIME_GET_DOUBLE(ji->emission_counter),
 								 3, es);
 			ExplainPropertyFloat("Total", "ms",
 								 1000.0 * INSTR_TIME_GET_DOUBLE(total_time),
@@ -1554,6 +1564,25 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					ExplainPropertyInteger("Workers Launched", NULL,
 										   nworkers, es);
 				}
+
+				/*
+				 * Print per-worker Jit instrumentation. Use same conditions
+				 * as for the leader's JIT instrumentation, see comment there.
+				 */
+				if (es->costs && es->verbose &&
+					outerPlanState(planstate)->worker_jit_instrument)
+				{
+					PlanState *child = outerPlanState(planstate);
+					int			n;
+					SharedJitInstrumentation *w = child->worker_jit_instrument;
+
+					for (n = 0; n < w->num_workers; ++n)
+					{
+						ExplainPrintJIT(es, child->state->es_jit_flags,
+										&w->jit_instr[n], n);
+					}
+				}
+
 				if (gather->single_copy || es->format != EXPLAIN_FORMAT_TEXT)
 					ExplainPropertyBool("Single Copy", gather->single_copy, es);
 			}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 85d980356b7..e36f58a0151 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -48,6 +48,7 @@
 #include "executor/execdebug.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
+#include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "optimizer/clauses.h"
@@ -494,6 +495,20 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
 
 	ExecEndPlan(queryDesc->planstate, estate);
 
+	/*
+	 * If this process has done JIT, either merge stats into worker stats, or
+	 * this process' stats as the global stats if no workers did JIT.
+	 */
+	if (estate->es_instrument && queryDesc->estate->es_jit)
+	{
+		if (queryDesc->estate->es_jit_combined_instr)
+			InstrJitAgg(queryDesc->estate->es_jit_combined_instr,
+						&queryDesc->estate->es_jit->instr);
+		else
+			queryDesc->estate->es_jit_combined_instr =
+				&queryDesc->estate->es_jit->instr;
+	}
+
 	/* do away with our snapshots */
 	UnregisterSnapshot(estate->es_snapshot);
 	UnregisterSnapshot(estate->es_crosscheck_snapshot);
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index c93084e4d2a..84ee0098c30 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -37,6 +37,7 @@
 #include "executor/nodeSort.h"
 #include "executor/nodeSubplan.h"
 #include "executor/tqueue.h"
+#include "jit/jit.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/planmain.h"
 #include "optimizer/planner.h"
@@ -62,6 +63,7 @@
 #define PARALLEL_KEY_INSTRUMENTATION	UINT64CONST(0xE000000000000006)
 #define PARALLEL_KEY_DSA				UINT64CONST(0xE000000000000007)
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
+#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -573,9 +575,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
+	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
+	int			jit_instrumentation_len;
 	int			instrument_offset = 0;
 	Size		dsa_minsize = dsa_minimum_size();
 	char	   *query_string;
@@ -669,6 +673,17 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 					 mul_size(e.nnodes, nworkers));
 		shm_toc_estimate_chunk(&pcxt->estimator, instrumentation_len);
 		shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+		/* Estimate space for JIT instrumentation, if required. */
+		if (estate->es_jit_flags != PGJIT_NONE)
+		{
+			jit_instrumentation_len =
+				offsetof(SharedJitInstrumentation, jit_instr) +
+				sizeof(JitInstrumentation) * nworkers;
+			jit_instrumentation_len = MAXALIGN(jit_instrumentation_len);
+			shm_toc_estimate_chunk(&pcxt->estimator, jit_instrumentation_len);
+			shm_toc_estimate_keys(&pcxt->estimator, 1);
+		}
 	}
 
 	/* Estimate space for DSA area. */
@@ -742,6 +757,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 		shm_toc_insert(pcxt->toc, PARALLEL_KEY_INSTRUMENTATION,
 					   instrumentation);
 		pei->instrumentation = instrumentation;
+
+		if (estate->es_jit_flags != PGJIT_NONE)
+		{
+			jit_instrumentation = shm_toc_allocate(pcxt->toc,
+												   jit_instrumentation_len);
+			jit_instrumentation->num_workers = nworkers;
+			memset(jit_instrumentation->jit_instr, 0,
+				   sizeof(JitInstrumentation) * nworkers);
+			shm_toc_insert(pcxt->toc, PARALLEL_KEY_JIT_INSTRUMENTATION,
+						   jit_instrumentation);
+			pei->jit_instrumentation = jit_instrumentation;
+		}
 	}
 
 	/*
@@ -1003,6 +1030,45 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate,
 								 instrumentation);
 }
 
+/*
+ * Add up the workers' JIT instrumentation from dynamic shared memory.
+ */
+static void
+ExecParallelRetrieveJitInstrumentation(PlanState *planstate,
+									   SharedJitInstrumentation *shared_jit)
+{
+	JitInstrumentation *combined;
+	int			ibytes;
+
+	int			n;
+
+	/*
+	 * Accumulate worker JIT instrumentation into the combined JIT, allocating
+	 * it if required. Note this is separate from the leader instrumentation.
+	 */
+	if (!planstate->state->es_jit_combined_instr)
+		planstate->state->es_jit_combined_instr =
+			MemoryContextAllocZero(planstate->state->es_query_cxt, sizeof(JitInstrumentation));
+	combined = planstate->state->es_jit_combined_instr;
+
+	/* Accummulate all the workers' instrumentations. */
+	for (n = 0; n < shared_jit->num_workers; ++n)
+		InstrJitAgg(combined, &shared_jit->jit_instr[n]);
+
+	/*
+	 * Store the per-worker detail.
+	 *
+	 * Similar to ExecParallelRetrieveInstrumentation(), allocate the
+	 * instrumentation in per-query context.
+	 */
+	ibytes = offsetof(SharedJitInstrumentation, jit_instr)
+			 + mul_size(shared_jit->num_workers, sizeof(JitInstrumentation));
+	planstate->worker_jit_instrument =
+		MemoryContextAlloc(planstate->state->es_query_cxt, ibytes);
+
+	memcpy(planstate->worker_jit_instrument, shared_jit, ibytes);
+}
+
 /*
  * Finish parallel execution.  We wait for parallel workers to finish, and
  * accumulate their buffer usage.
@@ -1068,6 +1134,11 @@ ExecParallelCleanup(ParallelExecutorInfo *pei)
 		ExecParallelRetrieveInstrumentation(pei->planstate,
 											pei->instrumentation);
 
+	/* Accumulate JIT instrumentation, if any. */
+	if (pei->jit_instrumentation)
+		ExecParallelRetrieveJitInstrumentation(pei->planstate,
+											pei->jit_instrumentation);
+
 	/* Free any serialized parameters. */
 	if (DsaPointerIsValid(pei->param_exec))
 	{
@@ -1274,6 +1345,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 	DestReceiver *receiver;
 	QueryDesc  *queryDesc;
 	SharedExecutorInstrumentation *instrumentation;
+	SharedJitInstrumentation *jit_instrumentation;
 	int			instrument_options = 0;
 	void	   *area_space;
 	dsa_area   *area;
@@ -1287,6 +1359,8 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 	instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_INSTRUMENTATION, true);
 	if (instrumentation != NULL)
 		instrument_options = instrumentation->instrument_options;
+	jit_instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_JIT_INSTRUMENTATION,
+										 true);
 	queryDesc = ExecParallelGetQueryDesc(toc, receiver, instrument_options);
 
 	/* Setting debug_query_string for individual workers */
@@ -1350,6 +1424,14 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 		ExecParallelReportInstrumentation(queryDesc->planstate,
 										  instrumentation);
 
+	/* Report JIT instrumentation data if any */
+	if (queryDesc->estate->es_jit && jit_instrumentation != NULL)
+	{
+		Assert(ParallelWorkerNumber < jit_instrumentation->num_workers);
+		jit_instrumentation->jit_instr[ParallelWorkerNumber] =
+			queryDesc->estate->es_jit->instr;
+	}
+
 	/* Must do this after capturing instrumentation. */
 	ExecutorEnd(queryDesc);
 
diff --git a/src/backend/jit/jit.c b/src/backend/jit/jit.c
index 7d3883d3bf4..5d1f2e57be1 100644
--- a/src/backend/jit/jit.c
+++ b/src/backend/jit/jit.c
@@ -182,6 +182,17 @@ jit_compile_expr(struct ExprState *state)
 	return false;
 }
 
+/* Aggregate JIT instrumentation information */
+void
+InstrJitAgg(JitInstrumentation *dst, JitInstrumentation *add)
+{
+	dst->created_functions += add->created_functions;
+	INSTR_TIME_ADD(dst->generation_counter, add->generation_counter);
+	INSTR_TIME_ADD(dst->inlining_counter, add->inlining_counter);
+	INSTR_TIME_ADD(dst->optimization_counter, add->optimization_counter);
+	INSTR_TIME_ADD(dst->emission_counter, add->emission_counter);
+}
+
 static bool
 file_exists(const char *name)
 {
diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 640c27fc408..7510698f863 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -224,7 +224,7 @@ llvm_expand_funcname(struct LLVMJitContext *context, const char *basename)
 {
 	Assert(context->module != NULL);
 
-	context->base.created_functions++;
+	context->base.instr.created_functions++;
 
 	/*
 	 * Previously we used dots to separate, but turns out some tools, e.g.
@@ -504,7 +504,7 @@ llvm_compile_module(LLVMJitContext *context)
 		INSTR_TIME_SET_CURRENT(starttime);
 		llvm_inline(context->module);
 		INSTR_TIME_SET_CURRENT(endtime);
-		INSTR_TIME_ACCUM_DIFF(context->base.inlining_counter,
+		INSTR_TIME_ACCUM_DIFF(context->base.instr.inlining_counter,
 							  endtime, starttime);
 	}
 
@@ -524,7 +524,7 @@ llvm_compile_module(LLVMJitContext *context)
 	INSTR_TIME_SET_CURRENT(starttime);
 	llvm_optimize_module(context, context->module);
 	INSTR_TIME_SET_CURRENT(endtime);
-	INSTR_TIME_ACCUM_DIFF(context->base.optimization_counter,
+	INSTR_TIME_ACCUM_DIFF(context->base.instr.optimization_counter,
 						  endtime, starttime);
 
 	if (jit_dump_bitcode)
@@ -575,7 +575,7 @@ llvm_compile_module(LLVMJitContext *context)
 	}
 #endif
 	INSTR_TIME_SET_CURRENT(endtime);
-	INSTR_TIME_ACCUM_DIFF(context->base.emission_counter,
+	INSTR_TIME_ACCUM_DIFF(context->base.instr.emission_counter,
 						  endtime, starttime);
 
 	context->module = NULL;
@@ -596,9 +596,9 @@ llvm_compile_module(LLVMJitContext *context)
 
 	ereport(DEBUG1,
 			(errmsg("time to inline: %.3fs, opt: %.3fs, emit: %.3fs",
-					INSTR_TIME_GET_DOUBLE(context->base.inlining_counter),
-					INSTR_TIME_GET_DOUBLE(context->base.optimization_counter),
-					INSTR_TIME_GET_DOUBLE(context->base.emission_counter)),
+					INSTR_TIME_GET_DOUBLE(context->base.instr.inlining_counter),
+					INSTR_TIME_GET_DOUBLE(context->base.instr.optimization_counter),
+					INSTR_TIME_GET_DOUBLE(context->base.instr.emission_counter)),
 			 errhidestmt(true),
 			 errhidecontext(true)));
 }
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 0f3109334e8..7454d05acaf 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -2557,7 +2557,7 @@ llvm_compile_expr(ExprState *state)
 	llvm_leave_fatal_on_oom();
 
 	INSTR_TIME_SET_CURRENT(endtime);
-	INSTR_TIME_ACCUM_DIFF(context->base.generation_counter,
+	INSTR_TIME_ACCUM_DIFF(context->base.instr.generation_counter,
 						  endtime, starttime);
 
 	return true;
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9b75baae6e6..b0b6f1e15a4 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -81,7 +81,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc);
 
-extern void ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc);
+extern void ExplainPrintJIT(ExplainState *es, int jit_flags,
+				struct JitInstrumentation *jit_instr, int worker_i);
 
 extern void ExplainQueryText(ExplainState *es, QueryDesc *queryDesc);
 
diff --git a/src/include/executor/execParallel.h b/src/include/executor/execParallel.h
index 626a66c27a0..1b4c35e5f14 100644
--- a/src/include/executor/execParallel.h
+++ b/src/include/executor/execParallel.h
@@ -27,6 +27,7 @@ typedef struct ParallelExecutorInfo
 	ParallelContext *pcxt;		/* parallel context we're using */
 	BufferUsage *buffer_usage;	/* points to bufusage area in DSM */
 	SharedExecutorInstrumentation *instrumentation; /* optional */
+	struct SharedJitInstrumentation *jit_instrumentation; /* optional */
 	dsa_area   *area;			/* points to DSA area in DSM */
 	dsa_pointer param_exec;		/* serialized PARAM_EXEC parameters */
 	bool		finished;		/* set true by ExecParallelFinish */
diff --git a/src/include/jit/jit.h b/src/include/jit/jit.h
index b451f4027e7..6cb3bdc89f0 100644
--- a/src/include/jit/jit.h
+++ b/src/include/jit/jit.h
@@ -24,13 +24,8 @@
 #define PGJIT_DEFORM   (1 << 4)
 
 
-typedef struct JitContext
+typedef struct JitInstrumentation
 {
-	/* see PGJIT_* above */
-	int			flags;
-
-	ResourceOwner resowner;
-
 	/* number of emitted functions */
 	size_t		created_functions;
 
@@ -45,6 +40,25 @@ typedef struct JitContext
 
 	/* accumulated time for code emission */
 	instr_time	emission_counter;
+} JitInstrumentation;
+
+/*
+ * DSM structure for accumulating jit instrumentation of all workers.
+ */
+typedef struct SharedJitInstrumentation
+{
+	int			num_workers;
+	JitInstrumentation jit_instr[FLEXIBLE_ARRAY_MEMBER];
+} SharedJitInstrumentation;
+
+typedef struct JitContext
+{
+	/* see PGJIT_* above */
+	int			flags;
+
+	ResourceOwner resowner;
+
+	JitInstrumentation instr;
 } JitContext;
 
 typedef struct JitProviderCallbacks JitProviderCallbacks;
@@ -85,6 +99,7 @@ extern void jit_release_context(JitContext *context);
  * not be able to perform JIT (i.e. return false).
  */
 extern bool jit_compile_expr(struct ExprState *state);
+extern void InstrJitAgg(JitInstrumentation *dst, JitInstrumentation *add);
 
 
 #endif							/* JIT_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 687d7cd2f41..03ad5169762 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -569,9 +569,14 @@ typedef struct EState
 	 * JIT information. es_jit_flags indicates whether JIT should be performed
 	 * and with which options.  es_jit is created on-demand when JITing is
 	 * performed.
+	 *
+	 * es_jit_combined_instr, at the end of query execution with
+	 * instrumentation enabled, is the the combined instrumentation
+	 * information of leader and followers.
 	 */
 	int			es_jit_flags;
 	struct JitContext *es_jit;
+	struct JitInstrumentation *es_jit_combined_instr;
 } EState;
 
 
@@ -923,6 +928,9 @@ typedef struct PlanState
 	Instrumentation *instrument;	/* Optional runtime stats for this node */
 	WorkerInstrumentation *worker_instrument;	/* per-worker instrumentation */
 
+	/* Per-worker JIT instrumentation */
+	struct SharedJitInstrumentation *worker_jit_instrument;
+
 	/*
 	 * Common structural data for all Plan types.  These links to subsidiary
 	 * state trees parallel links in the associated plan tree (except for the
-- 
2.18.0.rc2.dirty

Reply via email to