On 2018-09-18 10:03:02 +0530, Amit Khandekar wrote:
> Attached v3 patch that does the above change.
Attached is a revised version of that patch. I've changed quite a few
things:
- I've reverted the split of "base" and "provider specific" contexts - I
don't think it really buys us anything here.
- I've reverted the context creation changes - instead of creating a
context in the leader just to store instrumentation in the worker,
there's now a new EState->es_jit_combined_instr.
- That also means worker instrumentation doesn't get folded into the
leader's instrumentation. This seems good for the future and for
extensions - it's not actually "linear" time that's spent doing
JIT in workers (& leader), as all of that work happens in
parallel. Being able to disentangle that seems important.
- Only allocate / transport JIT instrumentation if instrumentation is
enabled.
- Smaller cleanups.
Forcing parallelism and JIT to be on, I get:
postgres[20216][1]=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS) SELECT count(*) FROM
pg_class;
Finalize Aggregate (cost=15.43..15.44 rows=1 width=8) (actual
time=183.282..183.282 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=13
-> Gather (cost=15.41..15.42 rows=2 width=8) (actual time=152.835..152.904
rows=2 loops=1)
Output: (PARTIAL count(*))
Workers Planned: 2
Workers Launched: 2
JIT for worker 0:
Functions: 2
Options: Inlining true, Optimization true, Expressions true,
Deforming true
Timing: Generation 0.480 ms, Inlining 78.789 ms, Optimization 5.797
ms, Emission 5.554 ms, Total 90.620 ms
JIT for worker 1:
Functions: 2
Options: Inlining true, Optimization true, Expressions true,
Deforming true
Timing: Generation 0.475 ms, Inlining 78.632 ms, Optimization 5.954
ms, Emission 5.900 ms, Total 90.962 ms
Buffers: shared hit=13
-> Partial Aggregate (cost=15.41..15.42 rows=1 width=8) (actual
time=90.608..90.609 rows=1 loops=2)
Output: PARTIAL count(*)
Buffers: shared hit=13
Worker 0: actual time=90.426..90.426 rows=1 loops=1
Buffers: shared hit=7
Worker 1: actual time=90.791..90.792 rows=1 loops=1
Buffers: shared hit=6
-> Parallel Seq Scan on pg_catalog.pg_class (cost=0.00..14.93
rows=193 width=0) (actual time=0.006..0.048 rows=193 loops=2)
Output: relname, relnamespace, reltype, reloftype,
relowner, relam, relfilenode, reltablespace, relpages, reltuples,
relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence,
relkind, relnatts, relchecks, relhasoids, relhasrules, relhastriggers,
relhassubclass, relrowsecurity, relfo
Buffers: shared hit=13
Worker 0: actual time=0.006..0.047 rows=188 loops=1
Buffers: shared hit=7
Worker 1: actual time=0.006..0.048 rows=198 loops=1
Buffers: shared hit=6
Planning Time: 0.152 ms
JIT:
Functions: 4
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.955 ms, Inlining 157.422 ms, Optimization 11.751 ms,
Emission 11.454 ms, Total 181.582 ms
Execution Time: 184.197 ms
Instead of "JIT for worker 0" we could instead do "Worker 0: JIT", but
I'm not sure that's better, given that the JIT group is multi-line
output.
Currently structured formats show this as:
"JIT": {
"Worker Number": 1,
"Functions": 2,
"Options": {
"Inlining": true,
"Optimization": true,
"Expressions": true,
"Deforming": true
},
"Timing": {
"Generation": 0.374,
"Inlining": 70.341,
"Optimization": 8.006,
"Emission": 7.670,
"Total": 86.390
}
},
This needs a bit more polish tomorrow, but I'm starting to like where
this is going. Comments?
Greetings,
Andres Freund
>From dd7d504df9fca0ee0cc0a48930980cda2ad8be1d Mon Sep 17 00:00:00 2001
From: Andres Freund <[email protected]>
Date: Tue, 25 Sep 2018 01:46:34 -0700
Subject: [PATCH] Collect JIT instrumentation from workers.
Author: Amit Khandekar and Andres Freund
Reviewed-By:
Discussion: https://postgr.es/m/caj3gd9elrz51rk_gtkod+71idcjpb_n8ec6vu2aw-vicsae...@mail.gmail.com
Backpatch: 11-
---
contrib/auto_explain/auto_explain.c | 6 +-
src/backend/commands/explain.c | 87 +++++++++++++++++++----------
src/backend/executor/execMain.c | 15 +++++
src/backend/executor/execParallel.c | 82 +++++++++++++++++++++++++++
src/backend/jit/jit.c | 11 ++++
src/backend/jit/llvm/llvmjit.c | 14 ++---
src/backend/jit/llvm/llvmjit_expr.c | 2 +-
src/include/commands/explain.h | 3 +-
src/include/executor/execParallel.h | 1 +
src/include/jit/jit.h | 27 +++++++--
src/include/nodes/execnodes.h | 8 +++
11 files changed, 209 insertions(+), 47 deletions(-)
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 0c0eb3fb9e0..a19c1d2a3c3 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -362,9 +362,9 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
ExplainPrintPlan(es, queryDesc);
if (es->analyze && auto_explain_log_triggers)
ExplainPrintTriggers(es, queryDesc);
- if (queryDesc->estate->es_jit && es->costs &&
- queryDesc->estate->es_jit->created_functions > 0)
- ExplainPrintJIT(es, queryDesc);
+ if (es->costs)
+ ExplainPrintJIT(es, queryDesc->estate->es_jit_flags,
+ queryDesc->estate->es_jit_combined_instr, -1);
ExplainEndOutput(es);
/* Remove last line break */
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 01520154e48..2ecd278f7e3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -563,9 +563,9 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
* depending on build options. Might want to separate that out from COSTS
* at a later stage.
*/
- if (queryDesc->estate->es_jit && es->costs &&
- queryDesc->estate->es_jit->created_functions > 0)
- ExplainPrintJIT(es, queryDesc);
+ if (es->costs)
+ ExplainPrintJIT(es, queryDesc->estate->es_jit_flags,
+ queryDesc->estate->es_jit_combined_instr, -1);
/*
* Close down the query and free resources. Include time for this in the
@@ -690,20 +690,26 @@ ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc)
/*
* ExplainPrintJIT -
- * Append information about JITing to es->str.
+ * Append information about JITing to es->str. Can be used to print
+ * the JIT instrumentation of the backend or that of a specific worker.
*/
void
-ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
+ExplainPrintJIT(ExplainState *es, int jit_flags,
+ JitInstrumentation *ji, int worker_num)
{
- JitContext *jc = queryDesc->estate->es_jit;
instr_time total_time;
+ bool for_workers = (worker_num >= 0);
+
+ /* don't print information if no JITing happened */
+ if (!ji || ji->created_functions == 0)
+ return;
/* calculate total time */
INSTR_TIME_SET_ZERO(total_time);
- INSTR_TIME_ADD(total_time, jc->generation_counter);
- INSTR_TIME_ADD(total_time, jc->inlining_counter);
- INSTR_TIME_ADD(total_time, jc->optimization_counter);
- INSTR_TIME_ADD(total_time, jc->emission_counter);
+ INSTR_TIME_ADD(total_time, ji->generation_counter);
+ INSTR_TIME_ADD(total_time, ji->inlining_counter);
+ INSTR_TIME_ADD(total_time, ji->optimization_counter);
+ INSTR_TIME_ADD(total_time, ji->emission_counter);
ExplainOpenGroup("JIT", "JIT", true, es);
@@ -711,27 +717,30 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
if (es->format == EXPLAIN_FORMAT_TEXT)
{
appendStringInfoSpaces(es->str, es->indent * 2);
- appendStringInfo(es->str, "JIT:\n");
+ if (for_workers)
+ appendStringInfo(es->str, "JIT for worker %u:\n", worker_num);
+ else
+ appendStringInfo(es->str, "JIT:\n");
es->indent += 1;
- ExplainPropertyInteger("Functions", NULL, jc->created_functions, es);
+ ExplainPropertyInteger("Functions", NULL, ji->created_functions, es);
appendStringInfoSpaces(es->str, es->indent * 2);
appendStringInfo(es->str, "Options: %s %s, %s %s, %s %s, %s %s\n",
- "Inlining", jc->flags & PGJIT_INLINE ? "true" : "false",
- "Optimization", jc->flags & PGJIT_OPT3 ? "true" : "false",
- "Expressions", jc->flags & PGJIT_EXPR ? "true" : "false",
- "Deforming", jc->flags & PGJIT_DEFORM ? "true" : "false");
+ "Inlining", jit_flags & PGJIT_INLINE ? "true" : "false",
+ "Optimization", jit_flags & PGJIT_OPT3 ? "true" : "false",
+ "Expressions", jit_flags & PGJIT_EXPR ? "true" : "false",
+ "Deforming", jit_flags & PGJIT_DEFORM ? "true" : "false");
if (es->analyze && es->timing)
{
appendStringInfoSpaces(es->str, es->indent * 2);
appendStringInfo(es->str,
"Timing: %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms\n",
- "Generation", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->generation_counter),
- "Inlining", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->inlining_counter),
- "Optimization", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->optimization_counter),
- "Emission", 1000.0 * INSTR_TIME_GET_DOUBLE(jc->emission_counter),
+ "Generation", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
+ "Inlining", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
+ "Optimization", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->optimization_counter),
+ "Emission", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->emission_counter),
"Total", 1000.0 * INSTR_TIME_GET_DOUBLE(total_time));
}
@@ -739,13 +748,14 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
}
else
{
- ExplainPropertyInteger("Functions", NULL, jc->created_functions, es);
+ ExplainPropertyInteger("Worker Number", NULL, worker_num, es);
+ ExplainPropertyInteger("Functions", NULL, ji->created_functions, es);
ExplainOpenGroup("Options", "Options", true, es);
- ExplainPropertyBool("Inlining", jc->flags & PGJIT_INLINE, es);
- ExplainPropertyBool("Optimization", jc->flags & PGJIT_OPT3, es);
- ExplainPropertyBool("Expressions", jc->flags & PGJIT_EXPR, es);
- ExplainPropertyBool("Deforming", jc->flags & PGJIT_DEFORM, es);
+ ExplainPropertyBool("Inlining", jit_flags & PGJIT_INLINE, es);
+ ExplainPropertyBool("Optimization", jit_flags & PGJIT_OPT3, es);
+ ExplainPropertyBool("Expressions", jit_flags & PGJIT_EXPR, es);
+ ExplainPropertyBool("Deforming", jit_flags & PGJIT_DEFORM, es);
ExplainCloseGroup("Options", "Options", true, es);
if (es->analyze && es->timing)
@@ -753,16 +763,16 @@ ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc)
ExplainOpenGroup("Timing", "Timing", true, es);
ExplainPropertyFloat("Generation", "ms",
- 1000.0 * INSTR_TIME_GET_DOUBLE(jc->generation_counter),
+ 1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
3, es);
ExplainPropertyFloat("Inlining", "ms",
- 1000.0 * INSTR_TIME_GET_DOUBLE(jc->inlining_counter),
+ 1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
3, es);
ExplainPropertyFloat("Optimization", "ms",
- 1000.0 * INSTR_TIME_GET_DOUBLE(jc->optimization_counter),
+ 1000.0 * INSTR_TIME_GET_DOUBLE(ji->optimization_counter),
3, es);
ExplainPropertyFloat("Emission", "ms",
- 1000.0 * INSTR_TIME_GET_DOUBLE(jc->emission_counter),
+ 1000.0 * INSTR_TIME_GET_DOUBLE(ji->emission_counter),
3, es);
ExplainPropertyFloat("Total", "ms",
1000.0 * INSTR_TIME_GET_DOUBLE(total_time),
@@ -1554,6 +1564,25 @@ ExplainNode(PlanState *planstate, List *ancestors,
ExplainPropertyInteger("Workers Launched", NULL,
nworkers, es);
}
+
+ /*
+ * Print per-worker Jit instrumentation. Use same conditions
+ * as for the leader's JIT instrumentation, see comment there.
+ */
+ if (es->costs && es->verbose &&
+ outerPlanState(planstate)->worker_jit_instrument)
+ {
+ PlanState *child = outerPlanState(planstate);
+ int n;
+ SharedJitInstrumentation *w = child->worker_jit_instrument;
+
+ for (n = 0; n < w->num_workers; ++n)
+ {
+ ExplainPrintJIT(es, child->state->es_jit_flags,
+ &w->jit_instr[n], n);
+ }
+ }
+
if (gather->single_copy || es->format != EXPLAIN_FORMAT_TEXT)
ExplainPropertyBool("Single Copy", gather->single_copy, es);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 85d980356b7..e36f58a0151 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -48,6 +48,7 @@
#include "executor/execdebug.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
+#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "optimizer/clauses.h"
@@ -494,6 +495,20 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
ExecEndPlan(queryDesc->planstate, estate);
+ /*
+ * If this process has done JIT, either merge stats into worker stats, or
+ * this process' stats as the global stats if no workers did JIT.
+ */
+ if (estate->es_instrument && queryDesc->estate->es_jit)
+ {
+ if (queryDesc->estate->es_jit_combined_instr)
+ InstrJitAgg(queryDesc->estate->es_jit_combined_instr,
+ &queryDesc->estate->es_jit->instr);
+ else
+ queryDesc->estate->es_jit_combined_instr =
+ &queryDesc->estate->es_jit->instr;
+ }
+
/* do away with our snapshots */
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index c93084e4d2a..84ee0098c30 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -37,6 +37,7 @@
#include "executor/nodeSort.h"
#include "executor/nodeSubplan.h"
#include "executor/tqueue.h"
+#include "jit/jit.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/planmain.h"
#include "optimizer/planner.h"
@@ -62,6 +63,7 @@
#define PARALLEL_KEY_INSTRUMENTATION UINT64CONST(0xE000000000000006)
#define PARALLEL_KEY_DSA UINT64CONST(0xE000000000000007)
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
+#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -573,9 +575,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
char *paramlistinfo_space;
BufferUsage *bufusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
+ SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
int paramlistinfo_len;
int instrumentation_len = 0;
+ int jit_instrumentation_len;
int instrument_offset = 0;
Size dsa_minsize = dsa_minimum_size();
char *query_string;
@@ -669,6 +673,17 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
mul_size(e.nnodes, nworkers));
shm_toc_estimate_chunk(&pcxt->estimator, instrumentation_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* Estimate space for JIT instrumentation, if required. */
+ if (estate->es_jit_flags != PGJIT_NONE)
+ {
+ jit_instrumentation_len =
+ offsetof(SharedJitInstrumentation, jit_instr) +
+ sizeof(JitInstrumentation) * nworkers;
+ jit_instrumentation_len = MAXALIGN(jit_instrumentation_len);
+ shm_toc_estimate_chunk(&pcxt->estimator, jit_instrumentation_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
}
/* Estimate space for DSA area. */
@@ -742,6 +757,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_insert(pcxt->toc, PARALLEL_KEY_INSTRUMENTATION,
instrumentation);
pei->instrumentation = instrumentation;
+
+ if (estate->es_jit_flags != PGJIT_NONE)
+ {
+ jit_instrumentation = shm_toc_allocate(pcxt->toc,
+ jit_instrumentation_len);
+ jit_instrumentation->num_workers = nworkers;
+ memset(jit_instrumentation->jit_instr, 0,
+ sizeof(JitInstrumentation) * nworkers);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_JIT_INSTRUMENTATION,
+ jit_instrumentation);
+ pei->jit_instrumentation = jit_instrumentation;
+ }
}
/*
@@ -1003,6 +1030,45 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate,
instrumentation);
}
+/*
+ * Add up the workers' JIT instrumentation from dynamic shared memory.
+ */
+static void
+ExecParallelRetrieveJitInstrumentation(PlanState *planstate,
+ SharedJitInstrumentation *shared_jit)
+{
+ JitInstrumentation *combined;
+ int ibytes;
+
+ int n;
+
+ /*
+ * Accumulate worker JIT instrumentation into the combined JIT, allocating
+ * it if required. Note this is separate from the leader instrumentation.
+ */
+ if (!planstate->state->es_jit_combined_instr)
+ planstate->state->es_jit_combined_instr =
+ MemoryContextAllocZero(planstate->state->es_query_cxt, sizeof(JitInstrumentation));
+ combined = planstate->state->es_jit_combined_instr;
+
+ /* Accummulate all the workers' instrumentations. */
+ for (n = 0; n < shared_jit->num_workers; ++n)
+ InstrJitAgg(combined, &shared_jit->jit_instr[n]);
+
+ /*
+ * Store the per-worker detail.
+ *
+ * Similar to ExecParallelRetrieveInstrumentation(), allocate the
+ * instrumentation in per-query context.
+ */
+ ibytes = offsetof(SharedJitInstrumentation, jit_instr)
+ + mul_size(shared_jit->num_workers, sizeof(JitInstrumentation));
+ planstate->worker_jit_instrument =
+ MemoryContextAlloc(planstate->state->es_query_cxt, ibytes);
+
+ memcpy(planstate->worker_jit_instrument, shared_jit, ibytes);
+}
+
/*
* Finish parallel execution. We wait for parallel workers to finish, and
* accumulate their buffer usage.
@@ -1068,6 +1134,11 @@ ExecParallelCleanup(ParallelExecutorInfo *pei)
ExecParallelRetrieveInstrumentation(pei->planstate,
pei->instrumentation);
+ /* Accumulate JIT instrumentation, if any. */
+ if (pei->jit_instrumentation)
+ ExecParallelRetrieveJitInstrumentation(pei->planstate,
+ pei->jit_instrumentation);
+
/* Free any serialized parameters. */
if (DsaPointerIsValid(pei->param_exec))
{
@@ -1274,6 +1345,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
DestReceiver *receiver;
QueryDesc *queryDesc;
SharedExecutorInstrumentation *instrumentation;
+ SharedJitInstrumentation *jit_instrumentation;
int instrument_options = 0;
void *area_space;
dsa_area *area;
@@ -1287,6 +1359,8 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_INSTRUMENTATION, true);
if (instrumentation != NULL)
instrument_options = instrumentation->instrument_options;
+ jit_instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_JIT_INSTRUMENTATION,
+ true);
queryDesc = ExecParallelGetQueryDesc(toc, receiver, instrument_options);
/* Setting debug_query_string for individual workers */
@@ -1350,6 +1424,14 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
ExecParallelReportInstrumentation(queryDesc->planstate,
instrumentation);
+ /* Report JIT instrumentation data if any */
+ if (queryDesc->estate->es_jit && jit_instrumentation != NULL)
+ {
+ Assert(ParallelWorkerNumber < jit_instrumentation->num_workers);
+ jit_instrumentation->jit_instr[ParallelWorkerNumber] =
+ queryDesc->estate->es_jit->instr;
+ }
+
/* Must do this after capturing instrumentation. */
ExecutorEnd(queryDesc);
diff --git a/src/backend/jit/jit.c b/src/backend/jit/jit.c
index 7d3883d3bf4..5d1f2e57be1 100644
--- a/src/backend/jit/jit.c
+++ b/src/backend/jit/jit.c
@@ -182,6 +182,17 @@ jit_compile_expr(struct ExprState *state)
return false;
}
+/* Aggregate JIT instrumentation information */
+void
+InstrJitAgg(JitInstrumentation *dst, JitInstrumentation *add)
+{
+ dst->created_functions += add->created_functions;
+ INSTR_TIME_ADD(dst->generation_counter, add->generation_counter);
+ INSTR_TIME_ADD(dst->inlining_counter, add->inlining_counter);
+ INSTR_TIME_ADD(dst->optimization_counter, add->optimization_counter);
+ INSTR_TIME_ADD(dst->emission_counter, add->emission_counter);
+}
+
static bool
file_exists(const char *name)
{
diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 640c27fc408..7510698f863 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -224,7 +224,7 @@ llvm_expand_funcname(struct LLVMJitContext *context, const char *basename)
{
Assert(context->module != NULL);
- context->base.created_functions++;
+ context->base.instr.created_functions++;
/*
* Previously we used dots to separate, but turns out some tools, e.g.
@@ -504,7 +504,7 @@ llvm_compile_module(LLVMJitContext *context)
INSTR_TIME_SET_CURRENT(starttime);
llvm_inline(context->module);
INSTR_TIME_SET_CURRENT(endtime);
- INSTR_TIME_ACCUM_DIFF(context->base.inlining_counter,
+ INSTR_TIME_ACCUM_DIFF(context->base.instr.inlining_counter,
endtime, starttime);
}
@@ -524,7 +524,7 @@ llvm_compile_module(LLVMJitContext *context)
INSTR_TIME_SET_CURRENT(starttime);
llvm_optimize_module(context, context->module);
INSTR_TIME_SET_CURRENT(endtime);
- INSTR_TIME_ACCUM_DIFF(context->base.optimization_counter,
+ INSTR_TIME_ACCUM_DIFF(context->base.instr.optimization_counter,
endtime, starttime);
if (jit_dump_bitcode)
@@ -575,7 +575,7 @@ llvm_compile_module(LLVMJitContext *context)
}
#endif
INSTR_TIME_SET_CURRENT(endtime);
- INSTR_TIME_ACCUM_DIFF(context->base.emission_counter,
+ INSTR_TIME_ACCUM_DIFF(context->base.instr.emission_counter,
endtime, starttime);
context->module = NULL;
@@ -596,9 +596,9 @@ llvm_compile_module(LLVMJitContext *context)
ereport(DEBUG1,
(errmsg("time to inline: %.3fs, opt: %.3fs, emit: %.3fs",
- INSTR_TIME_GET_DOUBLE(context->base.inlining_counter),
- INSTR_TIME_GET_DOUBLE(context->base.optimization_counter),
- INSTR_TIME_GET_DOUBLE(context->base.emission_counter)),
+ INSTR_TIME_GET_DOUBLE(context->base.instr.inlining_counter),
+ INSTR_TIME_GET_DOUBLE(context->base.instr.optimization_counter),
+ INSTR_TIME_GET_DOUBLE(context->base.instr.emission_counter)),
errhidestmt(true),
errhidecontext(true)));
}
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 0f3109334e8..7454d05acaf 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -2557,7 +2557,7 @@ llvm_compile_expr(ExprState *state)
llvm_leave_fatal_on_oom();
INSTR_TIME_SET_CURRENT(endtime);
- INSTR_TIME_ACCUM_DIFF(context->base.generation_counter,
+ INSTR_TIME_ACCUM_DIFF(context->base.instr.generation_counter,
endtime, starttime);
return true;
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9b75baae6e6..b0b6f1e15a4 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -81,7 +81,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
extern void ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc);
-extern void ExplainPrintJIT(ExplainState *es, QueryDesc *queryDesc);
+extern void ExplainPrintJIT(ExplainState *es, int jit_flags,
+ struct JitInstrumentation *jit_instr, int worker_i);
extern void ExplainQueryText(ExplainState *es, QueryDesc *queryDesc);
diff --git a/src/include/executor/execParallel.h b/src/include/executor/execParallel.h
index 626a66c27a0..1b4c35e5f14 100644
--- a/src/include/executor/execParallel.h
+++ b/src/include/executor/execParallel.h
@@ -27,6 +27,7 @@ typedef struct ParallelExecutorInfo
ParallelContext *pcxt; /* parallel context we're using */
BufferUsage *buffer_usage; /* points to bufusage area in DSM */
SharedExecutorInstrumentation *instrumentation; /* optional */
+ struct SharedJitInstrumentation *jit_instrumentation; /* optional */
dsa_area *area; /* points to DSA area in DSM */
dsa_pointer param_exec; /* serialized PARAM_EXEC parameters */
bool finished; /* set true by ExecParallelFinish */
diff --git a/src/include/jit/jit.h b/src/include/jit/jit.h
index b451f4027e7..6cb3bdc89f0 100644
--- a/src/include/jit/jit.h
+++ b/src/include/jit/jit.h
@@ -24,13 +24,8 @@
#define PGJIT_DEFORM (1 << 4)
-typedef struct JitContext
+typedef struct JitInstrumentation
{
- /* see PGJIT_* above */
- int flags;
-
- ResourceOwner resowner;
-
/* number of emitted functions */
size_t created_functions;
@@ -45,6 +40,25 @@ typedef struct JitContext
/* accumulated time for code emission */
instr_time emission_counter;
+} JitInstrumentation;
+
+/*
+ * DSM structure for accumulating jit instrumentation of all workers.
+ */
+typedef struct SharedJitInstrumentation
+{
+ int num_workers;
+ JitInstrumentation jit_instr[FLEXIBLE_ARRAY_MEMBER];
+} SharedJitInstrumentation;
+
+typedef struct JitContext
+{
+ /* see PGJIT_* above */
+ int flags;
+
+ ResourceOwner resowner;
+
+ JitInstrumentation instr;
} JitContext;
typedef struct JitProviderCallbacks JitProviderCallbacks;
@@ -85,6 +99,7 @@ extern void jit_release_context(JitContext *context);
* not be able to perform JIT (i.e. return false).
*/
extern bool jit_compile_expr(struct ExprState *state);
+extern void InstrJitAgg(JitInstrumentation *dst, JitInstrumentation *add);
#endif /* JIT_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 687d7cd2f41..03ad5169762 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -569,9 +569,14 @@ typedef struct EState
* JIT information. es_jit_flags indicates whether JIT should be performed
* and with which options. es_jit is created on-demand when JITing is
* performed.
+ *
+ * es_jit_combined_instr, at the end of query execution with
+ * instrumentation enabled, is the the combined instrumentation
+ * information of leader and followers.
*/
int es_jit_flags;
struct JitContext *es_jit;
+ struct JitInstrumentation *es_jit_combined_instr;
} EState;
@@ -923,6 +928,9 @@ typedef struct PlanState
Instrumentation *instrument; /* Optional runtime stats for this node */
WorkerInstrumentation *worker_instrument; /* per-worker instrumentation */
+ /* Per-worker JIT instrumentation */
+ struct SharedJitInstrumentation *worker_jit_instrument;
+
/*
* Common structural data for all Plan types. These links to subsidiary
* state trees parallel links in the associated plan tree (except for the
--
2.18.0.rc2.dirty