Re: [HACKERS] [parallel query] random server crash while running tpc-h query on power2
On Tue, Aug 16, 2016 at 1:05 AM, Rushabh Lathiawrote: > I agree, this make sense. > > Here is the patch to allocate worker instrumentation into same context > as the regular instrumentation which is per-query context. Looks good, committed. I am not sure it was a very good idea for af33039317ddc4a0e38a02e2255c2bf453115fd2 by Tom Lane to change the current memory context for the entire execution of gather_readnext(); this might not be the only or the last bug that results from that decision. However, I don't really want to get an argument about that right now, and this at least fixes the problem we know about. Thanks for the report and patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [parallel query] random server crash while running tpc-h query on power2
On Mon, Aug 15, 2016 at 6:02 PM, Robert Haaswrote: > On Sat, Aug 13, 2016 at 4:36 AM, Amit Kapila > wrote: > > AFAICS, your patch seems to be the right fix for this issue, unless we > > need the instrumentation information during execution (other than for > > explain) for some purpose. > > Hmm, I disagree. It should be the job of > ExecParallelRetrieveInstrumentation to allocate its data in the > correct context, not the responsibility of nodeGather.c to work around > the fact that it doesn't. The worker instrumentation should be > allocated in the same context as the regular instrumentation > information, which I assume is probably the per-query context. > I agree, this make sense. Here is the patch to allocate worker instrumentation into same context as the regular instrumentation which is per-query context. PFA patch. -- Rushabh Lathia www.EnterpriseDB.com diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c index 380d743..5aa6f02 100644 --- a/src/backend/executor/execParallel.c +++ b/src/backend/executor/execParallel.c @@ -500,6 +500,7 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate, int n; int ibytes; int plan_node_id = planstate->plan->plan_node_id; + MemoryContext oldcontext; /* Find the instumentation for this node. */ for (i = 0; i < instrumentation->num_plan_nodes; ++i) @@ -514,10 +515,19 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate, for (n = 0; n < instrumentation->num_workers; ++n) InstrAggNode(planstate->instrument, [n]); - /* Also store the per-worker detail. */ + /* + * Also store the per-worker detail. + * + * Worker instrumentation should be allocated in the same context as + * the regular instrumentation information, which is the per-query + * context. Switch into per-query memory context. + */ + oldcontext = MemoryContextSwitchTo(planstate->state->es_query_cxt); ibytes = mul_size(instrumentation->num_workers, sizeof(Instrumentation)); planstate->worker_instrument = palloc(ibytes + offsetof(WorkerInstrumentation, instrument)); + MemoryContextSwitchTo(oldcontext); + planstate->worker_instrument->num_workers = instrumentation->num_workers; memcpy(>worker_instrument->instrument, instrument, ibytes); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [parallel query] random server crash while running tpc-h query on power2
On Sat, Aug 13, 2016 at 4:36 AM, Amit Kapilawrote: > AFAICS, your patch seems to be the right fix for this issue, unless we > need the instrumentation information during execution (other than for > explain) for some purpose. Hmm, I disagree. It should be the job of ExecParallelRetrieveInstrumentation to allocate its data in the correct context, not the responsibility of nodeGather.c to work around the fact that it doesn't. The worker instrumentation should be allocated in the same context as the regular instrumentation information, which I assume is probably the per-query context. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [parallel query] random server crash while running tpc-h query on power2
On Sat, Aug 13, 2016 at 11:10 AM, Rushabh Lathiawrote: > Hi All, > > Recently while running tpc-h queries on postgresql master branch, I am > noticed > random server crash. Most of the time server crash coming while turn tpch > query > number 3 - (but its very random). > > > Here its clear that work_instrument is either corrupted or Un-inililized > that is the > reason its ending up with server crash. > > With bit more debugging and looked at git history I found that issue started > coming > with commit af33039317ddc4a0e38a02e2255c2bf453115fd2. gather_readnext() > calls > ExecShutdownGatherWorkers() when nreaders == 0. ExecShutdownGatherWorkers() > calls ExecParallelFinish() which collects the instrumentation before marking > ParallelExecutorInfo to finish. ExecParallelRetrieveInstrumentation() do the > allocation > of planstate->worker_instrument. > > With commit af33039317 now we calling the gather_readnext() with per-tuple > context, > but with nreader == 0 with ExecShutdownGatherWorkers() we end up with > allocation > of planstate->worker_instrument into per-tuple context - which is wrong. > > Now fix can be: > > 1) Avoid calling ExecShutdownGatherWorkers() from the gather_readnext() and > let > ExecEndGather() do that things. > I don't think we can wait till ExecEndGather() to collect statistics, as we need it before that for explain path. However, we do call ExecShutdownNode() from ExecutePlan() when there are no more tuples which can take care of ensuring the shutdown of Gather node. I think the advantage of calling it in gather_readnext() is that it will resources to be released early and populating the instrumentation/statistics as early as possible. > But with this change, gather_readread() and > gather_getnext() depend on planstate->reader structure to continue reading > tuple. > Now either we can change those condition to be depend on planstate->nreaders > or > just pfree(planstate->reader) into gather_readnext() instead of calling > ExecShutdownGatherWorkers(). > > > Attaching patch, which fix the issue with approach 1). > AFAICS, your patch seems to be the right fix for this issue, unless we need the instrumentation information during execution (other than for explain) for some purpose. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [parallel query] random server crash while running tpc-h query on power2
Hi All, Recently while running tpc-h queries on postgresql master branch, I am noticed random server crash. Most of the time server crash coming while turn tpch query number 3 - (but its very random). Call Stack of server crash: (gdb) bt #0 0x102aa9ac in ExplainNode (planstate=0x1001a7471d8, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48) at explain.c:1516 #1 0x102aaeb8 in ExplainNode (planstate=0x1001a746b50, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48) at explain.c:1599 #2 0x102aaeb8 in ExplainNode (planstate=0x1001a7468b8, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48) at explain.c:1599 #3 0x102aaeb8 in ExplainNode (planstate=0x1001a745e48, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48) at explain.c:1599 #4 0x102aaeb8 in ExplainNode (planstate=0x1001a745bb0, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48) at explain.c:1599 #5 0x102aaeb8 in ExplainNode (planstate=0x1001a745870, ancestors=0x1001a745348, relationship=0x0, plan_name=0x0, es=0x1001a65ad48) at explain.c:1599 #6 0x102a81d8 in ExplainPrintPlan (es=0x1001a65ad48, queryDesc=0x1001a7442a0) at explain.c:599 #7 0x102a7e20 in ExplainOnePlan (plannedstmt=0x1001a744208, into=0x0, es=0x1001a65ad48, queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0, planduration=0x3fffe17c6488) at explain.c:515 #8 0x102a7968 in ExplainOneQuery (query=0x1001a65ab98, into=0x0, es=0x1001a65ad48, queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0) at explain.c:357 #9 0x102a74b8 in ExplainQuery (stmt=0x1001a6fe468, queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0, dest=0x1001a65acb0) at explain.c:245 (gdb) p *w $2 = {num_workers = 2139062143, instrument = 0x1001af8a8d0} (gdb) p planstate $3 = (PlanState *) 0x1001a7471d8 (gdb) p *planstate $4 = {type = T_HashJoinState, plan = 0x1001a740730, state = 0x1001a745758, instrument = 0x1001a7514a8, worker_instrument = 0x1001af8a8c8, targetlist = 0x1001a747448, qual = 0x0, lefttree = 0x1001a74a0e8, righttree = 0x1001a74f2f8, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleSlot = 0x1001a750c10, ps_ExprContext = 0x1001a7472f0, ps_ProjInfo = 0x1001a751220, ps_TupFromTlist = 0 '\000'} (gdb) p *planstate->worker_instrument $5 = {num_workers = 2139062143, instrument = 0x1001af8a8d0} (gdb) p n $6 = 13055 (gdb) p n $7 = 13055 Here its clear that work_instrument is either corrupted or Un-inililized that is the reason its ending up with server crash. With bit more debugging and looked at git history I found that issue started coming with commit af33039317ddc4a0e38a02e2255c2bf453115fd2. gather_readnext() calls ExecShutdownGatherWorkers() when nreaders == 0. ExecShutdownGatherWorkers() calls ExecParallelFinish() which collects the instrumentation before marking ParallelExecutorInfo to finish. ExecParallelRetrieveInstrumentation() do the allocation of planstate->worker_instrument. With commit af33039317 now we calling the gather_readnext() with per-tuple context, but with nreader == 0 with ExecShutdownGatherWorkers() we end up with allocation of planstate->worker_instrument into per-tuple context - which is wrong. Now fix can be: 1) Avoid calling ExecShutdownGatherWorkers() from the gather_readnext() and let ExecEndGather() do that things. But with this change, gather_readread() and gather_getnext() depend on planstate->reader structure to continue reading tuple. Now either we can change those condition to be depend on planstate->nreaders or just pfree(planstate->reader) into gather_readnext() instead of calling ExecShutdownGatherWorkers(). 2) Teach ExecParallelRetrieveInstrumentation() to do allocation of planstate->worker_instrument into proper memory context. Attaching patch, which fix the issue with approach 1). Regards. Rushabh Lathia www.EnterpriseDB.com diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c index 438d1b2..bc56f0e 100644 --- a/src/backend/executor/nodeGather.c +++ b/src/backend/executor/nodeGather.c @@ -348,7 +348,8 @@ gather_readnext(GatherState *gatherstate) --gatherstate->nreaders; if