On 24 June 2016 at 05:17, Umair Shahid <umair.sha...@gmail.com> wrote:
> On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid < > umair.sha...@2ndquadrant.com> wrote: > >> >> ---------- Forwarded message ---------- >> From: Tom Lane <t...@sss.pgh.pa.us> >> Date: Thu, Jun 23, 2016 at 9:32 PM >> Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready >> To: Magnus Hagander <mag...@hagander.net> >> Cc: Umair Shahid <umair.sha...@2ndquadrant.com>, Dave Page < >> dp...@postgresql.org>, PostgreSQL Packagers < >> pgsql-packag...@postgresql.org> >> >> >> Magnus Hagander <mag...@hagander.net> writes: >> > That makes more sense as the joinrel stuff *has* been changed between >> the >> > two betas. I'm sure someone who's touched that code (Tom?) can comment >> on >> > that part.. >> >> It still makes little sense to me, as the previous reports say that the >> problem happened during bootstrap, and the planner does not run >> during bootstrap. >> >> Could we get a look at debug_query_string in the coredump, to possibly >> narrow down where the crash is really happening? >> > > Moving thread to -hackers ... > > debug_query_string is > > * "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid, > t.description FROM tmp_pg_description t, pg_class c WHERE c.relname = > t.classname;"* > > Happening in "setup_description" > > I was helping Haroon with this last night. I don't have access to the original thread and he's not around so I don't know how much he said. I'll repeat our findings here. During debugging I found that: * A VS 2013 build (perfomed by Haroon and copied to the test host) crashes consistently with the reported symptoms - "performing post-bootstrap initialization ... child process was terminated by exception 0xC0000005" * The issue doesn't happen in a VS 2015 build done on the test host * I couldn't use just-in-time debugging because the restricted execution token setup isolated the process. For the same reason, breakpoints stop working in initdb.c after line 3557. * To get a backtrace, I had to: * Launch a VS x86 command prompt * devenv /debugexe bin\initdb.exe -D test * Set a breakpoint in initdb.c:3557 and initdb.c:3307 * Run * When it traps at get_restricted_token(), manually move the execution pointer over the setup of the restricted execution token by dragging & dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, comment it out and rebuild, but I was working with a supplied binary. * Continue until next breakpoint * Launch process explorer and find the pid of the postgres child process * Debug->attach to process, attach to the child postgres. This doesn't detach the parent, VS does multiprocess debugging. * Continue execution * vs will trap on the child when it crashes * It is an access violation (segfault) in postgres.exe when attempting to read memory at 0xFFFFFFFFFFFFFFFF in calc_joinrel_size_estimate() at costsize.c:3940 fkselec = get_foreign_key_join_selectivity(root, outer_rel->relids, inner_rel->relids, sjinfo, &restrictlist); with debug_query_string: 0x0000000009bf6140 "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE c.relname = t.classname;\n" Backtrace: Exception thrown at 0x00000001401A5A81 in postgres.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF. > postgres.exe!calc_joinrel_size_estimate(PlannerInfo * root, RelOptInfo * outer_rel, RelOptInfo * inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3944 C postgres.exe!set_joinrel_size_estimates(PlannerInfo * root, RelOptInfo * rel, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3852 C postgres.exe!build_join_rel(PlannerInfo * root, Bitmapset * joinrelids, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * * restrictlist_ptr) Line 521 C postgres.exe!make_join_rel(PlannerInfo * root, RelOptInfo * rel1, RelOptInfo * rel2) Line 721 C postgres.exe!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo * old_rel, ListCell * other_rels) Line 266 C postgres.exe!join_search_one_level(PlannerInfo * root, int level) Line 69 C postgres.exe!standard_join_search(PlannerInfo * root, int levels_needed, List * initial_rels) Line 2172 C postgres.exe!query_planner(PlannerInfo * root, List * tlist, void(*)(PlannerInfo *, void *) qp_callback, void * qp_extra) Line 255 C postgres.exe!grouping_planner(PlannerInfo * root, char inheritance_update, double tuple_fraction) Line 1695 C postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, char hasRecursion, double tuple_fraction) Line 775 C postgres.exe!standard_planner(Query * parse, int cursorOptions, ParamListInfoData * boundParams) Line 312 C postgres.exe!pg_plan_query(Query * querytree, int cursorOptions, ParamListInfoData * boundParams) Line 800 C postgres.exe!exec_simple_query(const char * query_string) Line 1023 C postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 4076 C postgres.exe!main(int argc, char * * argv) Line 227 C Local vars: + inner_rel 0x0000000009dfd170 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009d6d718 {...} ...} RelOptInfo * inner_rows 270.00000000000000 double + outer_rel 0x00000001401ded48 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo * outer_rows 2.653352065130e-314#DEN double + restrictlist 0x0000000009d6f7f8 {type=T_List (656) length=1 head=0x0000000009d6f7d8 {data={ptr_value=0x0000000009d6e980 ...} ...} ...} List * + root 0x0000000009dfd800 {type=1 parse=0x000000000067d220 {type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo * + sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009dfcfd8 {nwords=1 words=0x0000000009dfcfdc {...} } ...} SpecialJoinInfo * -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services