On Wed, Apr 18, 2018 at 11:43 AM, Jonathan Rudenberg <jonat...@titanous.com> wrote: > On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: >> On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg >> <jonat...@titanous.com> wrote: >> > Yep, I think I know approximately what it looked like, I've attached a >> > lightly redacted plan. All of the hung queries were running some variant >> > of this plan as far as I can tell. >> >> Hmm, that isn't a parallel query. I was expecting to see "Gather" and >> "Parallel" in there. > > Oops, I'm really sorry about that. I only have the first part of the hung > queries, and there are a few variants. Here's one that's parallel.
I spent some time trying to reproduce this failure without any luck, using query plans similar to your Gather plan fragment, and using some test harness code for the allocator stuff in isolation. I had an idea that (1) freeing a large object that releases and unpins a segment in one backend and then (2) freeing it again in another backend (illegally) might produce this effect with sufficiently bad luck. I'm still trying to reproduce that without any success, but I get other kinds of failures which I think you'd be seeing too if that hunch were right. Still looking... -- Thomas Munro http://www.enterprisedb.com