Re: [HACKERS] WIP: [[Parallel] Shared] Hash

Rafia Sabih Wed, 01 Feb 2017 19:58:12 -0800

On Thu, Feb 2, 2017 at 1:19 AM, Thomas Munro
<[email protected]> wrote:
> On Thu, Feb 2, 2017 at 3:34 AM, Rafia Sabih
> <[email protected]> wrote:
>> 9         | 62928.88   | 59077.909
>
> Thanks Rafia.  At first glance this plan is using the Parallel Shared
> Hash in one place where it should pay off, that is loading the orders
> table, but the numbers are terrible.  I noticed that it uses batch
> files and then has to increase the number of batch files, generating a
> bunch of extra work, even though it apparently overestimated the
> number of rows, though that's only ~9 seconds of ~60.  I am
> investigating.


Hi Thomas,
Apart from the previously reported regression, there appear one more
issue in this set of patches. At times, running a query using parallel
hash it hangs up and all the workers including the master shows the
following backtrace,

#0  0x00003fff880c7de8 in __epoll_wait_nocancel () from /lib64/power8/libc.so.6
#1  0x00000000104e2718 in WaitEventSetWaitBlock (set=0x100157bde90,
cur_timeout=-1, occurred_events=0x3fffdbe69698, nevents=1) at
latch.c:998
#2  0x00000000104e255c in WaitEventSetWait (set=0x100157bde90,
timeout=-1, occurred_events=0x3fffdbe69698, nevents=1,
wait_event_info=134217745) at latch.c:950
#3  0x0000000010512970 in ConditionVariableSleep (cv=0x3ffd736e05a4,
wait_event_info=134217745) at condition_variable.c:132
#4  0x00000000104dbb1c in BarrierWaitSet (barrier=0x3ffd736e0594,
new_phase=1, wait_event_info=134217745) at barrier.c:97
#5  0x00000000104dbb9c in BarrierWait (barrier=0x3ffd736e0594,
wait_event_info=134217745) at barrier.c:127
#6  0x00000000103296a8 in ExecHashShrink (hashtable=0x3ffd73747dc0) at
nodeHash.c:1075
#7  0x000000001032c46c in dense_alloc_shared
(hashtable=0x3ffd73747dc0, size=40, shared=0x3fffdbe69eb8,
respect_work_mem=1 '\001') at nodeHash.c:2618
#8  0x000000001032a2f0 in ExecHashTableInsert
(hashtable=0x3ffd73747dc0, slot=0x100158f9e90, hashvalue=2389907270)
at nodeHash.c:1476
#9  0x0000000010327fd0 in MultiExecHash (node=0x100158f9800) at nodeHash.c:296
#10 0x0000000010306730 in MultiExecProcNode (node=0x100158f9800) at
execProcnode.c:577

The issue is not deterministic and straightforwardly reproducible,
sometimes after make clean, etc. queries run sometimes they hang up
again. I wanted to bring this to your notice hoping you might be
faster than me in picking up the exact reason behind this anomaly.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

Reply via email to