Ming LI created HAWQ-978:
----------------------------

             Summary: long running query got hang on master and can't be 
terminated
                 Key: HAWQ-978
                 URL: https://issues.apache.org/jira/browse/HAWQ-978
             Project: Apache HAWQ
          Issue Type: Bug
            Reporter: Ming LI
            Assignee: Lei Chang


One backend process on master had been running for several days and can't be 
terminated.
The session is idle on all segments but master instance.

pstack/strace/back trace of the backend process.

```
[gpadmin@avw7hdm2p1 ~]$ pstack 431263
Thread 2 (Thread 0x7f4c93aa2700 (LWP 431264)):
#0  0x00007f4c9013f0d3 in poll () from /lib64/libc.so.6
#1  0x0000000000ba8294 in rxThreadFunc ()
#2  0x00007f4c9101f9d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4c901488fd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f4c93af48e0 (LWP 431263)):
#0  0x00007f4c9015805e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007f4c900dd16b in _L_lock_9503 () from /lib64/libc.so.6
#2  0x00007f4c900da6a6 in malloc () from /lib64/libc.so.6
#3  0x00007f4c9008fb39 in _nl_make_l10nflist () from /lib64/libc.so.6
#4  0x00007f4c9008ddf5 in _nl_find_domain () from /lib64/libc.so.6
#5  0x00007f4c9008d6e0 in __dcigettext () from /lib64/libc.so.6
#6  0x00007f4c6fabcfe3 in Rf_onsigusr1 () from /usr/local/lib64/R/lib/libR.so
#7  <signal handler called>
#8  0x00007f4c9014079a in brk () from /lib64/libc.so.6
#9  0x00007f4c90140845 in sbrk () from /lib64/libc.so.6
#10 0x00007f4c900dd769 in __default_morecore () from /lib64/libc.so.6
#11 0x00007f4c900d87a2 in _int_free () from /lib64/libc.so.6
#12 0x0000000000b3ff24 in gp_free2 ()
#13 0x0000000000b356fc in AllocSetDelete ()
#14 0x0000000000b38391 in MemoryContextDeleteImpl ()
#15 0x000000000077c851 in ExecEndAgg ()
#16 0x00000000007592ad in ExecEndNode ()
#17 0x000000000075186c in ExecEndPlan ()
#18 0x000000000079dffa in ExecEndSubqueryScan ()
#19 0x000000000075921d in ExecEndNode ()
#20 0x000000000075186c in ExecEndPlan ()
#21 0x0000000000752565 in ExecutorEnd ()
#22 0x00000000006dd9bd in PortalCleanup ()
#23 0x0000000000b3f077 in AtCommit_Portals ()
#24 0x000000000051abe5 in CommitTransaction ()
#25 0x000000000051f1d5 in CommitTransactionCommand ()
#26 0x000000000099809e in PostgresMain ()
#27 0x00000000008f1031 in BackendStartup ()
#28 0x00000000008f70e0 in PostmasterMain ()
#29 0x00000000007f63da in main ()
[gpadmin@avw7hdm2p1 ~]$


[gpadmin@avw7hdm2p1 ~]$ strace -p 431263
Process 431263 attached - interrupt to quit
futex(0x7f4c903efe80, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
Process 431263 detached
[gpadmin@avw7hdm2p1 ~]$



(gdb) thread apply all bt

Thread 2 (Thread 0x7f4c93af48e0 (LWP 431263)):
#0  0x00007f4c9015805e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007f4c900dd16b in _L_lock_9503 () from /lib64/libc.so.6
#2  0x00007f4c900da6a6 in malloc () from /lib64/libc.so.6
#3  0x00007f4c9008fb39 in _nl_make_l10nflist () from /lib64/libc.so.6
#4  0x00007f4c9008ddf5 in _nl_find_domain () from /lib64/libc.so.6
#5  0x00007f4c9008d6e0 in __dcigettext () from /lib64/libc.so.6
#6  0x00007f4c6fabcfe3 in Rf_onsigusr1 (dummy=<value optimized out>) at 
errors.c:178
#7  <signal handler called>
#8  0x00007f4c9014079a in brk () from /lib64/libc.so.6
#9  0x00007f4c90140845 in sbrk () from /lib64/libc.so.6
#10 0x00007f4c900dd769 in __default_morecore () from /lib64/libc.so.6
#11 0x00007f4c900d87a2 in _int_free () from /lib64/libc.so.6
#12 0x0000000000b3ff24 in gp_free2 (ptr=0x191c3b000, sz=0) at memprot.c:808
#13 0x0000000000b356fc in AllocSetDelete (context=<value optimized out>) at 
aset.c:981
#14 0x0000000000b38391 in MemoryContextDeleteImpl (context=0x4a46da0, 
sfile=0x0, func=<value optimized out>, sline=-1) at mcxt.c:232
#15 MemoryContextDeleteChildren (context=0x4a46da0, sfile=0x0, func=<value 
optimized out>, sline=-1) at mcxt.c:251
#16 MemoryContextDeleteImpl (context=0x4a46da0, sfile=0x0, func=<value 
optimized out>, sline=-1) at mcxt.c:205
#17 0x000000000077c851 in ExecEndAgg (node=0x325eb00) at nodeAgg.c:2641
#18 0x00000000007592ad in ExecEndNode (node=0x325eb00) at execProcnode.c:1687
#19 0x000000000075186c in ExecEndPlan (planstate=0x325eb00, estate=0x323f9e8) 
at execMain.c:2825
#20 0x000000000079dffa in ExecEndSubqueryScan (node=0x325cd20) at 
nodeSubqueryscan.c:294
#21 0x000000000075921d in ExecEndNode (node=0x325cd20) at execProcnode.c:1638
#22 0x000000000075186c in ExecEndPlan (planstate=0x325cd20, estate=0x323f010) 
at execMain.c:2825
#23 0x0000000000752565 in ExecutorEnd (queryDesc=<value optimized out>) at 
execMain.c:1321
#24 0x00000000006dd9bd in PortalCleanupHelper (portal=<value optimized out>) at 
portalcmds.c:366
#25 PortalCleanup (portal=<value optimized out>) at portalcmds.c:302
#26 0x0000000000b3f077 in PortalDrop () at portalmem.c:402
#27 AtCommit_Portals () at portalmem.c:643
#28 0x000000000051abe5 in CommitTransaction () at xact.c:3379
#29 0x000000000051f1d5 in CommitTransactionCommand () at xact.c:4535
#30 0x000000000099809e in finish_xact_command (argc=<value optimized out>, 
argv=<value optimized out>, username=<value optimized out>) at postgres.c:3180
#31 PostgresMain (argc=<value optimized out>, argv=<value optimized out>, 
username=<value optimized out>) at postgres.c:5260
#32 0x00000000008f1031 in BackendRun (port=0x2aa5520) at postmaster.c:6811
#33 BackendStartup (port=0x2aa5520) at postmaster.c:6408
#34 0x00000000008f70e0 in ServerLoop (argc=<value optimized out>, argv=<value 
optimized out>) at postmaster.c:2350
#35 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at 
postmaster.c:1556
#36 0x00000000007f63da in main (argc=18, argv=0x2aa1270) at main.c:217

Thread 1 (Thread 0x7f4c93aa2700 (LWP 431264)):
#0  0x00007f4c9013f0d3 in poll () from /lib64/libc.so.6
#1  0x0000000000ba8294 in rxThreadFunc (arg=<value optimized out>) at 
ic_udp.c:6263
#2  0x00007f4c9101f9d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4c901488fd in clone () from /lib64/libc.so.6
(gdb)
```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to