Kuien Liu created HAWQ-1529:
-------------------------------

             Summary: "segment resource manager" will NOT exit when postmaster 
died
                 Key: HAWQ-1529
                 URL: https://issues.apache.org/jira/browse/HAWQ-1529
             Project: Apache HAWQ
          Issue Type: Improvement
          Components: Core
            Reporter: Kuien Liu
            Assignee: Radar Lei


If I send SIGKILL to postmaster of segment by 'kill -9', then postmaster dies, 
BUT "segment resource manager" and "logger process" are still alive and 
flushing "WARNING" each 30s.

To my understanding, "logger process" is waiting for "segment resource 
manager", but the resource manager will not detect the alive-status of 
postmaster and continue waiting. Does it make sense? Why not quit in case of 
postmaster gone? 

The call stack of RM when postmaster is killed:
#0  0x00007f19023ccab6 in poll () from /lib64/libc.so.6
#1  0x0000000000a48c9e in processAllCommFileDescs () at rmcomm_AsyncComm.c:156
#2  0x0000000000a8ce5e in MainHandlerLoop_RMSEG () at 
resourcemanager_RMSEG.c:166
#3  0x0000000000a8cba3 in ResManagerMainSegment2ndPhase () at 
resourcemanager_RMSEG.c:71
#4  0x0000000000a8d966 in ResManagerMain (argc=0x3, argv=0x7fffa018b890) at 
resourcemanager.c:346
#5  0x0000000000a8db45 in ResManagerProcessStartup () at resourcemanager.c:411
#6  0x0000000000899b89 in CommenceNormalOperations () at postmaster.c:3673
#7  0x000000000089a562 in do_reaper () at postmaster.c:4021
#8  0x00000000008969bb in ServerLoop () at postmaster.c:2136
#9  0x0000000000895a78 in PostmasterMain (argc=0xc, argv=0x229a730) at 
postmaster.c:1454
#10 0x00000000007b185d in main (argc=0xc, argv=0x229a730) at main.c:226
#11 0x00007f190231e994 in __libc_start_main () from /lib64/libc.so.6
#12 0x00000000004bde89 in _start ()




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to