On 31.10.2011 14:09, Andreas Piesk wrote:
> 
> after looking at runtime/modules.c i suspected that the mutex doesn't get 
> unlocked.
> 
> after inserting some more debug printf:
> 
> 2426.965520000:2ad3d4b29320: cfline: '$ModLoad imfile'
> 2426.965545000:2ad3d4b29320: Requested to load module 'imfile'
> 2426.965548000:2ad3d4b29320: setting module load/unlock lock
> 2426.965551000:2ad3d4b29320: module load/unload lock set
> 2426.965555000:2ad3d4b29320: loading module '/lib64/rsyslog/imfile.so'
> 2426.965591000:419d5940: strm 0x2ad3e9475220: file 8 read 537 bytes
> 2426.965733000:419d5940: MsgSetTAG in: len 9, pszBuf: rsyslogd:
> 2426.965740000:419d5940: MsgSetTAG exit: pMsg->iLenTAG 9, pMsg->TAG.szBuf: 
> rsyslogd:
> 2426.965776000:419d5940: XXXXX:  tryDoAction 0x2ad3e94705a0, pnElem 1, nElem 1
> 2426.965780000:419d5940: Action 0x2ad3e94705a0 transitioned to state: itx
> 2426.965783000:419d5940: entering actionCalldoAction(), state: itx
> 2426.965797000:419d5940:  10.10.0.254
> 2426.965818000:419d5940: caller requested object 'nsd_ptcp', not found (iRet 
> -3003)
> 2426.965843000:419d5940: Requested to load module 'lmnsd_ptcp'
> 2426.965846000:419d5940: setting module load/unlock lock
> 
> i don't see "imfile: version %s initializing\n", so the problem seems to be 
> imfile because it
> doesn't get initialized and blocks the mutex.
> 

i've debugged the issue a little bit further and it turned out that 
imfile:modInit() hangs at
UseObj(), or to be precise at pthread_mutex_lock(&mutObjGlobalOp).

i think it's a deadlock between mutObjGlobalOp and mutLoadUnload:

- thread 2ad3d4b29320 calls modules:Load() which acquires lock mutLoadUnload
- thread 419d5940 calls obj:UseObj() which aquires lock mutObjGlobalOp
- thread 2ad3d4b29320 calls imfile:modInit() which calls obj:UseObj() and hangs 
at lock mutObjGlobalOp
- thread 419d5940 calls modules:Load() and hangs at lock mutLoadUnload

Rainer, am i right so far? if yes, do you have an idea how to fix it?

regards,
-ap

BTW: the issue is not easily reproducible because the timing has to be right. 
in a test-VM i
ususally get a hanging rsyslogd every 5 reboots.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to