Hello,
I have noticed that all workers dequeing messages in main queue are using the
same objects instances (at least in 5.8.10).
This creates a lot of contention when using regexp since matching a string
against a regexp take a lock on the regexp instance (in libc)
As a result, all threads but one are waiting to acquire the lock.
Here is a copy of one backtrace:
Thread 4 (Thread 0x7f7ad75fe700 (LWP 25735)):
#0 0x00007f7b532b306e in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007f7b5328852a in _L_lock_49262 () from /lib64/libc.so.6
#2 0x00007f7b532845fc in regexec@@GLIBC_2.3.4 () from /lib64/libc.so.6
#3 0x00007f7b541e4b80 in MsgGetProp (pMsg=<value optimized out>,
pTpe=0x7f7b555fa500,
propID=<value optimized out>, pPropLen=0x7f7ad75fd158,
pbMustBeFreed=0x7f7ad75fd16e)
at msg.c:2621
#4 0x00007f7b54204979 in tplToString (pTpl=0x7f7b555f9370,
pMsg=0x7f7aa3ed1a40,
ppBuf=0x7f7b519984a0, pLenBuf=0x7f7b519984c0) at ../template.c:129
#5 0x00007f7b54202305 in prepareDoActionParams (pAction=0x7f7b5560b370,
pBatch=0x7f7b55641e48, pbShutdownImmediate=<value optimized out>) at
../action.c:780
#6 prepareBatch (pAction=0x7f7b5560b370, pBatch=0x7f7b55641e48,
pbShutdownImmediate=<value optimized out>) at ../action.c:1166
#7 processBatchMain (pAction=0x7f7b5560b370, pBatch=0x7f7b55641e48,
pbShutdownImmediate=<value optimized out>) at ../action.c:1207
#8 0x00007f7b541ff93f in doQueueEnqObjDirectBatch (pAction=0x7f7b5560b370,
pBatch=0x7f7b55641e48) at ../action.c:1659
#9 0x00007f7b54200f2b in doSubmitToActionQNotAllMarkBatch
(pAction=0x7f7b5560b370,
pBatch=0x7f7b55641e48) at ../action.c:1568
#10 0x00007f7b541fbfd3 in processBatchDoActions (pData=0x7f7b5560b370,
pParam=<value optimized out>) at rule.c:101
#11 0x00007f7b541e653f in llExecFunc (pThis=0x7f7b556090b0,
pFunc=0x7f7b541fbfa0 <processBatchDoActions>, pParam=0x7f7b55641e48) at
linkedlist.c:389
#12 0x00007f7b541fc3b3 in processBatch (pThis=0x7f7b55609060,
pBatch=0x7f7b55641e48)
at rule.c:292
#13 0x00007f7b541fb3de in processBatchDoRules (pData=0x7f7b55609060,
pParam=0x7f7b55641e48)
at ruleset.c:150
#14 0x00007f7b541e653f in llExecFunc (pThis=0x7f7b555f4410,
pFunc=0x7f7b541fb3b0 <processBatchDoRules>, pParam=0x7f7b55641e48) at
linkedlist.c:389
#15 0x00007f7b541fb866 in processBatch (pBatch=0x7f7b55641e48) at ruleset.c:232
#16 0x00007f7b541d3641 in msgConsumer (notNeeded=<value optimized out>,
pBatch=0x7f7b55641e48, pbShutdownImmediate=0x7f7b55641028) at syslogd.c:719
#17 0x00007f7b541faedb in ConsumerReg (pThis=0x7f7b55641010,
pWti=0x7f7b55641e20)
at queue.c:1760
#18 0x00007f7b541f3e46 in wtiWorker (pThis=0x7f7b55641e20) at wti.c:313
#19 0x00007f7b541f393a in wtpWorker (arg=0x7f7b55641e20) at wtp.c:387
#20 0x00007f7b53b77851 in start_thread () from /lib64/libpthread.so.0
#21 0x00007f7b532a390d in clone () from /lib64/libc.so.6
This is sad because each worker could (should?) create its own instance of the
regex.
Do you agree with that conclusion?
The configuration except is:
$template All,"/var/opt/logs/technicals/%msg:R,ERE,1,ZERO:^
\+([a-zA-Z/0-9\-]+)\+(.*)--end%.log"
$template RemoveCR, "%TIMESTAMP:::date-rfc3339% %HOSTNAME% %msg:R,ERE,2,FIELD:^
\+([a-zA-Z/0-9\-]+)\+(.*)--end%\n"
:fromhost-ip,!isequal,"127.0.0.1" -?All;RemoveCR
I can include the full configuration if needed.
Right now the only solution I see would be to load balance messages between
several actions doing exactly the same treatment (but maybe this would create
new issues).
--
Gregoire
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.