Hello, 

With the following config: 
... 
# Send all logs onto the local relay 
# 
*.*;syslog.!=info @@log1;RSYSLOG_ForwardFormat 
$ActionExecOnlyWhenPreviousIsSuspended on 
& @@log2 
& /var/spool/rsyslog-buffer 
$ActionExecOnlyWhenPreviousIsSuspended off 


We have systems stuck in a connect state that do not appear to be recovering: 

# strace -p 32318 
Process 32318 attached - interrupt to quit 
connect(1, {sa_family=AF_INET, sin_port=htons(514), 
sin_addr=inet_addr("log2")}, 16 <unfinished ...> 
Process 32318 detached 


Loaded symbols for /lib64/libnss_dns.so.2 
Reading symbols from /lib64/libresolv.so.2...done. 
Loaded symbols for /lib64/libresolv.so.2 
Reading symbols from /lib64/rsyslog/lmnsd_ptcp.so...done. 
Loaded symbols for /lib64/rsyslog/lmnsd_ptcp.so 
0x000000319ca0cf2b in connect () from /lib64/libpthread.so.0 
#0 0x000000319ca0cf2b in connect () from /lib64/libpthread.so.0 
#1 0x00002aaaab0d1d65 in Connect (pNsd=0x2aaaac8abe10, family=<value optimized 
out>, port=<value optimized out>, host=<value optimized out>) at nsd_ptcp.c:684 
#2 0x000000000040ff29 in TCPSendInit () 
#3 0x0000000000410038 in doTryResume () 
#4 0x0000000000436d30 in actionTryResume () 
#5 0x0000000000437393 in submitBatch () 
#6 0x0000000000437978 in processBatchMain () 
#7 0x0000000000435896 in doSubmitToActionQBatch () 
#8 0x00000000004361f9 in doSubmitToActionQNotAllMarkBatch () 
#9 0x00000000004325b8 in processBatchDoActions () 
#10 0x000000000041d5d8 in llExecFunc () 
#11 0x0000000000432933 in processBatch () 
#12 0x00000000004319de in processBatchDoRules () 
#13 0x000000000041d5d8 in llExecFunc () 
#14 0x0000000000431f04 in processBatch () 
#15 0x000000000040b5cf in msgConsumer () 
#16 0x0000000000430dcd in ConsumerReg () 
#17 0x000000000042a51c in wtiWorker () 
#18 0x000000000042a136 in wtpWorker () 
#19 0x000000319ca062f7 in start_thread () from /lib64/libpthread.so.0 
#20 0x000000319c2d1b6d in clone () from /lib64/libc.so.6 

$ sudo /usr/sbin/lsof -p 20064 
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME 
rsyslogd 20064 root cwd DIR 8,7 4096 2 / 
rsyslogd 20064 root rtd DIR 8,7 4096 2 / 
rsyslogd 20064 root txt REG 8,7 441537 1310792 /sbin/rsyslogd 
rsyslogd 20064 root mem REG 8,7 134400 885017 /lib64/ld-2.5.so 
rsyslogd 20064 root mem REG 8,7 1699912 885018 /lib64/libc-2.5.so 
rsyslogd 20064 root mem REG 8,7 23360 885019 /lib64/libdl-2.5.so 
rsyslogd 20064 root mem REG 8,7 141440 885023 /lib64/libpthread-2.5.so 
rsyslogd 20064 root mem REG 8,7 53448 885024 /lib64/librt-2.5.so 
rsyslogd 20064 root mem REG 8,6 85928 164425 /usr/lib64/libz.so.1.2.3 
rsyslogd 20064 root mem REG 8,7 92736 884796 /lib64/libresolv-2.5.so 
rsyslogd 20064 root mem REG 8,7 53880 884764 /lib64/libnss_files-2.5.so 
rsyslogd 20064 root mem REG 8,7 23632 884762 /lib64/libnss_dns-2.5.so 
rsyslogd 20064 root mem REG 8,7 93320 885005 /lib64/rsyslog/lmnsd_ptcp.so 
rsyslogd 20064 root mem REG 8,7 75802 884921 /lib64/rsyslog/lmnet.so 
rsyslogd 20064 root mem REG 8,7 1295631 884806 /lib64/rsyslog/imuxsock.so 
rsyslogd 20064 root mem REG 8,7 81914 884794 /lib64/rsyslog/imklog.so 
rsyslogd 20064 root mem REG 8,7 57594 884804 /lib64/rsyslog/imudp.so 
rsyslogd 20064 root mem REG 8,7 37373 884801 /lib64/rsyslog/impstats.so 
rsyslogd 20064 root mem REG 8,7 90803 884994 /lib64/rsyslog/lmnetstrms.so 
rsyslogd 20064 root mem REG 8,7 35770 884873 /lib64/rsyslog/lmtcpclt.so 
rsyslogd 20064 root 0u unix 0xffff81031e95e0c0 876257540 /dev/log 
rsyslogd 20064 root 1u IPv4 1000073229 TCP 
app1.lhr.acx:43293->192.168.132.143:shell (SYN_SENT) 
rsyslogd 20064 root 2r 0000 0,10 0 876257542 eventpoll 
rsyslogd 20064 root 3u IPv6 876257538 UDP *:syslog 
rsyslogd 20064 root 4u IPv4 876257539 UDP *:syslog 
rsyslogd 20064 root 8r REG 0,3 0 4026531849 /proc/kmsg 


The suspected code is in nsd_ptcp.c as it does not appear to allow for a 
timeout with a NODELAY or other mechanism on the connect. 

if((pThis->sock = socket(res->ai_family, res->ai_socktype, res->ai_protocol)) 
== -1) { 
ABORT_FINALIZE(RS_RET_IO_ERROR); 
} 

if(connect(pThis->sock, res->ai_addr, res->ai_addrlen) != 0) { 
ABORT_FINALIZE(RS_RET_IO_ERROR); 
} 



Rgds 
Rodney 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to