Re: [rsyslog] Qualys scan against rsyslog causes it to segfault
On 10/23/2017 7:51 PM, deoren wrote: On 10/23/2017 7:38 PM, deoren wrote: On 10/23/2017 7:11 PM, David Lang wrote: do you have a tcpdump or info from Qualys saying what it sends as part of the scan? David Lang Thankfully (for troubleshooting purposes), the problem isn't specific to the Qualys scan. I later learned that messages coming from our ESXi hosts trigger the problem as well. It may be that ANY message arriving on an input where I'm attempting to check for an empty $!origin!hostname property is enough to trigger the segfault. That said, the messages sent by the Qualys scan along with messages I've seen coming from our ESXi hosts are often missing information (such as the hostname). I'll do further testing and post back. According to what I captured with tcpdump, this is what Wireshark translated the conversation as: <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [23140B70 verbose 'Solo.VmwareCLI' opID=hostd-3963 user=root] Result (type boolean) (wsdl boolean) (kind 1) <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Received syslog cli invalidation message <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Running '/sbin/localcli system syslog config get' <166>2017-10-24T00:48:08.073Z vms1.example.com Hostd: [22DC2B70 info 'SysCommandPosix'] ForkExec(/sbin/localcli) 9164454 <166>2017-10-24T00:48:08.077Z vms1.example.com Hostd: [226B0B70 verbose 'Default' opID=hostd-10a7 user=root] CloseSession called for session id=0896d7c3-f4a1-d872-7b76-a01bf0543edf <166>2017-10-24T00:48:08.077Z vms1.example.com Hostd: [226B0B70 info 'Vimsvc.ha-eventmgr' opID=hostd-10a7 user=root] Event 743 : User root@127.0.0.1 logged out (login time: Tuesday, 24 October, 2017 00:48:07, number of API invocations: 0, user agent: ) <166>2017-10-24T00:48:08.106Z vms1.example.com Rhttpproxy: [FF9CFB70 verbose 'Proxy Req 85506'] The client closed the stream, not unexpectedly. <166>2017-10-24T00:48:08.408Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Running '/sbin/localcli system syslog config logger list' <166>2017-10-24T00:48:08.409Z vms1.example.com Hostd: [22DC2B70 info 'SysCommandPosix'] ForkExec(/sbin/localcli) 916445 I'll next test with the logger command from a remote host and see where that goes. logger --tcp --port 514 --server sawmill3.example.com "hello!" That's enough to cause a segfault. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Qualys scan against rsyslog causes it to segfault
On 10/23/2017 7:38 PM, deoren wrote: On 10/23/2017 7:11 PM, David Lang wrote: do you have a tcpdump or info from Qualys saying what it sends as part of the scan? David Lang Thankfully (for troubleshooting purposes), the problem isn't specific to the Qualys scan. I later learned that messages coming from our ESXi hosts trigger the problem as well. It may be that ANY message arriving on an input where I'm attempting to check for an empty $!origin!hostname property is enough to trigger the segfault. That said, the messages sent by the Qualys scan along with messages I've seen coming from our ESXi hosts are often missing information (such as the hostname). I'll do further testing and post back. According to what I captured with tcpdump, this is what Wireshark translated the conversation as: <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [23140B70 verbose 'Solo.VmwareCLI' opID=hostd-3963 user=root] Result (type boolean) (wsdl boolean) (kind 1) <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Received syslog cli invalidation message <166>2017-10-24T00:48:08.071Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Running '/sbin/localcli system syslog config get' <166>2017-10-24T00:48:08.073Z vms1.example.com Hostd: [22DC2B70 info 'SysCommandPosix'] ForkExec(/sbin/localcli) 9164454 <166>2017-10-24T00:48:08.077Z vms1.example.com Hostd: [226B0B70 verbose 'Default' opID=hostd-10a7 user=root] CloseSession called for session id=0896d7c3-f4a1-d872-7b76-a01bf0543edf <166>2017-10-24T00:48:08.077Z vms1.example.com Hostd: [226B0B70 info 'Vimsvc.ha-eventmgr' opID=hostd-10a7 user=root] Event 743 : User root@127.0.0.1 logged out (login time: Tuesday, 24 October, 2017 00:48:07, number of API invocations: 0, user agent: ) <166>2017-10-24T00:48:08.106Z vms1.example.com Rhttpproxy: [FF9CFB70 verbose 'Proxy Req 85506'] The client closed the stream, not unexpectedly. <166>2017-10-24T00:48:08.408Z vms1.example.com Hostd: [22DC2B70 verbose 'Hostsvc.SyslogConfigProvider'] Running '/sbin/localcli system syslog config logger list' <166>2017-10-24T00:48:08.409Z vms1.example.com Hostd: [22DC2B70 info 'SysCommandPosix'] ForkExec(/sbin/localcli) 916445 I'll next test with the logger command from a remote host and see where that goes. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Qualys scan against rsyslog causes it to segfault
On 10/23/2017 7:11 PM, David Lang wrote: do you have a tcpdump or info from Qualys saying what it sends as part of the scan? David Lang Thankfully (for troubleshooting purposes), the problem isn't specific to the Qualys scan. I later learned that messages coming from our ESXi hosts trigger the problem as well. It may be that ANY message arriving on an input where I'm attempting to check for an empty $!origin!hostname property is enough to trigger the segfault. That said, the messages sent by the Qualys scan along with messages I've seen coming from our ESXi hosts are often missing information (such as the hostname). I'll do further testing and post back. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Qualys scan against rsyslog causes it to segfault
do you have a tcpdump or info from Qualys saying what it sends as part of the scan? David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Qualys scan against rsyslog causes it to segfault
On 10/7/2017 10:44 AM, deoren wrote: On 10/7/2017 5:25 AM, Rainer Gerhards wrote: 2017-10-07 7:57 GMT+02:00 deoren: As I dig more into this, I'm beginning to think the only thing the Qualys scan did was aggravate an existing problem and cause rsyslog to tip over more quickly. While I didn't observe the problem at the time, I recently made some configuration changes to introduce a lookup table to map known IPs to known hostnames (something I plan to programmatically build from a central db at some point in the future). I opted to try this in an effort to stop using $hostname directly (which gives odd results with invalid messages) for dynamic directory/filenames. A long story short, I found that systemd is now (without a Qualys scan actively running) restarting rsyslog after it dies within a minute or so of running. My guess is that when a remote client sees that the rsyslog receiver is running again it retries delivery and when the receiver is passed the message it performs the lookup. At that point the real cause for the segfault is triggered and rsyslog goes down. Looking at the output of the dmesg command I found many instances of messages like this one: [364581.440049] rs:main Q:Reg[82570]: segfault at 0 ip 7faeb1fa55f4 sp 7faeaac431d0 error 4 in libfastjson.so.4.1.0[7faeb1fa1000+a000] I stopped rsyslog and then manually started it up like so: rsyslogd -dn > rsyslog-debug.log 2>&1 This was printed to the console: Segmentation fault (core dumped) Here are the last 20 lines from the debug log file: 7587.842410212:main Q:Reg/w0 : ../action.c: action 'action 21': set suspended state to 0 7587.842412422:main Q:Reg/w0 : rainerscript.c: STOP 7587.842418787:main Q:Reg/w0 : ruleset.c: processBATCH: next msg 44: <167>2017-10-07T01:15:37.701Z vms2.example.com Rhttpproxy: verbose rhttpproxy[79FC5B70] [Originator@6876 sub=Proxy Req 829 7587.842421471:main Q:Reg/w0 : rainerscript.c: CALL [receiver-origin-hostname-lookup, queue:0] 7587.842427317:main Q:Reg/w0 : rainerscript.c: IF 7587.842433062:main Q:Reg/w0 : rainerscript.c: var 'fromhost-ip' 7587.842442260:main Q:Reg/w0 : rainerscript.c: == 7587.842449663:main Q:Reg/w0 : rainerscript.c: string '127.0.0.1' 7587.842460890:main Q:Reg/w0 : rainerscript.c: eval expr 0x22147a0, type 'CMP_EQ' 7587.842463187:main Q:Reg/w0 : rainerscript.c: eval expr 0x22138e0, type 'V' 7587.842466060:main Q:Reg/w0 : rainerscript.c: rainerscript: (string) var 8: '192.168.5.142' 7587.842468406:main Q:Reg/w0 : rainerscript.c: eval expr 0x22138e0, return datatype 'S':0 7587.842470812:main Q:Reg/w0 : rainerscript.c: eval expr 0x22147a0, return datatype 'N':0 7587.842473056:main Q:Reg/w0 : ruleset.c: if condition result is 0 7587.842475315:main Q:Reg/w0 : rainerscript.c: IF 7587.842480945:main Q:Reg/w0 : rainerscript.c: var '!origin!hostname' 7587.842490085:main Q:Reg/w0 : rainerscript.c: == 7587.842497393:main Q:Reg/w0 : rainerscript.c: string '' 7587.842508499:main Q:Reg/w0 : rainerscript.c: eval expr 0x22148f0, type 'CMP_EQ' 7587.842511262:main Q:Reg/w0 : rainerscript.c: eval expr 0x22146f0, type 'V' I checked my JSON lookup table file and found a duplicate key (but with different values). This was unintentional and was a result of a mistake when building out the file (copy, paste, modify, oops!). I was hopeful that fixing that would resolve the issue, but no such luck. Just in case it isn't readable above, here is the same output plus the ruleset I'm using: https://gist.github.com/deoren/7e8e799428d0d87c4e467fd4add51d09 Anything stand out as obviously wrong? I've gone back over the Expressions, Control Structures and Variable (Property) types docs and it doesn't appear that I'm attempting to access/set the variables improperly, but I could be missing something. The lookup table feature was added in 8.17.0 as an experimental feature. There seems to be at least one hard-to-find bug inside it, see here: https://github.com/rsyslog/rsyslog/issues/1071 In the past days, I have looked at the issue as the original contributor seems to have lost interest in it. It sounds like what you experience is related to this problem. It would be great if you could run rsyslog under valgrind and see if the problem persists and, if so, what valgrind reports. You can use valgrind simply by installing the distro-provided valgrind package and prefixing the command with "valgrind", as such: $ valgrind rsyslogd -dn > rsyslog-debug.log 2>&1 This would be very helpful. It might also be very, very useful if you could add your findings to above-mentioned github issue tracker. I am 99.999% sure it is the same root cause. Rainer Thanks in advance for your help. Sure, I can do that. I ended up eventually getting my configuration to a "stable" point where where rsyslog was no longer crashing, though I don't yet fully
[rsyslog] RES: Client/server architecture with omrelp/imrelp - Lost of messages in case of server side failure
Hello, I realized the parameter RebindInterval was causing duplicated messages in both balanced servers. I removed that and apparentely the messages haven't been lost anymore. Tomorrow I will stress it more. Regards, Jether De: Jether B. Santos Enviada em: segunda-feira, 23 de outubro de 2017 11:35 Para: 'rsyslog@lists.adiscon.com'Assunto: Client/server architecture with omrelp/imrelp - Lost of messages in case of server side failure Hi everybody, I'm testing the following rsyslog client/server architecture: - Client side: a rsyslog server that consumes messages from a local apache log file through imfile module. The main worker threads save the messages in a disk assisted queue of an omrelp action. The action threads dequeue the messages and send them to rsyslog servers behind a haproxy instance. Rsyslog's client configuration: --- module(load="imfile" mode="inotify") input(type="imfile" file="/var/log/httpd/access_log" tag="apache:" facility="local3" severity="notice" ruleset="remote_logserver") module(load="omrelp") ruleset(name="remote_logserver"){ action(type="omrelp" Target="10.0.0.43" Port="514" queue.filename="fila-msg" queue.size="180" queue.saveonshutdown="on" queue.type="LinkedList" action.resumeretrycount="-1" action.resumeInterval="1" RebindInterval="72") } --- - Server side: two rsyslog servers behind a haproxy load balancer instance. The servers listen the logs through imrelp module. Rsyslog's servers configuration: --- module(load="imrelp") input(type="imrelp" port="514") local3.*/var/log/httpd-access.log --- The problem: Everytime I stop and start the rsyslog daemon in the remote rsyslog server where the client is conected, despite haproxy switches the connection to the other rsyslog server, at least a message is lost. I tried configure retrycount="-1" but without success. I read about similar problem in the mail list (http://lists.adiscon.net/pipermail/rsyslog/2016-March/042379.html) but I've not understood exaclly what's been done to solve that (https://github.com/rsyslog/rsyslog/issues/974?_pjax=%23js-repo-pjax-container): - Is it really mandatory to configure batch of only one message a time (queue.dequeuebatchsize="1") in order to not lose any message in case of the remote rsyslog server fail? - Has the "isolation" of a "corrupt message" implemented (Rainer said in the mail list mentioned above: I thought that if a batch failed, it pushed all the messages back on the queue and retried with a half size batch until it got to the one message that could not be processed and only did a fatal fail on that message)? - What about the action retry feature? When it is enabled and you have a batch of many messages and just one fails, what's expected to happen? My servers environment: - Centos 7.3 - Rsyslog version: 8.24 Regards, Jether "Esta mensagem e reservada e sua divulgacao, distribuicao, reproducao ou qualquer forma de uso e proibida e depende de previa autorizacao desta instituicao. O remetente utiliza o correio eletronico no exercicio do seu trabalho ou em razao dele, eximindo esta instituicao de qualquer responsabilidade por utilizacao indevida. Se voce recebeu esta mensagem por engano, favor elimina-la imediatamente." "This message is reserved and its disclosure, distribution, reproduction or any other form of use is prohibited and shall depend upon previous proper authorization. The sender uses the electronic mail in the exercise of his/her work or by virtue thereof, and the institution accepts no liability for its undue use. If you have received this e-mail by mistake, please delete it immediately." ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
[rsyslog] Client/server architecture with omrelp/imrelp - Lost of messages in case of server side failure
Hi everybody, I'm testing the following rsyslog client/server architecture: - Client side: a rsyslog server that consumes messages from a local apache log file through imfile module. The main worker threads save the messages in a disk assisted queue of an omrelp action. The action threads dequeue the messages and send them to rsyslog servers behind a haproxy instance. Rsyslog's client configuration: --- module(load="imfile" mode="inotify") input(type="imfile" file="/var/log/httpd/access_log" tag="apache:" facility="local3" severity="notice" ruleset="remote_logserver") module(load="omrelp") ruleset(name="remote_logserver"){ action(type="omrelp" Target="10.0.0.43" Port="514" queue.filename="fila-msg" queue.size="180" queue.saveonshutdown="on" queue.type="LinkedList" action.resumeretrycount="-1" action.resumeInterval="1" RebindInterval="72") } --- - Server side: two rsyslog servers behind a haproxy load balancer instance. The servers listen the logs through imrelp module. Rsyslog's servers configuration: --- module(load="imrelp") input(type="imrelp" port="514") local3.*/var/log/httpd-access.log --- The problem: Everytime I stop and start the rsyslog daemon in the remote rsyslog server where the client is conected, despite haproxy switches the connection to the other rsyslog server, at least a message is lost. I tried configure retrycount="-1" but without success. I read about similar problem in the mail list (http://lists.adiscon.net/pipermail/rsyslog/2016-March/042379.html) but I've not understood exaclly what's been done to solve that (https://github.com/rsyslog/rsyslog/issues/974?_pjax=%23js-repo-pjax-container): - Is it really mandatory to configure batch of only one message a time (queue.dequeuebatchsize="1") in order to not lose any message in case of the remote rsyslog server fail? - Has the "isolation" of a "corrupt message" implemented (Rainer said in the mail list mentioned above: I thought that if a batch failed, it pushed all the messages back on the queue and retried with a half size batch until it got to the one message that could not be processed and only did a fatal fail on that message)? - What about the action retry feature? When it is enabled and you have a batch of many messages and just one fails, what's expected to happen? My servers environment: - Centos 7.3 - Rsyslog version: 8.24 Regards, Jether "Esta mensagem e reservada e sua divulgacao, distribuicao, reproducao ou qualquer forma de uso e proibida e depende de previa autorizacao desta instituicao. O remetente utiliza o correio eletronico no exercicio do seu trabalho ou em razao dele, eximindo esta instituicao de qualquer responsabilidade por utilizacao indevida. Se voce recebeu esta mensagem por engano, favor elimina-la imediatamente." "This message is reserved and its disclosure, distribution, reproduction or any other form of use is prohibited and shall depend upon previous proper authorization. The sender uses the electronic mail in the exercise of his/her work or by virtue thereof, and the institution accepts no liability for its undue use. If you have received this e-mail by mistake, please delete it immediately." ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.