Hi, we're using rsyslog to send logs to ES using omelasticsearch One issue we're having is that from time to time, after a network issue, the queue for messages going to ES keeps growing, We are writing to multiple indices in the same cluster, ant this issue is only happening for some of the indices (and not always the same ones).
What I found so far: Dropping traffic from ES is a relatively reliable way to trigger the issue: iptables -A INPUT -p tcp --source <ES proxy IP> --sport 9200 -j DROP ; sleep 180 ; iptables -D INPUT -p tcp --source <ES proxy IP> --sport 9200 -j DROP Force-closing ES sockets in the worker (via GDB) fixes the issue (queue : for FD in `lsof -p $PID -n -a -i TCP:wap-wsp -a -s TCP:ESTABLISHED -F f | grep '^f' | cut -d f -f 2`; do gdb -batch -ex 'set logging on' -ex 'set logging redirect on' -ex "attach $PID" -ex "call shutdown($FD, 1)" -ex 'detach' -ex 'quit'; done My current theory is that, one or more libcurl requests gets in a state where rsyslog has sent a request to ES, ES has tried to send the answer back to rsyslog for some time, but then gave up and closed the connection, and since there is no timeout on the request, omelasticsearch is stuck forever. But of course I might be dead wrong and there is a simple explanation. I wanted to try to set a timeout on the ES request, to see if this fixes the issue, but first I wanted to ask if there is a specific reason why there is no option to set the timeout on the ES request, or it just was not implemented at the time? Kind regards, Mattia _______________________________________________ rsyslog mailing list https://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

