I finally conclusively confirmed that all the buffered logs are sitting in omelasticsearch-queue.00000001 file. I grabbed the last log message reported to ES, then found the next one that should have been reported and found it at the top of omelasticsearch-queue.00000001. So, it's either that my queues are configured incorrectly (see my original post) or not all queue related bugs were fixed in 8.18.
I restarted rsyslog in debug mode. Is there anything else I can do to help troubleshoot this? > ls -lat /var/spool/rsyslog/ total 17944 drwxr-xr-x 2 root root 4096 May 27 15:43 . -rw------- 1 root root 561 May 27 15:23 omelasticsearch-queue.qi -rw------- 1 root root 17170428 May 27 15:23 omelasticsearch-queue.00000001 > rpm -qa | grep rsyslog rsyslog-elasticsearch-8.18.0-1.el6.x86_64 rsyslog-8.18.0-1.el6.x86_64 rsyslog-mmnormalize-8.18.0-1.el6.x86_64 Thanks, Alec On Fri, May 27, 2016 at 9:15 AM, Alec Swan <[email protected]> wrote: > I just found out that the reason why rsyslog stopped writing to ES but > continued writing the copy of the log to a separate log file was because > puppet ran and overwrote my manual change to write to the separate log file > :( > So, I am currently in the state where rsyslog is still writing messages to > the log file, but not to elasticsearch. Since startmsg.regex applies to > both inputs it's definitely not the culprit. > > Before I restarted rsyslog in debug mode I decided to grab impstats to see > omelasticsearch queue state. The last impstats logs were sent to > elasticsearch 4 hours ago and are shown below. > I see failed.http=5 failed.httprequests=3 failed.es=0 not log before > logs stop being sent to ES. Could this have caused the logs to start > buffering? > > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log: > origin=core.action processed=3979 failed=0 suspended=0 suspended.duration=0 > resumed=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost > hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: > programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyY9 > _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log > queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 > discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZL _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log > que[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 > discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZN _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log > que: origin=core.queue size=995 enqueued=37639 full=0 discarded.full=0 > discarded.nf=0 maxqsize=20002 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZO _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch: > origin=omelasticsearch submitted=54608 failed.http=5 failed.httprequests=3 > failed.es=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost > hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: > programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyY3 > _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log: > origin=core.action processed=37639 failed=0 suspended=0 > suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZA _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue[DA]: > origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 > maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost > hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: > programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyZJ > _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog: > origin=core.action processed=35488 failed=0 suspended=0 > suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyY5 _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue: > origin=core.queue size=20024 enqueued=35491 full=0 discarded.full=0 > discarded.nf=0 maxqsize=20024 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZK _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log > queue: origin=core.queue size=1482 enqueued=3979 full=0 discarded.full=0 > discarded.nf=0 maxqsize=1482 @timestamp:May 27th 2016, 05:27:07.960 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99mVggzpxkzyDyZM _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:26:37.931 message:omelasticsearch-syslog queue[DA]: > origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 > maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931 host:myhost > hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: > programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyYu > _type:syslog _index:myapp-2016.05.27 _score: > May 27th 2016, 05:26:37.931 message:omelasticsearch-myapp-launcher.log > queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 > discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931 > host:myhost hostip:10.0.0.1 severity:debug facility:syslog > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog > _id:AVTx99hfggzpxkzyDyYw _type:syslog _index:myapp-2016.05.27 _score: > > > > Thanks, > > Alec > > On Thu, May 26, 2016 at 11:41 PM, Rainer Gerhards < > [email protected]> wrote: > >> 2016-05-27 3:23 GMT+02:00 Peter Portante <[email protected]>: >> > On Thu, May 26, 2016 at 2:11 PM, Alec Swan <[email protected]> wrote: >> > >> >> David, >> >> >> >> After enabling the omfile action to write to a log file before sending >> to >> >> elasticsearch (see below) I confirmed that logs stop being written to >> >> /var/log/rsyslog/output.log and to ES at the same time even though the >> log >> >> file being monitored /var/log/file-to-monitor.log keeps getting new >> logs. >> >> So, the problem seems to be related to imfile input and not to output >> >> plugins or queue settings. >> >> >> >> Maybe it's related to how I am monitoring the log file? >> >> >> >> >> >> *module(load = "imfile")* >> >> >> >> *input(* >> >> * type = "imfile"* >> >> * File = "/var/log/file-to-monitor.log"* >> >> * Tag = "myapp"* >> >> >> >> * startmsg.regex = "^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}"* >> >> >> > >> > Are you sure the startmsg.regex pattern always matches? >> >> This might indded be the problem. To gain more insight, I suggest to >> create a debug log. That will most probably get us good information to >> know what is going on,. >> >> Essentially, you can do this interactively by stopping the service and >> running it via >> >> $ rsyslogd -dn ..other opts... &> rsyslog.debug >> >> You can also use the service version. Full details are here: >> >> http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html >> >> We would then need to see the *full* debug log (no excerpts, please). >> You can post it e.g. to pastebin. >> >> Rainer >> >> > >> > >> >> * Facility = "local2"* >> >> * Ruleset = "myruleset"* >> >> * )* >> >> >> >> >> >> *ruleset(name = "myruleset") {* >> >> * action(type = "mmnormalize" rulebase = >> >> "/etc/rsyslog.d/rules/myrule.rb")* >> >> * action(type = "omfile" template="es-payload" >> >> file="/var/log/rsyslog/output.log" FileCreateMode="0644")* >> >> * action(type = "omelasticsearch" queue.xxx)* >> >> *}* >> >> >> >> Thanks, >> >> >> >> Alec >> >> >> >> On Wed, May 25, 2016 at 11:13 AM, Alec Swan <[email protected]> >> wrote: >> >> >> >> > David, >> >> > >> >> > The rate of delivery is about 1 log per second. If you are referring >> to >> >> queue.dequeuebatchsize="1000" >> >> > batch size, then I would expect the logs to be batched for 15-20 >> minutes. >> >> > However, I am observing delays of multiple hours. >> >> > >> >> > When I restart rsyslog all buffered logs get sent to elasticsearch. >> I had >> >> > logging to log file enabled before and could see all logs being >> written >> >> to >> >> > log files correctly. I enabled it again and will keep an eye on it, >> but I >> >> > am sure the problem is pushing to ES. >> >> > >> >> > I currently have a host which hasn't sent logs for about 12 hours. >> The >> >> > following are the last logs I received from that node. Anything I >> can do >> >> to >> >> > troubleshoot while the host is in a bad state? >> >> > >> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >> >> > message:omelasticsearch-myapp-launcher.log: origin=core.action >> >> > processed=333 failed=0 suspended=0 suspended.duration=0 resumed=0 >> >> > _id:AVTnrAgpFO4BDB55DTh2 _type:syslog >> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >> >> > message:action 4: origin=core.action processed=336 failed=0 >> suspended=0 >> >> > suspended.duration=0 resumed=0 _id:AVTnrAgpFO4BDB55DTh3 _type:syslog >> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >> >> > message:omelasticsearch-syslog queue[DA]: origin=core.queue size=0 >> >> > enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0 >> >> > _id:AVTnrAlsFO4BDB55DTiD _type:syslog >> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >> >> > message:omelasticsearch-myapp-launcher.log queue[DA]: >> origin=core.queue >> >> > size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0 >> >> > _id:AVTnrAlsFO4BDB55DTiF _type:syslog >> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >> >> > >> >> > Thanks, >> >> > >> >> > Alec >> >> > >> >> > On Tue, May 24, 2016 at 10:35 PM, David Lang <[email protected]> wrote: >> >> > >> >> >> It shouldn't matter, but what is the rate of log delivery? is there >> any >> >> >> chance that it is waiting to deliver a full batch? >> >> >> >> >> >> I would consider putting this in a ruleset and moving the queue to >> the >> >> >> ruleset. I would then have the ruleset contain two items >> >> >> >> >> >> 1. the output to ES >> >> >> 2. a write to a debug log locally (not necessarily the full >> messages, >> >> >> timestamp would be enough) >> >> >> >> >> >> you can then see if the local file in growing while things are not >> yet >> >> >> showing up in ES to see if the issue is on the sending side or on >> the >> >> >> receiving side. >> >> >> >> >> >> David Lang >> >> >> >> >> >> On Tue, 24 May 2016, Alec Swan wrote: >> >> >> >> >> >> Date: Tue, 24 May 2016 22:17:22 -0600 >> >> >>> From: Alec Swan <[email protected]> >> >> >>> Reply-To: rsyslog-users <[email protected]> >> >> >>> To: rsyslog-users <[email protected]> >> >> >>> Subject: [rsyslog] Logs are delayed being pushed to Elasticsearch >> >> >>> >> >> >>> >> >> >>> Hello, >> >> >>> >> >> >>> I recently upgraded to rsyslog 8.18 and was happy to see that on >> disk >> >> >>> queues no longer cause rsyslog to get in a bad state. However, now >> I >> >> see >> >> >>> very long delays (several hours) of logs being pushed to >> Elasticsearch. >> >> >>> It >> >> >>> seems that somehow the logs are being buffered on the client for >> >> several >> >> >>> hours because eventually they do show up in Elasticsearch. I don't >> see >> >> >>> any >> >> >>> errors in /var/log/rsyslog/ES-error.log (see config below) or >> >> >>> /var/log/messages. >> >> >>> >> >> >>> I enabled impstats but didn't see any errors related to >> >> omelasticsearch. >> >> >>> What else can I do to troubleshoot this? >> >> >>> >> >> >>> Here is my omelasticsearch config: >> >> >>> >> >> >>> action( >> >> >>> type = "omelasticsearch" >> >> >>> template = "es-payload" >> >> >>> dynSearchIndex = "on" >> >> >>> searchIndex = "logstash-index" >> >> >>> searchType = "syslog" >> >> >>> server = "127.0.0.1" >> >> >>> serverport = "9200" >> >> >>> uid = "xxx" >> >> >>> pwd = "yyy" >> >> >>> errorFile = "/var/log/rsyslog/ES-error.log" >> >> >>> bulkmode = "on" >> >> >>> action.resumeretrycount="-1" # retry if ES is unreachable >> (-1 >> >> >>> for >> >> >>> infinite retries) >> >> >>> action.resumeInterval="60" >> >> >>> queue.dequeuebatchsize="1000" # ES bulk size >> >> >>> queue.type="linkedlist" >> >> >>> queue.size="100000" >> >> >>> queue.workerthreads="5" >> >> >>> queue.timeoutworkerthreadshutdown="2000" >> >> >>> queue.spoolDirectory="/var/spool/rsyslog" >> >> >>> queue.filename="omelasticsearch-queue" >> >> >>> queue.maxfilesize="100m" >> >> >>> queue.maxdiskspace="1g" >> >> >>> queue.highwatermark="80000" # when to start spilling to >> disk >> >> >>> queue.lowwatermark="20000" # when to stop spilling to disk >> >> >>> queue.saveonshutdown="on" >> >> >>> ) >> >> >>> >> >> >>> Thanks, >> >> >>> >> >> >>> Alec >> >> >>> _______________________________________________ >> >> >>> rsyslog mailing list >> >> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >> >>> http://www.rsyslog.com/professional-services/ >> >> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> >> myriad >> >> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you >> >> >>> DON'T LIKE THAT. >> >> >>> >> >> >>> _______________________________________________ >> >> >> rsyslog mailing list >> >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >> >> http://www.rsyslog.com/professional-services/ >> >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you >> >> >> DON'T LIKE THAT. >> >> >> >> >> > >> >> > >> >> _______________________________________________ >> >> rsyslog mailing list >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >> http://www.rsyslog.com/professional-services/ >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> >> DON'T LIKE THAT. >> >> >> > _______________________________________________ >> > rsyslog mailing list >> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > http://www.rsyslog.com/professional-services/ >> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you DON'T LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

