2016-05-28 4:36 GMT+02:00 Alec Swan <[email protected]>: > I ran rsyslog in debug mode and now I am in the same state where a copy of > the logs are still being written to an omfile but nothing gets reported to > elasticsearch. Rsyslog generated 728MB of debug logs. What would you like > me to do with it? Should I upload it to pastebin or search for something? > The debug logs are of the following format: > > 1673.866510445:omelasticsearch-event-collector-kafka-rest-http-request.log > que: actionDoRetry: > omelasticsearch-event-collector-kafka-rest-http-request.log enter loop, > iRetries=294 > 1673.866533581:omelasticsearch-event-collector-kafka-rest-http-request.log > que: omelasticsearch: tryResume called > 1673.949301652:omelasticsearch-event-collector-kafka-rest-http-request.log > que: omelasticsearch: checkConn() curl_easy_perform() failed: HTTP response > code said error
That's probably sufficient :-) I found a regression that we had since a couple of versions where data-induced errors causes suspensions instead of writing the errors to the error file. This is patched now. I suggest to try git master branch. It is pretty safe to do so now, it will turn int 8.19.0 next tuesday. Rainer > 1673.949335309:omelasticsearch-event-collector-kafka-rest-http-request.log > que: actionDoRetry: > omelasticsearch-event-collector-kafka-rest-http-request.log > action->tryResume returned -2007 > 1673.949342989:omelasticsearch-event-collector-kafka-rest-http-request.log > que: actionDoRetry: > omelasticsearch-event-collector-kafka-rest-http-request.log check for max > retries, iResumeRetryCount -1, iRetries 294 > > Thanks, > > Alec > > On Fri, May 27, 2016 at 9:48 AM, Alec Swan <[email protected]> wrote: > >> I finally conclusively confirmed that all the buffered logs are sitting in >> omelasticsearch-queue.00000001 file. I grabbed the last log message >> reported to ES, then found the next one that should have been reported and >> found it at the top of omelasticsearch-queue.00000001. >> So, it's either that my queues are configured incorrectly (see my original >> post) or not all queue related bugs were fixed in 8.18. >> >> I restarted rsyslog in debug mode. Is there anything else I can do to help >> troubleshoot this? >> >> > ls -lat /var/spool/rsyslog/ >> total 17944 >> drwxr-xr-x 2 root root 4096 May 27 15:43 . >> -rw------- 1 root root 561 May 27 15:23 omelasticsearch-queue.qi >> -rw------- 1 root root 17170428 May 27 15:23 >> omelasticsearch-queue.00000001 >> >> > rpm -qa | grep rsyslog >> rsyslog-elasticsearch-8.18.0-1.el6.x86_64 >> rsyslog-8.18.0-1.el6.x86_64 >> rsyslog-mmnormalize-8.18.0-1.el6.x86_64 >> >> >> >> Thanks, >> >> Alec >> >> On Fri, May 27, 2016 at 9:15 AM, Alec Swan <[email protected]> wrote: >> >>> I just found out that the reason why rsyslog stopped writing to ES but >>> continued writing the copy of the log to a separate log file was because >>> puppet ran and overwrote my manual change to write to the separate log file >>> :( >>> So, I am currently in the state where rsyslog is still writing messages >>> to the log file, but not to elasticsearch. Since startmsg.regex applies to >>> both inputs it's definitely not the culprit. >>> >>> Before I restarted rsyslog in debug mode I decided to grab impstats to >>> see omelasticsearch queue state. The last impstats logs were sent to >>> elasticsearch 4 hours ago and are shown below. >>> I see failed.http=5 failed.httprequests=3 failed.es=0 not log before >>> logs stop being sent to ES. Could this have caused the logs to start >>> buffering? >>> >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log: >>> origin=core.action processed=3979 failed=0 suspended=0 suspended.duration=0 >>> resumed=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost >>> hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: >>> programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyY9 >>> _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log >>> queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZL _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log >>> que[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZN _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log >>> que: origin=core.queue size=995 enqueued=37639 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=20002 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZO _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch: >>> origin=omelasticsearch submitted=54608 failed.http=5 failed.httprequests=3 >>> failed.es=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost >>> hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: >>> programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyY3 >>> _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-http-request.log: >>> origin=core.action processed=37639 failed=0 suspended=0 >>> suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZA _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue[DA]: >>> origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 >>> maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost >>> hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: >>> programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyZJ >>> _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog: >>> origin=core.action processed=35488 failed=0 suspended=0 >>> suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyY5 _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue: >>> origin=core.queue size=20024 enqueued=35491 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=20024 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZK _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log >>> queue: origin=core.queue size=1482 enqueued=3979 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=1482 @timestamp:May 27th 2016, 05:27:07.960 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99mVggzpxkzyDyZM _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:26:37.931 message:omelasticsearch-syslog queue[DA]: >>> origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 >>> maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931 host:myhost >>> hostip:10.0.0.1 severity:debug facility:syslog syslogtag:rsyslogd-pstats: >>> programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyYu >>> _type:syslog _index:myapp-2016.05.27 _score: >>> May 27th 2016, 05:26:37.931 message:omelasticsearch-myapp-launcher.log >>> queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0 >>> discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931 >>> host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>> syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>> _id:AVTx99hfggzpxkzyDyYw _type:syslog _index:myapp-2016.05.27 _score: >>> >>> >>> >>> Thanks, >>> >>> Alec >>> >>> On Thu, May 26, 2016 at 11:41 PM, Rainer Gerhards < >>> [email protected]> wrote: >>> >>>> 2016-05-27 3:23 GMT+02:00 Peter Portante <[email protected]>: >>>> > On Thu, May 26, 2016 at 2:11 PM, Alec Swan <[email protected]> wrote: >>>> > >>>> >> David, >>>> >> >>>> >> After enabling the omfile action to write to a log file before >>>> sending to >>>> >> elasticsearch (see below) I confirmed that logs stop being written to >>>> >> /var/log/rsyslog/output.log and to ES at the same time even though >>>> the log >>>> >> file being monitored /var/log/file-to-monitor.log keeps getting new >>>> logs. >>>> >> So, the problem seems to be related to imfile input and not to output >>>> >> plugins or queue settings. >>>> >> >>>> >> Maybe it's related to how I am monitoring the log file? >>>> >> >>>> >> >>>> >> *module(load = "imfile")* >>>> >> >>>> >> *input(* >>>> >> * type = "imfile"* >>>> >> * File = "/var/log/file-to-monitor.log"* >>>> >> * Tag = "myapp"* >>>> >> >>>> >> * startmsg.regex = >>>> "^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}"* >>>> >> >>>> > >>>> > Are you sure the startmsg.regex pattern always matches? >>>> >>>> This might indded be the problem. To gain more insight, I suggest to >>>> create a debug log. That will most probably get us good information to >>>> know what is going on,. >>>> >>>> Essentially, you can do this interactively by stopping the service and >>>> running it via >>>> >>>> $ rsyslogd -dn ..other opts... &> rsyslog.debug >>>> >>>> You can also use the service version. Full details are here: >>>> >>>> http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html >>>> >>>> We would then need to see the *full* debug log (no excerpts, please). >>>> You can post it e.g. to pastebin. >>>> >>>> Rainer >>>> >>>> > >>>> > >>>> >> * Facility = "local2"* >>>> >> * Ruleset = "myruleset"* >>>> >> * )* >>>> >> >>>> >> >>>> >> *ruleset(name = "myruleset") {* >>>> >> * action(type = "mmnormalize" rulebase = >>>> >> "/etc/rsyslog.d/rules/myrule.rb")* >>>> >> * action(type = "omfile" template="es-payload" >>>> >> file="/var/log/rsyslog/output.log" FileCreateMode="0644")* >>>> >> * action(type = "omelasticsearch" queue.xxx)* >>>> >> *}* >>>> >> >>>> >> Thanks, >>>> >> >>>> >> Alec >>>> >> >>>> >> On Wed, May 25, 2016 at 11:13 AM, Alec Swan <[email protected]> >>>> wrote: >>>> >> >>>> >> > David, >>>> >> > >>>> >> > The rate of delivery is about 1 log per second. If you are >>>> referring to >>>> >> queue.dequeuebatchsize="1000" >>>> >> > batch size, then I would expect the logs to be batched for 15-20 >>>> minutes. >>>> >> > However, I am observing delays of multiple hours. >>>> >> > >>>> >> > When I restart rsyslog all buffered logs get sent to elasticsearch. >>>> I had >>>> >> > logging to log file enabled before and could see all logs being >>>> written >>>> >> to >>>> >> > log files correctly. I enabled it again and will keep an eye on it, >>>> but I >>>> >> > am sure the problem is pushing to ES. >>>> >> > >>>> >> > I currently have a host which hasn't sent logs for about 12 hours. >>>> The >>>> >> > following are the last logs I received from that node. Anything I >>>> can do >>>> >> to >>>> >> > troubleshoot while the host is in a bad state? >>>> >> > >>>> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >>>> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>>> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>>> >> > message:omelasticsearch-myapp-launcher.log: origin=core.action >>>> >> > processed=333 failed=0 suspended=0 suspended.duration=0 resumed=0 >>>> >> > _id:AVTnrAgpFO4BDB55DTh2 _type:syslog >>>> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >>>> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >>>> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>>> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>>> >> > message:action 4: origin=core.action processed=336 failed=0 >>>> suspended=0 >>>> >> > suspended.duration=0 resumed=0 _id:AVTnrAgpFO4BDB55DTh3 _type:syslog >>>> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >>>> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >>>> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>>> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>>> >> > message:omelasticsearch-syslog queue[DA]: origin=core.queue size=0 >>>> >> > enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0 >>>> >> > _id:AVTnrAlsFO4BDB55DTiD _type:syslog >>>> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >>>> >> > May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016, 05:28:05.880 >>>> >> > host:myhost hostip:10.0.0.1 severity:debug facility:syslog >>>> >> > syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog >>>> >> > message:omelasticsearch-myapp-launcher.log queue[DA]: >>>> origin=core.queue >>>> >> > size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0 >>>> >> > _id:AVTnrAlsFO4BDB55DTiF _type:syslog >>>> >> > _index:logstash-syslog-myapp-2016.05.25 _score: >>>> >> > >>>> >> > Thanks, >>>> >> > >>>> >> > Alec >>>> >> > >>>> >> > On Tue, May 24, 2016 at 10:35 PM, David Lang <[email protected]> wrote: >>>> >> > >>>> >> >> It shouldn't matter, but what is the rate of log delivery? is >>>> there any >>>> >> >> chance that it is waiting to deliver a full batch? >>>> >> >> >>>> >> >> I would consider putting this in a ruleset and moving the queue to >>>> the >>>> >> >> ruleset. I would then have the ruleset contain two items >>>> >> >> >>>> >> >> 1. the output to ES >>>> >> >> 2. a write to a debug log locally (not necessarily the full >>>> messages, >>>> >> >> timestamp would be enough) >>>> >> >> >>>> >> >> you can then see if the local file in growing while things are not >>>> yet >>>> >> >> showing up in ES to see if the issue is on the sending side or on >>>> the >>>> >> >> receiving side. >>>> >> >> >>>> >> >> David Lang >>>> >> >> >>>> >> >> On Tue, 24 May 2016, Alec Swan wrote: >>>> >> >> >>>> >> >> Date: Tue, 24 May 2016 22:17:22 -0600 >>>> >> >>> From: Alec Swan <[email protected]> >>>> >> >>> Reply-To: rsyslog-users <[email protected]> >>>> >> >>> To: rsyslog-users <[email protected]> >>>> >> >>> Subject: [rsyslog] Logs are delayed being pushed to Elasticsearch >>>> >> >>> >>>> >> >>> >>>> >> >>> Hello, >>>> >> >>> >>>> >> >>> I recently upgraded to rsyslog 8.18 and was happy to see that on >>>> disk >>>> >> >>> queues no longer cause rsyslog to get in a bad state. However, >>>> now I >>>> >> see >>>> >> >>> very long delays (several hours) of logs being pushed to >>>> Elasticsearch. >>>> >> >>> It >>>> >> >>> seems that somehow the logs are being buffered on the client for >>>> >> several >>>> >> >>> hours because eventually they do show up in Elasticsearch. I >>>> don't see >>>> >> >>> any >>>> >> >>> errors in /var/log/rsyslog/ES-error.log (see config below) or >>>> >> >>> /var/log/messages. >>>> >> >>> >>>> >> >>> I enabled impstats but didn't see any errors related to >>>> >> omelasticsearch. >>>> >> >>> What else can I do to troubleshoot this? >>>> >> >>> >>>> >> >>> Here is my omelasticsearch config: >>>> >> >>> >>>> >> >>> action( >>>> >> >>> type = "omelasticsearch" >>>> >> >>> template = "es-payload" >>>> >> >>> dynSearchIndex = "on" >>>> >> >>> searchIndex = "logstash-index" >>>> >> >>> searchType = "syslog" >>>> >> >>> server = "127.0.0.1" >>>> >> >>> serverport = "9200" >>>> >> >>> uid = "xxx" >>>> >> >>> pwd = "yyy" >>>> >> >>> errorFile = "/var/log/rsyslog/ES-error.log" >>>> >> >>> bulkmode = "on" >>>> >> >>> action.resumeretrycount="-1" # retry if ES is >>>> unreachable (-1 >>>> >> >>> for >>>> >> >>> infinite retries) >>>> >> >>> action.resumeInterval="60" >>>> >> >>> queue.dequeuebatchsize="1000" # ES bulk size >>>> >> >>> queue.type="linkedlist" >>>> >> >>> queue.size="100000" >>>> >> >>> queue.workerthreads="5" >>>> >> >>> queue.timeoutworkerthreadshutdown="2000" >>>> >> >>> queue.spoolDirectory="/var/spool/rsyslog" >>>> >> >>> queue.filename="omelasticsearch-queue" >>>> >> >>> queue.maxfilesize="100m" >>>> >> >>> queue.maxdiskspace="1g" >>>> >> >>> queue.highwatermark="80000" # when to start spilling to >>>> disk >>>> >> >>> queue.lowwatermark="20000" # when to stop spilling to >>>> disk >>>> >> >>> queue.saveonshutdown="on" >>>> >> >>> ) >>>> >> >>> >>>> >> >>> Thanks, >>>> >> >>> >>>> >> >>> Alec >>>> >> >>> _______________________________________________ >>>> >> >>> rsyslog mailing list >>>> >> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> >> >>> http://www.rsyslog.com/professional-services/ >>>> >> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> >> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>> >> myriad >>>> >> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >>>> if you >>>> >> >>> DON'T LIKE THAT. >>>> >> >>> >>>> >> >>> _______________________________________________ >>>> >> >> rsyslog mailing list >>>> >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> >> >> http://www.rsyslog.com/professional-services/ >>>> >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>> myriad >>>> >> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>> you >>>> >> >> DON'T LIKE THAT. >>>> >> >> >>>> >> > >>>> >> > >>>> >> _______________________________________________ >>>> >> rsyslog mailing list >>>> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> >> http://www.rsyslog.com/professional-services/ >>>> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>> myriad >>>> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> >> DON'T LIKE THAT. >>>> >> >>>> > _______________________________________________ >>>> > rsyslog mailing list >>>> > http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> > http://www.rsyslog.com/professional-services/ >>>> > What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>> you DON'T LIKE THAT. >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>> >>> >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

