2016-05-28 4:36 GMT+02:00 Alec Swan <[email protected]>:
I ran rsyslog in debug mode and now I am in the same state where a copy
of
the logs are still being written to an omfile but nothing gets reported
to
elasticsearch. Rsyslog generated 728MB of debug logs. What would you like
me to do with it? Should I upload it to pastebin or search for something?
The debug logs are of the following format:
1673.866510445:omelasticsearch-event-collector-kafka-rest-http-request.log
que: actionDoRetry:
omelasticsearch-event-collector-kafka-rest-http-request.log enter loop,
iRetries=294
1673.866533581:omelasticsearch-event-collector-kafka-rest-http-request.log
que: omelasticsearch: tryResume called
1673.949301652:omelasticsearch-event-collector-kafka-rest-http-request.log
que: omelasticsearch: checkConn() curl_easy_perform() failed: HTTP
response
code said error
That's probably sufficient :-)
I found a regression that we had since a couple of versions where
data-induced errors causes suspensions instead of writing the errors
to the error file. This is patched now. I suggest to try git master
branch. It is pretty safe to do so now, it will turn int 8.19.0 next
tuesday.
Rainer
1673.949335309:omelasticsearch-event-collector-kafka-rest-http-request.log
que: actionDoRetry:
omelasticsearch-event-collector-kafka-rest-http-request.log
action->tryResume returned -2007
1673.949342989:omelasticsearch-event-collector-kafka-rest-http-request.log
que: actionDoRetry:
omelasticsearch-event-collector-kafka-rest-http-request.log check for max
retries, iResumeRetryCount -1, iRetries 294
Thanks,
Alec
On Fri, May 27, 2016 at 9:48 AM, Alec Swan <[email protected]> wrote:
I finally conclusively confirmed that all the buffered logs are sitting
in
omelasticsearch-queue.00000001 file. I grabbed the last log message
reported to ES, then found the next one that should have been reported
and
found it at the top of omelasticsearch-queue.00000001.
So, it's either that my queues are configured incorrectly (see my
original
post) or not all queue related bugs were fixed in 8.18.
I restarted rsyslog in debug mode. Is there anything else I can do to
help
troubleshoot this?
ls -lat /var/spool/rsyslog/
total 17944
drwxr-xr-x 2 root root 4096 May 27 15:43 .
-rw------- 1 root root 561 May 27 15:23 omelasticsearch-queue.qi
-rw------- 1 root root 17170428 May 27 15:23
omelasticsearch-queue.00000001
rpm -qa | grep rsyslog
rsyslog-elasticsearch-8.18.0-1.el6.x86_64
rsyslog-8.18.0-1.el6.x86_64
rsyslog-mmnormalize-8.18.0-1.el6.x86_64
Thanks,
Alec
On Fri, May 27, 2016 at 9:15 AM, Alec Swan <[email protected]> wrote:
I just found out that the reason why rsyslog stopped writing to ES but
continued writing the copy of the log to a separate log file was
because
puppet ran and overwrote my manual change to write to the separate log
file
:(
So, I am currently in the state where rsyslog is still writing messages
to the log file, but not to elasticsearch. Since startmsg.regex
applies to
both inputs it's definitely not the culprit.
Before I restarted rsyslog in debug mode I decided to grab impstats to
see omelasticsearch queue state. The last impstats logs were sent to
elasticsearch 4 hours ago and are shown below.
I see failed.http=5 failed.httprequests=3 failed.es=0 not log before
logs stop being sent to ES. Could this have caused the logs to start
buffering?
May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log:
origin=core.action processed=3979 failed=0 suspended=0
suspended.duration=0
resumed=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost
hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats:
programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyY9
_type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log
queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0
discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZL _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960
message:omelasticsearch-myapp-http-request.log
que[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0
discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZN _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960
message:omelasticsearch-myapp-http-request.log
que: origin=core.queue size=995 enqueued=37639 full=0 discarded.full=0
discarded.nf=0 maxqsize=20002 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZO _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch:
origin=omelasticsearch submitted=54608 failed.http=5
failed.httprequests=3
failed.es=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost
hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats:
programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyY3
_type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960
message:omelasticsearch-myapp-http-request.log:
origin=core.action processed=37639 failed=0 suspended=0
suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZA _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue[DA]:
origin=core.queue size=0 enqueued=0 full=0 discarded.full=0
discarded.nf=0
maxqsize=0 @timestamp:May 27th 2016, 05:27:07.960 host:myhost
hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats:
programname:rsyslogd-pstats logtag:syslog _id:AVTx99mVggzpxkzyDyZJ
_type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog:
origin=core.action processed=35488 failed=0 suspended=0
suspended.duration=0 resumed=0 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyY5 _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch-syslog queue:
origin=core.queue size=20024 enqueued=35491 full=0 discarded.full=0
discarded.nf=0 maxqsize=20024 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZK _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:27:07.960 message:omelasticsearch-myapp-launcher.log
queue: origin=core.queue size=1482 enqueued=3979 full=0
discarded.full=0
discarded.nf=0 maxqsize=1482 @timestamp:May 27th 2016, 05:27:07.960
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99mVggzpxkzyDyZM _type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:26:37.931 message:omelasticsearch-syslog queue[DA]:
origin=core.queue size=0 enqueued=0 full=0 discarded.full=0
discarded.nf=0
maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931 host:myhost
hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats:
programname:rsyslogd-pstats logtag:syslog _id:AVTx99hfggzpxkzyDyYu
_type:syslog _index:myapp-2016.05.27 _score:
May 27th 2016, 05:26:37.931 message:omelasticsearch-myapp-launcher.log
queue[DA]: origin=core.queue size=0 enqueued=0 full=0 discarded.full=0
discarded.nf=0 maxqsize=0 @timestamp:May 27th 2016, 05:26:37.931
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats logtag:syslog
_id:AVTx99hfggzpxkzyDyYw _type:syslog _index:myapp-2016.05.27 _score:
Thanks,
Alec
On Thu, May 26, 2016 at 11:41 PM, Rainer Gerhards <
[email protected]> wrote:
2016-05-27 3:23 GMT+02:00 Peter Portante <[email protected]
:
On Thu, May 26, 2016 at 2:11 PM, Alec Swan <[email protected]>
wrote:
David,
After enabling the omfile action to write to a log file before
sending to
elasticsearch (see below) I confirmed that logs stop being written
to
/var/log/rsyslog/output.log and to ES at the same time even though
the log
file being monitored /var/log/file-to-monitor.log keeps getting new
logs.
So, the problem seems to be related to imfile input and not to
output
plugins or queue settings.
Maybe it's related to how I am monitoring the log file?
*module(load = "imfile")*
*input(*
* type = "imfile"*
* File = "/var/log/file-to-monitor.log"*
* Tag = "myapp"*
* startmsg.regex =
"^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}"*
Are you sure the startmsg.regex pattern always matches?
This might indded be the problem. To gain more insight, I suggest to
create a debug log. That will most probably get us good information to
know what is going on,.
Essentially, you can do this interactively by stopping the service and
running it via
$ rsyslogd -dn ..other opts... &> rsyslog.debug
You can also use the service version. Full details are here:
http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html
We would then need to see the *full* debug log (no excerpts, please).
You can post it e.g. to pastebin.
Rainer
* Facility = "local2"*
* Ruleset = "myruleset"*
* )*
*ruleset(name = "myruleset") {*
* action(type = "mmnormalize" rulebase =
"/etc/rsyslog.d/rules/myrule.rb")*
* action(type = "omfile" template="es-payload"
file="/var/log/rsyslog/output.log" FileCreateMode="0644")*
* action(type = "omelasticsearch" queue.xxx)*
*}*
Thanks,
Alec
On Wed, May 25, 2016 at 11:13 AM, Alec Swan <[email protected]>
wrote:
David,
The rate of delivery is about 1 log per second. If you are
referring to
queue.dequeuebatchsize="1000"
batch size, then I would expect the logs to be batched for 15-20
minutes.
However, I am observing delays of multiple hours.
When I restart rsyslog all buffered logs get sent to
elasticsearch.
I had
logging to log file enabled before and could see all logs being
written
to
log files correctly. I enabled it again and will keep an eye on
it,
but I
am sure the problem is pushing to ES.
I currently have a host which hasn't sent logs for about 12
hours.
The
following are the last logs I received from that node. Anything I
can do
to
troubleshoot while the host is in a bad state?
May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016,
05:28:05.880
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats
logtag:syslog
message:omelasticsearch-myapp-launcher.log: origin=core.action
processed=333 failed=0 suspended=0 suspended.duration=0 resumed=0
_id:AVTnrAgpFO4BDB55DTh2 _type:syslog
_index:logstash-syslog-myapp-2016.05.25 _score:
May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016,
05:28:05.880
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats
logtag:syslog
message:action 4: origin=core.action processed=336 failed=0
suspended=0
suspended.duration=0 resumed=0 _id:AVTnrAgpFO4BDB55DTh3
_type:syslog
_index:logstash-syslog-myapp-2016.05.25 _score:
May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016,
05:28:05.880
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats
logtag:syslog
message:omelasticsearch-syslog queue[DA]: origin=core.queue
size=0
enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0
_id:AVTnrAlsFO4BDB55DTiD _type:syslog
_index:logstash-syslog-myapp-2016.05.25 _score:
May 25th 2016, 05:28:05.880 @timestamp:May 25th 2016,
05:28:05.880
host:myhost hostip:10.0.0.1 severity:debug facility:syslog
syslogtag:rsyslogd-pstats: programname:rsyslogd-pstats
logtag:syslog
message:omelasticsearch-myapp-launcher.log queue[DA]:
origin=core.queue
size=0 enqueued=0 full=0 discarded.full=0 discarded.nf=0
maxqsize=0
_id:AVTnrAlsFO4BDB55DTiF _type:syslog
_index:logstash-syslog-myapp-2016.05.25 _score:
Thanks,
Alec
On Tue, May 24, 2016 at 10:35 PM, David Lang <[email protected]>
wrote:
It shouldn't matter, but what is the rate of log delivery? is
there any
chance that it is waiting to deliver a full batch?
I would consider putting this in a ruleset and moving the queue
to
the
ruleset. I would then have the ruleset contain two items
1. the output to ES
2. a write to a debug log locally (not necessarily the full
messages,
timestamp would be enough)
you can then see if the local file in growing while things are
not
yet
showing up in ES to see if the issue is on the sending side or
on
the
receiving side.
David Lang
On Tue, 24 May 2016, Alec Swan wrote:
Date: Tue, 24 May 2016 22:17:22 -0600
From: Alec Swan <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: [rsyslog] Logs are delayed being pushed to
Elasticsearch
Hello,
I recently upgraded to rsyslog 8.18 and was happy to see that
on
disk
queues no longer cause rsyslog to get in a bad state. However,
now I
see
very long delays (several hours) of logs being pushed to
Elasticsearch.
It
seems that somehow the logs are being buffered on the client
for
several
hours because eventually they do show up in Elasticsearch. I
don't see
any
errors in /var/log/rsyslog/ES-error.log (see config below) or
/var/log/messages.
I enabled impstats but didn't see any errors related to
omelasticsearch.
What else can I do to troubleshoot this?
Here is my omelasticsearch config:
action(
type = "omelasticsearch"
template = "es-payload"
dynSearchIndex = "on"
searchIndex = "logstash-index"
searchType = "syslog"
server = "127.0.0.1"
serverport = "9200"
uid = "xxx"
pwd = "yyy"
errorFile = "/var/log/rsyslog/ES-error.log"
bulkmode = "on"
action.resumeretrycount="-1" # retry if ES is
unreachable (-1
for
infinite retries)
action.resumeInterval="60"
queue.dequeuebatchsize="1000" # ES bulk size
queue.type="linkedlist"
queue.size="100000"
queue.workerthreads="5"
queue.timeoutworkerthreadshutdown="2000"
queue.spoolDirectory="/var/spool/rsyslog"
queue.filename="omelasticsearch-queue"
queue.maxfilesize="100m"
queue.maxdiskspace="1g"
queue.highwatermark="80000" # when to start spilling to
disk
queue.lowwatermark="20000" # when to stop spilling to
disk
queue.saveonshutdown="on"
)
Thanks,
Alec
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED
by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by
a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
POST if
you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.