I agree that's the core of the issue.
can we see more of the config so that we can duplicate it?
I want to be sure that some complaint isn't already being made, but getting lost
because the init settings throw away stdout/stderr or something like that.
I also want to see what impstats shows.
this debug line says tht it's calling LogMsg with the message:
"fatalerror on disk queue 'main_json_token queue[DA]', emergency switch to
directmode*"
what happens to this message? if it's being logged and sent to stderr, I don't
know how much louder we can complain.
David Lang
On Fri, 21 Aug 2015, Otis Gospodnetić wrote:
Btw. I think the main part that Ciprian wants to point out us:
*2631.119110261:main_json_token queue[DA]:Reg/w0: main_json_token
queue[DA]:DA queue is in emergency mode, disabling DA in
parent2631.119146744:main_json_token queue[DA]:Reg/w0: Called LogMsg, msg:
fatalerror on disk queue 'main_json_token queue[DA]', emergency switch to
directmode*
But I think the problem Ciprian is trying to point out here is this:
* there is a problem with disk queue
* but rsyslog doesn't report it
* Ciprian saw it *only* because he restarted rsyslog in *debug* mode.
I would think that if there is a problem with the on disk queue rsyslog
would complain about it immediately and loudly, but I guess it doesn't?
Small bug?
Thanks,
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On Fri, Aug 21, 2015 at 7:27 AM, Ciprian Hacman <[email protected]
wrote:
I just started Rsyslog on debug to see if there is any issue, and found we
have one, maybe this can help:
2631.117432280:main_json_token queue[DA]:Reg/w0: wti 0x1348150: worker
starting
2631.117464099:main_json_token queue[DA]:Reg/w0: DeleteProcessedBatch: we
deleted 0 objects and enqueued 0 objects
2631.117492596:main_json_token queue[DA]:Reg/w0: doDeleteBatch: delete
batch from store, new sizes: log 4530, phys 4530
2631.117522634:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file 19
read 0 bytes
2631.117551194:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file 19
EOF
2631.117580484:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file
19(json_and_token_action) closing
2631.117626778:main_json_token queue[DA]:Reg/w0: file
'/mnt/rsyslog/queues/json_and_token_action.00000070' opened as #-1 with
mode 384
2631.117668522:main_json_token queue[DA]:Reg/w0: strm 0x134b970: open error
2, file '/mnt/rsyslog/queues/json_and_token_action.00000070': No such file
or directory
2631.117700793:main_json_token queue[DA]:Reg/w0: objDeserialize error -2040
during header processing - trying to recover
2631.117732844:main_json_token queue[DA]:Reg/w0: file
'/mnt/rsyslog/queues/json_and_token_action.00000070' opened as #-1 with
mode 384
2631.117764597:main_json_token queue[DA]:Reg/w0: strm 0x134b970: open error
2, file '/mnt/rsyslog/queues/json_and_token_action.00000070': No such file
or directory
2631.117794460:main_json_token queue[DA]:Reg/w0: deserializer has possibly
been able to re-sync and recover, state -2040
2631.117823674:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]:
error -2040 dequeueing element - ignoring, but strange things may happen
2631.117852905:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]:
got 'file not found' error -2040, queue defunct
2631.117882095:main_json_token queue[DA]:Reg/w0: strm 0x1349450: file
17(json_and_token_action) closing
2631.117911082:main_json_token queue[DA]:Reg/w0: strm 0x1349450: file
17(json_and_token_action) flush, buflen 0 (no need to flush)
2631.117942739:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file
-1(json_and_token_action) closing
2631.117974536:main_json_token queue[DA]:Reg/w0: strm 0x134a6e0: file
18(json_and_token_action) closing
2631.118004770:main_json_token queue[DA]:Reg/w0: strmCloseFile: deleting
'/mnt/rsyslog/queues/json_and_token_action.00000069'
2631.119110261:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]:
DA queue is in emergency mode, disabling DA in parent
2631.119146744:main_json_token queue[DA]:Reg/w0: Called LogMsg, msg: fatal
error on disk queue 'main_json_token queue[DA]', emergency switch to direct
mode
2631.119183273:main_json_token queue[DA]:Reg/w0: main Q: qqueueAdd: entry
added, size now log 1, phys 9 entries
2631.119212733:main_json_token queue[DA]:Reg/w0: main Q: EnqueueMsg advised
worker start
rsyslogd: fatal error on disk queue 'main_json_token queue[DA]', emergency
switch to direct mode [v8.12.0 try http://www.rsyslog.com/e/2040 ]
2631.119266265:main_json_token queue[DA]:Reg/w0: regular consumer finished,
iret=-2183, szlog 0 sz phys 0
2631.119293213:main_json_token queue[DA]:Reg/w0: DDDD: wti 0x1348150:
worker cleanup action instances
The queue dir only contained a meta file, no data files:
<OPB:1:qqueue:1:
+iQueueSize:2:4:4530:
+tVars.disk.sizeOnDisk:2:7:9517544:
>End
.
<Obj:1:strm:1:
+iCurrFNum:2:2:69:
+pszFName:1:21:json_and_token_action:
+iMaxFiles:2:8:10000000:
+bDeleteOnClose:2:1:0:
+sType:2:1:1:
+tOperationsMode:2:1:2:
+tOpenMode:2:3:384:
+iCurrOffs:2:7:9517544:
+inode:2:1:0:
+bPrevWasNL:2:1:0:
>End
.
<Obj:1:strm:1:
+iCurrFNum:2:2:69:
+pszFName:1:21:json_and_token_action:
+iMaxFiles:2:8:10000000:
+bDeleteOnClose:2:1:1:
+sType:2:1:1:
+tOperationsMode:2:1:1:
+tOpenMode:2:3:384:
+iCurrOffs:2:7:9517544:
+inode:2:1:0:
+bPrevWasNL:2:1:0:
>End
.
Ciprian
---
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On Fri, Aug 21, 2015 at 12:58 PM, Ciprian Hacman <
[email protected]> wrote:
> Hi David,
>
> We see Rsyslog starting to use a lot of memory when it cannot send data
to
> Elasticsearch.
> We expected to see logs written to disk, but instead found a message on
> startup that looked like this:
>
>> fatal error on disk queue 'main_nojson queue[DA]', emergency switch to
>> direct mode [v8.11.0 try http://www.rsyslog.com/e/2040 ]
>
>
> Permissions were correct on the queue files.
>
> Thanks,
> Ciprian
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
> On Fri, Aug 21, 2015 at 12:34 PM, David Lang <[email protected]> wrote:
>
>> On Fri, 21 Aug 2015, Otis Gospodnetić wrote:
>>
>> Hi,
>>>
>>> Are there known situations where rsyslog disk queues can get corrupt?
>>>
>>> Sorry for such a high-level and open question, but we sometimes see
disk
>>> queue corruption and I wanted to see if anyone else sees that in
certain
>>> situation (e.g. when rsyslog is under pressure, when it is stopped in a
>>> certain way, when it runs out of memory, or something along these
lines)?
>>>
>>
>> Well, if it crashes as it's writing data, or if it's trying to flush the
>> queue to disk on shutdown and gets killed by -9 while it's doing so I
would
>> expect problems (some distros send a kill -15, wait a bit and then do
kill
>> -9, if there's too much work to do in writing the memory queue to disk,
>> rsyslog will be caught by this)
>>
>> Other than that, nothing specific to disk queues.
>>
>> what sort of corruption are you seeing?
>>
>> David Lang
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.