On Fri, Mar 6, 2015 at 8:53 AM, Andrew Ruch <[email protected]> wrote:
> On Thu, Mar 5, 2015 at 11:45 AM, Andrew Ruch <[email protected]> wrote:
>> On Thu, Mar 5, 2015 at 11:38 AM, David Lang <[email protected]> wrote:
>>> What do you mean when you say it's getting corrupted?
>>>
>>> What is the log message that's being delivered, and what does it look like
>>> after the corruption?
>>>
>>> David Lang
>>>
>>>
>>> On Thu, 5 Mar 2015, Andrew Ruch wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm using Rsyslog 8.8.0 on RHEL 6.6 on the sender and receiver.
>>>>
>>>> I'm experiencing an issue where my disk queue is getting corrupted,
>>>> even just sending one message. The Rsyslog server is configured to
>>>> receive messages over TCP. The message is then put into a disk queue
>>>> before being sent to another local application. When I comment out the
>>>> 4 “queue” lines on the action, everything works fine. The remote logs
>>>> are received and written to “/var/log/remote.log” successfully. Is
>>>> there something else I need to configure for the disk queue to work?
>>>>
>>>>
>>>> Below is a simple configuration file created for debugging this
>>>> problem. Also included are some relevant lines from the Rsyslog debug
>>>> log.
>>>>
>>>>
>>>> ##########################################################################
>>>>
>>>> SERVER CONFIG FILE
>>>>
>>>> ##########################################################################
>>>>
>>>> # rsyslog v8 configuration file (server)
>>>>
>>>> #### GLOBAL DIRECTIVES ####
>>>>
>>>> $DebugLevel 2
>>>> $DebugFile /var/log/rsyslog-debug.log
>>>>
>>>> $umask 0000
>>>>
>>>> # Working directory for spooling and file monitoring
>>>> $WorkDirectory /var/lib/rsyslog
>>>>
>>>> # Set the message size to 32k (up from 2k default)
>>>> $MaxMessageSize 32768
>>>>
>>>> #### MODULES ####
>>>>
>>>> # Configure file output defaults
>>>> module( load="builtin:omfile"
>>>>        template="RSYSLOG_TraditionalFileFormat"
>>>>        fileCreateMode="0600"
>>>>      )
>>>>
>>>> # provides input for tcp connections
>>>> module( load="imtcp"
>>>>        NotifyOnConnectionClose="on"
>>>>      )
>>>>
>>>> #### NETSTREAM DRIVER ####
>>>> $DefaultNetstreamDriver ptcp
>>>>
>>>> #### RULESETS ####
>>>>
>>>> ruleset(name="remote"){
>>>>
>>>> # To file
>>>> *.*    /var/log/remote.log
>>>>
>>>> # To local process
>>>>    action( template="RSYSLOG_SyslogProtocol23Format"
>>>>            name="remotefwd"
>>>>            type="omfwd"
>>>>            action.resumeRetryCount="-1"
>>>>            target="127.0.0.1"
>>>>            port="7871"
>>>>            protocol="tcp"
>>>>            TCP_Framing="octet-counted"
>>>>            queue.filename="remotefwd"
>>>>            queue.size="1000000"
>>>>            queue.type="Disk"
>>>>            queue.saveonshutdown="on"
>>>>            StreamDriver="ptcp"
>>>>          )
>>>>
>>>> }
>>>>
>>>> #### INPUTS ####
>>>> input(type="imtcp" port="10514" ruleset="remote")
>>>>
>>>>
>>>> #### LOCAL RULES ####
>>>>
>>>> *.*    /var/log/messages
>>>>
>>>>
>>>>
>>>>
>>>> ##########################################################################
>>>>
>>>> DEBUG LOG
>>>>
>>>> ##########################################################################
>>>>
>>>> 2608.731281300:main Q:Reg/w0  : processBATCH: next msg 0: <166>1
>>>> 2015-03-05T16:23:28.729746+00:00 localci-hdf dev - - -  test 4d
>>>> 2608.731286050:main Q:Reg/w0  :     ACTION 0
>>>> [builtin:omfile:/var/log/remote.log]
>>>> 2608.731296440:main Q:Reg/w0  : executing action 0
>>>> 2608.731300781:main Q:Reg/w0  : Called action, logging to builtin:omfile
>>>> 2608.731321713:main Q:Reg/w0  : action 0 is transactional - executing
>>>> in commit phase
>>>> 2608.731327093:main Q:Reg/w0  : wti 0x7f697aa7e200: we need to create
>>>> a new action worker instance for action 0
>>>> 2608.731331936:main Q:Reg/w0  : Action 0 transitioned to state: itx
>>>> 2608.731336015:main Q:Reg/w0  :     ACTION 1
>>>> [builtin:omfwd:action(type="builtin:omfwd" ...)]
>>>> 2608.731345835:main Q:Reg/w0  : executing action 1
>>>> 2608.731349861:main Q:Reg/w0  : Called action, logging to builtin:omfwd
>>>> 2608.731373560:main Q:Reg/w0  : strm 0x7f697aa79d70: file
>>>> -1(remotefwd) flush, buflen 575
>>>> 2608.731378696:main Q:Reg/w0  : strmPhysWrite, stream 0x7f697aa79d70, len
>>>> 575
>>>> 2608.731440218:main Q:Reg/w0  : file
>>>> '/var/lib/rsyslog/remotefwd.00000001' opened as #2 with mode 384
>>>> 2608.731447636:main Q:Reg/w0  : strm 0x7f697aa79d70: opened file
>>>> '/var/lib/rsyslog/remotefwd.00000001' for WRITE as 2
>>>> 2608.731457063:main Q:Reg/w0  : strm 0x7f697aa79d70: file 2 write
>>>> wrote 575 bytes
>>>> 2608.731462184:main Q:Reg/w0  : remotefwd queue: write wrote 575
>>>> octets to disk, queue disk size now 575 octets, EnqOnly:0
>>>> 2608.731466581:main Q:Reg/w0  : remotefwd queue: qqueueAdd: entry
>>>> added, size now log 1, phys 1 entries
>>>> 2608.731471191:main Q:Reg/w0  : remotefwd queue:Reg: high activity -
>>>> starting 1 additional worker thread(s).
>>>> 2608.731525047:main Q:Reg/w0  : remotefwd queue:Reg: started with
>>>> state 0, num workers now 1
>>>> 2608.731534075:main Q:Reg/w0  : remotefwd queue: EnqueueMsg advised worker
>>>> start
>>>> 2608.731540956:main Q:Reg/w0  : END batch execution phase, entering to
>>>> commit phase
>>>> 2608.731553966:main Q:Reg/w0  : actionCommitAll: action 0, state 1,
>>>> nbr to commit 1 isTransactional 1
>>>> 2608.731567908:main Q:Reg/w0  : doTransaction: have commitTransaction
>>>> IF, using that, pWrkrInfo 0x7f697aa7e2d0
>>>> 2608.731580406:main Q:Reg/w0  : entering
>>>> actionCallCommitTransaction(), state: itx, actionNbr 0, nMsgs 1
>>>> 2608.731598707:7f69754f4700: thread created, tid 3379, name
>>>> 'rs:remotefwd queue:Reg'
>>>> 2608.731642781:remotefwd queue:Reg/w0: wti 0x7f697aa7d4f0: worker starting
>>>> 2608.731656561:remotefwd queue:Reg/w0: DeleteProcessedBatch: we
>>>> deleted 0 objects and enqueued 0 objects
>>>> 2608.731669096:remotefwd queue:Reg/w0: doDeleteBatch: delete batch
>>>> from store, new sizes: log 1, phys 1
>>>> 2608.731685925:main Q:Reg/w0  : file stream remote.log params: flush
>>>> interval 0, async write 0
>>>> 2608.731711325:main Q:Reg/w0  : omfile: write to stream, pData->pStrm
>>>> 0x7f6968003980, lenBuf 40, strt data Mar  5 16:23:28 localci-hdf dev
>>>> test 4d
>>>> 2608.731720487:main Q:Reg/w0  : strm 0x7f6968003980: file
>>>> -1(remote.log) flush, buflen 40
>>>> 2608.731726623:main Q:Reg/w0  : strmPhysWrite, stream 0x7f6968003980, len
>>>> 40
>>>> 2608.731743671:main Q:Reg/w0  : file '/var/log/remote.log' opened as
>>>> #7 with mode 420
>>>> 2608.731756634:main Q:Reg/w0  : strm 0x7f6968003980: opened file
>>>> '/var/log/remote.log' for WRITE as 7
>>>> 2608.731768282:main Q:Reg/w0  : strm 0x7f6968003980: file 7 write wrote 40
>>>> bytes
>>>> 2608.731774441:main Q:Reg/w0  : Action 0 transitioned to state: rdy
>>>> 2608.731781078:main Q:Reg/w0  : Action 0 transitioned to state: itx
>>>> 2608.731786633:main Q:Reg/w0  : Action 0 transitioned to state: rdy
>>>> 2608.731791771:main Q:Reg/w0  : actionCommit, in retry loop, iRet 0
>>>> 2608.731797654:main Q:Reg/w0  : actionCommitAll: action 2, state 0,
>>>> nbr to commit 0 isTransactional 1
>>>> 2608.731807120:main Q:Reg/w0  : processBATCH: batch of 1 elements has
>>>> been processed
>>>> 2608.731812961:main Q:Reg/w0  : regular consumer finished, iret=0,
>>>> szlog 0 sz phys 1
>>>> 2608.731820159:main Q:Reg/w0  : DeleteProcessedBatch: we deleted 1
>>>> objects and enqueued 0 objects
>>>> 2608.731825717:main Q:Reg/w0  : doDeleteBatch: delete batch from
>>>> store, new sizes: log 0, phys 0
>>>> 2608.731831642:main Q:Reg/w0  : regular consumer finished, iret=4,
>>>> szlog 0 sz phys 0
>>>> 2608.731837955:main Q:Reg/w0  : main Q:Reg/w0: worker IDLE, waiting for
>>>> work.
>>>> 2608.731967228:remotefwd queue:Reg/w0: file
>>>> '/var/lib/rsyslog/remotefwd.00000001' opened as #6 with mode 384
>>>> 2608.731981761:remotefwd queue:Reg/w0: strm 0x7f697aa7aff0: opened
>>>> file '/var/lib/rsyslog/remotefwd.00000001' for READ as 6
>>>> 2608.731989606:remotefwd queue:Reg/w0: strm 0x7f697aa7aff0: file 6
>>>> read 575 bytes
>>>> 2608.732006243:remotefwd queue:Reg/w0: remotefwd queue: error -2308
>>>> dequeueing element - ignoring, but strange things may happen
>>>> 2608.732011036:remotefwd queue:Reg/w0: regular consumer finished,
>>>> iret=-2308, szlog 0 sz phys 1
>>>> 2608.732015422:remotefwd queue:Reg/w0: DeleteProcessedBatch: we
>>>> deleted 0 objects and enqueued 0 objects
>>>> 2608.732019638:remotefwd queue:Reg/w0: doDeleteBatch: delete batch
>>>> from store, new sizes: log 0, phys 1
>>>> 2608.732024016:remotefwd queue:Reg/w0: regular consumer finished,
>>>> iret=4, szlog 0 sz phys 1
>>>> 2608.732028228:remotefwd queue:Reg/w0: remotefwd queue:Reg/w0: worker
>>>> IDLE, waiting for work.
>>>>
>>>>
>>>> Thanks,
>>>> Andrew
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you 
>>>> DON'T
>>>> LIKE THAT.
>>>
>>>
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>>> LIKE THAT.
>>
>>
>> By corrupt I mean the log never leaves the queue. The text of the log
>> message that corresponds to the debug log I included is "test 4d". The
>> debug log shows this error message towards the bottom:
>>
>> 2608.732006243:remotefwd queue:Reg/w0: remotefwd queue: error -2308
>> dequeueing element - ignoring, but strange things may happen
>>
>>
>> Something bad is happening when the message is read from the queue.
>>
>>
>> Thanks,
>> Andrew
>
>
> In doing some more research, it appears that local messages work just
> fine when being sent to the disk queue. (Note that I changed the
> server configuration slightly to allow for local messages to be sent
> to the "remote" action.)
>
> Here are the steps:
>
> 1) Call "logger" from local server. This log message is successfully
> passed through the queue and sent to my local application using omfwd.
>
> RSYSLOG DEBUG LOGS:
>
> 4981.388342088:main Q:Reg/w0  : remotefwd queue: write wrote 395
> octets to disk, queue disk size now 790 octets, EnqOnly:0
> 4981.388462240:remotefwd queue:Reg/w0: strm 0x7f4faae85050: file 2
> read 395 bytes
>
> 2) Call “logger” from remote client. This log message causes an error
> in the queue.
>
> RSYSLOG DEBUG LOGS:
>
> 5000.069186937:main Q:Reg/w0  : remotefwd queue: write wrote 572
> octets to disk, queue disk size now 1362 octets, EnqOnly:0
> 5000.069278979:remotefwd queue:Reg/w0: strm 0x7f4faae85050: file 2
> read 572 bytes
> 5000.069308645:remotefwd queue:Reg/w0: remotefwd queue: error -2308
> dequeueing element - ignoring, but strange things may happen
>
> 3) Call “logger” from local server again. This log message is successful.
>
> RSYSLOG DEBUG LOGS:
>
> 5033.404190090:main Q:Reg/w0  : remotefwd queue: write wrote 395
> octets to disk, queue disk size now 1757 octets, EnqOnly:0
> 5033.404195439:main Q:Reg/w0  : remotefwd queue: qqueueAdd: entry
> added, size now log 1, phys 2 entries
> 5033.404273139:remotefwd queue:Reg/w0: doDeleteBatch: delete batch
> from store, new sizes: log 1, phys 2
> 5033.404279061:remotefwd queue:Reg/w0: objDeserialize error -2029
> during header processing - trying to recover
> 5033.404348551:remotefwd queue:Reg/w0: strm 0x7f4faae85050: file 2
> read 395 bytes
> 5033.404362158:remotefwd queue:Reg/w0: deserializer has possibly been
> able to re-sync and recover, state 0
>
>
>
> To me, this implies that something is different about the message
> being written to disk when it originates from imtcp. However, it still
> works to write this message to file, so I think everything is working
> as far as the TCP transfer of the message.
>
> Andrew


The MsgSerialize and MsgDeserialize functions are out of sync in
msg.c. They label the structured data field differently and expect it
in a different order. When I fix the name and fix the ordering in
MsgDeserialize, my systems work fine.

I created a bug (#264).
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to