I can see where it is deleting from the database action queue in chunks of
10,000 now:

6702.235489189:SecInfra2ArchiveDB queue:Reg/w0: DeleteProcessedBatch: we
deleted 10000 objects and enqueued 0 objects

I assume that means they are being inserted in chunks, too.  But watching
the processlist in mysql, it appears that they are still being inserted
one-at-a-time, albeit in a transaction, so maybe that's what is meant by
"batch" processing?
I see lots of entries like the following, but these appear to originate
from the main queue, not the db action queue.  Is that a correct
interpretation?

...
6702.597844624:main Q:Reg/w0  : SecInfra2ArchiveDB queue: EnqueueMsg
advised worker start
6702.597854174:main Q:Reg/w0  : action 'SecInfra2ArchiveDB': set suspended
state to 0
6702.597861111:main Q:Reg/w0  : processBATCH: next msg 649: <14>1
2017-11-07T15:31:42.255-05:00 JF-ITE-PM2-00C RT_FLOW -
RT_FLOW_SESSION_CREATE [junos@2636.1.1.1.2.28 source-address="10.24
6702.597867433:main Q:Reg/w0  :
6702.597880487:SecInfra2ArchiveDB queue:Reg/w0: dnscache: entry
0x7febc4017ed0 found
6702.597894312:main Q:Reg/w0  : ACTION 0 [ommysql:action(type="ommysql"
...)]
6702.597907683:main Q:Reg/w0  : executing action 0
6702.597919469:main Q:Reg/w0  : action 'SecInfra2ArchiveDB': called,
logging to ommysql (susp 0/0, direct q 0)
...

Thanks,
Steven Libby

On Wed, Nov 8, 2017 at 4:32 PM, David Lang <da...@lang.hm> wrote:

> are the inserts into MySQL happening in large chunks now?
>
> if the queue is filling, it should be dequeueing the messages in large
> batches and submitting them to MySQL in batches, if the batches to MySQL
> are too small, you could have trouble keeping up.
>
> look in the debug logs for how many message are processed between calls to
> insert them into MySQL.
>
> David Lang
>
>  On Wed, 8 Nov 2017, Steven Libby wrote:
>
> Date: Wed, 8 Nov 2017 16:26:08 -0500
>> From: Steven Libby <libby.ste...@gmail.com>
>> To: David Lang <da...@lang.hm>
>> Cc: Steven Libby via rsyslog <rsyslog@lists.adiscon.com>
>> Subject: Re: [rsyslog] ommysql database queue bottleneck
>>
>>
>> I've converted the legacy $MainQueue% options over to the new RainerScript
>> style and confirmed the queues (main and action) are dequeueing in the
>> proper designated chunk sizes now.
>>
>> I've seen nothing else in the rsyslog debug log which would indicate a
>> problem, but then again I might be missing something.  Any help or
>> guidance
>> on how to properly debug this bottleneck would be greatly appreciated.
>>
>> Thanks,
>> Steven Libby
>>
>> On Tue, Nov 7, 2017 at 2:06 PM, Steven Libby <libby.ste...@gmail.com>
>> wrote:
>>
>> We had configured large batches (8192) for the main queue, action queue,
>>> and udp input module.  However, after a brief debugging session, it
>>> appears
>>> that at least two of these settings are not being honored...
>>>
>>> The database action queue seems to be deleting maximum batch sizes of 8
>>> at
>>> a time:
>>>
>>> 7968.410369112:SecInfra2ArchiveDB queue[DA]:Reg/w0: regular consumer
>>> finished, iret=0, szlog 235270 sz phys 235278
>>> 7968.410545345:SecInfra2ArchiveDB queue[DA]:Reg/w0:
>>> DeleteProcessedBatch:
>>> we deleted 8 objects and enqueued 0 objects
>>> 7968.410554591:SecInfra2ArchiveDB queue[DA]:Reg/w0: doDeleteBatch:
>>> delete
>>> batch from store, new sizes: log 235270, phys 235270
>>>
>>> And the main queue appears to be dequeueing in maximum chunks of 256:
>>>
>>> 7963.597767589:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 25 of 25 messages]
>>> 7963.616701318:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 38 of 38 messages]
>>> 7963.669420594:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 224 of 224 messages]
>>> 7963.731337893:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 256 of 256 messages]
>>> 7963.800962451:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 256 of 256 messages]
>>> 7963.857743064:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 256 of 256 messages]
>>> 7963.904639838:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 256 of 256 messages]
>>> 7963.985528145:main Q:Reg/w0  : END batch execution phase, entering to
>>> commit phase [processed 256 of 256 messages]
>>> ...
>>>
>>> I'm starting to think some of the options in our config file are
>>> incorrect
>>> or no longer supported...?
>>>
>>> Also, is there any specific keywords I should be looking for in the debug
>>> log?  Or any specific debug options I should be turning on (i.e.
>>> LogFuncFlow, PrintMutexAction, OutputTidToStderr, etc...)?
>>>
>>>
>>> On Mon, Nov 6, 2017 at 4:54 PM, David Lang <da...@lang.hm> wrote:
>>>
>>> On Mon, 6 Nov 2017, Steven Libby wrote:
>>>>
>>>> I've tried removing the ruleset altogether and just using an action to
>>>>
>>>>> minimize locking overhead (which I assume would be the same thing
>>>>> you're
>>>>> referring to here).  But just to make sure, I also tried changing the
>>>>> ruleset queue type to "Direct" and got the same result.
>>>>>
>>>>>
>>>> yes
>>>>
>>>> Also, most of the time during my testing I used resumeRetryCount="-1"
>>>> but
>>>>
>>>>> mistakenly left it at "0" when I posted to this mailing list.
>>>>> Reverting
>>>>> back to "-1" still results in a backlog in the database action queue.
>>>>> When
>>>>> you say "...some insert attempts...are not being accepted by mysql...",
>>>>> are
>>>>> you implying that some insert attempts are actually being rejected by
>>>>> mysql?
>>>>>
>>>>>
>>>> that was one possibility when I thought you were not retrying logs.
>>>>
>>>>  I have enabled the general query log with the highest level of
>>>>
>>>>> verbosity, and also installed the audit plugin, but I am unable to see
>>>>> any
>>>>> indication that any insert attempts are being rejected.  I have yet to
>>>>> enable the DBUG package yet for further investigation, but will try
>>>>> that
>>>>> next to see if I can glean any new information.
>>>>>
>>>>>
>>>> are logs getting inserted one at a time or in large batches? try to
>>>> increase batching, get a debug log for a bit on the rsyslog side and
>>>> see if
>>>> it's having to make multiple attempts to insert messages (for example,
>>>> you
>>>> try to deliver up to 8K messages at once, rsyslog could be rejecting
>>>> this
>>>> because the total request size is too large and rsyslog then tries to
>>>> deliver 4K, 2K, 1K... messages until it gets small enough to be
>>>> accepted)
>>>>
>>>> David Lang
>>>>
>>>>
>>>> On a side note, I created a separate test table using the MyISAM storage
>>>>
>>>>> engine and the insert rate was improved, but still filled up the
>>>>> database
>>>>> action queue within several minutes.  I've reduced the durability of
>>>>> TokuDB
>>>>> by disabling tokudb_commit_sync and tokudb_fsync_log_period global
>>>>> variables, but there was zero noticeable performance difference here.
>>>>> (Background: we use TokuDB for it's compression and secondary
>>>>> clustering
>>>>> indexes, mainly).  I would expect to see excessive CPU utilization or
>>>>> Disk
>>>>> I/O if the database was struggling to keep up with the rsyslog queue,
>>>>> but
>>>>> perhaps that expectation is incorrect?
>>>>>
>>>>> Regards,
>>>>> Steven
>>>>>
>>>>> On Sat, Nov 4, 2017 at 5:03 PM, David Lang <da...@lang.hm> wrote:
>>>>>
>>>>> don't put a queue on the ruleset and the only item in the ruleset
>>>>>
>>>>>>
>>>>>> when you have to queues like this, you end up with lots of locking
>>>>>> overhead and poor batch sizing.
>>>>>>
>>>>>> with retrycount=0 you never attempt to re-deliver a message that
>>>>>> fails,
>>>>>> if
>>>>>> you set it to -1 it will retry forever
>>>>>>
>>>>>> try setting it to -1 and see if logs are getting stuck (delivery
>>>>>> stops),
>>>>>> if so, you are getting some insert attempts that are not being
>>>>>> accepted
>>>>>> by
>>>>>> mysql and since you have retries disabled, the entire batch is being
>>>>>> thrown
>>>>>> away.
>>>>>>
>>>>>>
>>>>>> On Fri, 3 Nov 2017, Steven Libby via rsyslog wrote:
>>>>>>
>>>>>> Date: Fri, 3 Nov 2017 14:53:17 -0400
>>>>>>
>>>>>> From: Steven Libby via rsyslog <rsyslog@lists.adiscon.com>
>>>>>>> To: rsyslog@lists.adiscon.com
>>>>>>> Cc: Steven Libby <libby.ste...@gmail.com>
>>>>>>> Subject: [rsyslog] ommysql database queue bottleneck
>>>>>>>
>>>>>>>
>>>>>>> We are running rsyslog 8.21.0 on a RHEL 6.7 box with 256GB of RAM and
>>>>>>> 96
>>>>>>> cores, writing to a MySQL instance (Percona Server 5.6.28-76) using
>>>>>>> TokuDB
>>>>>>> as the storage engine.  This is our centralized log server, which
>>>>>>> receives
>>>>>>> messages from numerous network devices at an average rate of 7k
>>>>>>> messages
>>>>>>> per second.  This message rate is fairly consistent throughout the
>>>>>>> course
>>>>>>> of the day.
>>>>>>>
>>>>>>> We noticed records missing from the database when comparing with the
>>>>>>> raw
>>>>>>> records being written directly to a file on another server.  Based on
>>>>>>> the
>>>>>>> investigation so far, it appears the lost messages are being
>>>>>>> discarded
>>>>>>> due
>>>>>>> to queues filling up.
>>>>>>>
>>>>>>> After enabling the impstats module, we discovered that the bottleneck
>>>>>>> appears to be in the database queue; but we're having trouble
>>>>>>> determining
>>>>>>> the exact cause of the bottleneck.  Database resource utilization is
>>>>>>> relatively low (CPU utilization < 4%, Disk I/O < 1%, 5+GB free RAM)
>>>>>>> and
>>>>>>> we've tuned the database configuration to remove any unnecessary
>>>>>>> overhead.
>>>>>>> We've made numerous configuration changes, both on the DB side and in
>>>>>>> rsyslogd, and the queue seems to build at roughly the same rate
>>>>>>> regardless
>>>>>>> of of everything we've tried.
>>>>>>>
>>>>>>> We are successfully writing 3k-4k records per-second into the DB, but
>>>>>>> the
>>>>>>> action queue is also building at about that same rate.
>>>>>>>
>>>>>>> Here is the latest rsyslog configuration we've tried (below) cleaned
>>>>>>> up
>>>>>>> for
>>>>>>> the purpose of this message.  Most of the incoming messages arrive
>>>>>>> via
>>>>>>> imudp.  We've tried adjusting the number of threads in that module,
>>>>>>> as
>>>>>>> well
>>>>>>> as the batchSize, and the number of messages coming in stays about
>>>>>>> the
>>>>>>> same, regardless.  We've also tried different combinations of MainMsg
>>>>>>> queue
>>>>>>> worker threads and batch sizes, as well as similar adjustments in the
>>>>>>> ruleset and action configurations.  We've gone through the changelogs
>>>>>>> up
>>>>>>> to
>>>>>>> rsyslog 8.31.0 and didn't see anything to indicate that upgrading
>>>>>>> would
>>>>>>> help with the performance.  We've also gone through most of the
>>>>>>> online
>>>>>>> documentation at rsyslog.com/doc/v8-stable/ and read some articles
>>>>>>> by
>>>>>>> David
>>>>>>> Long regarding performance improvements and the work he did with
>>>>>>> Intuit.
>>>>>>>
>>>>>>> At this point, any additional help or guidance from the professionals
>>>>>>> would
>>>>>>> be greatly appreciated.  Thanks for taking the time to read this...
>>>>>>>
>>>>>>>
>>>>>>> #rsyslog v8.x config file
>>>>>>> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
>>>>>>> $RepeatedMsgContainsOriginalMsg on
>>>>>>> $OptimizeForUniprocessor off
>>>>>>> $PreserveFQDN on
>>>>>>> $WorkDirectory /HSD1/syslogQ
>>>>>>>
>>>>>>> $MainMsgQueueFileName mainq-SecInfra
>>>>>>> $MainMsgQueueMaxDiskSpace 200g
>>>>>>> $MainMsgQueueMaxFileSize 1g
>>>>>>> $MainMsgQueueSaveOnShutdown on
>>>>>>> $MainMsgQueueType LinkedList
>>>>>>> $MainMsgQueueSize 10485760
>>>>>>> $MainMsgQueueHighWatermark 10000000
>>>>>>> $MainMsgQueueWorkerThreads 8
>>>>>>> $MainMsgDequeueBatchSize 8192
>>>>>>>
>>>>>>> module(load="imudp" threads="8" batchSize="8192" timeRequery="10")
>>>>>>> module(load="imptcp" threads="8" processOnPoller="off" )
>>>>>>> module(load="imuxsock")
>>>>>>> module(load="imklog")
>>>>>>> module(load="immark")
>>>>>>> module(load="ommysql")
>>>>>>> module(load="impstats" interval="5" resetCounters="off"
>>>>>>> format="legacy"
>>>>>>> log.syslog="off" log.file="/HSD1/locallog/pstat
>>>>>>> s-SecInfra_2017-11-03.log")
>>>>>>> module(load="mmutf8fix")
>>>>>>>
>>>>>>> $template SecInfraSyslogTmpl,"insert into SecInfraSyslog.syslogRaw
>>>>>>> (recievedAt,deviceReportedTime,deviceTimezone,fromHost,
>>>>>>> fromHostIpv4,hostName,facility,priority,localPriority,infoUnitId,
>>>>>>> inputName,syslogTag,programName,appName,procId,msgId,message
>>>>>>> ,rawMessage)
>>>>>>> values ('%timegenerated:::date-mysql%.%timegenerated:::date-
>>>>>>> subseconds%','%timereported:::date-mysql%.%timereported:::
>>>>>>> date-subseconds%','%timereported:::date-tzoffsdirection%%
>>>>>>> timereported:
>>>>>>> ::date-tzoffshour%','%FROMHOST%','%fromhost-ip%','%
>>>>>>> HOSTNAME%',%syslogfacility%,%syslogpriority%,'%PRI%',%iut%,
>>>>>>> '%inputname%','%syslogtag%','%programname%','%APP-NAME%','%
>>>>>>> PROCID%','%MSGID%','%msg%','%rawmsg%')",SQL
>>>>>>>
>>>>>>> ruleset(name="SecInfra"
>>>>>>>  queue.saveonshutdown="on"
>>>>>>>  queue.type="LinkedList"
>>>>>>>  queue.filename="SecInfra-ruleQ"
>>>>>>>  queue.highwatermark="1000000"
>>>>>>>  queue.size="1310720"
>>>>>>>  queue.dequeuebatchsize="8192"
>>>>>>>  queue.maxdiskspace="200g"
>>>>>>>  queue.maxfilesize="10g"
>>>>>>>  queue.workerthreads="2"
>>>>>>> ) {
>>>>>>>  action(name="SecInfra2ArchiveDB"
>>>>>>>    type="ommysql"
>>>>>>>    server="127.0.0.1"
>>>>>>>    db="SecInfraSyslog"
>>>>>>>    uid="rsyslog"
>>>>>>>    pwd="************"
>>>>>>>    serverport="3306"
>>>>>>>    template="SecInfraSyslogTmpl"
>>>>>>>    queue.saveonshutdown="on"
>>>>>>>    queue.type="LinkedList"
>>>>>>>    queue.filename="SecInfraArchive-dbQ"
>>>>>>>    queue.highwatermark="1000000"
>>>>>>>    queue.size="1310720"
>>>>>>>    queue.dequeuebatchsize="8192"
>>>>>>>    queue.maxdiskspace="200g"
>>>>>>>    queue.maxfilesize="10g"
>>>>>>>    queue.workerthreads="2"
>>>>>>>    action.resumeRetryCount="0"
>>>>>>>    action.reportSuspension="on"
>>>>>>>    action.reportSuspensionContinuation="on"
>>>>>>>  )
>>>>>>> }
>>>>>>>
>>>>>>> input(type="imudp" name="SecInfraUDP" address="************"
>>>>>>> port="514"
>>>>>>> ruleset="SecInfra")
>>>>>>> input(type="imtcp" name="SecInfraTCP" address="************"
>>>>>>> port="514"
>>>>>>> ruleset="SecInfra")
>>>>>>>
>>>>>>> Regards,
>>>>>>> Steven Libby
>>>>>>> _______________________________________________
>>>>>>> rsyslog mailing list
>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>> myriad
>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>> you
>>>>>>> DON'T LIKE THAT.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to