On 03/20/2012 09:54 PM, John P. Rouillard wrote:
>
> Hi all:
>
> When sec creates a dump file, the input sources are reported as:
>
> Input sources:
> ============================================================
> /var/log/messages (status: Open, type: regular file, device/inode: 
> 64774/8339528, received data: 180797 lines, context: _FILE_EVENT_messages)
> /var/spool/sec/Control (status: Open, type: pipe, device/inode: 64770/584644, 
> received data: 0 lines, context: CONTROL)
>
> would it be possible to get a byte ofset at which SEC is reading put
> in there as well for "type: regular file".
>
> What I am trying to do is see how far SEC has to process to reach the
> end of the file (aka realtime). Ideally I should be able to ls -l the
> file and be able to compare that to the reported offset. The
> difference between the two is how many bytes sec has to process to
> reach realtime.
>
...
 > Which leads me to wonder if some sort
> of profiling mode could be added to sec that tells me for every rule:
>
>    how many events are compared to the rule (the current metrics only
>         tell me how many events were matched by the rule including
>         context processing). With this info I can restructure the order
>         to put more expensive/inefficient rules later in processing.
>
>    how many times an event is processed by a rule. This is sort of the
>         inverse of the stat above, but is a good metric to use to tune
>         rules as reducing the number of event ->  rule applications
>         seems to be the best way to reduce processing time.
>
>    how long on average (in real and cpu second/centiseconds) it takes
>         to process an event against a rule (provides an indication of
>         how efficient the regexp/rule is).
>
> I expect this would toss performance through the floor, but I could
> see this as being a useful offline mode to throw thousands of same
> events against and use it to tune the ruleset order, regexps etc.
>

The file position indicator should be easy to implement -- one needs to 
invoke some extra system calls during the dump, which does not add CPU 
overhead to regular rule matching. There is one crucial difference 
between the number of processed lines and file position, though. The 
former reflects lines successfully read and processed from a given file. 
However, it is possible that the file position is located beyond the end 
of the last processed line, since SEC implements line buffering layer on 
top of read(2) system calls. For example, we could have a very long line 
for which read(2) hasn't seen a terminating newline yet. Therefore, the 
reported file position does not necessarily mean that all the data 
before it have been processed by SEC -- it merely indicates the data 
have been read (but could still reside in a buffer).
The rest of the ideas need some thought, though, since they add some 
overhead to the main matching loop (even if extra code is not used, one 
still needs to check command line flags if the measuring functionality 
is switched on).
kind regards,
risto

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to