On 03/20/2012 09:54 PM, John P. Rouillard wrote: > > Hi all: > > When sec creates a dump file, the input sources are reported as: > > Input sources: > ============================================================ > /var/log/messages (status: Open, type: regular file, device/inode: > 64774/8339528, received data: 180797 lines, context: _FILE_EVENT_messages) > /var/spool/sec/Control (status: Open, type: pipe, device/inode: 64770/584644, > received data: 0 lines, context: CONTROL) > > would it be possible to get a byte ofset at which SEC is reading put > in there as well for "type: regular file". > > What I am trying to do is see how far SEC has to process to reach the > end of the file (aka realtime). Ideally I should be able to ls -l the > file and be able to compare that to the reported offset. The > difference between the two is how many bytes sec has to process to > reach realtime. > ... > Which leads me to wonder if some sort > of profiling mode could be added to sec that tells me for every rule: > > how many events are compared to the rule (the current metrics only > tell me how many events were matched by the rule including > context processing). With this info I can restructure the order > to put more expensive/inefficient rules later in processing. > > how many times an event is processed by a rule. This is sort of the > inverse of the stat above, but is a good metric to use to tune > rules as reducing the number of event -> rule applications > seems to be the best way to reduce processing time. > > how long on average (in real and cpu second/centiseconds) it takes > to process an event against a rule (provides an indication of > how efficient the regexp/rule is). > > I expect this would toss performance through the floor, but I could > see this as being a useful offline mode to throw thousands of same > events against and use it to tune the ruleset order, regexps etc. >
The file position indicator should be easy to implement -- one needs to invoke some extra system calls during the dump, which does not add CPU overhead to regular rule matching. There is one crucial difference between the number of processed lines and file position, though. The former reflects lines successfully read and processed from a given file. However, it is possible that the file position is located beyond the end of the last processed line, since SEC implements line buffering layer on top of read(2) system calls. For example, we could have a very long line for which read(2) hasn't seen a terminating newline yet. Therefore, the reported file position does not necessarily mean that all the data before it have been processed by SEC -- it merely indicates the data have been read (but could still reside in a buffer). The rest of the ideas need some thought, though, since they add some overhead to the main matching loop (even if extra code is not used, one still needs to check command line flags if the measuring functionality is switched on). kind regards, risto ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users