RE: Module for simple processing of log files
Title: RE: Module for simple processing of log files Le mardi 29 mars 2005 à 17:52, Orton, Yves écrivait: I started working on a project like this but never got around to finishing it. I called it Generic Record Processing System IE GRPS. The point being that this isnt a facility related to parsing log files, its a facility relating to processing any file of parsable records in a mechanical way. Then what do you think of Record::Processor? Great. Although you might want to take a little bit of time to think about how you would subdivide that space. For instance i could imagine: Record::Processor::Parser Record::Processor::Writer Record::Processor::Writer::XML Record::Processor::Writer::xSV Record::Processor::Writer::Packed Record::Processor::Reader::XML Record::Processor::Reader::xSV Record::Processor::Reader::Packed ... Etc... If the framework makes sense it should be fairly easy to extend it for new data representations, output formats and the like. For instance maybe I have some kind of specially encoded records that need to be preprocessed before your framework can be executed then it should be fairly easy to add a new subclass and have it DWIM. Also, when i say these classes what im thinking is that they encapsulate the knowledge about how to convert a rule specification into _source_code_ im not thinking that they should have methods that are executed inside of the parse loop. IMO there shouldnt be ANY subroutines inside of the parse loop. That way the resulting parser is lean and mean and fast. No method lookup BS or subroutine call stack overhead. Anyway, as i said i look forward to seeing your work. :-) Yves
TRIEs in the core (was: Re: Module for simple processing of log files_
Orton, Yves wrote: [...] shameless plug But David and the other Regexp authors need to update their code to take advantage of 5.9.2 and later innate TRIE optimisation. They still have room for optimising the patterns that they build but they will need to build fairly different looking patterns to really harness the TRIE regop. /shameless plug No, I've been following the threads on p5p. I've been looking hard at the stuff I do, and the patterns I generate come from little patterns that all tend to feature lots of metacharacters (otherwise I'd be doing hash lookups or index()), correct me if I'm wrong, such patterns don't benefit from your trie optimisations. E.g., what happens with FROM MRS\. [A-Z]+ [A-Z]+ FROM MRS [A-Z]+ [A-Z]+ FROM MR [A-Z]+ [A-Z]+ FROM MR\. [A-Z]+ [A-Z]+ FROM: MRS\. [A-Z]+ [A-Z]+ FROM: MRS [A-Z]+ [A-Z]+ FROM: MR [A-Z]+ [A-Z]+ FROM: MR\. [A-Z]+ [A-Z]+ (actual patterns lifted from Nigerian spam). R::A produces FROM:? MRS?\.? [A-Z]+ [A-Z]+ Instead of the whole mess or'ed together. I'm seriously lacking time to benchmark the differences. David
Re: Should DSLIP codes be updated?
* Robert Rothenberg [EMAIL PROTECTED] [2005-03-29T18:03:09] On 29/03/2005 22:14 Andy Lester wrote: Or thrown away entirely, along with the rest of the archaic idea of module registration. I'm sympathetic to the idea, but some of the information in DSLIP is useful and shouldn't be thrown away (such as how supported, alpha/beta/mature, and license). What isn't in META.yml should go there. I assume you mean What isn't in META.yml should go in DSLIP. Why not What isn't in META.yml should go in META.yml? No reason every module that wants to provide this information can't. -- rjbs pgpsQatgjrGuz.pgp Description: PGP signature
Re: Should DSLIP codes be updated?
Ricardo SIGNES writes: I assume you mean What isn't in META.yml should go in DSLIP. Why not What isn't in META.yml should go in META.yml? META.yml sounds much more sensible to me. It wasn't around when DSLIP was created, but it is now. Of course, even if we change _where_ this metadata is stored, we still have to address Robert's original points about the data itself. Smylers