date:20050330

RE: Module for simple processing of log files

2005-03-30 Thread Orton, Yves

Title: RE: Module for simple processing of log files

Le mardi 29 mars 2005 à 17:52, Orton, Yves écrivait:

I started working on a project like this but never got
around to finishing
it. I called it Generic Record Processing System IE GRPS.
The point being
that this isnt a facility related to parsing log files, its
a facility
relating to processing any file of parsable records in a
mechanical way.

Then what do you think of Record::Processor?

Great. Although you might want to take a little bit of time to think about how you would subdivide that space. For instance i could imagine:

Record::Processor::Parser
Record::Processor::Writer
Record::Processor::Writer::XML
Record::Processor::Writer::xSV
Record::Processor::Writer::Packed
Record::Processor::Reader::XML
Record::Processor::Reader::xSV
Record::Processor::Reader::Packed

... Etc...

If the framework makes sense it should be fairly easy to extend it for new data representations, output formats and the like. For instance maybe I have some kind of specially encoded records that need to be preprocessed before your framework can be executed then it should be fairly easy to add a new subclass and have it DWIM.

Also, when i say these classes what im thinking is that they encapsulate the knowledge about how to convert a rule specification into _source_code_ im not thinking that they should have methods that are executed inside of the parse loop. IMO there shouldnt be ANY subroutines inside of the parse loop. That way the resulting parser is lean and mean and fast. No method lookup BS or subroutine call stack overhead.

Anyway, as i said i look forward to seeing your work. :-)
Yves

TRIEs in the core (was: Re: Module for simple processing of log files_

2005-03-30 Thread David Landgren

Orton, Yves wrote:
[...]
shameless plug
But David and the other Regexp authors need to update their code to take 
advantage of 5.9.2 and later innate TRIE optimisation. They still have 
room for optimising the patterns that they build but they will need to 
build fairly different looking patterns to really harness the TRIE regop.

/shameless plug
No, I've been following the threads on p5p. I've been looking hard at 
the stuff I do, and the patterns I generate come from little patterns 
that all tend to feature lots of metacharacters (otherwise I'd be doing 
hash lookups or index()), correct me if I'm wrong, such patterns don't 
benefit from your trie optimisations. E.g., what happens with

FROM MRS\. [A-Z]+ [A-Z]+
FROM MRS [A-Z]+ [A-Z]+
FROM MR [A-Z]+ [A-Z]+
FROM MR\. [A-Z]+ [A-Z]+
FROM: MRS\. [A-Z]+ [A-Z]+
FROM: MRS [A-Z]+ [A-Z]+
FROM: MR [A-Z]+ [A-Z]+
FROM: MR\. [A-Z]+ [A-Z]+
(actual patterns lifted from Nigerian spam). R::A produces
FROM:? MRS?\.? [A-Z]+ [A-Z]+
Instead of the whole mess or'ed together. I'm seriously lacking time to 
benchmark the differences.

David

Re: Should DSLIP codes be updated?

2005-03-30 Thread Ricardo SIGNES

* Robert Rothenberg [EMAIL PROTECTED] [2005-03-29T18:03:09]
 On 29/03/2005 22:14 Andy Lester wrote:
 
 Or thrown away entirely, along with the rest of the archaic idea of
 module registration.
 
 I'm sympathetic to the idea, but some of the information in DSLIP is 
 useful and shouldn't be thrown away (such as how supported, 
 alpha/beta/mature, and license). What isn't in META.yml should go there.

I assume you mean What isn't in META.yml should go in DSLIP.

Why not What isn't in META.yml should go in META.yml?

No reason every module that wants to provide this information can't.

-- 
rjbs


pgpsQatgjrGuz.pgp
Description: PGP signature

Re: Should DSLIP codes be updated?

2005-03-30 Thread Smylers

Ricardo SIGNES writes:

 I assume you mean What isn't in META.yml should go in DSLIP.
 
 Why not What isn't in META.yml should go in META.yml?

META.yml sounds much more sensible to me.  It wasn't around when DSLIP
was created, but it is now.

Of course, even if we change _where_ this metadata is stored, we still
have to address Robert's original points about the data itself.

Smylers

RE: Module for simple processing of log files

TRIEs in the core (was: Re: Module for simple processing of log files_

Re: Should DSLIP codes be updated?

Re: Should DSLIP codes be updated?

4 matches

Site Navigation

Mail list logo

Footer information