Re: Is this fun?

A. Pagaltzis Tue, 15 Jul 2003 08:19:44 -0700

* Keith C. Ivey <[EMAIL PROTECTED]> [2003-07-15 14:42]:
> which will be handled by the regex but may cause a parser to
> blow up (though some are more tolerant than others)


Did you read what I said? You need a tolerant parser indeed. Did
you take any look HTML::Parser at all?

> | That leaves input data munging, which I do a lot of, and a
> | lot of input data these days is XML. Now here's the dirty
> | secret; most of it is machine-generated XML,

Is yours?

> | I've even gone to the length of writing a prefilter to glue
> | together tags that got split across multiple lines, just so I
> | could do the regexp trick.

Do you?

Sure, you as long as you know your input follows narrower
specifications then "arbitrary valid markup", you can use that
knowledge to your advantage.

The deficiencies with parsers are their interfaces; what we
really need is a generic matching engine that can be applied to
ordered collections not only of characters, but of arbitrary
objects for some, so that we could apply a pattern to, say, a
stream of XML parser events.

-- 
Regards,
Aristotle

Re: Is this fun?

Reply via email to