I've got the regex doing what I want, but it's still open to the client
breaking it.

I need to try and extract all text, including html tags, between the first
opening <p> and the first closing </p>.

So from:

        <p>I want this paragraph. <b>Including me</b></p><p>But not me</p>

I want:

        I want this paragraph. <b>Including me</b>

I have gotten this far:

        /<p>([^<]+)<\/p>/

but of course this will not work if any html tag is nested between the p
tags. I've tried adding extra char classes after the one already there, but
it doesn't work. Am I missing something simple here? Any help would be very
much appreciated.

Does anyone know of a site where they have solutions to common regex
problems, something other than the basic tutorials I keep finding?

Thanks

Ade

-----Original Message-----
From: Paul Johnston [mailto:[EMAIL PROTECTED]
Sent: 13 June 2003 10:20
To: [EMAIL PROTECTED]
Subject: RE: [ cf-dev ] OT - Perl RegEx


> > > href\s*=\s*(['"])[^\1]+\1
> >
> > I take it this means if a single opens the attribute, only a
> > single one will be looked for as the closing quote? The back 
> > reference \1 holds the value of (['"])?
> 
> Yup, that's right.
> 
> But then again it may be that the data you're supplied with 
> is something like href="...' - in which case you wouldn't 
> want to use it...  ;)

The assumption (note: assumption) would be that any href's you get would be
URL Encoded so you wouldn't have an issue with " or ' in the URL.  Then you
could justifiably use either (typo) to close the string (although it's VERY
bad)

Remember, Regexes can only catch MOST of the cases and not all.  Don't even
attempt to catch them all... If there is an error, you fix that one, and
move on.  Trying to make a regex that's a be all and end all regex almost
never works.

Paul



-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]

-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]

Reply via email to