Dave M G wrote:
> Jochem,
> 
> Thank you for responding.
> 
>>
>> does this one work?:
>> preg_replace('#^<\!DOCTYPE(.*)<ul[^>]*>#is', '', $htmlPage);
> 
> Yes, that works. I don't think I would have every figured that out on my
> own - it's certainly much more complicated than the ereg equivalent.

1. the '^' at the start of the regexp states 'must match start of the string 
(or line in multiline mode)'
2. the 'i' after the the closing regexp delimiter states 'match 
case-insensitively'
3. the 's' after the the closing regexp delimiter states 'the dot also matches 
newlines'
4. the '<u[^>]*>' matches a UL tag with any number of attributes ... the 
'[^>]*' matches a number
of characters that are not a '>' character - the square brackets denote a 
character class (in
this cass with just one character in it) and the '^' at the start of the 
character class
definition negates the class (i.e. turns the character class definition to mean 
every character
*not* defined in the class)

PCRE is alot more powerful [imho], the downside it it has more modifiers
and syntax to control the meaning of the patterns...

read and become familiar with these 2 pages:
http://php.net/manual/en/reference.pcre.pattern.modifiers.php
http://php.net/manual/en/reference.pcre.pattern.syntax.php

and remember that writing patterns is often quite a complex - when you build one
just take i one 'assertion' at a time, ie. build the pattern up step by step...

if you give it a good go and get stuck, then there is always the list.

> 
> If I may push for just one more example of how to properly use regular
> expressions with preg:
> 
> It occurs to me that I've been assuming that with regular expressions I
> could only remove or change specified text.

essentially regexps are pattern syntax for asserting where something matches
a pattern (or not) - there are various functions that allow you to act upon the
results of the pattern matching depending on your needs (see below)

> 
> What if I wanted to get rid of everything *other* than the specified text?
> 
> Can I form an expression that would take $htmlPage and delete everything
> *except* text that is between a <li> tag and a <br> tag?

yes but you wouldn't use preg_replace() but rather preg_match() or 
preg_match_all()
which gives you back an array (via 3rd/4th[?] reference argument) which contains
the texts that matched (and therefore want to keep).

> 
> Or is that something that requires much more than a single use of
> preg_replace?
> 
> -- 
> Dave M G
> 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to