Hello all

I have been scratching my head for the last two days about this regular 
expression problem. I would be really VERY happy if someone could help me!

I have the following text in the file 'text.htm', for example:

--

<BLOCKQUOTE><P>
Cow, Cow, Cow, Cow, Cow
Cow, Cow, Cow, Cow, Cow
Cow, Cow, Cow, Cow, Cow
a lot of lines
</P></BLOCKQUOTE>

<p>boring stuff - we are not interested in this....</p>

<BLOCKQUOTE><P>
Chicken, Chicken, Chicken
Chicken, Chicken, Chicken
Chicken, Chicken, Chicken
more lines
</P></BLOCKQUOTE>

<p>more boring stuff - we are not interested in this....</p>

<BLOCKQUOTE><P>
Rabbit, Rabbit, Rabbit, Rabbit

</P></BLOCKQUOTE>

<p>even more boring stuff - we are not interested in this....</p>

<BLOCKQUOTE><P>
Pig, Pig, Pig, Pig, Pig
</P></BLOCKQUOTE>

--

I want to return all the stuff between <BLOCKQUOTE><P> ... </P></BLOCKQUOTE> 
in an array. One element per match. For example, for the above text, I would 
like to get back an array back like this:

array(
        "Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow a 
lot of lines",
        "Chicken, Chicken, Chicken Chicken, Chicken, Chicken Chicken, Chicken, 
Chicken more lines",
        "Rabbit, Rabbit, Rabbit, Rabbit",
        "Pig, Pig, Pig, Pig, Pig"
)

I have been trying to do this with (many variations of) the following code:

--

<?PHP

// open file
$fd = fopen ("./text.htm", "r");

// load contents into a variable
while (!feof ($fd))
{
    $content .= fgets($fd, 4096);
}

// close file
fclose ($fd);

// remove char returns and co.
$content = preg_replace("/(\r\n)|(\n\r)|(\n|\r)/", " ",$content);

// match agains regex -- this does not work correctly....
if 
(preg_match("/<BLOCKQUOTE><P>(.*)<\/P><\/BLOCKQUOTE>/i",$content,$matches))
{
        echo "<pre>";
        var_dump($matches);
        echo "</pre>";
}

?>

--

For the above, var_dump() returns this:

--

array(2) {
  [0]=>
  string(556) "<BLOCKQUOTE><P> Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, 
Cow Cow, Cow, Cow, Cow, Cow a lot of lines </P></BLOCKQUOTE>  <p>boring 
stuff - we are not interested in this....</p>  <BLOCKQUOTE><P> Chicken, 
Chicken, Chicken Chicken, Chicken, Chicken Chicken, Chicken, Chicken more 
lines </P></BLOCKQUOTE>  <p>more boring stuff - we are not interested in 
this....</p>  <BLOCKQUOTE><P> Rabbit, Rabbit, Rabbit, Rabbit  
</P></BLOCKQUOTE>  <p>even more boring stuff - we are not interested in 
this....</p>  <BLOCKQUOTE><P> Pig, Pig, Pig, Pig, Pig </P></BLOCKQUOTE>"
  [1]=>
  string(524) " Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow Cow, Cow, 
Cow, Cow, Cow a lot of lines </P></BLOCKQUOTE>  <p>boring stuff - we are not 
interested in this....</p>  <BLOCKQUOTE><P> Chicken, Chicken, Chicken 
Chicken, Chicken, Chicken Chicken, Chicken, Chicken more lines 
</P></BLOCKQUOTE>  <p>more boring stuff - we are not interested in 
this....</p>  <BLOCKQUOTE><P> Rabbit, Rabbit, Rabbit, Rabbit  
</P></BLOCKQUOTE>  <p>even more boring stuff - we are not interested in 
this....</p>  <BLOCKQUOTE><P> Pig, Pig, Pig, Pig, Pig "
}

--

Clearly not what I want.

Is my approach here incorrect? Or is it indeed possible to construct a regex 
to do what I want (with just one pass of the text)?

Thank you in advance.

:-))

S.






_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to