[PHP] preg_match_all nested html text

2008-08-20 Thread ioannes
I am trying to get the text between nested html tags within arrays 
produced by preg_match_all.  The simple situation would be:


trtdtest/td/tr

I want to return 'test'.

Assuming $post_results has some string derived from a html page source 
with lots of nested tags.


Replacing new line (seems to be a good idea to do this):
$post_results = ereg_replace([\n\r],  , $post_results);

I tried this:
$pattern = /[^]*(.+)\/[^]*/i;

Explanation (as far as I understand, please feel free to correct):
/.../ start end
 - opening html tag
[^] end tag
* any number of same

 end tag - don't understand why needed in addition to above

(.+) group: any number of any characters
 opening tag
\/ literal forward slash
[^] end with tag end
* any number of same

 end tag - don't know why needed again
i - modifier, can't remember what it means, something like case 
insensitive, yes, that would be it


  
//Main expression for first try, substituting tags:

preg_match_all($pattern,$post_results,$outputs);

//this only replaces the outer tag eg tr, not the td, so:
while(stristr($outputs[0][1],)) {
   preg_match_all($pattern,$outputs[0][1],$outputs,PREG_PATTERN_ORDER);
}


Is there a neat expression to get the inner text withing nested html tags?

Thanks,

John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] preg_match_all nested html text

2008-08-20 Thread Ashley Sheridan
Do you just wish to remove all the HTML tags from a given string? If so,
the strip_tags() function should do this.

Ash
www.ashleysheridan.co.uk
---BeginMessage---
I am trying to get the text between nested html tags within arrays 
produced by preg_match_all.  The simple situation would be:


trtdtest/td/tr

I want to return 'test'.

Assuming $post_results has some string derived from a html page source 
with lots of nested tags.


Replacing new line (seems to be a good idea to do this):
$post_results = ereg_replace([\n\r],  , $post_results);

I tried this:
$pattern = /[^]*(.+)\/[^]*/i;

Explanation (as far as I understand, please feel free to correct):
/.../ start end
 - opening html tag
[^] end tag
* any number of same

 end tag - don't understand why needed in addition to above

(.+) group: any number of any characters
 opening tag
\/ literal forward slash
[^] end with tag end
* any number of same

 end tag - don't know why needed again
i - modifier, can't remember what it means, something like case 
insensitive, yes, that would be it


  
//Main expression for first try, substituting tags:

preg_match_all($pattern,$post_results,$outputs);

//this only replaces the outer tag eg tr, not the td, so:
while(stristr($outputs[0][1],)) {
   preg_match_all($pattern,$outputs[0][1],$outputs,PREG_PATTERN_ORDER);
}


Is there a neat expression to get the inner text withing nested html tags?

Thanks,

John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


---End Message---
-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php