M5 wrote:
> I am trying to write a regex function to extract the readable (visible,
> screen-rendered) portion of any web page. Specifically, I only want the
> text between the <body> tags, excluding any <script> or <style> tags
> within the document, also excluding comments. Has anyone here seen such
> a regex? Is it possible to do in one expression?

Alternative suggestion: "lynx --dump http://www.example.com/";

Col

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to