I am working on a a script that reads in an HTML file, and outputs formatted
plain text.  Not a significant task, but one area that I am having
difficulty with is gracefully converting the 'A' element.  The desired
outcome is text that maintains the link reference in brackets:

$body = "<p><a href=\"mailto:[EMAIL PROTECTED]\";>first link</a></p>
         <p><a href=\"http://www.tao.ca/\";>second link</a></p>
         <p><a href=\"http://www.tao.ca\";>http://www.tao.ca</a></p>
         <p><a href=\"mailto:[EMAIL PROTECTED]\";>[EMAIL PROTECTED]</a>";

// Regex's for dealing with HTML elements here
// Most of them omitted for simplicity
$body = preg_replace ('/<a href="(http:\/\/)(.*)".*>(.*)<\/a>/Usi', "\\3
(\\1\\2)", $body[$el]);
$body = preg_replace ('/<a href="(mailto:)(.*)".*>(.*)<\/a>/Usi', "\\3
(\\2)", $body[$el]);

// output        //


first link ([EMAIL PROTECTED])
second link (http://www.tao.ca/)
http://www.tao.ca (http://www.tao.ca)



The regex above deal fine for the first and second link, but leave redundant
text in the third and fourth.  Ideally, the regex expressions would not
include the text in brackets in the 3rd and 4th lines.  This is what I am
having difficulty with.  How can I incorporate such logic into my regex's?

Thank for your help,

Michael Caplan
Institute for Social Ecology

1118 Maple Hill Road
Plainfield, VT, 05667 USA

PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to