I need some advice about reading rel="me" tags in arbitrary web pages using PHP. I'm intending to use this to help build a lifestream style function. The basic intent is to cut down the amount of data entry the user has to do. When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists of links to their profile pages I should be able to avoid having to ask them for all of them again. So:-

- User gives me a URL for one of their profile pages
- Use Curl to collect the source
- Parse the source looking for links with a rel="me"
- Extract an array of Link URL - Link Text
- Do something useful with the array. (???? followed by Profit!)

I've been searching this morning for a PHP library to do the parsing and link extraction or PHP examples or example regex to use in PREG_MATCH_ALL or something/anything, without success. Since the source data is probably badly written and broken html, I don't think I can use XML methods as all the XML unserialising code I've used barfs on badly formed XML. One possibility I suppose is to run it though HTML-Tidy first but I run the (admittedly small) chance of html-tidy wiping out some of the links.

So what do people use to consume XFN with PHP?

--
Julian Bond  E&MSN: julian_bond at voidstar.com  M: +44 (0)77 5907 2173
Webmaster:          http://www.ecademy.com/      T: +44 (0)192 0412 433
Personal WebLog:    http://www.voidstar.com/     skype:julian.bond?chat
                        Not Tested On Animals
_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

Reply via email to