It's a good question with a very simple answer, many many feeds out there are completely broken, sometimes they don't conform to standards, that's a good scenario but often they have unmatched tags or unclosed attributes.
At first I tried using the xml function but I quickly discovered that it breaks down when trying to read roughly 20% of the feeds out there, a deplorable situation but it's the way it is. About the file I sent you lacking items sorry, then it must be an ATOM feed, not RSS, then you try and find <entry>...</entry> instead but be careful because that format will allow for attributes in the tag, ie <entry attr="attr">...</entry>. I have attached my current rss.l which is able to parse all of the 800+ feeds I subscribe to, note that I use (xml) for the OPML format, these are files containing my subscriptions which a feedreader should be able to import/export, my reader can currently import them. The reason I'm able to use (xml) on that one is that the two readers my reader currently can import from are Google reader and the desktop app called simply FeedReader, at least these two manage to export valid xml files. /Henrik On Sun, Nov 1, 2009 at 1:25 PM, Alexander Burger <a...@software-lab.de> wrote: > Hi Henrik, > >> The problem is using from in combination with till repeatedly to parse >> input in order to for instance get at the contents of the <item></tem> >> elements, there is a twist though, the contents can contain more >> markup so a check is needed every time till encounters for instance <, >> if that one is to be used as a stop char. > > This is indeed a bit tedious, because we would need to manually collect > strings and match them until the proper patterns are found. > > > But before we start doing that: I'm wondering why this should be > necessary. Can't we just just use the 'xml' function? It was written for > that purpose after all (though it is also based on 'from' and 'till'): > > (load "lib/xml.l") > (setq Lst (in "rss.xml" (and (xml?) (xml)))) > > Now 'Lst' contains the whole XML tree, which can be handled easily with > Lisp functions. > > > For example, to collect all <item> expressions nested somewhere in that > list, you could use 'fish' > > (fish '((L) (== 'item (car L))) Lst) > > Actually, the sample "rss.xml" you've attached does not seem to contain > any 'item' tags. But if I try 'author' > > (fish '((L) (== 'author (car L))) Lst) > > I get a long result list. > > To inspect it conveniently, I usually do > > (more (fish '((L) (== 'author (car L))) Lst) pretty) > > Cheers, > - Alex > -- > UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe >
Description: Binary data