I started with this approach yesterday, first in order to capture feed
type which I am now able to do.

I noticed that some rss feeds have attributes in their <item> tags,
therefore the above won't work 100% of the time.

(in "rss.xml"
      (from "<item")
               (NIL (chain (till ">")))
               (T (tail '`(chop "item") @)) ) ) ) ))

This will accurately capture the <item> tag all the time I think but
then we need some way of discarding the attributes and the closing >.
I tried with an immediate (till ">") after the (from) but it didn't
have the intentional result, any suggestions here?


On Sun, Nov 1, 2009 at 6:26 PM, Alexander Burger <a...@software-lab.de> wrot=
> On Sun, Nov 01, 2009 at 01:49:59PM +0100, Henrik Sarvell wrote:
>> It's a good question with a very simple answer, many many feeds out
>> there are completely broken, sometimes they don't conform to
>> standards, that's a good scenario but often they have unmatched tags
>> or unclosed attributes.
> Ouch. I see.
> So what do you think about the following:
> (while (from "<item>")
> =A0 (println =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 # Instead of printing
> =A0 =A0 =A0(make =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 # do further matching
> =A0 =A0 =A0 =A0 (loop
> =A0 =A0 =A0 =A0 =A0 =A0(NIL (chain (till ">"))) =A0 =A0 =A0 =A0 =A0 =A0 =
=A0# Collect until next tag
> =A0 =A0 =A0 =A0 =A0 =A0(char) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0# Skip '>'
> =A0 =A0 =A0 =A0 =A0 =A0(T (tail '`(chop "item") @)) ) ) ) ) =A0# See if w=
e got <item>
> The 'make' will give you smaller chunks of data, which are easier to
> 'match'.
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=3dunsubscribe
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Reply via email to