Re: @lib/scrape.l questions

Luis P. Mendes Wed, 24 Jun 2015 04:08:36 -0700

2015-06-23 17:44 GMT+01:00 Alexander Burger <a...@software-lab.de>:
> Hi Luis,
>
>> I want to go through every link to perform further parsing in some
>> links, based in the first word of the tile of the link, but ran into
>> some difficulties.
>>
>> Is this the right way to access each of the links?
>> : (for L *Links (cond ((=T (pre? "Quarto" (car L))) (scrape this link...
>> ...
>> I know that scrape.l is available, but still a ton too much for me to
>> understand it.
>
> I would think that "@lib/scrape.l" is not so much for parsing general
> websites. It is tailored for communication with - and controlling of -
> interactive PicoLisp GUI applications, and thus rather overkill.
>
> You can directly access the contents of a site with 'client', 'from' and
> 'till'.
>
> For example, this prints every link ('href' anchor):
>
>    (client "picolisp.com" 80 "wiki/?home"
>       (while (from "<a href=\"")
>          (msg (till "<" T)) ) )
>
> Instead of the final 'msg', you could to further processing of the data,
> and/or omit the 'T' in the 'till' call to get lists of characters
> instead of strings.


These are beginner questions, but why doesn't this work?

(in "i.html"
   (while (from "<a href=\"")
      (parseLink (till "<" T))
      ))

(in "i.html"
   (while (from "<a href=\"")
      (setq Tt (till "<" T))
      (parseLink Tt)))

It's just a substitution of `msg` by a call to `parseLink` that
doesn't seem to get called.


Just for curiosity, in my previous message, in
(for L *Links (cond ((=T (pre? "Quarto" (car L))) (scrape this link...

(car L) return the same as (caar L), (caaar L), etc?


Luis
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: @lib/scrape.l questions

Reply via email to