2015-06-23 17:44 GMT+01:00 Alexander Burger <a...@software-lab.de>: > Hi Luis, > >> I want to go through every link to perform further parsing in some >> links, based in the first word of the tile of the link, but ran into >> some difficulties. >> >> Is this the right way to access each of the links? >> : (for L *Links (cond ((=T (pre? "Quarto" (car L))) (scrape this link... >> ... >> I know that scrape.l is available, but still a ton too much for me to >> understand it. > > I would think that "@lib/scrape.l" is not so much for parsing general > websites. It is tailored for communication with - and controlling of - > interactive PicoLisp GUI applications, and thus rather overkill. > > You can directly access the contents of a site with 'client', 'from' and > 'till'. > > For example, this prints every link ('href' anchor): > > (client "picolisp.com" 80 "wiki/?home" > (while (from "<a href=\"") > (msg (till "<" T)) ) ) > > Instead of the final 'msg', you could to further processing of the data, > and/or omit the 'T' in the 'till' call to get lists of characters > instead of strings.
These are beginner questions, but why doesn't this work? (in "i.html" (while (from "<a href=\"") (parseLink (till "<" T)) )) (in "i.html" (while (from "<a href=\"") (setq Tt (till "<" T)) (parseLink Tt))) It's just a substitution of `msg` by a call to `parseLink` that doesn't seem to get called. Just for curiosity, in my previous message, in (for L *Links (cond ((=T (pre? "Quarto" (car L))) (scrape this link... (car L) return the same as (caar L), (caaar L), etc? Luis -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe