David K. Storrs wrote on 12/09/2015 08:50 PM:
1) Is there a web-spidering package that people recommend?  I could use wget 
and then parse things from disk, but I'd like to have something that's easily 
composable into CLI scripts.

I've done a lot of Web crawling and scraping successfully with Racket and Scheme, over the last 14-15 years. I released an HTML parser ("http://www.neilvandyke.org/racket-html-parsing/";), which I still use today. From that parse, you might then extract the info you need with `sxml-match` ("http://planet.racket-lang.org/display.ss?package=sxml-match.plt&owner=jim";) and/or SXPath. For HTTP, the client modules in Racket are often satisfactory, and other times I've used my own packages that implement HTTP in pure Racket or that wrap `curl` or `wget` for special requirements. For storing pages and links/metadata, there's the filesystem, the core Racket RDBMS database support, and cloud stores like AWS S3. The un-AJAX-ing and site-specific scraping behavior you might have to do yourself, if you need it. (I have a backlog of related tools to release someday.)

P.S., Fortunately, the `sxml-match` Racket package has been preserved on the official Racket PLaneT package server, :) since the author's Web site with the package home page is down/disappeared.

Neil V.

