I've been playing with Plucker, trying to get it to download
http://slashdot.org/palm the way I want it to. What I'm going for is to
get the current news postings, the top rated comments (the url I
mentioned makes a page with the top 10), as well as any of the links in
the articles. I can specify a MAXDEPTH of 3 and I can get the articles,
the links in the articles (which are normally on a different webserver
than slashdot.org for those unfamiliar), and I can get the page that
lists the comments, but the comment pages themselves are at a depth of
4, and sometimes 5 if its really long as is split in two. If I specify
MAXDEPTH as 4, then I would get the actual comments.
The problem is that I also want the off-site links, but only the page
linked to, no deeper. Those pages are mostly at depth 3, the comment
pages are at depth 4. If I specify STAYONHOST, then I can get the
comments but don't get the links, and if I don't specify it, I get 1000+
pages.
Has anyone tried something similar with slashdot and had any success?
I'm also open to using the "light" version of slashdot, but I can't
figure out how using that would be easier. Thanks in advance for any
suggestions. I've looked at sitescooper, which supports the plucker
format, but it doesn't seem to give me what I want, and seems really
messy. I believe this is the right place to ask this, and I apologize
if it isn't.
Thanks,
Justin