On Thu, 13 Jun 2002, Bill Janssen wrote:

> > That was going to be another question.  What is the advantage of using
> > Sitescooper if you are already using Plucker.
> 
> In general, Plucker has focussed on producing good documents from
> available Web pages, rather than on all the technicalities of fetching
> Web pages and the manifold various impediments placed in the way of
> that fetching.  Sitescooper, on the other hand, has put much more work
> into fetching remote pages and storing them locally.

Also, sitescooper has a mature library of URLs and post-processing scripts,
which plucker is in the process of building.

sitescooper has provision for specifying in a sitefile that bits and pieces
be stripped from the pages.

For instance :-

URL:            http://www.mozillazine.org/contents.rdf
Name:           MozillaZine
Description:    Your source for Mozilla news, advocacy, interviews, builds, and more!
ContentsFormat: rss

StoryURL:       /talkback\.html\?article=\d+

# You may also want to add a StoryStart and StoryEnd line to
# clean up the stories. Here's sample lines (you need to edit them):
#
StoryStart: --features--
StoryEnd: form method="post" action

Cheers,   Andy!

Reply via email to