You could try a WebSphinx. It`s a good spider with rich set of features and
it`s free.
www.cs.cmu.edu/~rcm/websphinx

--

  Alexandr "Xenocid" Koloskov     |  E-Mail: mailto:[EMAIL PROTECTED]
  G5 Software                     |  WWW:    www.g5software.com

...I'm the screen, the blinding light
I'm the screen, I work at night...
DaySleeper by R.E.M.



> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Jim
> MacDiarmid
> Sent: Monday, January 22, 2001 5:22 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Looking for a gatherer.
>
>
> Is there anything like this that would run on a Windows 98 or NT platform?
>
> > Jim MacDiarmid, Senior Software Engineer
> > PACEL Corp.
> > 8870 Rixlew Lane
> > Manassas, VA 20109
> > (703) 257-4759
> > FAX:  (703) 361-6706
> > www.pacel.com
> >
> >
> >
> > -----Original Message-----
> > From: Simon Wilkinson [SMTP:[EMAIL PROTECTED]]
> > Sent: Sunday, January 14, 2001 4:37 PM
> > To:   [EMAIL PROTECTED]
> > Subject:      Re: Looking for a gatherer.
> >
> > > I am looking for a spider/gatherer with the following characteristics:
> > >     * Enables the control of the crawling process by URL
> > substring/regexp
> > > and HTML context of the link.
> > >     * Enables the control of the gathering (i.e. saving) processes by
> > URL
> > > substring/regexp, MIME type, other header information and ideally by
> > some
> > > predicates on the HTML source.
> > >     * Some way to save page/document metadata, ideally in a database.
> > >     * Freeware, shareware or otherwise inexpensive would be nice.
> >
> > You might like to take a look at Harvest-NG, which is free software.
> > (http://webharvest.sourceforge.net/ng) It will allow all of what you
> > detail above. It saves the metadata in a Perl DBM database - some work
> > has been done, but not completed, on working with the DBI interface
> > to a remote database. You may find that some knowledge of Perl
> is helpful
> > in adapting it exactly to your needs (much use is made of Perl regular
> > expressions in the pattern matching, for instance).
> >
> > Cheers,
> >
> > Simon.

Reply via email to