You could try a WebSphinx. It`s a good spider with rich set of features and it`s free. www.cs.cmu.edu/~rcm/websphinx
-- Alexandr "Xenocid" Koloskov | E-Mail: mailto:[EMAIL PROTECTED] G5 Software | WWW: www.g5software.com ...I'm the screen, the blinding light I'm the screen, I work at night... DaySleeper by R.E.M. > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Jim > MacDiarmid > Sent: Monday, January 22, 2001 5:22 PM > To: [EMAIL PROTECTED] > Subject: Re: Looking for a gatherer. > > > Is there anything like this that would run on a Windows 98 or NT platform? > > > Jim MacDiarmid, Senior Software Engineer > > PACEL Corp. > > 8870 Rixlew Lane > > Manassas, VA 20109 > > (703) 257-4759 > > FAX: (703) 361-6706 > > www.pacel.com > > > > > > > > -----Original Message----- > > From: Simon Wilkinson [SMTP:[EMAIL PROTECTED]] > > Sent: Sunday, January 14, 2001 4:37 PM > > To: [EMAIL PROTECTED] > > Subject: Re: Looking for a gatherer. > > > > > I am looking for a spider/gatherer with the following characteristics: > > > * Enables the control of the crawling process by URL > > substring/regexp > > > and HTML context of the link. > > > * Enables the control of the gathering (i.e. saving) processes by > > URL > > > substring/regexp, MIME type, other header information and ideally by > > some > > > predicates on the HTML source. > > > * Some way to save page/document metadata, ideally in a database. > > > * Freeware, shareware or otherwise inexpensive would be nice. > > > > You might like to take a look at Harvest-NG, which is free software. > > (http://webharvest.sourceforge.net/ng) It will allow all of what you > > detail above. It saves the metadata in a Perl DBM database - some work > > has been done, but not completed, on working with the DBI interface > > to a remote database. You may find that some knowledge of Perl > is helpful > > in adapting it exactly to your needs (much use is made of Perl regular > > expressions in the pattern matching, for instance). > > > > Cheers, > > > > Simon.