That URL is incorrect. This is the correct one: http://sourceforge.net/projects/jcrawler/
Otis P.S. Hitting "Reply All" put [EMAIL PROTECTED] on the'To' line twice. Is this a list setup problem maybe? --- Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > I guess you could then try JCrawler > (http://jcrawler.sourceforge.net/ I believe). > > JCrawler uses the WebSphinx package. > > Otis > > > --- "Alexandr \"Xenocid\" Koloskov" > <[EMAIL PROTECTED]> wrote: > > You could try a WebSphinx. It`s a good spider with > > rich set of features and > > it`s free. > > www.cs.cmu.edu/~rcm/websphinx > > > > -- > > > > Alexandr "Xenocid" Koloskov | E-Mail: > > mailto:[EMAIL PROTECTED] > > G5 Software | WWW: > > www.g5software.com > > > > ...I'm the screen, the blinding light > > I'm the screen, I work at night... > > DaySleeper by R.E.M. > > > > > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED]]On Behalf Of Jim > > > MacDiarmid > > > Sent: Monday, January 22, 2001 5:22 PM > > > To: [EMAIL PROTECTED] > > > Subject: Re: Looking for a gatherer. > > > > > > > > > Is there anything like this that would run on a > > Windows 98 or NT platform? > > > > > > > Jim MacDiarmid, Senior Software Engineer > > > > PACEL Corp. > > > > 8870 Rixlew Lane > > > > Manassas, VA 20109 > > > > (703) 257-4759 > > > > FAX: (703) 361-6706 > > > > www.pacel.com > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: Simon Wilkinson > [SMTP:[EMAIL PROTECTED]] > > > > Sent: Sunday, January 14, 2001 4:37 PM > > > > To: [EMAIL PROTECTED] > > > > Subject: Re: Looking for a gatherer. > > > > > > > > > I am looking for a spider/gatherer with the > > following characteristics: > > > > > * Enables the control of the crawling > > process by URL > > > > substring/regexp > > > > > and HTML context of the link. > > > > > * Enables the control of the gathering > > (i.e. saving) processes by > > > > URL > > > > > substring/regexp, MIME type, other header > > information and ideally by > > > > some > > > > > predicates on the HTML source. > > > > > * Some way to save page/document > metadata, > > ideally in a database. > > > > > * Freeware, shareware or otherwise > > inexpensive would be nice. > > > > > > > > You might like to take a look at Harvest-NG, > > which is free software. > > > > (http://webharvest.sourceforge.net/ng) It will > > allow all of what you > > > > detail above. It saves the metadata in a Perl > > DBM database - some work > > > > has been done, but not completed, on working > > with the DBI interface > > > > to a remote database. You may find that some > > knowledge of Perl > > > is helpful > > > > in adapting it exactly to your needs (much use > > is made of Perl regular > > > > expressions in the pattern matching, for > > instance). > > > > > > > > Cheers, > > > > > > > > Simon. > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Auctions - Buy the things you want at great > prices. > http://auctions.yahoo.com/ __________________________________________________ Do You Yahoo!? Yahoo! Auctions - Buy the things you want at great prices. http://auctions.yahoo.com/
