Re: Looking for a gatherer.

Otis Gospodnetic Tue, 23 Jan 2001 06:20:50 -0800

That URL is incorrect.
This is the correct one:
http://sourceforge.net/projects/jcrawler/


Otis
P.S.
Hitting "Reply All" put [EMAIL PROTECTED] on the'To'
line twice. Is this a list setup problem maybe?


--- Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:
> I guess you could then try JCrawler
> (http://jcrawler.sourceforge.net/ I believe).
>
> JCrawler uses the WebSphinx package.
>
> Otis
>
>
> --- "Alexandr \"Xenocid\" Koloskov"
> <[EMAIL PROTECTED]> wrote:
> > You could try a WebSphinx. It`s a good spider with
> > rich set of features and
> > it`s free.
> > www.cs.cmu.edu/~rcm/websphinx
> >
> > --
> >
> >   Alexandr "Xenocid" Koloskov     |  E-Mail:
> > mailto:[EMAIL PROTECTED]
> >   G5 Software                     |  WWW:
> > www.g5software.com
> >
> > ...I'm the screen, the blinding light
> > I'm the screen, I work at night...
> > DaySleeper by R.E.M.
> >
> >
> >
> > > -----Original Message-----
> > > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED]]On Behalf Of Jim
> > > MacDiarmid
> > > Sent: Monday, January 22, 2001 5:22 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Looking for a gatherer.
> > >
> > >
> > > Is there anything like this that would run on a
> > Windows 98 or NT platform?
> > >
> > > > Jim MacDiarmid, Senior Software Engineer
> > > > PACEL Corp.
> > > > 8870 Rixlew Lane
> > > > Manassas, VA 20109
> > > > (703) 257-4759
> > > > FAX:  (703) 361-6706
> > > > www.pacel.com
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Simon Wilkinson
> [SMTP:[EMAIL PROTECTED]]
> > > > Sent: Sunday, January 14, 2001 4:37 PM
> > > > To:   [EMAIL PROTECTED]
> > > > Subject:      Re: Looking for a gatherer.
> > > >
> > > > > I am looking for a spider/gatherer with the
> > following characteristics:
> > > > >     * Enables the control of the crawling
> > process by URL
> > > > substring/regexp
> > > > > and HTML context of the link.
> > > > >     * Enables the control of the gathering
> > (i.e. saving) processes by
> > > > URL
> > > > > substring/regexp, MIME type, other header
> > information and ideally by
> > > > some
> > > > > predicates on the HTML source.
> > > > >     * Some way to save page/document
> metadata,
> > ideally in a database.
> > > > >     * Freeware, shareware or otherwise
> > inexpensive would be nice.
> > > >
> > > > You might like to take a look at Harvest-NG,
> > which is free software.
> > > > (http://webharvest.sourceforge.net/ng) It will
> > allow all of what you
> > > > detail above. It saves the metadata in a Perl
> > DBM database - some work
> > > > has been done, but not completed, on working
> > with the DBI interface
> > > > to a remote database. You may find that some
> > knowledge of Perl
> > > is helpful
> > > > in adapting it exactly to your needs (much use
> > is made of Perl regular
> > > > expressions in the pattern matching, for
> > instance).
> > > >
> > > > Cheers,
> > > >
> > > > Simon.
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Auctions - Buy the things you want at great
> prices.
> http://auctions.yahoo.com/


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - Buy the things you want at great prices.
http://auctions.yahoo.com/

Re: Looking for a gatherer.

Reply via email to