Hi, Sure, just use WWW::Search which runs in Perl on both Linux and Windows. It's a great tool. Check it out at http://cpan.perl.org or i think it's at http://www.freshmeat.net as well.
in His grip, Gil Vidals / CEO http://www.truepath.com your Christ-centered web host On Wed, 10 May 2000 14:26:25 -0700 Avi Rappoport <[EMAIL PROTECTED]> wrote: > There's are several Java spider products available, some with source > code. Try the listings at > <http://www.searchtools.com/tools/tools-java.html> > > At 4:14 PM -0400 5/9/2000, Corey Wineman wrote: > >Hello, > > > >I have just joined this mailing list. I haven't seen any messages and don't > >know if anyone is listening. > >Anyway, I have been working on a webspider for my company for some time now. > >I inherited much of the code from a previous employee. It is written > >completely in Java, and I have spent a long time trying to make it run > >properly. It is still plagued with memory leaks and other networking > >problems. The biggest problem has been dealing with threading, recognizing > >blackholes and keeping track of a huge number of nodes. > >What I want to do is traverse through a site and do processing on certain > >files, storing the results( things like, if the file meets a certain > >criteria, what is the IP of the site, when did I visit the site) to a > >database. I would like to be able to configure the spider. Limiting the > >depth from a source URL, limiting the depth it will search onto external > >sites, and setting the defaults on various timeouts. > > > >Does anyone know of a webspider that does some of these things and is > >available along with the source code? > > > >Thanks, > >Corey > > -- > ________________________________________________________________ > The Complete Guide to Site Indexing and Local Search Engines > <mailto:[EMAIL PROTECTED]> <http://www.searchtools.com> > Gil Vidals (877) TOP-GEEK (877.867-4335) http://PositionGeek.com search engine positioning that works
