Let me add:

Please, for the love of Pete, respect the directives in the robots.txt file.

> -----Original Message-----
> From: Jim Davis [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, February 05, 2003 10:45 AM
> To: CF-Talk
> Subject: RE: Remotely spider websites
> 
> 
> > -----Original Message-----
> > From: Jillian Carroll [mailto:[EMAIL PROTECTED]] 
> > Sent: Wednesday, February 05, 2003 10:18 AM
> > To: CF-Talk
> > Subject: Remotely spider websites
> > 
> > 
> > >From what I have been able to find, the verity spider will 
> > only spider
> > sites on a single network domain... so a remote machine must 
> > reside on the same domain.  What I'm wondering is if anybody 
> 
> Yup.  Verity (and MS Index Server) are what's called "Worms" - they
> "burrow" through the file system of the machine itself.  "Spiders", on
> the other hand, "walk the web" and see only what's public 
> through HTTP.
> 
> There are pros and cons to both, of course, but that's the main
> difference.
> 
> > can recommend (or has
> > written) a tool that will let me spider remote sites in a 
> > CF-friendly manner?
> 
> There are a ton, but I don't know how well supported they are (or if
> they even exist any longer).  Most of the major search engines have a
> "home game" version of their software (either as a software package or
> as a service they offer):
> 
> Alta Vista: http://solutions.altavista.com/
> 
> Google: http://www.google.com/services/
> 
> Lycos: http://enterprise.lycos.com/Search/SiteSearch.asp
> 
> You can dig up a lot more of these types of services by visiting
> http://www.searchengines.com/  Unfortunately there's no 
> specific content
> directed towards DIY, but they do have the largest collection 
> of public
> engines around.
> 
> Now the site to go for more general information is
> http://www.searchenginewatch.com/
> 
> This is probably the most relevant page for you:
> 
> http://www.searchenginewatch.com/resources/software.html
> 
> But there's a lot more material on the site aimed at webmasters.
> 
> Hope this helps,
> 
> Jim Davis
> President, Depressed Press of Boston: http://www.DepressedPress.com/
> Webmaster, First Night Boston:  http://www.firstnight.org/
> Senior Consultant, Metlife eCommerce IT:  http://www.metlife.com/ 
> 
> 
> 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4
Subscription: 
http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
This list and all House of Fusion resources hosted by CFHosting.com. The place for 
dependable ColdFusion Hosting.

                                Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
                                

Reply via email to