That's a good question. I know at Websense our crawler can still find those pages and it PISSES some people off. Some people watch their logs very closely and see our crawlers and hitting it and they spaz out because they think they are the only people in the whole universe that know of that pages existence. We can find those pages if it gets sent to us by our customers via a opt in mechanism.
So if a crawler has no other source of data then yes that page should not get indexed. But in the case of google, yahoo, msn, etc they have many many sources of data for URL's not just what they can discover via a crawl. They COULD (not saying they do or don't) register all URL's they see with their toolbars, emails they host, or other data sources. But if you just want to hide a page just password protect the page or directory and even use a self signed ssl cert to encrypt it if your after privacy. Thanks, ------------------------------------------ Ali Mesdaq Security Researcher II Websense Security Labs http://www.WebsenseSecurityLabs.com ------------------------------------------ -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of j maccraw Sent: Tuesday, May 15, 2007 2:28 PM To: The Hardware List Subject: Re: [H] take off search list If a web server does not allow directory browsing, you don't use a common name filename, and don't link to it, does it still get indexed? Mesdaq, Ali wrote: > Sorry for not getting back earlier I totally forgot about this. But I > have never actually used the robots.txt file but I looked at their spec > and you seem to have it correct. I also saw that html elements are also > used in some cases. ________________________________________________________________________ ____________Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool. http://autos.yahoo.com/carfinder/
