[PHP] Blatantly Evil Question
What is the best way to cloak a site - send search engines different content than real users? Yes, I know it's bad practice, and I know the domain will eventually be banned. I've found lots of different methods including huge tables of all the possible client types sent by various spiders. I postulate that the simplest/fastest way to do it, and no less reliably, would be to simply consider any user whose client type includes msie, netscape, or safari to be a person, and let the rest go. Anyone have any practical experience with success that they'd like to share? I know there are plenty of negative stories and reasons NOT to do those but no need to take up the bandwidth with that; heard 'em already. :) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
On Aug 11, 2005, at 3:44 PM, Evert | Collab wrote: Use robots.txt 'evil' searchengines will spoof the user-agent string anyway Can you be more specific about what you mean by use robots.txt? I just want to cloak for Google, MSN, and Yahoo. I couldn't care less about what any other search engine (evil or not) does or sees. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
On Aug 11, 2005, at 4:06 PM, Evert | Collab wrote: First hit on google: http://www.searchengineworld.com/robots/robots_tutorial.htm Search engines check for a robots.txt on your site, in the robots.txt file you can specify that certain or all search engines shouldn't index your site I know what robots.txt is, I meant how would you use that to cloak the site. Put PHP code in robots.txt to log the IP of any requests to a db, and then use that db to cloak the rest of the site or not? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Brian Dunning wrote: On Aug 11, 2005, at 3:44 PM, Evert | Collab wrote: Use robots.txt 'evil' searchengines will spoof the user-agent string anyway Can you be more specific about what you mean by use robots.txt? I just want to cloak for Google, MSN, and Yahoo. I couldn't care less about what any other search engine (evil or not) does or sees. robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. Jasper -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Jasper Bryant-Greene wrote: Brian Dunning wrote: On Aug 11, 2005, at 3:44 PM, Evert | Collab wrote: Use robots.txt 'evil' searchengines will spoof the user-agent string anyway Can you be more specific about what you mean by use robots.txt? I just want to cloak for Google, MSN, and Yahoo. I couldn't care less about what any other search engine (evil or not) does or sees. robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. they will hammer you for it eventually - AFAICT all major SEs send out their spiders occasionally with faked user-agent strings - to catch out crap like this. oh and the guy that invented php is a really bigcheese down at yahoo... and he reads this list :-) though I doubt he has the time or desire to chase you personally. I would recommend you don't go down this road. it's bad for your business in the longer term and its bad for the web because your filling it with shite. Jasper -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Jochem Maas wrote: Jasper Bryant-Greene wrote: robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. they will hammer you for it eventually - AFAICT all major SEs send out their spiders occasionally with faked user-agent strings - to catch out crap like this. oh and the guy that invented php is a really bigcheese down at yahoo... and he reads this list :-) though I doubt he has the time or desire to chase you personally. I would recommend you don't go down this road. it's bad for your business in the longer term and its bad for the web because your filling it with shite. Of course it is, but in his original post he said that he realised that it was bad, and he didn't want to hear reasons not to do it. I would never even attempt to do something like this on a website of my own -- as I said in an off-list email to this guy (it was OT for the list) it's going to harm his website more than help it. It's not exactly hard for the search engines to detect cloaking. Jasper -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. they will hammer you for it eventually - AFAICT all major SEs send out their spiders occasionally with faked user-agent strings - to catch out crap like this. google adsense won't. I explicity asked them about this. Well, what I asked was that if I had a password protected area, could I allow them access to spider the content so that normal users could see the ads. I told them the layout would be different, but the content the same. They said that was fine. 2cents. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Philip Hallstrom wrote: robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. they will hammer you for it eventually - AFAICT all major SEs send out their spiders occasionally with faked user-agent strings - to catch out crap like this. google adsense won't. I explicity asked them about this. Well, what I asked was that if I had a password protected area, could I allow them access to spider the content so that normal users could see the ads. I told them the layout would be different, but the content the same. They said that was fine. but you didn't ask - 'heh is it okay to fill my public page with SEO crud but only if a spider comes round' they might just take a different view on that :-) 2cents. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
* Brian Dunning [EMAIL PROTECTED] : On Aug 11, 2005, at 4:06 PM, Evert | Collab wrote: First hit on google: http://www.searchengineworld.com/robots/robots_tutorial.htm Search engines check for a robots.txt on your site, in the robots.txt file you can specify that certain or all search engines shouldn't index your site I know what robots.txt is, I meant how would you use that to cloak the site. Put PHP code in robots.txt to log the IP of any requests to a db, and then use that db to cloak the rest of the site or not? If you want to dynamically determine what to disallow based on the UserAgent string, simply tell Apache, via an .htaccess file, to pass robots.txt to PHP for handling. Then have that script do the processing and return output compatible with the robots.txt specification. -- Matthew Weier O'Phinney Zend Certified Engineer http://weierophinney.net/matthew/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Evert | Collab wrote: Lets just put it this way: if you don't want your site indexed, use robots.txt if you want to hide your site from search engines [ which won't even touch your files if you use robots.txt ] check the UA string. I can't imagine a situation where you want to hide your content from the major search engines, since they all use robots.txt You misunderstand his original question. He wants to show different content to search engines than to users. He understands this is a bad thing to do, but just wants to know how to do it anyway. Jasper -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Blatantly Evil Question
Jasper Bryant-Greene wrote: Jochem Maas wrote: Jasper Bryant-Greene wrote: robots.txt will not do what you want it to. Just sniff for those robots' User-Agents (Google, MSN and Yahoo all publish their UA strings on their websites, AFAIK) and send different content if it's one of those. they will hammer you for it eventually - AFAICT all major SEs send out their spiders occasionally with faked user-agent strings - to catch out crap like this. oh and the guy that invented php is a really bigcheese down at yahoo... and he reads this list :-) though I doubt he has the time or desire to chase you personally. I would recommend you don't go down this road. it's bad for your business in the longer term and its bad for the web because your filling it with shite. Of course it is, but in his original post he said that he realised that it was bad, and he didn't want to hear reasons not to do it. I know - I only really replied to voice my total disdain for idiots who are filling the search engines with shite. thats bad for all of us (well those of us that use search engines - you get the impression that some people here don't know what one is ;-) if he didn't want to hear this stuf he should have googled - there is tons of code that does this - he didn't really need the list at all to figure out how to do it. I would never even attempt to do something like this on a website of my own -- as I said in an off-list email to this guy (it was OT for the you have to go pretty far to be off topic for this list ;-) list) it's going to harm his website more than help it. It's not exactly hard for the search engines to detect cloaking. I concur. Jasper -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php