Re: [Flightgear-devel] http://wiki.flightgear.org/robots.txt
Hi Guys, Is the syntax of the robots.txt correct? Could be wrong. To my knowledge this is what google likes, User-agent: * Disallow: / User-agent: Googlebot Allow: / Happy flying! Rob On 02/18/2010 02:03 AM, John Denker wrote: On 02/17/2010 04:54 PM, Jon Stockill wrote: Presumably because there are some truly awful bots out there, and google at least is known to be well behaved. But the truly awful bots don't look at robots.txt. In fact one of the easiest ways to catch rogue bots is to disallow a small part of the site and then blacklist anybody who goes there. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] http://wiki.flightgear.org/robots.txt
On 02/18/2010 04:07 AM, Rob / EViLSLuT wrote: Is the syntax of the robots.txt correct? Could be wrong. Well, technically, it should say Googlebot instead of just Google. But this is such a common mistake that Googlebot answers to the name Google, and no harm is done. To my knowledge this is what google likes, User-agent: * Disallow: / User-agent: Googlebot Allow: / That's not the recommended form. According to http://www.robotstxt.org/robotstxt.html there is no Allow: directive. Certainly there is no advantage to saying Allow: / ... and no disadvantage to using the canonical form Disallow: which disallows nothing. There are situations where an Allow: directive would be helpful, but this is not one of them. Also, due to differences in opinion as to the interpretation of the robots.txt non-standard, it is a bit unpredictable whether bots will respond to the first match or best match ... so it is good practice to put more-specific directives ahead of less-specific ones. In particular, the * wildcard should be last, as it is currently on the site. In any case, the larger point remains: There are plenty of perfectly reasonable, desirable bots that are being excluded by the current file. Conversely there are plenty of truly horrible bots that will never be excluded by any robots.txt file. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] http://wiki.flightgear.org/robots.txt
Excessive traffic? The wiki has been getting 503 all the time lately. http://wiki.flightgear.org/robots.txt User-agent: Google Disallow: User-agent: * Disallow: / #User-agent: Slurp #Crawl-delay: 5 #Disallow: = Really? A collective, open-source project that doesn't allow anybody other than google to index the documentation? Is there a reason for this? -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] http://wiki.flightgear.org/robots.txt
John Denker wrote: Really? A collective, open-source project that doesn't allow anybody other than google to index the documentation? Is there a reason for this? Presumably because there are some truly awful bots out there, and google at least is known to be well behaved. Jon -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel