Hi Chris,

Based on below, I am +1 to having a default value of off.
I've met many web admins recently that want to search and index their
entire DNS but do not wish to disable their robots.txt filter in order to
do so.



I’ve recently been made aware of some situations in which
we are using crawlers like Nutch and we explicitly are looking
not to honor robots.txt (some for research purposes; some for
other purposes). Right now, of course, this isn’t possible since
it’s always explicitly required.

What would you guys think of as an optional configuration (turned
off by default) that allows bypassing of Robot rules?

Cheers,
Chris

Reply via email to