Hi Kaya,

Yes, if you don't use any front webserver (ie Apache or nginx), you should
put robots.txt directly into /ROOT directory of tomcat (if this one listen
on port 80). After that, you can simply test your set up, trying to join
http://youdomain.org/robots.txt. If you don't find it this way, bots won't
find it neither.

Concerning the disallow directives, it is your choice to let the bots to
index what you want/need. My advice would be the make an inventory of space
and actions you don't want to index.
You could take this one as example: http://cdlsworld.xwiki.com/robots.txt

Finally, it's funny you're asking about the fact that bots could harass
your server, because almost everyone want them (except for bad robots) to
come indexing their websites :-)
Anyway, I don't think that robots could take a remarkable amount of trafic.
But the users who find your content through search engines, will ;-) I
guess it's what you want.


Guillaume Fenollar
XWiki SysAdmin
Tel : +33 (0)

2012/1/8 Kaya Saman <kayasa...@gmail.com>

> Hi,
> in the Xwiki documentation for the robots.txt file it says to put it in
> the webserver configuration.
> http://platform.xwiki.org/**xwiki/bin/view/AdminGuide/**
> Performances#HRobots.txt<http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Performances#HRobots.txt>
> On Tomcat where would this go? - Directly on the webapps/ROOT/ directory??
> Also the directives used it claims:
> # It could be also usefull to block certain spaces from crawling,
> # especially if this spaces doesn't provide new content
> Should the: /xwiki/bin/view/Photos/ portion also be excluded??
> Just as a last thing, what kind of performance benefits would be adhered
> to by stopping crawlers?
> I am imagining: CPU, RAM, Network B/W........
> Regards,
> Kaya
> ______________________________**_________________
> users mailing list
> users@xwiki.org
> http://lists.xwiki.org/**mailman/listinfo/users<http://lists.xwiki.org/mailman/listinfo/users>
users mailing list

Reply via email to