Hi Guys, Would be great if you could update the existing wiki documentation with the information in this thread since it wasn't good enough in the first place apparently :)
Thanks! -Vincent On Jan 8, 2012, at 4:49 PM, Kaya Saman wrote: > On 01/08/2012 03:53 PM, Guillaume Fenollar wrote: >> Hi Kaya, >> >> Yes, if you don't use any front webserver (ie Apache or nginx), you should >> put robots.txt directly into /ROOT directory of tomcat (if this one listen >> on port 80). After that, you can simply test your set up, trying to join >> http://youdomain.org/robots.txt. If you don't find it this way, bots won't >> find it neither. > > Thanks for the response Guillaume! > > I found a site: http://www.frobee.com/robots-txt-check > > which actually tests compliancey of the robots.txt and it seems mine are fine. > >> >> Concerning the disallow directives, it is your choice to let the bots to >> index what you want/need. My advice would be the make an inventory of space >> and actions you don't want to index. >> You could take this one as example: http://cdlsworld.xwiki.com/robots.txt > > I took a look at it and will compare that to the example off the Xwiki site. > >> >> Finally, it's funny you're asking about the fact that bots could harass >> your server, because almost everyone want them (except for bad robots) to >> come indexing their websites :-) >> Anyway, I don't think that robots could take a remarkable amount of trafic. >> But the users who find your content through search engines, will ;-) I >> guess it's what you want. > > It's not that I don't want things to be indexed or viewed but am getting a > strange issue on one of my Xwiki sites that whenever I load the site, ie > start tomcat, the memory usage is really low ~600MB; then after a while the > cpu will start working a little ~10% and the memory consumed by the process > will jump up to 1.6GB. There's not much on that site to begin with, I mean my > Wiki site has more information and images etc.. then this site which is my > www site yet the www site is consuming way more memory?? > > I'm not really sure of how to even begin debugging as I have both webalizer > and awstats working on my reverse Squid proxy infront of tomcat. So far > awstats which has been working from the beginning (3rd Jan this year) shows > nearly 9000 hits :-S out of which a lot come from Googlebot. > > That was my only issue. > > The URLs of both sites are here: > > > http://www.optiplex-networks.com > > http://wiki.optiplex-networks.com > > > and footprints are shown here: > > PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 51547 22 www 46 44 0 3545M 1590M ucond 1 6:04 0.00% java > 28878 14 www 49 44 0 3544M 404M ucond 0 3:47 0.00% java > > > with JID 14 being the wiki. site and JID 22 being the www. site..... > >> >> Regards, >> > > Regards, > > Kaya > _______________________________________________ > users mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/users
