Hi Guys,

Would be great if you could update the existing wiki documentation with the 
information in this thread since it wasn't good enough in the first place 
apparently :)

Thanks!
-Vincent

On Jan 8, 2012, at 4:49 PM, Kaya Saman wrote:

> On 01/08/2012 03:53 PM, Guillaume Fenollar wrote:
>> Hi Kaya,
>> 
>> Yes, if you don't use any front webserver (ie Apache or nginx), you should
>> put robots.txt directly into /ROOT directory of tomcat (if this one listen
>> on port 80). After that, you can simply test your set up, trying to join
>> http://youdomain.org/robots.txt. If you don't find it this way, bots won't
>> find it neither.
> 
> Thanks for the response Guillaume!
> 
> I found a site: http://www.frobee.com/robots-txt-check
> 
> which actually tests compliancey of the robots.txt and it seems mine are fine.
> 
>> 
>> Concerning the disallow directives, it is your choice to let the bots to
>> index what you want/need. My advice would be the make an inventory of space
>> and actions you don't want to index.
>> You could take this one as example: http://cdlsworld.xwiki.com/robots.txt
> 
> I took a look at it and will compare that to the example off the Xwiki site.
> 
>> 
>> Finally, it's funny you're asking about the fact that bots could harass
>> your server, because almost everyone want them (except for bad robots) to
>> come indexing their websites :-)
>> Anyway, I don't think that robots could take a remarkable amount of trafic.
>> But the users who find your content through search engines, will ;-) I
>> guess it's what you want.
> 
> It's not that I don't want things to be indexed or viewed but am getting a 
> strange issue on one of my Xwiki sites that whenever I load the site, ie 
> start tomcat, the memory usage is really low ~600MB; then after a while the 
> cpu will start working a little ~10% and the memory consumed by the process 
> will jump up to 1.6GB. There's not much on that site to begin with, I mean my 
> Wiki site has more information and images etc.. then this site which is my 
> www site yet the www site is consuming way more memory??
> 
> I'm not really sure of how to even begin debugging as I have both webalizer 
> and awstats working on my reverse Squid proxy infront of tomcat. So far 
> awstats which has been working from the beginning (3rd Jan this year) shows 
> nearly 9000 hits :-S out of which a lot come from Googlebot.
> 
> That was my only issue.
> 
> The URLs of both sites are here:
> 
> 
> http://www.optiplex-networks.com
> 
> http://wiki.optiplex-networks.com
> 
> 
> and footprints are shown here:
> 
>  PID JID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> 51547  22 www         46  44    0  3545M  1590M ucond   1   6:04  0.00% java
> 28878  14 www         49  44    0  3544M   404M ucond   0   3:47  0.00% java
> 
> 
> with JID 14 being the wiki. site and JID 22 being the www. site.....
> 
>> 
>> Regards,
>> 
> 
> Regards,
> 
> Kaya
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/users

Reply via email to