[Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Jeffrey Trimble
Is there something simple I can place in the jsp that will prohibit the crawlers from using my server resources? TIA, Jeff Jeffrey Trimble Systems Librarian Maag Library Youngstown State University 330-941-2483 (Office) jtrim...@cc.ysu.edu http://www.maag.ysu.edu http://digital.maag.ysu.edu

Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Shane Beers
Jeff: We had an issue with our local google instance crawling our DSpace installation and causing huge issues. I re-wrote the robots.txt to disallow anything besides the item pages themselves - no browsing pages or search pages and whatnot. Here is a copy of ours: User-agent: * Disallow:

Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Robert Tansley
As of DSpace 1.5, sitemaps are supported which allow search engines to selectively crawl only new items, while massively reducing the server load: http://www.dspace.org/1_5_1Documentation/ch03.html#N10B44 Unfortunately, it seems that relatively few DSpace instances actually use this feature. I

Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread George Kozak
Jeff: What I am using is a robots.txt file that I put in the dspace webapps directory in tomcat. I think it's working (at least we haven't crashed lately). If you're interested in seeing my robots.txt file, I can send it to you. At 01:09 PM 1/14/2009, Jeffrey Trimble wrote: Is there

Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Tom De Mulder
On Wed, 14 Jan 2009, Shane Beers wrote: We had an issue with our local google instance crawling our DSpace installation and causing huge issues. I re-wrote the robots.txt to disallow anything besides the item pages themselves - no browsing pages or search pages and whatnot. Here is a

Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Van Ly
Beers Cc: dspace-tech Tech; Jeffrey Trimble Subject: Re: [Dspace-tech] Google bots and web crawlers As of DSpace 1.5, sitemaps are supported which allow search engines to selectively crawl only new items, while massively reducing the server load: http://www.dspace.org/1_5_1Documentation/ch03.html