I've seen this as well…

One early Tuesday morning I was making a DSpace system change and, right 
afterward the system load average spiked, making for that bad feeling that 
something I had changed was no good. Analysis proved otherwise, however. I 
found that the jsvc process was eating all available CPU power. The Tomcat logs 
are pretty useless when pursuing problems. Going to the Apache access_log 
showed that our site's Google Search Appliance was intensely indexing the 
content served by this system. When that indexing completed, the load average 
went back to nil. Looking back in our system monitoring graphs, I saw that this 
load spike occurred every Tuesday morning - when the GSA was doing a full 
indexing run. Servers implemented in just Apache had no issues. (Apache is 
implemented in a compiled programming language.)

Java is notorious for its overhead, security issues, and JVM programming 
errors, and this is one more manifestation of the issues one can suffer when 
implementing applications in Java.

There isn't a lot you can do about this, as long as DSpace is implemented in 
Java and you want your site's content to be findable via search sites such as 
Google. You could try blocking non-mainstream indexing bots via firewall 
settings, but that quickly gets messy.

Richard Sims
Sr. Systems Engineer, Information Services & Technology
Boston University
http://people.bu.edu/rbs

On Apr 6, 2013, at 6:26 AM, Hilton Gibson <[email protected]> wrote:

> "We are currently experiencing that bot’s (eg. googlebot, bingbot) are using 
> up all the cpu on our dspace installation" Can you explain how you determined 
> this?
> 
> 
> On 5 April 2013 20:42, Ene Rammer Nielsen <[email protected]> wrote:
> Hi,
> We are currently experiencing that bot’s (eg. googlebot, bingbot) are using 
> up all the cpu on our dspace installation. We were wondering whether anybody 
> have any good advice? We have discussed different solutions, but we also 
> noticed that the problem could be due to missing database indexs, so we would 
> like to hear if anybody have any good ideas.
> 
> Hope someone can help.


------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to