----- "Tim Donohue" <[email protected]> wrote: | Quick followup, in case it isn't clear (as I was asked about this | off-list). The preference would be to share your DSpace | setup/configuration information directly on this listserv
Let me kick things off, then (questions truncated a bit for formatting reasons): 1) Contact Info a) Bill Anderson / Georgia Institute of Technology / [email protected] 2) DSpace Setup and Configuration details a) What DSpace version are you using? 1. Dspace 1.6.2 2. Currently using JSPUI, migrating to XMLUI 3. 30,498 Items 4. 610 Communities/Collections b) What Postgres/Oracle version are you using? 1. PostgreSQL 8.1.4 c) What Tomcat version are you using? 1. Tomcat/6.0.26 + mod_jk/1.2.30 + Apache/2.0.52 d) Is everything running on one server (DSpace/Tomcat/Posgres/etc)? 1. Everything is (currently) on the same server 2. PowerEdge 2850: 2x Intel Xeon CPU 2.80Ghz, 12Gb Memory, Red Hat AS 4 (Nahant Update 8), RAID5 Disk array e) How much memory are you making available to Tomcat/Java? 1. (lb worker) JAVA_OPTS="-server -Xmx462M -Xms462M -XX:+UseParallelGC -Dfile.encoding=UTF-8", webapps: jspui lni oai sword xmlui 2. (lb worker) JAVA_OPTS="-server -Xmx462M -Xms462M -XX:+UseParallelGC -Dfile.encoding=UTF-8", webapps: jspui lni oai sword xmlui 3. JAVA_OPTS="-server -Xmx600M -Xms600M -XX:+UseParallelGC -Dfile.encoding=UTF-8", webapps: solr 4. lb worker method=request, socket_keepalive=True, socket_timeout=0, ping_mode=A 5. Postgres max_connections=300 3) Performance / Scalability Issues noticed 1. We've had intermittent performance problems since upgrading to 1.6 in May. At first, the problems seemed strictly SOLR-related; SOLR was grabbing hundreds of postgres connections, and eventually generating these in dspace.log: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error: Timeout waiting for idle object and these in catalina.out: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. ...followed by permgen errors and death. 2. We heavily revised our solrconfig.xml, and alleviated the problem, but didn't eliminate it. We also split our jspui between two load-balanced tomcat instance, and moved the SOLR webapp to another third instance, which also helped. Following OR 2010, on a suggesting from Peter Dietz, we revised the SOLR JSP code to use the auto-commit functionality rather than manually committing every transaction. All of this got us to the point where we weren't crashing routinely; but we still have major problems during times of heavy traffic. Generally, these take the form of a gradual slowdown followed by a complete failure to respond; this sometimes ends in spontaneous recovery, and sometimes in permgen errors and a crash. At the end of last week, following a bad patch caused by a LOCKSS harvest, we implemented a restart schedule, with our two jspui tomcat instances being automatically restarted every 6 hours alternating between one/two. We haven't had any crashes since; but we're not at all sure we've solved the problem. 3. On restart, we sometimes get a bunch of these: Sep 28, 2010 9:00:06 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads SEVERE: A web application appears to have started a thread named [FinalizableReferenceQueue] but has failed to stop it. This is very likely to create a memory leak 4. Other errors that lead to a service/application outage: Sep 23, 2010 3:47:14 PM org.apache.tomcat.util.threads.ThreadPool$ControlRunnable run SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen space) executing org.apache.jk.common.channelsocket$socketconnect...@3aff776, terminating thread Sep 23, 2010 10:37:04 AM org.apache.catalina.connector.CoyoteAdapter service SEVERE: An exception or error occurred in the container during the request processing java.lang.OutOfMemoryError: PermGen space at java.lang.Throwable.getStackTraceElement(Native Method) at java.lang.Throwable.getOurStackTrace(Throwable.java:591) at java.lang.Throwable.getStackTrace(Throwable.java:582) at org.apache.juli.logging.DirectJDKLog.log(DirectJDKLog.java:155) at org.apache.juli.logging.DirectJDKLog.error(DirectJDKLog.java:135) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:274) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690) at java.lang.Thread.run(Thread.java:619) Sep 23, 2010 10:38:19 AM org.apache.catalina.connector.CoyoteAdapter service SEVERE: An exception or error occurred in the container during the request processing java.lang.OutOfMemoryError: PermGen space 4) Volunteer To Help? a) Would you be willing to volunteer some time to work on a fix Yes. We have a large DSpace installation and several smaller ones, with four systems analysts and a system administrator who work at least part time on them, and several administrative users/submitters with significant knowledge of DSpace from a users' perspective. We would be interested in helping with testing and development as time permits. Please use me as a primary contact: [email protected] Bill Anderson Software Developer Digital Library Development Georgia Tech Library ------------------------------------------------------------------------------ Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

