Re: [Dspace-tech] seeking help in tracking down why Tomcat is dying on us

2011-07-13 Thread Mark H. Wood
I keep a monitoring gadget like LambdaProbe (or PsiProbe) running on
production Tomcat instances so I can watch what is happening to
memory.  It helps with tuning and sometimes diagnosis.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpDLT1tXinNd.pgp
Description: PGP signature
--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] seeking help in tracking down why Tomcat is dying on us

2011-07-13 Thread Pottinger, Hardy J.
Thanks, Mark, many many moons ago you gave me this same, very helpful
advice, and I've been running LambdaProbe ever since. The problem with
LambdaProbe, is it requires Tomcat to function, and if Tomcat is down...
;-)

Now I'm looking into getting JMX monitoring set up by my sysadmins, so
they can use Zabbix to monitor our Tomcat installations. I'd love to have
logs of memory usage to dig into, the next time something comes up.

--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
No matter how far down the wrong road you've gone,
turn back. --Turkish proverb






On 7/13/11 9:27 AM, Mark H. Wood mw...@iupui.edu wrote:

I keep a monitoring gadget like LambdaProbe (or PsiProbe) running on
production Tomcat instances so I can watch what is happening to
memory.  It helps with tuning and sometimes diagnosis.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are
smart.


--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] seeking help in tracking down why Tomcat is dying on us

2011-07-12 Thread Pottinger, Hardy J.
Hi, I thought I'd ask the community for help debugging this one. For the
past week, we've been having troubles with Tomcat dying on us. The error
message in the Tomcat logs indicates that the JVM has run out of memory
(snippets of log files and config files below). The system has 8GB of RAM,
and here's what we have configured for Tomcat memory settings:

###  tomcat memory settings, from /etc/tomcat5/tomct5.conf

# recommended settings for a production DSpace environment
# from Mark Wood (mw...@iupui.edu)
JAVA_OPTS=-Xmx1024M -Xms768M
JAVA_OPTS=$JAVA_OPTS -XX:MaxPermSize=128M
JAVA_OPTS=$JAVA_OPTS -XX:PermSize=32M

# tweak from Bill Anderson at GAtech: use the parallel garbage collector
JAVA_OPTS=$JAVA_OPTS -XX:+UseParallelGC



###  Here is the error message we're seeing in the tomcat logs:

# There is insufficient memory for the Java Runtime Environment to
continue.
# Native memory allocation (malloc) failed to allocate 32756 bytes for
ChunkPool::allocate
# An error report file with more information is saved as:
# /home/dspace/hs_err_pid7188.log


###  contents of /home/dspace/hs_err_pid7188.log

# There is insufficient memory for the Java Runtime Environment to
continue.
# Native memory allocation (malloc) failed to allocate 32756 bytes for
ChunkPool::allocate
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (allocation.cpp:211), pid=7188, tid=1800473488
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) Server VM (20.1-b02 mixed mode linux-x86 )

---  T H R E A D  ---

Current thread (0x6ba00800):  JavaThread C2 CompilerThread0 daemon
[_thread_in_native, id=7204, stack(0x6b49,0x6b511000)]

Stack: [0x6b49,0x6b511000],  sp=0x6b50de80,  free space=503k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
V  [libjvm.so+0x7248b0]

###  there are two similar hs_err.log files, reporting similar error
conditions (insufficient memory), dated 07/05/2011 and 07/06/2011


I can post snippets from other log files, but figured I'd see if anyone
had any advice of where to look/what to look for?


--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
No matter how far down the wrong road you've gone,
turn back. --Turkish proverb





--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] seeking help in tracking down why Tomcat is dying on us

2011-07-12 Thread Pottinger, Hardy J.
Hi, George, now that you mention it, the only thing showing up in the logs
at the time of the memory error were lines about SOLR stats. I figured
that was just an indication of normal usage. On the off chance these lines
might help with diagnosis, I'm pasting a few below:

Jul 12, 2011 9:48:19 AM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=1

commit{dir=/dspace/solr/statistics/data/index,segFN=segments_tl02,version=1
282198870198,generation=1380242,filenames=[_wvbt.nrm, _ghmn.frq,
_px24.tii, _wuy1.fdt, _wvbt.prx, _75l4.tis, _75l4.tii, _wu4u.fnm,
_wvbw.fdt, _wvbu.fnm, _wvbw.tii, _px24.fnm, _wvbw.fdx, _px24.tis,
_wvbw.frq, _wvbs.fnm, _wv9i.frq, _75
l4.fdx, _wv0t.frq, _wvbr.fnm, _wsrd.tis, _px24.prx, _px24.fdt, _75l4.fdt,
_75l4.prx, _px24.fdx, _wuy1.fdx, _wu4u.fdt, _wv5l.tis, _wsrd.frq,
_wu4u.fdx, _wvbv.fnm, _wsrd.tii, _wsrd.fdx, _wvby.fnm, _wvbu.tis,
_wvbv.tis, _wsrd.fdt, _wvby.tii, _wv3m.frq, _wvbu.nrm, _wv5l.fnm,
_wvbr.prx, _wvbx.fnm, _wsrd.prx, _wv9i.fdt,
_wvbu.prx, _wv9i.fdx, _wvbt.tis, _wvby.prx, _wtdv.prx, _wuy1.prx,
_wuy1.fnm, _wvbt.fnm, _wu4u.tii, _75l4.frq, _wu4u.tis, _wvbs.tis,
_wtdv.tis, _wv9i.tis, _wvby.nrm, segments_tl02, _ghmn.fdx, _wuy1.tis,
_ghmn.fdt, _wvby.fdt, _wv9i.tii, _wvbv.fdx, _wv5l.fdx, _wvbs.tii,
_wtdv.tii, _wv3m.prx, _wvbv.fdt, _wvbw.nrm, _ghm
n.fnm, _wvbt.tii, _wvbv.prx, _px24.frq, _wvbt.fdt, _wtdv.fnm, _wvbv.nrm,
_wvbt.fdx, _wvbr.tii, _wv3m.tis, _wvbu.frq, _wv0t.fdx, _wv5l.prx,
_wvbr.tis, _wu4u.frq, _wv0t.fnm, _wv9i.fnm, _wv0t.fdt, _wvbv.tii,
_wvby.frq, _wsrd.fnm, _wvbw.prx, _wv0t.prx, _wv5l.fdt, _wvby.fdx,
_wvbv.frq, _wvbs.frq, _wvbr.frq, _wvby.tis, _
wvbw.fnm, _wvbu.tii, _wv3m.fnm, _wvbs.fdx, _wv5l.frq, _wv0t.tii,
_wvbr.fdx, _wvbs.fdt, _wvbr.fdt, _wv9i.prx, _ghmn.tii, _wv3m.fdx,
_wvbu.fdx, _wtdv.fdx, _wv3m.tii, _wvbs.nrm, _wv3m.fdt, _wvbu.fdt,
_ghmn.tis, _wvbx.frq, _wtdv.frq, _wvbx.fdx, _wv0t.tis, _wu4u.prx,
_wtdv.fdt, _wvbx.tis, _wvbx.fdt, _wuy1.frq, _ghmn.prx
, _wvbs.prx, _wvbt.frq, _75l4.fnm, _wvbw.tis, _wvbx.nrm, _wvbx.prx,
_wuy1.tii, _wv5l.tii, _wvbx.tii]
Jul 12, 2011 9:48:19 AM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1282198870198
Jul 12, 2011 9:48:19 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 6
Jul 12, 2011 9:48:19 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=6 
Jul 12, 2011 9:49:43 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:49:43 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:49:44 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 0
Jul 12, 2011 9:49:44 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=0 
Jul 12, 2011 9:51:01 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:01 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:06 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:06 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:11 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:11 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:15 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:15 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:19 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:19 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:33 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:33 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:51:49 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:51:49 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr path=/update params={wt=javabinversion=1}
status=0 QTime=1 
Jul 12, 2011 9:52:05 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[null]} 0 1
Jul 12, 2011 9:52:05 AM org.apache.solr.core.SolrCore execute
INFO: [statistics] webapp=/solr