Re: [Dspace-tech] JAVA_OPTS for cron jobs?

2014-05-30 Thread Alan Orth
Peter,

Ahh, that's very interesting.  I just looked up the -server flag and it
seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux
platforms[0].

It seems my problem was the fact that heuristics used by the OOM killer
were killing Tomcat's java instead of whatever filter-media, etc cron
job which happened to be the final straw in exhausting the server's
memory.  I've since re-evaluated my Tomcat's -Xmx and -Xms values, and
determined there wasn't enough physical RAM to run both Tomcat's java as
well as the background tasks, yet DSpace's control panel shows Tomcat's
java is actually underutilizing the RAM we've allocated.  Reducing the
allocation there made a little more room for the background tasks and
things have been stable since then.

Also, I suspect it was the checksum checker job (runs at 3am for us)
which was actually the final straw in exhausting the memory, so I've
modified to work for 1 hour each run, instead of attempting to crawl the
whole repository (default):

0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p

Cheers,

Alan

[0]
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html

On 05/28/2014 05:33 PM, Peter Dietz wrote:
 Hi Alan,
 
 At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace
 https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66
 
 #Allow user to specify java options through JAVA_OPTS variable
 if [ $JAVA_OPTS =  ]; then
   #Default Java to use 256MB of memory
   JAVA_OPTS=-server -Xmx256m
 fi
 
 
 Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to
 help with permgen issues.
 https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66
 
 #Allow user to specify java options through JAVA_OPTS variable
 if [ $JAVA_OPTS =  ]; then
   #Default Java to use 256MB of memory
   JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m
 -XX:+CMSClassUnloadingEnabled
 fi
 
 
 By adding the -server your ensuring that Java runs in server mode,
 as opposed to client mode. Server has slower initial startup, but a
 better memory footprint, and better performance for a longer running
 task, as per: 
 http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client
 
 Then, if one of our clients has some jumbo-sized content that just
 isn't completing the cron jobs, then we'll temporarily bump the Xmx
 memory limit high, such as 4G.
 
 Peter Dietz
 Longsight
 www.longsight.com
 pe...@longsight.com
 p: 740-599-5005 x809
 
 
 On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu wrote:
 Alan,

 We override JAVA_OPTS for the nightly filter-media task in our cron.

 export JAVA_OPTS=-Xmx1200m;dspace filter-media ...

 We have a set of automated ingest tools.  We set JAVA_OPTS in some of the
 workflows that are run by those tools.

 https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh

 Terry



 On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote:

 Hi,

 I'm curious if anyone sets memory limits for DSpace's various cron jobs?

 Lately we've been having Tomcat's java process get killed every morning
 around the same time, and all dmesg shows is that java was killed by
 the kernel's OOM killer.  Catalina logs don't show any SEVERE errors,
 so I have to assume it's the cron jobs which are using up loads of
 memory and then confusing the kernel, which then identifies Tomcat's
 java as the memory hog and kills it.

 So I'm just curious if anyone has had these kinds of problems, and
 if/what they set their JAVA_OPTS to in crontab.

 The long term plan of course is to move to a machine with more memory
 (currently 4GB).

 Thanks,

 DSpace version is 3.1, OS is Ubuntu 12.04.

 --
 Alan Orth
 alan.o...@gmail.com
 http://alaninkenya.org
 http://mjanja.co.ke
 I have always wished for my computer to be as easy to use as my
 telephone; my wish has come true because I can no longer figure out how
 to use my telephone. -Bjarne Stroustrup, inventor of C++
 GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0



 --
 Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
 Instantly run your Selenium tests across 300+ browser/OS combos.
 Get unparalleled scalability from the best Selenium testing platform
 available
 Simple to use. Nothing to install. Get started now for free.
 http://p.sf.net/sfu/SauceLabs
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette




 --
 Terry Brady
 Applications Programmer Analyst
 Georgetown University Library Information Technology
 https://www.library.georgetown.edu/lit/code
 425-298-5498

 

Re: [Dspace-tech] JAVA_OPTS for cron jobs?

2014-05-30 Thread Peter Dietz
My hammer java_opts on our production server, for when some site has
crazy big content is to temporarily run it with:

JAVA_OPTS=-server -Xms256m -Xmx4g -XX:MaxPermSize=256m

We have 64GB ram on our boxes, so we'll survive.


Not to derail onto a tangent, but one thing I'd like to see DSpace support
is some type of background-processing-queue.

i.e. new content submitted should be queued to get: initial checksum, virus
check, media-filters to generate thumbnail and fulltext extraction,
Discovery needs to index the content

And then there are maintenance jobs: Recompute the checksum, OAI harvest,
index-maintenance, ...

New submissions add to the queue, some scheduler can add maintenance tasks
to the queue. This way you don't run into the issue of 3+ concurrent cron
jobs because they didn't complete in time. Maybe you can even tie this in
to the curation task queue system too. In the past we had a GitHub
Enterprise/Firewall, and being an admin of that shows you fancy admin bells
and whistles, where you can even inspect the queue.

Now what happens if queue growth exceeds its throughput, we'll cross that
bridge when we get there.


Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809


On Fri, May 30, 2014 at 6:11 AM, Alan Orth alan.o...@gmail.com wrote:

 Peter,

 Ahh, that's very interesting.  I just looked up the -server flag and it
 seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux
 platforms[0].

 It seems my problem was the fact that heuristics used by the OOM killer
 were killing Tomcat's java instead of whatever filter-media, etc cron
 job which happened to be the final straw in exhausting the server's
 memory.  I've since re-evaluated my Tomcat's -Xmx and -Xms values, and
 determined there wasn't enough physical RAM to run both Tomcat's java as
 well as the background tasks, yet DSpace's control panel shows Tomcat's
 java is actually underutilizing the RAM we've allocated.  Reducing the
 allocation there made a little more room for the background tasks and
 things have been stable since then.

 Also, I suspect it was the checksum checker job (runs at 3am for us)
 which was actually the final straw in exhausting the memory, so I've
 modified to work for 1 hour each run, instead of attempting to crawl the
 whole repository (default):

 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p

 Cheers,

 Alan

 [0]
 http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html

 On 05/28/2014 05:33 PM, Peter Dietz wrote:
  Hi Alan,
 
  At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace
 
 https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66
 
  #Allow user to specify java options through JAVA_OPTS variable
  if [ $JAVA_OPTS =  ]; then
#Default Java to use 256MB of memory
JAVA_OPTS=-server -Xmx256m
  fi
 
 
  Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to
  help with permgen issues.
  https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66
 
  #Allow user to specify java options through JAVA_OPTS variable
  if [ $JAVA_OPTS =  ]; then
#Default Java to use 256MB of memory
JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m
  -XX:+CMSClassUnloadingEnabled
  fi
 
 
  By adding the -server your ensuring that Java runs in server mode,
  as opposed to client mode. Server has slower initial startup, but a
  better memory footprint, and better performance for a longer running
  task, as per:
 http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client
 
  Then, if one of our clients has some jumbo-sized content that just
  isn't completing the cron jobs, then we'll temporarily bump the Xmx
  memory limit high, such as 4G.
  
  Peter Dietz
  Longsight
  www.longsight.com
  pe...@longsight.com
  p: 740-599-5005 x809
 
 
  On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu
 wrote:
  Alan,
 
  We override JAVA_OPTS for the nightly filter-media task in our cron.
 
  export JAVA_OPTS=-Xmx1200m;dspace filter-media ...
 
  We have a set of automated ingest tools.  We set JAVA_OPTS in some of
 the
  workflows that are run by those tools.
 
 
 https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh
 
  Terry
 
 
 
  On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote:
 
  Hi,
 
  I'm curious if anyone sets memory limits for DSpace's various cron
 jobs?
 
  Lately we've been having Tomcat's java process get killed every morning
  around the same time, and all dmesg shows is that java was killed by
  the kernel's OOM killer.  Catalina logs don't show any SEVERE errors,
  so I have to assume it's the cron jobs which are using up loads of
  memory and then confusing the kernel, which then identifies Tomcat's
  java as the memory hog and kills it.
 
  So I'm just curious if anyone has had these kinds of problems, and
  if/what they set 

Re: [Dspace-tech] JAVA_OPTS for cron jobs?

2014-05-30 Thread Alan Orth
Peter,

A queue would be awesome.  You're absolutely right regarding the cron
jobs; it's almost like you need to set a weekly reminder to go check the
execution times of your DSpace maintenance cron jobs to make sure
they're all completing and not running at the same time. :)  I find that
I tweak everything and then we add a bunch more content, get a bunch
more hits, etc, and all the timings are off again. :P

Cheers,

Alan

On 05/30/2014 05:16 PM, Peter Dietz wrote:
 My hammer java_opts on our production server, for when some site has
 crazy big content is to temporarily run it with:
 
 JAVA_OPTS=-server -Xms256m -Xmx4g -XX:MaxPermSize=256m
 
 
 We have 64GB ram on our boxes, so we'll survive.
 
 
 Not to derail onto a tangent, but one thing I'd like to see DSpace
 support is some type of background-processing-queue. 
 
 i.e. new content submitted should be queued to get: initial checksum,
 virus check, media-filters to generate thumbnail and fulltext
 extraction, Discovery needs to index the content
 
 And then there are maintenance jobs: Recompute the checksum, OAI
 harvest, index-maintenance, ...
 
 New submissions add to the queue, some scheduler can add maintenance
 tasks to the queue. This way you don't run into the issue of 3+
 concurrent cron jobs because they didn't complete in time. Maybe you can
 even tie this in to the curation task queue system too. In the past we
 had a GitHub Enterprise/Firewall, and being an admin of that shows you
 fancy admin bells and whistles, where you can even inspect the queue.
 
 Now what happens if queue growth exceeds its throughput, we'll cross
 that bridge when we get there.
 
 
 Peter Dietz
 Longsight
 www.longsight.com http://www.longsight.com
 pe...@longsight.com mailto:pe...@longsight.com
 p: 740-599-5005 x809
 
 
 On Fri, May 30, 2014 at 6:11 AM, Alan Orth alan.o...@gmail.com
 mailto:alan.o...@gmail.com wrote:
 
 Peter,
 
 Ahh, that's very interesting.  I just looked up the -server flag and it
 seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux
 platforms[0].
 
 It seems my problem was the fact that heuristics used by the OOM killer
 were killing Tomcat's java instead of whatever filter-media, etc cron
 job which happened to be the final straw in exhausting the server's
 memory.  I've since re-evaluated my Tomcat's -Xmx and -Xms values, and
 determined there wasn't enough physical RAM to run both Tomcat's java as
 well as the background tasks, yet DSpace's control panel shows Tomcat's
 java is actually underutilizing the RAM we've allocated.  Reducing the
 allocation there made a little more room for the background tasks and
 things have been stable since then.
 
 Also, I suspect it was the checksum checker job (runs at 3am for us)
 which was actually the final straw in exhausting the memory, so I've
 modified to work for 1 hour each run, instead of attempting to crawl the
 whole repository (default):
 
 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p
 
 Cheers,
 
 Alan
 
 [0]
 http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html
 
 On 05/28/2014 05:33 PM, Peter Dietz wrote:
  Hi Alan,
 
  At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace
 
 
 https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66
 
  #Allow user to specify java options through JAVA_OPTS variable
  if [ $JAVA_OPTS =  ]; then
#Default Java to use 256MB of memory
JAVA_OPTS=-server -Xmx256m
  fi
 
 
  Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to
  help with permgen issues.
 
 https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66
 
  #Allow user to specify java options through JAVA_OPTS variable
  if [ $JAVA_OPTS =  ]; then
#Default Java to use 256MB of memory
JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m
  -XX:+CMSClassUnloadingEnabled
  fi
 
 
  By adding the -server your ensuring that Java runs in server mode,
  as opposed to client mode. Server has slower initial startup, but a
  better memory footprint, and better performance for a longer running
  task, as per:
 
 http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client
 
  Then, if one of our clients has some jumbo-sized content that just
  isn't completing the cron jobs, then we'll temporarily bump the Xmx
  memory limit high, such as 4G.
  
  Peter Dietz
  Longsight
  www.longsight.com http://www.longsight.com
  pe...@longsight.com mailto:pe...@longsight.com
  p: 740-599-5005 x809 tel:740-599-5005%20x809
 
 
  On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu
 mailto:tw...@georgetown.edu wrote:
  Alan,
 
  We override 

Re: [Dspace-tech] JAVA_OPTS for cron jobs?

2014-05-27 Thread Terry Brady
Alan,

We override JAVA_OPTS for the nightly filter-media task in our cron.

export JAVA_OPTS=-Xmx1200m;dspace filter-media ...

We have a set of automated ingest tools.  We set JAVA_OPTS in some of the
workflows that are run by those tools.

https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh

Terry



On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote:

 Hi,

 I'm curious if anyone sets memory limits for DSpace's various cron jobs?

 Lately we've been having Tomcat's java process get killed every morning
 around the same time, and all dmesg shows is that java was killed by
 the kernel's OOM killer.  Catalina logs don't show any SEVERE errors,
 so I have to assume it's the cron jobs which are using up loads of
 memory and then confusing the kernel, which then identifies Tomcat's
 java as the memory hog and kills it.

 So I'm just curious if anyone has had these kinds of problems, and
 if/what they set their JAVA_OPTS to in crontab.

 The long term plan of course is to move to a machine with more memory
 (currently 4GB).

 Thanks,

 DSpace version is 3.1, OS is Ubuntu 12.04.

 --
 Alan Orth
 alan.o...@gmail.com
 http://alaninkenya.org
 http://mjanja.co.ke
 I have always wished for my computer to be as easy to use as my
 telephone; my wish has come true because I can no longer figure out how
 to use my telephone. -Bjarne Stroustrup, inventor of C++
 GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0



 --
 Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
 Instantly run your Selenium tests across 300+ browser/OS combos.
 Get unparalleled scalability from the best Selenium testing platform
 available
 Simple to use. Nothing to install. Get started now for free.
 http://p.sf.net/sfu/SauceLabs
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette




-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498
--
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette