My "hammer" java_opts on our production server, for when some site has
crazy big content is to temporarily run it with:

JAVA_OPTS="-server -Xms256m -Xmx4g -XX:MaxPermSize=256m"

We have 64GB ram on our boxes, so we'll survive.


Not to derail onto a tangent, but one thing I'd like to see DSpace support
is some type of background-processing-queue.

i.e. new content submitted should be queued to get: initial checksum, virus
check, media-filters to generate thumbnail and fulltext extraction,
Discovery needs to index the content

And then there are maintenance jobs: Recompute the checksum, OAI harvest,
index-maintenance, ...

New submissions add to the queue, some scheduler can add maintenance tasks
to the queue. This way you don't run into the issue of 3+ concurrent cron
jobs because they didn't complete in time. Maybe you can even tie this in
to the curation task queue system too. In the past we had a GitHub
Enterprise/Firewall, and being an admin of that shows you fancy admin bells
and whistles, where you can even inspect the queue.

Now what happens if queue growth exceeds its throughput, we'll cross that
bridge when we get there.

________________
Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809


On Fri, May 30, 2014 at 6:11 AM, Alan Orth <alan.o...@gmail.com> wrote:

> Peter,
>
> Ahh, that's very interesting.  I just looked up the -server flag and it
> seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux
> platforms[0].
>
> It seems my problem was the fact that heuristics used by the OOM killer
> were killing Tomcat's java instead of whatever filter-media, etc cron
> job which happened to be the final straw in exhausting the server's
> memory.  I've since re-evaluated my Tomcat's -Xmx and -Xms values, and
> determined there wasn't enough physical RAM to run both Tomcat's java as
> well as the background tasks, yet DSpace's control panel shows Tomcat's
> java is actually underutilizing the RAM we've allocated.  Reducing the
> allocation there made a little more room for the background tasks and
> things have been stable since then.
>
> Also, I suspect it was the checksum checker job (runs at 3am for us)
> which was actually the final straw in exhausting the memory, so I've
> modified to work for 1 hour each run, instead of attempting to crawl the
> whole repository (default):
>
> 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p
>
> Cheers,
>
> Alan
>
> [0]
> http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html
>
> On 05/28/2014 05:33 PM, Peter Dietz wrote:
> > Hi Alan,
> >
> > At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace
> >
> https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66
> >
> > #Allow user to specify java options through JAVA_OPTS variable
> > if [ "$JAVA_OPTS" = "" ]; then
> >   #Default Java to use 256MB of memory
> >   JAVA_OPTS="-server -Xmx256m"
> > fi
> >
> >
> > Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to
> > help with permgen issues.
> > https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66
> >
> > #Allow user to specify java options through JAVA_OPTS variable
> > if [ "$JAVA_OPTS" = "" ]; then
> >   #Default Java to use 256MB of memory
> >   JAVA_OPTS="-server -Xmx512m -XX:MaxPermSize=128m
> > -XX:+CMSClassUnloadingEnabled"
> > fi
> >
> >
> > By adding the "-server" your ensuring that Java runs in server mode,
> > as opposed to client mode. Server has slower initial startup, but a
> > better memory footprint, and better performance for a longer running
> > task, as per:
> http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client
> >
> > Then, if one of our clients has some jumbo-sized content that just
> > isn't completing the cron jobs, then we'll temporarily bump the Xmx
> > memory limit high, such as 4G.
> > ________________
> > Peter Dietz
> > Longsight
> > www.longsight.com
> > pe...@longsight.com
> > p: 740-599-5005 x809
> >
> >
> > On Tue, May 27, 2014 at 7:03 PM, Terry Brady <tw...@georgetown.edu>
> wrote:
> >> Alan,
> >>
> >> We override JAVA_OPTS for the nightly filter-media task in our cron.
> >>
> >> export JAVA_OPTS=-Xmx1200m;dspace filter-media ...
> >>
> >> We have a set of automated ingest tools.  We set JAVA_OPTS in some of
> the
> >> workflows that are run by those tools.
> >>
> >>
> https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh
> >>
> >> Terry
> >>
> >>
> >>
> >> On Tue, May 20, 2014 at 1:33 AM, Alan Orth <alan.o...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I'm curious if anyone sets memory limits for DSpace's various cron
> jobs?
> >>>
> >>> Lately we've been having Tomcat's java process get killed every morning
> >>> around the same time, and all dmesg shows is that "java" was killed by
> >>> the kernel's OOM killer.  Catalina logs don't show any "SEVERE" errors,
> >>> so I have to assume it's the cron jobs which are using up loads of
> >>> memory and then confusing the kernel, which then identifies Tomcat's
> >>> java as the memory hog and kills it.
> >>>
> >>> So I'm just curious if anyone has had these kinds of problems, and
> >>> if/what they set their JAVA_OPTS to in crontab.
> >>>
> >>> The long term plan of course is to move to a machine with more memory
> >>> (currently 4GB).
> >>>
> >>> Thanks,
> >>>
> >>> DSpace version is 3.1, OS is Ubuntu 12.04.
> >>>
> >>> --
> >>> Alan Orth
> >>> alan.o...@gmail.com
> >>> http://alaninkenya.org
> >>> http://mjanja.co.ke
> >>> "I have always wished for my computer to be as easy to use as my
> >>> telephone; my wish has come true because I can no longer figure out how
> >>> to use my telephone." -Bjarne Stroustrup, inventor of C++
> >>> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
> >>>
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >>> Instantly run your Selenium tests across 300+ browser/OS combos.
> >>> Get unparalleled scalability from the best Selenium testing platform
> >>> available
> >>> Simple to use. Nothing to install. Get started now for free."
> >>> http://p.sf.net/sfu/SauceLabs
> >>> _______________________________________________
> >>> DSpace-tech mailing list
> >>> DSpace-tech@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> >>> List Etiquette:
> >>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
> >>
> >>
> >>
> >>
> >> --
> >> Terry Brady
> >> Applications Programmer Analyst
> >> Georgetown University Library Information Technology
> >> https://www.library.georgetown.edu/lit/code
> >> 425-298-5498
> >>
> >>
> ------------------------------------------------------------------------------
> >> The best possible search technologies are now affordable for all
> companies.
> >> Download your FREE open source Enterprise Search Engine today!
> >> Our experts will assist you in its installation for $59/mo, no
> commitment.
> >> Test it for FREE on our Cloud platform anytime!
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
> >> _______________________________________________
> >> DSpace-tech mailing list
> >> DSpace-tech@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> >> List Etiquette:
> >> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
>
> --
> Alan Orth
> alan.o...@gmail.com
> http://alaninkenya.org
> http://mjanja.co.ke
> "I have always wished for my computer to be as easy to use as my
> telephone; my wish has come true because I can no longer figure out how
> to use my telephone." -Bjarne Stroustrup, inventor of C++
> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>
>
>
> ------------------------------------------------------------------------------
> Time is money. Stop wasting it! Get your web API in 5 minutes.
> www.restlet.com/download
> http://p.sf.net/sfu/restlet
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
------------------------------------------------------------------------------
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to