Re: [Dspace-tech] JAVA_OPTS for cron jobs?
Peter, Ahh, that's very interesting. I just looked up the -server flag and it seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux platforms[0]. It seems my problem was the fact that heuristics used by the OOM killer were killing Tomcat's java instead of whatever filter-media, etc cron job which happened to be the final straw in exhausting the server's memory. I've since re-evaluated my Tomcat's -Xmx and -Xms values, and determined there wasn't enough physical RAM to run both Tomcat's java as well as the background tasks, yet DSpace's control panel shows Tomcat's java is actually underutilizing the RAM we've allocated. Reducing the allocation there made a little more room for the background tasks and things have been stable since then. Also, I suspect it was the checksum checker job (runs at 3am for us) which was actually the final straw in exhausting the memory, so I've modified to work for 1 hour each run, instead of attempting to crawl the whole repository (default): 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p Cheers, Alan [0] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html On 05/28/2014 05:33 PM, Peter Dietz wrote: Hi Alan, At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx256m fi Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to help with permgen issues. https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m -XX:+CMSClassUnloadingEnabled fi By adding the -server your ensuring that Java runs in server mode, as opposed to client mode. Server has slower initial startup, but a better memory footprint, and better performance for a longer running task, as per: http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client Then, if one of our clients has some jumbo-sized content that just isn't completing the cron jobs, then we'll temporarily bump the Xmx memory limit high, such as 4G. Peter Dietz Longsight www.longsight.com pe...@longsight.com p: 740-599-5005 x809 On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu wrote: Alan, We override JAVA_OPTS for the nightly filter-media task in our cron. export JAVA_OPTS=-Xmx1200m;dspace filter-media ... We have a set of automated ingest tools. We set JAVA_OPTS in some of the workflows that are run by those tools. https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh Terry On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote: Hi, I'm curious if anyone sets memory limits for DSpace's various cron jobs? Lately we've been having Tomcat's java process get killed every morning around the same time, and all dmesg shows is that java was killed by the kernel's OOM killer. Catalina logs don't show any SEVERE errors, so I have to assume it's the cron jobs which are using up loads of memory and then confusing the kernel, which then identifies Tomcat's java as the memory hog and kills it. So I'm just curious if anyone has had these kinds of problems, and if/what they set their JAVA_OPTS to in crontab. The long term plan of course is to move to a machine with more memory (currently 4GB). Thanks, DSpace version is 3.1, OS is Ubuntu 12.04. -- Alan Orth alan.o...@gmail.com http://alaninkenya.org http://mjanja.co.ke I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone. -Bjarne Stroustrup, inventor of C++ GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0 -- Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free. http://p.sf.net/sfu/SauceLabs ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498
Re: [Dspace-tech] JAVA_OPTS for cron jobs?
My hammer java_opts on our production server, for when some site has crazy big content is to temporarily run it with: JAVA_OPTS=-server -Xms256m -Xmx4g -XX:MaxPermSize=256m We have 64GB ram on our boxes, so we'll survive. Not to derail onto a tangent, but one thing I'd like to see DSpace support is some type of background-processing-queue. i.e. new content submitted should be queued to get: initial checksum, virus check, media-filters to generate thumbnail and fulltext extraction, Discovery needs to index the content And then there are maintenance jobs: Recompute the checksum, OAI harvest, index-maintenance, ... New submissions add to the queue, some scheduler can add maintenance tasks to the queue. This way you don't run into the issue of 3+ concurrent cron jobs because they didn't complete in time. Maybe you can even tie this in to the curation task queue system too. In the past we had a GitHub Enterprise/Firewall, and being an admin of that shows you fancy admin bells and whistles, where you can even inspect the queue. Now what happens if queue growth exceeds its throughput, we'll cross that bridge when we get there. Peter Dietz Longsight www.longsight.com pe...@longsight.com p: 740-599-5005 x809 On Fri, May 30, 2014 at 6:11 AM, Alan Orth alan.o...@gmail.com wrote: Peter, Ahh, that's very interesting. I just looked up the -server flag and it seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux platforms[0]. It seems my problem was the fact that heuristics used by the OOM killer were killing Tomcat's java instead of whatever filter-media, etc cron job which happened to be the final straw in exhausting the server's memory. I've since re-evaluated my Tomcat's -Xmx and -Xms values, and determined there wasn't enough physical RAM to run both Tomcat's java as well as the background tasks, yet DSpace's control panel shows Tomcat's java is actually underutilizing the RAM we've allocated. Reducing the allocation there made a little more room for the background tasks and things have been stable since then. Also, I suspect it was the checksum checker job (runs at 3am for us) which was actually the final straw in exhausting the memory, so I've modified to work for 1 hour each run, instead of attempting to crawl the whole repository (default): 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p Cheers, Alan [0] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html On 05/28/2014 05:33 PM, Peter Dietz wrote: Hi Alan, At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx256m fi Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to help with permgen issues. https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m -XX:+CMSClassUnloadingEnabled fi By adding the -server your ensuring that Java runs in server mode, as opposed to client mode. Server has slower initial startup, but a better memory footprint, and better performance for a longer running task, as per: http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client Then, if one of our clients has some jumbo-sized content that just isn't completing the cron jobs, then we'll temporarily bump the Xmx memory limit high, such as 4G. Peter Dietz Longsight www.longsight.com pe...@longsight.com p: 740-599-5005 x809 On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu wrote: Alan, We override JAVA_OPTS for the nightly filter-media task in our cron. export JAVA_OPTS=-Xmx1200m;dspace filter-media ... We have a set of automated ingest tools. We set JAVA_OPTS in some of the workflows that are run by those tools. https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh Terry On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote: Hi, I'm curious if anyone sets memory limits for DSpace's various cron jobs? Lately we've been having Tomcat's java process get killed every morning around the same time, and all dmesg shows is that java was killed by the kernel's OOM killer. Catalina logs don't show any SEVERE errors, so I have to assume it's the cron jobs which are using up loads of memory and then confusing the kernel, which then identifies Tomcat's java as the memory hog and kills it. So I'm just curious if anyone has had these kinds of problems, and if/what they set
Re: [Dspace-tech] JAVA_OPTS for cron jobs?
Peter, A queue would be awesome. You're absolutely right regarding the cron jobs; it's almost like you need to set a weekly reminder to go check the execution times of your DSpace maintenance cron jobs to make sure they're all completing and not running at the same time. :) I find that I tweak everything and then we add a bunch more content, get a bunch more hits, etc, and all the timings are off again. :P Cheers, Alan On 05/30/2014 05:16 PM, Peter Dietz wrote: My hammer java_opts on our production server, for when some site has crazy big content is to temporarily run it with: JAVA_OPTS=-server -Xms256m -Xmx4g -XX:MaxPermSize=256m We have 64GB ram on our boxes, so we'll survive. Not to derail onto a tangent, but one thing I'd like to see DSpace support is some type of background-processing-queue. i.e. new content submitted should be queued to get: initial checksum, virus check, media-filters to generate thumbnail and fulltext extraction, Discovery needs to index the content And then there are maintenance jobs: Recompute the checksum, OAI harvest, index-maintenance, ... New submissions add to the queue, some scheduler can add maintenance tasks to the queue. This way you don't run into the issue of 3+ concurrent cron jobs because they didn't complete in time. Maybe you can even tie this in to the curation task queue system too. In the past we had a GitHub Enterprise/Firewall, and being an admin of that shows you fancy admin bells and whistles, where you can even inspect the queue. Now what happens if queue growth exceeds its throughput, we'll cross that bridge when we get there. Peter Dietz Longsight www.longsight.com http://www.longsight.com pe...@longsight.com mailto:pe...@longsight.com p: 740-599-5005 x809 On Fri, May 30, 2014 at 6:11 AM, Alan Orth alan.o...@gmail.com mailto:alan.o...@gmail.com wrote: Peter, Ahh, that's very interesting. I just looked up the -server flag and it seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux platforms[0]. It seems my problem was the fact that heuristics used by the OOM killer were killing Tomcat's java instead of whatever filter-media, etc cron job which happened to be the final straw in exhausting the server's memory. I've since re-evaluated my Tomcat's -Xmx and -Xms values, and determined there wasn't enough physical RAM to run both Tomcat's java as well as the background tasks, yet DSpace's control panel shows Tomcat's java is actually underutilizing the RAM we've allocated. Reducing the allocation there made a little more room for the background tasks and things have been stable since then. Also, I suspect it was the checksum checker job (runs at 3am for us) which was actually the final straw in exhausting the memory, so I've modified to work for 1 hour each run, instead of attempting to crawl the whole repository (default): 0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p Cheers, Alan [0] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html On 05/28/2014 05:33 PM, Peter Dietz wrote: Hi Alan, At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx256m fi Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to help with permgen issues. https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66 #Allow user to specify java options through JAVA_OPTS variable if [ $JAVA_OPTS = ]; then #Default Java to use 256MB of memory JAVA_OPTS=-server -Xmx512m -XX:MaxPermSize=128m -XX:+CMSClassUnloadingEnabled fi By adding the -server your ensuring that Java runs in server mode, as opposed to client mode. Server has slower initial startup, but a better memory footprint, and better performance for a longer running task, as per: http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client Then, if one of our clients has some jumbo-sized content that just isn't completing the cron jobs, then we'll temporarily bump the Xmx memory limit high, such as 4G. Peter Dietz Longsight www.longsight.com http://www.longsight.com pe...@longsight.com mailto:pe...@longsight.com p: 740-599-5005 x809 tel:740-599-5005%20x809 On Tue, May 27, 2014 at 7:03 PM, Terry Brady tw...@georgetown.edu mailto:tw...@georgetown.edu wrote: Alan, We override
Re: [Dspace-tech] JAVA_OPTS for cron jobs?
Alan, We override JAVA_OPTS for the nightly filter-media task in our cron. export JAVA_OPTS=-Xmx1200m;dspace filter-media ... We have a set of automated ingest tools. We set JAVA_OPTS in some of the workflows that are run by those tools. https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh Terry On Tue, May 20, 2014 at 1:33 AM, Alan Orth alan.o...@gmail.com wrote: Hi, I'm curious if anyone sets memory limits for DSpace's various cron jobs? Lately we've been having Tomcat's java process get killed every morning around the same time, and all dmesg shows is that java was killed by the kernel's OOM killer. Catalina logs don't show any SEVERE errors, so I have to assume it's the cron jobs which are using up loads of memory and then confusing the kernel, which then identifies Tomcat's java as the memory hog and kills it. So I'm just curious if anyone has had these kinds of problems, and if/what they set their JAVA_OPTS to in crontab. The long term plan of course is to move to a machine with more memory (currently 4GB). Thanks, DSpace version is 3.1, OS is Ubuntu 12.04. -- Alan Orth alan.o...@gmail.com http://alaninkenya.org http://mjanja.co.ke I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone. -Bjarne Stroustrup, inventor of C++ GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0 -- Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free. http://p.sf.net/sfu/SauceLabs ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 -- The best possible search technologies are now affordable for all companies. Download your FREE open source Enterprise Search Engine today! Our experts will assist you in its installation for $59/mo, no commitment. Test it for FREE on our Cloud platform anytime! http://pubads.g.doubleclick.net/gampad/clk?id=145328191iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette