Hi Eric, Did your standby GC eventually fail on you with an OOME? I was able to reproduce the standby GC failure after about 6 days running in standby mode again on my Ops cluster after a cluster restart following the Thanksgiving holiday week.
Thanks, Terry On Wed, Dec 4, 2013 at 4:33 PM, Terry P. <[email protected]> wrote: > Thanks for testing Eric. I have nproc set hard to 32000 on all my nodes, > and just double checked it's correct on the Secondary Namenode where this > is happening on. nofile is set hard to 64000 so it's not a files/sockets > issue either. > > Please let me know if it eventually fails for you. > > > > On Wed, Dec 4, 2013 at 3:41 PM, Eric Newton <[email protected]> wrote: > >> I fired up a standby GC after I reduced the wait time between zookeeper >> lock checks to 10ms, and changed the memory from 256M to 56M. It's been >> running for the last hour. I didn't expect it to fail, but I wanted to >> make sure it wasn't reproducible. >> >> It's possible that you are running up against an nproc limit. We've >> seen out of memory issues when the JVM can't create a new thread. >> >> (I ran the test with 1.4.5-SNAPSHOT) >> >> -Eric >> >> >> On Wed, Dec 4, 2013 at 2:31 PM, Eric Newton <[email protected]>wrote: >> >>> I don't know if anyone is running a standby GC. Can you go ahead and >>> open a ticket? >>> >>> -Eric >>> >>> >>> >>> On Wed, Dec 4, 2013 at 2:29 PM, Terry P. <[email protected]> wrote: >>> >>>> Greetings folks, >>>> With Accumulo 1.4.2 I'm running Standby Master and GC processes on my >>>> Secondary Namenode. I've found that my Standby GC gets terminated due to >>>> OutOfMemoryError errors after about 6 days of running, even though it is >>>> running in standby mode only. The Standby Master is still running fine >>>> after 3 weeks of standby mode. >>>> >>>> My accumulo-env.sh script is using the 3GB environment default GC >>>> memory options of -Xmx256m -Xms256m, same as on the Master where the >>>> primary GC runs and has never gotten an OOME. At this point I don't see any >>>> reason to try increasing that as my bet is it will only delay the OOME. >>>> >>>> Anyone else running standby GC and running into this (or working fine)? >>>> >>> >>> >> >
