Should steps 1 and 2 below be exchanged ? Regards
On Thu, Jan 27, 2011 at 3:53 PM, Jean-Daniel Cryans <[email protected]>wrote: > To mitigate heap fragmentation, you could consider adding more nodes > to the cluster :) > > Regarding rolling restarts, currently there's one major issue: > https://issues.apache.org/jira/browse/HBASE-3441 > > How it currently works is a bit dumb, when you cleanly close a region > server it will first close all incoming connections and then will > procede to close the regions and it's not until it's fully done that > it will report to the master. What it means for your clients is that a > portion of the regions will become unavailable for some time until the > region server is done shutting down. How long you ask? Well it depends > on 1) how many regions you have but also mostly 2) how much data needs > to be flushed from the MemStores. On one of our clusters, shutting > down HBase takes a few minutes since our write pattern is almost > perfectly distributed meaning that all the memstore space is always > full from all the regions (luckily it's a cluster that serves only > mapreduce jobs). > > Writing this gives me an idea... I think one "easy" way we could > achieve this region draining problem is by writing a jruby script > that: > > 1- Retrieves the list of regions served by a RS > 2- Disables master balancing > 3- Moves one by one every region out of the RS, assigning them to the > other RSs in a round-robin fashion > 4- Shuts down the RS > 5- Reenables master balancing > > I wonder if it would work... At least it's a process that you could > stop at any time without breaking everything. > > J-D > > On Thu, Jan 27, 2011 at 11:38 AM, Wayne <[email protected]> wrote: > > I assumed GC was *trying* to roll. It shows the last 30min of logs with > > control characters at the end. > > > > We are not all writes. In terms of writes we can wait and the zookeeper > > timeout can go way up, but we also need to support real-time reads (end > user > > based) and that is why the zookeeper timeout is not our first choice to > > increase (we would rather decrease it). The funny part is that .90 seems > > faster for us and churns through writes at a faster clip thereby probably > > becoming less stable sooner due to the JVM not being able to handle it. > > Should we schedule a rolling restart every 24 hours? How do production > > systems accept volume writes through the front door without melting the > JVM > > due to fragmentation? We can possibly switch to bulk writes but > performance > > is not our problem...stability is. We are pushing 40k writes/node/sec > > sustained with well balanced regions hour after hour day after day (until > a > > zookeeper tear down). > > > > Great to hear it is actively being looked at. I will keep an eye on > #3455. > > > > Below are our GC options, many of which are from work with the other java > > database. Should I go back to the default settings? Should I use those > > referenced in the Jira #3455 (-XX:+UseConcMarkSweepGC > > -XX:CMSInitiatingOccupancyFraction=65 -Xms8g -Xmx8g). We are also using > > Java6u23. > > > > > > export HBASE_HEAPSIZE=8192 > > export HBASE_OPTS="-XX:+UseCMSInitiatingOccupancyOnly > > -XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled > > -XX:SurvivorRatio=8 -XX:NewRatio=3 -XX:MaxTenuringThreshold=1 > > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC > > -XX:+CMSIncrementalMode" > > export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails > > -XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log" > > > > > > Thanks for your help! > > > > >
