Is there a way to disable splitting (on a particular region server) ? On Thu, Jan 27, 2011 at 4:20 PM, Jean-Daniel Cryans <[email protected]>wrote:
> Mmm yes for the sake of not having a single region that moved, but it > wouldn't be so bad... it just means that those regions will be closed > when the RS closes. > > Also it's possible to have splits during that time, again it's not > dramatic as long as the script doesn't freak out because a region is > gone. > > J-D > > On Thu, Jan 27, 2011 at 4:13 PM, Ted Yu <[email protected]> wrote: > > Should steps 1 and 2 below be exchanged ? > > > > Regards > > > > On Thu, Jan 27, 2011 at 3:53 PM, Jean-Daniel Cryans <[email protected] > >wrote: > > > >> To mitigate heap fragmentation, you could consider adding more nodes > >> to the cluster :) > >> > >> Regarding rolling restarts, currently there's one major issue: > >> https://issues.apache.org/jira/browse/HBASE-3441 > >> > >> How it currently works is a bit dumb, when you cleanly close a region > >> server it will first close all incoming connections and then will > >> procede to close the regions and it's not until it's fully done that > >> it will report to the master. What it means for your clients is that a > >> portion of the regions will become unavailable for some time until the > >> region server is done shutting down. How long you ask? Well it depends > >> on 1) how many regions you have but also mostly 2) how much data needs > >> to be flushed from the MemStores. On one of our clusters, shutting > >> down HBase takes a few minutes since our write pattern is almost > >> perfectly distributed meaning that all the memstore space is always > >> full from all the regions (luckily it's a cluster that serves only > >> mapreduce jobs). > >> > >> Writing this gives me an idea... I think one "easy" way we could > >> achieve this region draining problem is by writing a jruby script > >> that: > >> > >> 1- Retrieves the list of regions served by a RS > >> 2- Disables master balancing > >> 3- Moves one by one every region out of the RS, assigning them to the > >> other RSs in a round-robin fashion > >> 4- Shuts down the RS > >> 5- Reenables master balancing > >> > >> I wonder if it would work... At least it's a process that you could > >> stop at any time without breaking everything. > >> > >> J-D > >> > >> On Thu, Jan 27, 2011 at 11:38 AM, Wayne <[email protected]> wrote: > >> > I assumed GC was *trying* to roll. It shows the last 30min of logs > with > >> > control characters at the end. > >> > > >> > We are not all writes. In terms of writes we can wait and the > zookeeper > >> > timeout can go way up, but we also need to support real-time reads > (end > >> user > >> > based) and that is why the zookeeper timeout is not our first choice > to > >> > increase (we would rather decrease it). The funny part is that .90 > seems > >> > faster for us and churns through writes at a faster clip thereby > probably > >> > becoming less stable sooner due to the JVM not being able to handle > it. > >> > Should we schedule a rolling restart every 24 hours? How do production > >> > systems accept volume writes through the front door without melting > the > >> JVM > >> > due to fragmentation? We can possibly switch to bulk writes but > >> performance > >> > is not our problem...stability is. We are pushing 40k writes/node/sec > >> > sustained with well balanced regions hour after hour day after day > (until > >> a > >> > zookeeper tear down). > >> > > >> > Great to hear it is actively being looked at. I will keep an eye on > >> #3455. > >> > > >> > Below are our GC options, many of which are from work with the other > java > >> > database. Should I go back to the default settings? Should I use those > >> > referenced in the Jira #3455 (-XX:+UseConcMarkSweepGC > >> > -XX:CMSInitiatingOccupancyFraction=65 -Xms8g -Xmx8g). We are also > using > >> > Java6u23. > >> > > >> > > >> > export HBASE_HEAPSIZE=8192 > >> > export HBASE_OPTS="-XX:+UseCMSInitiatingOccupancyOnly > >> > -XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled > >> > -XX:SurvivorRatio=8 -XX:NewRatio=3 -XX:MaxTenuringThreshold=1 > >> > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC > >> > -XX:+CMSIncrementalMode" > >> > export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails > >> > -XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log" > >> > > >> > > >> > Thanks for your help! > >> > > >> > > >> > > >
