Bryan: w.r.t. gc_log_visualizer, is there plan to open source it ? bq. while backend throughput will be better/cheaper with ParallelGC.
Does the above mean that hbase servers are still using ParallelGC ? Thanks On Wed, Apr 27, 2016 at 7:39 AM, Bryan Beaudreault <[email protected] > wrote: > We have 6 production clusters and all of them are tuned differently, so I'm > not sure there is a setting I could easily give you. It really depends on > the usage. One of our devs wrote a blog post on G1GC fundamentals > recently. It's rather long, but could be worth a read: > > http://product.hubspot.com/blog/g1gc-fundamentals-lessons-from-taming-garbage-collection > > We will also have a blog post coming out in the next week or so that talks > specifically to tuning G1GC for HBase. I can update this thread when that's > available. > > On Tue, Apr 26, 2016 at 8:08 PM Saad Mufti <[email protected]> wrote: > > > That is interesting. Would it be possible for you to share what GC > settings > > you ended up on that gave you the most predictable performance? > > > > Thanks. > > > > ---- > > Saad > > > > > > On Tue, Apr 26, 2016 at 11:56 AM, Bryan Beaudreault < > > [email protected]> wrote: > > > > > We were seeing this for a while with our CDH5 HBase clusters too. We > > > eventually correlated it very closely to GC pauses. Through heavily > > tuning > > > our GC we were able to drastically reduce the logs, by keeping most > GC's > > > under 100ms. > > > > > > On Tue, Apr 26, 2016 at 6:25 AM Saad Mufti <[email protected]> > wrote: > > > > > > > From what I can see in the source code, the default is actually even > > > lower > > > > at 100 ms (can be overridden with > hbase.regionserver.hlog.slowsync.ms > > ). > > > > > > > > ---- > > > > Saad > > > > > > > > > > > > On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling < > > [email protected] > > > > > > > > wrote: > > > > > > > > > I see similar log spam while system has reasonable performance. > Was > > > the > > > > > 250ms default chosen with SSDs and 10ge in mind or something? I > > guess > > > > I'm > > > > > surprised a sync write several times through JVMs to 2 remote > > datanodes > > > > > would be expected to consistently happen that fast. > > > > > > > > > > Regards, > > > > > > > > > > On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti <[email protected] > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > In our large HBase cluster based on CDH 5.5 in AWS, we're > > constantly > > > > > seeing > > > > > > the following messages in the region server logs: > > > > > > > > > > > > 2016-04-25 14:02:55,178 INFO > > > > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: > > 258 > > > > ms, > > > > > > current pipeline: > > > > > > [DatanodeInfoWithStorage[10.99.182.165:50010 > > > > > > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK], > > > > > > DatanodeInfoWithStorage[10.99.182.236:50010 > > > > > > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK], > > > > > > DatanodeInfoWithStorage[10.99.182.195:50010 > > > > > > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]] > > > > > > > > > > > > > > > > > > These happen regularly while HBase appear to be operating > normally > > > with > > > > > > decent read and write performance. We do have occasional > > performance > > > > > > problems when regions are auto-splitting, and at first I thought > > this > > > > was > > > > > > related but now I se it happens all the time. > > > > > > > > > > > > > > > > > > Can someone explain what this means really and should we be > > > concerned? > > > > I > > > > > > tracked down the source code that outputs it in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java > > > > > > > > > > > > but after going through the code I think I'd need to know much > more > > > > about > > > > > > the code to glean anything from it or the associated JIRA ticket > > > > > > https://issues.apache.org/jira/browse/HBASE-11240. > > > > > > > > > > > > Also, what is this "pipeline" the ticket and code talks about? > > > > > > > > > > > > Thanks in advance for any information and/or clarification anyone > > can > > > > > > provide. > > > > > > > > > > > > ---- > > > > > > > > > > > > Saad > > > > > > > > > > > > > > > > > > > > >
