Thanks, that is a lot of useful information. I have a lot of things to look at now in my cluster and API clients.
Cheers. ---- Saad On Wed, Apr 27, 2016 at 3:28 PM, Bryan Beaudreault <[email protected] > wrote: > We turned off auto-splitting by setting our region sizes to very large > (100gb). We split them manually when they become too unwieldy from a > compaction POV. > > We do use BufferedMutators in a number of places. They are pretty > straightforward, and definitely improve performance. The only lessons > learned there would be to use low buffer sizes. You'll get a lot of > benefits from just 1MB size, but if you want to go higher than that, you > should should aim for less than half of your G1GC region size. Anything > larger than that is considered a humongous object, and has implications for > garbage collection. The blog post I linked earlier goes into humongous > objects: > > http://product.hubspot.com/blog/g1gc-fundamentals-lessons-from-taming-garbage-collection#HumongousObjects > . > We've seen them to be very bad for GC performance when many of them come in > at once. > > So for us, most of our regionservers are 40gb+ heaps, which for that we use > 32mb G1GC regions. With 32mb G1GC regions, we aim for all buffered mutators > to use less than 16mb buffer sizes -- we even go further to limit it to > around 10mb just to be safe. We also do the same for reads -- we try to > limit all scanner and multiget responses to less than 10mb. > > We've created a dashboard with our internal monitoring system which shows > the count of requests that we consider too large, for all applications (we > have many 100s of deployed applications hitting these clusters). It's on > the individual teams that own the applications to try to drive that count > down to 0. We've built into HBase a detention queue (similar to quotas), > where we can put any of these applications based on their username if they > are doing something that is adversely affecting the rest of the system. For > instance if they started spamming a lot of too large requests, or badly > filtered scans, etc. In the detention queue, they use their own RPC > handlers which we can aggressively limit or reject if need be to preserve > the cluster. > > Hope this helps > > On Wed, Apr 27, 2016 at 2:54 PM Saad Mufti <[email protected]> wrote: > > > Hi Bryan, > > > > In Hubspot do you use a single shared (per-JVM) BufferedMutator anywhere > in > > an attempt to get better performance? Any lessons learned from any > > attempts? Has it hurt or helped? > > > > Also do you have any experience with write performance in conjunction > with > > auto-splitting activity kicking in, either with BufferedMutator or > > separately with just direct Put's? > > > > Thanks. > > > > ---- > > Saad > > > > > > > > > > On Wed, Apr 27, 2016 at 2:22 PM, Bryan Beaudreault < > > [email protected] > > > wrote: > > > > > Hey Ted, > > > > > > Actually, gc_log_visualizer is open-sourced, I will ask the author to > > > update the post with links: > https://github.com/HubSpot/gc_log_visualizer > > > > > > The author was taking a foundational approach with this blog post. We > do > > > use ParallelGC for backend non-API deployables, such as kafka consumers > > and > > > long running daemons, etc. However, we treat HBase like our API's, in > > that > > > it must have low latency requests. So we use G1GC for HBase. > > > > > > Expect another blog post from another HubSpot engineer soon, with all > the > > > details on how we approached G1GC tuning for HBase. I will update this > > list > > > when it's published, and will put some pressure on that author to get > it > > > out there :) > > > > > > On Wed, Apr 27, 2016 at 2:01 PM Ted Yu <[email protected]> wrote: > > > > > > > Bryan: > > > > w.r.t. gc_log_visualizer, is there plan to open source it ? > > > > > > > > bq. while backend throughput will be better/cheaper with ParallelGC. > > > > > > > > Does the above mean that hbase servers are still using ParallelGC ? > > > > > > > > Thanks > > > > > > > > On Wed, Apr 27, 2016 at 7:39 AM, Bryan Beaudreault < > > > > [email protected] > > > > > wrote: > > > > > > > > > We have 6 production clusters and all of them are tuned > differently, > > so > > > > I'm > > > > > not sure there is a setting I could easily give you. It really > > depends > > > on > > > > > the usage. One of our devs wrote a blog post on G1GC fundamentals > > > > > recently. It's rather long, but could be worth a read: > > > > > > > > > > > > > > > > > > > > http://product.hubspot.com/blog/g1gc-fundamentals-lessons-from-taming-garbage-collection > > > > > > > > > > We will also have a blog post coming out in the next week or so > that > > > > talks > > > > > specifically to tuning G1GC for HBase. I can update this thread > when > > > > that's > > > > > available. > > > > > > > > > > On Tue, Apr 26, 2016 at 8:08 PM Saad Mufti <[email protected]> > > > wrote: > > > > > > > > > > > That is interesting. Would it be possible for you to share what > GC > > > > > settings > > > > > > you ended up on that gave you the most predictable performance? > > > > > > > > > > > > Thanks. > > > > > > > > > > > > ---- > > > > > > Saad > > > > > > > > > > > > > > > > > > On Tue, Apr 26, 2016 at 11:56 AM, Bryan Beaudreault < > > > > > > [email protected]> wrote: > > > > > > > > > > > > > We were seeing this for a while with our CDH5 HBase clusters > too. > > > We > > > > > > > eventually correlated it very closely to GC pauses. Through > > heavily > > > > > > tuning > > > > > > > our GC we were able to drastically reduce the logs, by keeping > > most > > > > > GC's > > > > > > > under 100ms. > > > > > > > > > > > > > > On Tue, Apr 26, 2016 at 6:25 AM Saad Mufti < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > > > From what I can see in the source code, the default is > actually > > > > even > > > > > > > lower > > > > > > > > at 100 ms (can be overridden with > > > > > hbase.regionserver.hlog.slowsync.ms > > > > > > ). > > > > > > > > > > > > > > > > ---- > > > > > > > > Saad > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling < > > > > > > [email protected] > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I see similar log spam while system has reasonable > > performance. > > > > > Was > > > > > > > the > > > > > > > > > 250ms default chosen with SSDs and 10ge in mind or > something? > > > I > > > > > > guess > > > > > > > > I'm > > > > > > > > > surprised a sync write several times through JVMs to 2 > remote > > > > > > datanodes > > > > > > > > > would be expected to consistently happen that fast. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > In our large HBase cluster based on CDH 5.5 in AWS, we're > > > > > > constantly > > > > > > > > > seeing > > > > > > > > > > the following messages in the region server logs: > > > > > > > > > > > > > > > > > > > > 2016-04-25 14:02:55,178 INFO > > > > > > > > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow > sync > > > > cost: > > > > > > 258 > > > > > > > > ms, > > > > > > > > > > current pipeline: > > > > > > > > > > [DatanodeInfoWithStorage[10.99.182.165:50010 > > > > > > > > > > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK], > > > > > > > > > > DatanodeInfoWithStorage[10.99.182.236:50010 > > > > > > > > > > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK], > > > > > > > > > > DatanodeInfoWithStorage[10.99.182.195:50010 > > > > > > > > > > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > These happen regularly while HBase appear to be operating > > > > > normally > > > > > > > with > > > > > > > > > > decent read and write performance. We do have occasional > > > > > > performance > > > > > > > > > > problems when regions are auto-splitting, and at first I > > > > thought > > > > > > this > > > > > > > > was > > > > > > > > > > related but now I se it happens all the time. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can someone explain what this means really and should we > be > > > > > > > concerned? > > > > > > > > I > > > > > > > > > > tracked down the source code that outputs it in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java > > > > > > > > > > > > > > > > > > > > but after going through the code I think I'd need to know > > > much > > > > > more > > > > > > > > about > > > > > > > > > > the code to glean anything from it or the associated JIRA > > > > ticket > > > > > > > > > > https://issues.apache.org/jira/browse/HBASE-11240. > > > > > > > > > > > > > > > > > > > > Also, what is this "pipeline" the ticket and code talks > > > about? > > > > > > > > > > > > > > > > > > > > Thanks in advance for any information and/or > clarification > > > > anyone > > > > > > can > > > > > > > > > > provide. > > > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > Saad > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
