Re: Slow sync cost

2016-04-27 Thread Kevin Bowling
Even G1GC will have a 100ms pause time which would trigger this warning. Are there any real production clusters that don't constantly trigger this warning? What was the though process in 100ms? When you go through multiple JVMs that could be doing GCs over a network 100ms is not a long time!

Re: Slow sync cost

2016-04-27 Thread Ted Yu
There might be a typo: bq. After the Evacuation phase, Eden and Survivor To are devoid of live data and reclaimed. >From the graph below it, it seems Survivor From is reclaimed, not Survivor To. FYI On Wed, Apr 27, 2016 at 7:39 AM, Bryan Beaudreault wrote: > We

Question on writing scan coprocessors

2016-04-27 Thread James Johansville
Hello, I'd like to write a similar coprocessor to the example RegionObserverExample at http://www.3pillarglobal.com/insights/hbase-coprocessors : that is, a scan coprocessor which intercepts and selectively filters scan results. My problem is, I need to be able to filter out Results based on a

Re: Slow sync cost

2016-04-27 Thread Saad Mufti
Thanks, that is a lot of useful information. I have a lot of things to look at now in my cluster and API clients. Cheers. Saad On Wed, Apr 27, 2016 at 3:28 PM, Bryan Beaudreault wrote: > We turned off auto-splitting by setting our region sizes to very large >

Re: HBase Write Performance Under Auto-Split

2016-04-27 Thread Saad Mufti
Thanks for the feedback. We already disabled automatic major compaction, looks like we have to do the same for auto-splitting. Saad On Wed, Apr 27, 2016 at 3:26 PM, Vladimir Rodionov wrote: > Every split results in major compactions for both daughter regions. >

Re: Slow sync cost

2016-04-27 Thread Bryan Beaudreault
We turned off auto-splitting by setting our region sizes to very large (100gb). We split them manually when they become too unwieldy from a compaction POV. We do use BufferedMutators in a number of places. They are pretty straightforward, and definitely improve performance. The only lessons

Re: HBase Write Performance Under Auto-Split

2016-04-27 Thread Vladimir Rodionov
Every split results in major compactions for both daughter regions. Concurrent major compactions across a cluster is bad. I recommend you to set DisabledRegionSplitPolicy on your table(s) and run splits manually - you will have control on what and when should be split. The same is true for major

Re: Slow sync cost

2016-04-27 Thread Saad Mufti
Hi Bryan, In Hubspot do you use a single shared (per-JVM) BufferedMutator anywhere in an attempt to get better performance? Any lessons learned from any attempts? Has it hurt or helped? Also do you have any experience with write performance in conjunction with auto-splitting activity kicking in,

Re: Slow sync cost

2016-04-27 Thread Bryan Beaudreault
Hey Ted, Actually, gc_log_visualizer is open-sourced, I will ask the author to update the post with links: https://github.com/HubSpot/gc_log_visualizer The author was taking a foundational approach with this blog post. We do use ParallelGC for backend non-API deployables, such as kafka consumers

Re: Slow sync cost

2016-04-27 Thread Ted Yu
Bryan: w.r.t. gc_log_visualizer, is there plan to open source it ? bq. while backend throughput will be better/cheaper with ParallelGC. Does the above mean that hbase servers are still using ParallelGC ? Thanks On Wed, Apr 27, 2016 at 7:39 AM, Bryan Beaudreault

Re: [ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-27 Thread anil gupta
Cool, Thanks. Let me send the talk proposal to higher management. On Wed, Apr 27, 2016 at 8:16 AM, James Taylor wrote: > Yes, that sounds great - please let me know when I can add you to the > agenda. > > James > > On Tuesday, April 26, 2016, Anil Gupta

Re: Slow sync cost

2016-04-27 Thread Saad Mufti
Thanks, looks like an interesting read, will go try to absorb all the information. Saad On Wed, Apr 27, 2016 at 10:39 AM, Bryan Beaudreault < bbeaudrea...@hubspot.com> wrote: > We have 6 production clusters and all of them are tuned differently, so I'm > not sure there is a setting I

HBase Write Performance Under Auto-Split

2016-04-27 Thread Saad Mufti
Hi, Does anyone have experience with HBase write performance under auto-split conditions? Out keyspace is randomized so all regions roughly start auto-splitting around the same time, although early on when we had the 1024 regions we started with, they all decided to do so within an hour or so and

RE: Append Visibility Labels?

2016-04-27 Thread benedict.whittamsmith
> > Good to know that we haven't killed off the old products! But I'm not sure > > the archaeological approach would scale. > I'm curious if it would get you over your current hump. Yes, I think this will see us through the proof-of-concept work and early performance testing. In fact, from

Re: [ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-27 Thread James Taylor
Yes, that sounds great - please let me know when I can add you to the agenda. James On Tuesday, April 26, 2016, Anil Gupta wrote: > Hi James, > I spoke to my manager and he is fine with the idea of giving the talk. > Now, he is gonna ask higher management for final

Re: Slow sync cost

2016-04-27 Thread Bryan Beaudreault
We have 6 production clusters and all of them are tuned differently, so I'm not sure there is a setting I could easily give you. It really depends on the usage. One of our devs wrote a blog post on G1GC fundamentals recently. It's rather long, but could be worth a read: