Proc v2 can't fix that it's harder to get a write into meta when going over rpc. Our try at qos doesn't fix it. As long as critical meta operations are competing with user requests meta will be unstabla
I am absolutely confident that meta on master makes hbase lose less data. The itbll tests bear this out. The real world experience bears this out. On Apr 8, 2016 8:03 AM, "Matteo Bertozzi" <theo.berto...@gmail.com> wrote: > # Without meta on master, we double assign and lose data. > > I doubt meta on master solve this problem. > This has more to do on the fact that balancer, assignment, split, merge > are disjoint operations that are not aware of each other. > also those operation in general consist of multiple steps and if the master > crashes you may end up in an inconsistent state. > > this is what proc-v2 should solve. since we are aware of each operation > there is no chance of double assignment and similar by design. > > The master doesn't need the full meta to operate properly > it just need the "state" (at which point of the operation am I). > which is the wal of proc-v2. given that we can split meta or meta > remote without any problem. since we only have 1 update to meta to > update the location when the assignment is completed. > > also at the moment the master has a copy of the information in meta. > a map with the RegionInfo, state and locations. but we are still doing > a query on meta instead of using that local map directly. > if we move meta on master we can remove that extra copy, but that > will tight together meta and master making impossible to offload meta, if > we need to. > > > In my opinion with the new assignment you have all the main problem solved. > we can keep regions on master as we have now, > so you can configure it to get more performance (avoid the remote rpc). > but our design should allow meta to be split and to be hosted somewhere > else. > > Matteo > > > On Fri, Apr 8, 2016 at 2:08 AM, 张铎 <palomino...@gmail.com> wrote: > > > Agree on the performance concerns. IMO we should not hurt the performance > > of small(maybe normal?) clusters when scaling for huge clusters. > > And I also agree that the current implementation which allows Master to > > carry system regions is not good(sorry for the chinglish...). At least, > it > > makes the master startup really complicated. > > > > So IMO, we should let the master process or master machine to also carry > > system regions, but in another way. Start another RS instance on the same > > machine or in the same JVM? Or build a new storage based on the procedure > > store and convert it to a normal table when it is too large? > > > > Thanks. > > > > 2016-04-08 16:42 GMT+08:00 Elliott Clark <ecl...@apache.org>: > > > > > # Without meta on master, we double assign and lose data. > > > > > > That is currently a fact that I have seen over and over on multiple > > loaded > > > clusters. Some abstract clean up of deployment vs losing data is a > > > no-brainer for me. Master assignment, region split, region merge are > all > > > risky, and all places that HBase can lose data. Meta being hosted on > the > > > master makes communication easier and less flakey. Running ITBLL on a > > loop > > > that creates a new table every time, and without meta on master > > everything > > > will fail pretty reliably in ~2 days. With meta on master things pass > > MUCH > > > more. > > > > > > # Master hosting the system tables locates the system tables as close > as > > > possible to the machine that will be mutating the data. > > > > > > Data locality is something that we all work for. Short circuit local > > reads, > > > Caching blocks in jvm, etc. Bringing data closer to the interested > party > > > has a long history of making things faster and better. Master is in > > charge > > > of just about all mutations of all systems tables. It's in charge of > > > changing meta, changing acls, creating new namespaces, etc. So put the > > > memstore as close as possible to the system that's going to mutate > meta. > > > > > > # If you want to make meta faster then moving it to other regionservers > > > makes things worse. > > > > > > Meta can get pretty hot. Putting it with other regions that clients > will > > be > > > trying to access makes everything worse. It means that meta is > competing > > > with user requests. If meta gets served and other requests don't, > causing > > > more requests to meta; or requests to user regions get served and other > > > clients get starved. > > > At FB we've seen read throughput to meta doubled or more by swapping it > > to > > > master. Writes to meta are also much faster since there's no rpc hop, > no > > > queueing, to fighting with reads. So far it has been the single biggest > > > thing to make meta faster. > > > > > > > > > On Thu, Apr 7, 2016 at 10:11 PM, Stack <st...@duboce.net> wrote: > > > > > > > I would like to start a discussion on whether Master should be > carrying > > > > regions or not. No hurry. I see this thread going on a while and what > > > with > > > > 2.0 being a ways out yet, there is no need to rush to a decision. > > > > > > > > First, some background. > > > > > > > > Currently in the master branch, HMaster hosts 'system tables': e.g. > > > > hbase:meta. HMaster is doing more than just gardening the cluster, > > > > bootstrapping and keeping all up and serving healthy as in branch-1; > in > > > > master branch, it is actually in the write path for the most critical > > > > system regions. > > > > > > > > Master is this way because HMaster and HRegionServer servers have so > > much > > > > in common, they should be just one binary, w/ HMaster as any other > > server > > > > with the HMaster function a minor appendage runnable by any running > > > > HRegionServer. > > > > > > > > I like this idea, but the unification work was just never finished. > > What > > > is > > > > in master branch is a compromise. HMaster is not a RegionServer but a > > > > sort-of RegionServer doing part serving. So we have HMaster role, a > new > > > > part-RegionServer-carrying-special-regions role and then a full-on > > > > HRegionServer role. We need to fix this messyness. We could revert to > > > plain > > > > branch-1 roles or carrying the > > > > HMaster-function-is-something-any-RegionServer-could-execute through > to > > > > completion. > > > > > > > > More background from a time long-past with good comments by the likes > > of > > > > our Francis Liu and Mighty Matteo Bertozzi are here [1], on unifying > > > master > > > > and meta-serving. Slightly related are old discussions on being able > to > > > > scale by splitting meta with good comments by our Elliott Clark [2]. > > > > > > > > Also for consideration, the landscape has since changed. [1] was > > written > > > > before we had ProcedureV2 available to us where we could record > > > > intermediate transition states local to the Master rather than remote > > as > > > > intermediate updates to an hbase:meta over rpc running on another > node. > > > > > > > > Enough on the background. > > > > > > > > Let me provoke discussion by making the statement that we should undo > > > > HMaster carrying any regions ever; that the HMaster function is work > > > enough > > > > for a single dedicated server and that it important enough that it > > cannot > > > > take a background role on a serving RegionServer (I could go back > from > > > this > > > > position if evidence HMaster role could be backgrounded). Notions of > a > > > > Master carrying system tables only are just not on given system > tables > > > will > > > > be too big for a single server especially when hbase:meta is split > (so > > we > > > > can scale). This simple distinction of HMaster and RegionServer roles > > is > > > > also what our users know and have gotten used to so needs to be a > good > > > > reason to change it (We can still pursue the single binary that can > do > > > > HMaster or HRegionServer role determined at runtime). > > > > > > > > Thanks, > > > > St.Ack > > > > > > > > 1. > > > > > > > > > > > > > > https://docs.google.com/document/d/1xC-bCzAAKO59Xo3XN-Cl6p-5CM_4DMoR-WpnkmYZgpw/edit#heading=h.j5yqy7n04bkn > > > > 2. > > > > > > > > > > > > > > https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit#heading=h.80vcerzbkj93 > > > > > > > > > >