[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114633#comment-14114633
 ] 

Mikhail Antonov commented on HBASE-11165:
-----------------------------------------

bq. Please pile on all with thoughts. We need to put stake in grounds soon for 
hbase 2.0 cluster topology.

2 humble cents from my side:

 - I thought that the primary requirement for splittable meta is not really 
read-write throughput, but rather the thinking that on large enough cluster 
with 1M+ regions, meta may not simply fit in master memory (or JVM would have 
hard time keeping process consuming that much memory up)? That might be 
worsened if we take any actions towards keeping more metadata in meta table. 
[~enis] I believe you brought up before possible other things we may want to 
add to meta table, which would inflate it in size? Couldn't find that 
jira/thread. I would think that if we want to keep regions as a unit of 
assignment/recovery, splittable meta is a the must (so far didn't see an 
approach describing how to avoid it)

bq. ..and until we have HBASE-10295 "Refactor the replication implementation to 
eliminate permanent zk node" and/or HBASE-11467 "New impl of Registry interface 
not using ZK + new RPCs on master protocol" (Maybe a later phase of HBASE-10070 
when followers can run closer in to the leader state would work here) or a new 
master layout where we partition meta across multiple master server.
Unless I've missed some recent developments, HBASE-10070 is about region 
replicas, while HBASE-11467 is about ZK-less client (the patch there is about 
to grow big enough to provide for zk-less client, it's absorbing other subtasks 
:) ). May be worth to reiterate that zk-less client is some sort of 
pre-requisite or component of multi-master approach we're working on now, but 
it would work fine with current single active-many backup-masters schema as 
well.
I'm thinking that multi-masters and partitioned-masters (if we go in these 
approaches) need to be discussed closely together and considering each other, 
otherwise it'd be really hard to merge them together later on.

Also  on this:
bq. A plus split meta has over colocated master and meta is that master 
currently can be down for some period of time and the cluster keeps working; no 
splits and no merges and if a machine crashes while master is down, data is 
offline till master comes back (needs more exercise). This is less the case 
when colocated master and meta.
I'd be curious to hear more opinions/assessments on how bad is that when master 
is down, and what timeframe various people would consider as "generally ok", 
"kind of long, really want it to be faster" and "unacceptably long"?

{quote}So far it seems to me the driving requirements are:
+ scale
+ high availability
+ stop using zookeeper completely/for persistence
{quote}
Yeah, I think that are exactly the points and they could be discussed together. 
Besides scale, HA here probably consists of 2 parts - HA for region replicas 
(read- and rw-), and improved HA for master. Improved master HA (multi-master) 
for master is being researched/worked on now.

On "stop using ZK completely" there are general changes here coming along (like 
see HBASE-7767, on stopping using ZK for keeping table state.. a patch from 
[~octo47] is there ready for reviews), and proposed changes on client side to 
make hbase client non-dependent on ZK (that's HBASE-11467 [~stack] mentioned 
above, and that's what would be complementary to multi-master work).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to