[
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114920#comment-14114920
]
Mikhail Antonov commented on HBASE-11165:
-----------------------------------------
bq. That is the issue where client takes a quorum of masters instead of a
quorum of zks, right? I was extrapolating that the endpoint of this issue is a
quorum of masters where we could read from any (or at least some reads could be
stale...) as another means of scaling out the read i/o when master and meta
colocated.
Yes, that's this one. Yeah, it should help with scaling read I/O (which, I
believe, is a vast majority of meta access calls? by the way, could somebody
estimate the distribution or read vs writes to meta table, in terms of requests
per second/networking traffic/disk access? are there metrics for that?)
bq. This issue seems to be arriving at single master to serve mllions of regions
I believe there are 2 dimensions here, right?
- For multi-masters (replicated master) the objection I believe is to 1)
maintain up-to-date in-memory state on >1 master, thus avoiding startup cost
for second master (that's actually why I solicited the estimates of how long
does the full restart takes now on that big cluster) and 2) scale out reads by
serving them out of copies located at different machines. But multi-masters
does not solve problem when meta doesn't fit in memory, of when writes need to
scale better
- partitioned masters, in turn, address this second aspect, fitting meta in
memory and scaling writes.
Does it look like accurate summary to you? The viable combined solution might
be multi-master setup, when masters are serving split meta and grouped by meta
region replicas (like, HM1 and HM2 are serving replicas of metaRegion1,
metaRegion2, MetaRegion3, and HM3 and HM4 are serving replicas of metaRegion4
and metaRegion5? with masters being a region server now, having really many
masters in the cluster might be just right direction to go with?)
bq. In fact I don't even think a client can connect to the cluster currently if
master is down which makes a bit of a farce of the above notion and needs
fixing.
That probably is an argument for multi-master layout too :)
bq. Let me look at HBASE-7767
That would be great. I also did a first pass to review the patch on the review
board and planning to get back to get closer look.
> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
> Issue Type: Brainstorming
> Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf,
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569"
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M
> regions maybe even 50M later. This issue is about discussing how we will do
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)