[
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102555#comment-14102555
]
Francis Liu commented on HBASE-11165:
-------------------------------------
{quote}
Can I have some pointers on how to read the above. Zk-less AM is better because
you scan a table – you don't have to ls znodes? What is the 1M znodes vs 1M
rows about in above?
{quote}
Essentially the apis are better. ie 1M rows we can iterate over the rows
instead of ls and get back a huge chunk of data. ie deleting 1M znodes takes
too long, this could be parallelizable against an hbase table.
For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit
other ZK issues when having that many child znodes (1M and beyond). HDFS guys
are already looking into scaling number of child directories for NN.
Will update doc.
{quote}
Francis Liu Is the above the basis for your "...As our experiments shows
splitting is a must for scaling."? If split meta, then more read/write
throughput?
{quote}
If split meta, then: 1) Less write amplification (ie no large compactions),
Better W throughput. 2) More disks, more R/W throughput. 3. More heap to fit
meta, better R throughput.
{quote}
Because the meta table could be served by many machines so field more
reads/writes? The reads/writes are needed at starttime or during cluster
lifetime in your judgement? Thanks.
{quote}
Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure
for cluster lifetime case. Though large compactions would creep up on you. So
splitting would still be motivating for cluster lifetime IMHO.
> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
> Issue Type: Brainstorming
> Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf,
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569"
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M
> regions maybe even 50M later. This issue is about discussing how we will do
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)