[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

Francis Liu (JIRA) Tue, 19 Aug 2014 11:02:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102555#comment-14102555
 ]


Francis Liu commented on HBASE-11165:
-------------------------------------

{quote}
Can I have some pointers on how to read the above. Zk-less AM is better because 
you scan a table – you don't have to ls znodes? What is the 1M znodes vs 1M 
rows about in above?
{quote}
Essentially the apis are better. ie 1M rows we can iterate over the rows 
instead of ls and get back a huge chunk of data. ie deleting 1M znodes takes 
too long, this could be parallelizable against an hbase table.

For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit 
other ZK issues when having that many child znodes (1M and beyond). HDFS guys 
are already looking into scaling number of child directories for NN.

Will update doc.

{quote}
Francis Liu Is the above the basis for your "...As our experiments shows 
splitting is a must for scaling."? If split meta, then more read/write 
throughput? 
{quote}
If split meta, then:  1) Less write amplification (ie no large compactions), 
Better W throughput. 2) More disks, more R/W throughput. 3. More heap to fit 
meta, better R throughput.

{quote}
Because the meta table could be served by many machines so field more 
reads/writes? The reads/writes are needed at starttime or during cluster 
lifetime in your judgement? Thanks.
{quote}
Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure 
for cluster lifetime case. Though large compactions would creep up on you. So 
splitting would still be motivating for cluster lifetime IMHO.  


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

Reply via email to