[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

Lars Hofhansl (JIRA) Fri, 16 May 2014 17:47:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999619#comment-13999619
 ]


Lars Hofhansl commented on HBASE-11165:
---------------------------------------

We'll run into other limitations before we hit META size issues I guess. Each 
column family and each region has a memstore. With a (say) 30gb heap and 128mb 
memstores, and 40% of heap used for the memstore you can only host 96 regions 
per region server. We'd need 10k servers for 1m regions.
Even if we assume that on average the memstores are 50% filled we still need 5k 
servers for 1m regions.

Now, maybe only a few regions are being written, in that case we need much less 
heap for the memstores.
And maybe we can make the memstores smaller (64 or 32mb); we'd get lots flushes 
and great write amplification.

We should also discuss why few, large regions are bad, and whether we can 
decouple the unit of distribution (a region) from whatever unit we're trying to 
operate on. Maybe a mapper per region is not good if regions can grows to 20gb 
(assuming we can ideally read around 100mb/s, we'd need at least 3.5mins to 
scan through 20gb).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

Reply via email to