[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343882#comment-14343882
 ] 

Lars Hofhansl commented on HBASE-11165:
---------------------------------------

Lemme step back... The fundamental conflict is that for assignment we to assign 
in large enough chunks (i.e. a region size), but for other parts of the system 
smaller chunks would be better (compactions, input split for M/R), etc. Right?

The unit of failure is always going to be a region server, whether we have 2000 
1GB regions or 20 100GB region makes no difference from that angle. Assuming 
server with 16TB of disk space or so, the granularity of assignment is also not 
at issue as long as we keep in that ball park (i.e. 1GB - 1TB or so).

So what are the exact problems with large regions:
# compactions and write amplification
# input split calculation for M/R
# log replay upon recovery (is that an issue, i.e. is it worse replaying 1 
large log compared to replaying 100 small ones)
# (more?)

Can we *only* solve these three with many small regions? (or do stripe 
compactions, simple width stats for M/R, etc)
I'm trying to get from a statement about an implementation (that might just 
shift complexity from one part of HBase to another) to an exact problem 
statement.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to