[
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019184#comment-14019184
]
stack commented on HBASE-11165:
-------------------------------
Keep it coming. Lets have this issue as location for high-level scaling
discussion. Lets hang actual scaling issues (and their fixes) off this
umbrella issue too as [~toffer] and [~virag] have been doing already.
A few of us had a chat this morning on this topic (Francis, Andrew, LarsH).
Here are some very rough notes:
h4. How do to 1M+ regions on cluster?
* Distributed Master?
* Split Meta? We'll want to store more in meta?
* Fix assignment
3 high-level issues to address:
# Scanning is too slow
# Write amplification
# Moving regions is hard because big (lots of data offline at a time)
On amplification, on stripe compaction, need more data.
Do we add complexity to region itself or to the container.
Region is unit of distribution. Don't depend on changing say its size to
address issues listed above if possible. Disentangle dependencies.
Support a lot of regions. Can we disentangle need to have lots of regions so
can do more parallel scans: e.g. keep region midkeys or one-hundredth-keys
somewhere so mapreduce can do part-region-scan rather than all-region-scan?
Same for compactions.
Distributed master will be hard.
Multimaster will be hard.
Where to put the complexity? Make meta ops/master more complex? Currently
single master is 'simple' (if slow, etc.).
MultiMaster could partition the work but also do HA... Needs to be a plan for
multimaster.
Should we split meta. Can distribute load for assignment when other metas
(startup is a fat read then many .
Issues with HDFS when millions of regions. Need 4Gs of heap to list a
directory of millions of regions. Six hours to create a million regions in
HDFS.
Treeify the regions or just take regions out of HDFS (Matteo's region facade,
how BT does regions with files up in meta table). Just so can scale.
Action item. Can we try zkless assigmment on the big cluster? Can zkless
assignment apply to 0.98?
Going forward, lets keep high-level discussion up in this issue and break out
sub-issues for sub-discussion and work.
> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
> Issue Type: Brainstorming
> Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569"
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M
> regions maybe even 50M later. This issue is about discussing how we will do
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)