[
https://issues.apache.org/jira/browse/HBASE-25302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaolin Ha updated HBASE-25302:
-------------------------------
Description:
We have implemented a fast continuous split region method using HFileLink,
depending on stripe store file manager.
It is very simple and efficiency, we have implement all the ideas described in
the design doc and used on our production clusters. A region of about 600G can
be splitted to 75G*8 regions in about five minutes, with less than 5G total
rewrite size(all are L0) in the whole process, while normal continuous split
needs 600G*3=1800G. If using movement for same table HFileLinks, the rewritten
size is less than 50G(two stripe size), because the rebuild of HFileLinks to
stripes may bring some files to L0.
Details are in the doc,
[https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]
was:
We have implemented a fast continuous split region method using HFileLink,
depending on stripe store file manager.
It is very simple and efficiency, we have implement all the ideas and used on
our production clusters. A region of about 600G can be splitted to 75G*8
regions in about five minutes, with less than 50G total rewrite size in the
whole process, while normal continuous split needs 600G*3=1800G.
Details are in the doc,
[https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]
> Fast continuous split regions
> -----------------------------
>
> Key: HBASE-25302
> URL: https://issues.apache.org/jira/browse/HBASE-25302
> Project: HBase
> Issue Type: Improvement
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
> Attachments: Fast continuous split regions.pdf
>
>
> We have implemented a fast continuous split region method using HFileLink,
> depending on stripe store file manager.
> It is very simple and efficiency, we have implement all the ideas described
> in the design doc and used on our production clusters. A region of about 600G
> can be splitted to 75G*8 regions in about five minutes, with less than 5G
> total rewrite size(all are L0) in the whole process, while normal continuous
> split needs 600G*3=1800G. If using movement for same table HFileLinks, the
> rewritten size is less than 50G(two stripe size), because the rebuild of
> HFileLinks to stripes may bring some files to L0.
> Details are in the doc,
> [https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)