[ 
https://issues.apache.org/jira/browse/HBASE-25302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaolin Ha updated HBASE-25302:
-------------------------------
    Description: 
We have implemented a fast continuous split region method using HFileLink, 
depending on the stripe store file manager.

It is very simple and efficiency, we have implement all the ideas described in 
the design doc and used on our production clusters. A region of about 600G can  
be splitted to  75G*8 regions in about five minutes, with less than 5G total 
rewrite size(all are L0) in the whole process, while normal continuous split 
needs 600G*3=1800G. If using movement for same table HFileLinks, the rewritten 
size is less than 50G(two stripe size), because the rebuild of HFileLinks to 
stripes may insert some files to L0.

Details are in the doc,

[https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]

If there is someone who has interest in this issue, please let me know, thanks. 

 

  was:
We have implemented a fast continuous split region method using HFileLink, 
depending on stripe store file manager.

It is very simple and efficiency, we have implement all the ideas described in 
the design doc and used on our production clusters. A region of about 600G can  
be splitted to  75G*8 regions in about five minutes, with less than 5G total 
rewrite size(all are L0) in the whole process, while normal continuous split 
needs 600G*3=1800G. If using movement for same table HFileLinks, the rewritten 
size is less than 50G(two stripe size), because the rebuild of HFileLinks to 
stripes may insert some files to L0.

Details are in the doc,

[https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]


> Fast continuous split regions
> -----------------------------
>
>                 Key: HBASE-25302
>                 URL: https://issues.apache.org/jira/browse/HBASE-25302
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>         Attachments: Fast continuous split regions.pdf
>
>
> We have implemented a fast continuous split region method using HFileLink, 
> depending on the stripe store file manager.
> It is very simple and efficiency, we have implement all the ideas described 
> in the design doc and used on our production clusters. A region of about 600G 
> can  be splitted to  75G*8 regions in about five minutes, with less than 5G 
> total rewrite size(all are L0) in the whole process, while normal continuous 
> split needs 600G*3=1800G. If using movement for same table HFileLinks, the 
> rewritten size is less than 50G(two stripe size), because the rebuild of 
> HFileLinks to stripes may insert some files to L0.
> Details are in the doc,
> [https://docs.google.com/document/d/1hzBMdEFCckw18RE-kQQCe2ArW0MXhmLiiqyqpngItBM/edit?usp=sharing]
> If there is someone who has interest in this issue, please let me know, 
> thanks. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to