[ 
https://issues.apache.org/jira/browse/HBASE-27826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076859#comment-18076859
 ] 

Hudson commented on HBASE-27826:
--------------------------------

Results for branch master
        [build #53 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase-Integration-Test/job/master/53/]:
 (/) *{color:green}+1 overall{color}*
----
details (if available):



(/) {color:green}+1 client integration test for 3.3.5 {color}
(/) {color:green}+1 client integration test for 3.3.5 with shaded hadoop 
client{color}


(/) {color:green}+1 client integration test for 3.3.6 {color}
(/) {color:green}+1 client integration test for 3.3.6 with shaded hadoop 
client{color}


(/) {color:green}+1 client integration test for 3.4.0 {color}
(/) {color:green}+1 client integration test for 3.4.0 with shaded hadoop 
client{color}


(/) {color:green}+1 client integration test for 3.4.1 {color}
(/) {color:green}+1 client integration test for 3.4.1 with shaded hadoop 
client{color}


(/) {color:green}+1 client integration test for 3.4.2 {color}
(/) {color:green}+1 client integration test for 3.4.2 with shaded hadoop 
client{color}


(/) {color:green}+1 client integration test for 3.4.3 {color}
(/) {color:green}+1 client integration test for 3.4.3 with shaded hadoop 
client{color}


> Region split and merge time while offline is O(n) with respect to number of 
> store files
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-27826
>                 URL: https://issues.apache.org/jira/browse/HBASE-27826
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.5.4
>            Reporter: Andrew Kyle Purtell
>            Assignee: Prathyusha
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-1, 2.7.0
>
>
> This is a significant availability issue when HFiles are on S3. =
> HBASE-26079 ({_}Use StoreFileTracker when splitting and merging{_}) changed 
> the split and merge table procedure implementations to indirect through the 
> StoreFileTracker implementation when selecting HFiles to be merged or split, 
> rather than directly listing those using file system APIs. It also changed 
> the commit logic in HRegionFileSystem to add the link/ref files on resulting 
> split or merged regions to the StoreFileTracker. However, the creation of a 
> link file is still a filesystem operation and creating a “file” on S3 can 
> take well over a second. If, for example there are 20 store files in a 
> region, which is not uncommon, after the region is taken offline for a split 
> (or merge) it may require more than 20 seconds to create the link files 
> before the results can be brought back online, creating a severe availability 
> problem. Splits and merges are supposed to be fast, completing in less than a 
> second, certainly less than a few seconds. This has been true when HFiles are 
> stored on HDFS only because file creation operations there are nearly 
> instantaneous. 
> There are two issues but both can be handled with modifications to the store 
> file tracker interface and the file based store file tracker implementation. 
> When the file based store file file tracker is enabled the HFile links should 
> be virtual entities that only exist in the file manifest. We do not require 
> physical files in the filesystem to serve as links now. That is the magic of 
> the this file tracker, the manifest file replaces requirements to list the 
> filesystem.
> Then, when splitting or merging, the HFile links should be collected into a 
> list and committed in one batch using a new FILE file tracker interface, 
> requiring only one update of the manifest file in S3, bringing the time 
> requirement for this operation to O(1) down from O[n].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to