Vishal Khandelwal created HBASE-20933:
-----------------------------------------

             Summary: multiple splits may result into forever uncleaned split 
region
                 Key: HBASE-20933
                 URL: https://issues.apache.org/jira/browse/HBASE-20933
             Project: HBase
          Issue Type: Bug
            Reporter: Vishal Khandelwal
            Assignee: Vishal Khandelwal


Incase of multiple subsequent split and with an open handle on old reference 
file, it may result into split region which can never be cleaned

 So Here are two issues.
 # Region is getting split even when it has reference to its parent
 # Region is going offline/in archive mode even when there are reference 
pending in store

*Repro Steps*
 # Region split (P)
 # Before major compaction starts after split, open a handle on store file on 
new region (DA & DB)
 # Let compaction completes on DA, (Here compaction will not clear reference 
store files as it is opened)
 # Split new region (DA) again ( shouldSplit will return true as before 
compaction even does the cleanup, it removes the compacted files and reference 
in-memory list)
 # Now CatalogJanitor will not remove this region as it has store references, 
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
looks at only online regions
 #  After above steps region-DA which is offline will always be in split 
regions and never getting cleaned up.

We found that catalog janitor is also not able to clean regions which are 
offline(split parent) because it has reference of the daughter of it's parent 
which is not getting cleaned up. This is causing lot of store files not getting 
cleaned causing more space in local index store and lot of split lingering 
regions.

Unit test repro the scenario has been attached.

Fix can be in CompactedHFilesDischarger to look into offline region or stop 
split in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to