[
https://issues.apache.org/jira/browse/HBASE-20933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vishal Khandelwal updated HBASE-20933:
--------------------------------------
Description:
Incase of multiple subsequent split and with an open handle on old reference
file, it may result into split region which can never be cleaned
So Here are two issues.
# Region is getting split even when it has reference to its parent
# Region is going offline/in archive mode even when there are reference
pending in store
*Repro Steps*
# Region split (P)
# Before major compaction starts after split, open a handle on store file on
new region (DA & DB)
# Let compaction completes on DA, (Here compaction will not clear reference
store files as it is opened)
# Split new region (DA) again ( shouldSplit will return true as before
compaction even does the cleanup, it removes the compacted files and reference
in-memory list)
# Now CatalogJanitor will not remove this region as it has store references,
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only
looks at only online regions
# After above steps region-DA which is offline will always be in split
regions and never getting cleaned up.
We found that catalog janitor is also not able to clean regions which are
offline(split parent) because it has reference of the daughter of it's parent
which is not getting cleaned up. This is causing lot of store files not getting
cleaned causing more space in local index store and lot of split lingering
regions.
Unit test repro the scenario has been attached.
Fix can be in CompactedHFilesDischarger or catalogJanitor to handle sch case
was:
Incase of multiple subsequent split and with an open handle on old reference
file, it may result into split region which can never be cleaned
So Here are two issues.
# Region is getting split even when it has reference to its parent
# Region is going offline/in archive mode even when there are reference
pending in store
*Repro Steps*
# Region split (P)
# Before major compaction starts after split, open a handle on store file on
new region (DA & DB)
# Let compaction completes on DA, (Here compaction will not clear reference
store files as it is opened)
# Split new region (DA) again ( shouldSplit will return true as before
compaction even does the cleanup, it removes the compacted files and reference
in-memory list)
# Now CatalogJanitor will not remove this region as it has store references,
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only
looks at only online regions
# After above steps region-DA which is offline will always be in split
regions and never getting cleaned up.
We found that catalog janitor is also not able to clean regions which are
offline(split parent) because it has reference of the daughter of it's parent
which is not getting cleaned up. This is causing lot of store files not getting
cleaned causing more space in local index store and lot of split lingering
regions.
Unit test repro the scenario has been attached.
Fix can be in CompactedHFilesDischarger to look into offline region or stop
split in such cases.
> multiple splits may result into forever uncleaned split region
> --------------------------------------------------------------
>
> Key: HBASE-20933
> URL: https://issues.apache.org/jira/browse/HBASE-20933
> Project: HBase
> Issue Type: Bug
> Reporter: Vishal Khandelwal
> Assignee: Vishal Khandelwal
> Priority: Major
> Attachments: Test123.java
>
>
> Incase of multiple subsequent split and with an open handle on old reference
> file, it may result into split region which can never be cleaned
> So Here are two issues.
> # Region is getting split even when it has reference to its parent
> # Region is going offline/in archive mode even when there are reference
> pending in store
> *Repro Steps*
> # Region split (P)
> # Before major compaction starts after split, open a handle on store file on
> new region (DA & DB)
> # Let compaction completes on DA, (Here compaction will not clear reference
> store files as it is opened)
> # Split new region (DA) again ( shouldSplit will return true as before
> compaction even does the cleanup, it removes the compacted files and
> reference in-memory list)
> # Now CatalogJanitor will not remove this region as it has store references,
> majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only
> looks at only online regions
> # After above steps region-DA which is offline will always be in split
> regions and never getting cleaned up.
> We found that catalog janitor is also not able to clean regions which are
> offline(split parent) because it has reference of the daughter of it's parent
> which is not getting cleaned up. This is causing lot of store files not
> getting cleaned causing more space in local index store and lot of split
> lingering regions.
> Unit test repro the scenario has been attached.
> Fix can be in CompactedHFilesDischarger or catalogJanitor to handle sch case
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)