[jira] [Comment Edited] (HBASE-25130) Masters in-memory serverHoldings map is not cleared during hbck repair

2021-01-29 Thread Rahul Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17274369#comment-17274369
 ] 

Rahul Kumar edited comment on HBASE-25130 at 1/29/21, 3:16 PM:
---

While trying to clear overlapped region entries from _serverHoldings_ map on 
running hbck repair. I noticed the HRI that 

_regionAssignments_(_TreeMap_) has an entry of does 
not equal to the HRI that has to be offlined. For eg 

*

{ENCODED => fa18b66587f8f7a1de791ffefe364a48, NAME => 
'test,,1611911426615.fa18b66587f8f7a1de791ffefe364a48.', STARTKEY => '', ENDKEY 
=> ''}

* is the metadata of _HRI_ in _regionAssignment_ map where as 

*

{ENCODED => fa18b66587f8f7a1de791ffefe364a48, NAME => 
'test,,1611911426615.fa18b66587f8f7a1de791ffefe364a48.', STARTKEY => '', ENDKEY 
=> '', OFFLINE => true, SPLIT => true}

* is the metadata of _HRI_ which has to go offline. 
 So, while it tries to remove the HRI entry to be offline from 
_regionAssignment_ via _regionAssignment.remove(hri_), it was not able to find 
any and thus couldn't go ahead with _regionOffline_ operation further. 

I am confused here, as both _HRI_ objects i.e the one which has to go offline 
and the one in _regionAssignments_ map should point to same object?
 A random thought, do we need to update the logic of equals(on basis of 
_encodedRegionName_) for HRI so that both of the above considered as equals ? 

[~vjasani] [~apurtell] Can you please help. Thanks

Btw, I reproed the overlap scenario via adding a bug in split and rollback 
scenario if that matters anyway.


was (Author: rkrahul324):
While trying to clear overlapped region entries from _serverHoldings_ map on 
running hbck repair. I noticed the HRI that 

_regionAssignments_(_TreeMap_) has an entry of does 
not equal to the HRI that has to be offlined. For eg 

*{ENCODED => fa18b66587f8f7a1de791ffefe364a48, NAME => 
'test,,1611911426615.fa18b66587f8f7a1de791ffefe364a48.', STARTKEY => '', ENDKEY 
=> ''}* is the _HRI_ metadata in _regionAssignment_ map where as 

*{ENCODED => fa18b66587f8f7a1de791ffefe364a48, NAME => 
'test,,1611911426615.fa18b66587f8f7a1de791ffefe364a48.', STARTKEY => '', ENDKEY 
=> '', OFFLINE => true, SPLIT => true}* is the _HRI_ metadata which has to go 
offline. 
 So, while it tries to remove the HRI entry to be offline from 
_regionAssignment_ via _regionAssignment.remove(hri_), it was not able to find 
any and thus couldn't go ahead with _regionOffline_ operation further. 

I am confused here, as both _HRI_ objects i.e the one which has to go offline 
and the one in _regionAssignments_ map should point to same object?
A random thought, do we need to update the logic of equals(on basis of 
_encodedRegionName_) for HRI so that both of the above considered as equals ? 

[~vjasani] [~apurtell] Can you please help. Thanks

Btw, I reproed the overlap scenario via adding a bug in split and rollback 
scenario if that matters anyway.

> Masters in-memory serverHoldings map is not cleared during hbck repair
> --
>
> Key: HBASE-25130
> URL: https://issues.apache.org/jira/browse/HBASE-25130
> Project: HBase
>  Issue Type: Bug
>Reporter: Sandeep Guggilam
>Assignee: Rahul Kumar
>Priority: Major
>
> {color:#1d1c1d}Incase of repairing overlaps, hbck  essentially calls the 
> closeRegion RPC on RS followed by offline RPC on Master to offline all the 
> overlap regions that would be merged into a new region. {color}
> {color:#1d1c1d}However the offline RPC doesn’t remove it from the 
> serverHoldings map unless the new state is MERGED/SPLIT 
> ([https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java#L719])
>  b{color}{color:#1d1c1d}ut the new state in this case is OFFLINE. {color}
> {color:#1d1c1d}This is actually intended to match with the META entries and 
> would be removed later when the region is online on a different server. 
> However, in our case , the region would never be online on a new server, 
> hence the region info is never cleared from the map that is used by balancer 
> and SCP for incorrect reeassignment.{color}
> {color:#1d1c1d}We might need to tackle this by removing the entries from the 
> map when hbck actually deletes{color}{color:#1d1c1d} the meta entries for 
> this region which kind of matches the in-memory map’s expectation with the 
> META state.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25130) Masters in-memory serverHoldings map is not cleared during hbck repair

2021-01-04 Thread Rahul Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258189#comment-17258189
 ] 

Rahul Kumar edited comment on HBASE-25130 at 1/4/21, 1:01 PM:
--

Possible approaches I could think of tackling the above issue:

1. Remove the the region entry from serverHolding map in case of mergeOverlap 
repair.
 * Once deleteMetaRegion() gets executed, we can call HbckRepair to remove the 
region from serverHoldings.
 * Here HbckRepair would call ServerManager and ServerManager will manage to 
call AssignmentManager to remove the region entry from serverHolding map via 
RegionStates.

2. Keep cleaning out the unwanted entries from serverHoldings map via chore 
cleaner, if the entries for the region is not present in META. This could be an 
expensive operation, also would it be safe to cleanup based on the above logic ?

[~apurtell] [~vjasani]  Please let me know your feedback on the above 
approaches or if we can handle it better. Thanks

 


was (Author: rkrahul324):
Possible approaches I could think of tackling the above issue:

1. Remove the the region entry from serverHolding map in case of mergeOverlap 
repair.
 * Once deleteMetaRegion() gets executed, we can call HbckRepair to remove the 
region from serverHoldings.
 * Here HbckRepair would call ServerManager and ServerManager will manage to 
call AssignmentManager to remove the region entry from serverHolding map via 
RegionStates.

2. Keep cleaning out the unwanted entries from serverHoldings map via chore 
cleaner, if the entries for the region is not present in META. This could be an 
expensive operation, also would it be safe to cleanup based on the above logic ?

[~apurtell] [~vjasani]  Looking for feedback on the above approaches or if we 
can handle it better.

 

> Masters in-memory serverHoldings map is not cleared during hbck repair
> --
>
> Key: HBASE-25130
> URL: https://issues.apache.org/jira/browse/HBASE-25130
> Project: HBase
>  Issue Type: Bug
>Reporter: Sandeep Guggilam
>Assignee: Rahul Kumar
>Priority: Major
>
> {color:#1d1c1d}Incase of repairing overlaps, hbck  essentially calls the 
> closeRegion RPC on RS followed by offline RPC on Master to offline all the 
> overlap regions that would be merged into a new region. {color}
> {color:#1d1c1d}However the offline RPC doesn’t remove it from the 
> serverHoldings map unless the new state is MERGED/SPLIT 
> ([https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java#L719])
>  b{color}{color:#1d1c1d}ut the new state in this case is OFFLINE. {color}
> {color:#1d1c1d}This is actually intended to match with the META entries and 
> would be removed later when the region is online on a different server. 
> However, in our case , the region would never be online on a new server, 
> hence the region info is never cleared from the map that is used by balancer 
> and SCP for incorrect reeassignment.{color}
> {color:#1d1c1d}We might need to tackle this by removing the entries from the 
> map when hbck actually deletes{color}{color:#1d1c1d} the meta entries for 
> this region which kind of matches the in-memory map’s expectation with the 
> META state.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)