[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2019-03-04 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21421:
---
Fix Version/s: 2.2.0

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch, HBASE-21421.branch-2.0.003.patch, 
> HBASE-21421.branch-2.0.004.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-05 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
   Resolution: Fixed
Fix Version/s: 2.1.2
   2.0.3
   3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+, thanks for reviewing,[~Apache9].

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch, HBASE-21421.branch-2.0.003.patch, 
> HBASE-21421.branch-2.0.004.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-05 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.004.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch, HBASE-21421.branch-2.0.003.patch, 
> HBASE-21421.branch-2.0.004.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-05 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.003.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch, HBASE-21421.branch-2.0.003.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-04 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.002.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-04 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: (was: HBASE-21421.branch-2.0.002.patch)

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-03 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.002.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch, 
> HBASE-21421.branch-2.0.002.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-02 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: (was: HBASE-21421.branch-2.0.001.patch)

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-02 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.001.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-01 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Status: Patch Available  (was: Open)

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.2, 2.1.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-01 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Attachment: HBASE-21421.branch-2.0.001.patch

> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21421.branch-2.0.001.patch
>
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

2018-11-01 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21421:
---
Description: 
In the periodic regionServerReport from RS to master, we will call 
master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
same state with Master. If RS holds a region which master think should be on 
another RS, the Master will kill the RS.

But, the regionServerReport could be lagging(due to network or something), 
which can't represent the current state of RegionServer. Besides, we will call 
reportRegionStateTransition and try forever until it successfully reported to 
master  when online a region. We can count on reportRegionStateTransition calls.

I have encountered cases that the regions are closed on the RS and  
reportRegionStateTransition to master successfully. But later, a lagging 
regionServerReport tells the master the region is online on the RS(Which is not 
at the moment, this call may generated some time ago and delayed by network 
somehow), the the master think the region should be on another RS, and kill the 
RS, which should not be.

  was:
In the periodic regionServerReport call from RS to master, we will check 
master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
different state from Master. If RS holds a region which master think should be 
on another RS, the Master will kill the RS.

But, the regionServerReport could be lagging(due to network or something), 
which can't represent the current state of RegionServer. Besides, we will call 
reportRegionStateTransition and try forever until it successfully reported to 
master  when online a region. We can count on reportRegionStateTransition calls.

I have encountered cases that the regions are closed on the RS and  
reportRegionStateTransition to master successfully. But later, a lagging 
regionServerReport tells the master the region is online on the RS(Which is not 
at the moment, this call may generated some time ago and delayed by network 
somehow), the the master think the region should be on another RS, and kill the 
RS, which should not be.


> Do not kill RS if reportOnlineRegions fails
> ---
>
> Key: HBASE-21421
> URL: https://issues.apache.org/jira/browse/HBASE-21421
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.1, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
>
> In the periodic regionServerReport from RS to master, we will call 
> master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a 
> same state with Master. If RS holds a region which master think should be on 
> another RS, the Master will kill the RS.
> But, the regionServerReport could be lagging(due to network or something), 
> which can't represent the current state of RegionServer. Besides, we will 
> call reportRegionStateTransition and try forever until it successfully 
> reported to master  when online a region. We can count on 
> reportRegionStateTransition calls.
> I have encountered cases that the regions are closed on the RS and  
> reportRegionStateTransition to master successfully. But later, a lagging 
> regionServerReport tells the master the region is online on the RS(Which is 
> not at the moment, this call may generated some time ago and delayed by 
> network somehow), the the master think the region should be on another RS, 
> and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)