[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262793#comment-13262793
 ] 

stack commented on HBASE-5829:
--

@Ted Make a new issue?

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262798#comment-13262798
 ] 

Zhihong Yu commented on HBASE-5829:
---

The latest patch is good to go.
Useless statement can be addressed elsewhere.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263372#comment-13263372
 ] 

Hudson commented on HBASE-5829:
---

Integrated in HBase-TRUNK-security #186 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/186/])
HBASE-5829 Inconsistency between the regions map and the servers map in 
AssignmentManager (Revision 1330993)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 0.96.0

 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-25 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261327#comment-13261327
 ] 

Maryann Xue commented on HBASE-5829:


@ for the second, think we should guarantee that it is also added to the map 
this.servers.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261353#comment-13261353
 ] 

Hadoop QA commented on HBASE-5829:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12524120/HBASE-5829-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//console

This message is automatically generated.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

  

[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261655#comment-13261655
 ] 

Zhihong Yu commented on HBASE-5829:
---

Patch makes sense.
w.r.t. this.servers, I found a useless statement (at least in trunk):
{code}
  void unassignCatalogRegions() {
this.servers.entrySet();
{code}
that should be removed.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260238#comment-13260238
 ] 

stack commented on HBASE-5829:
--

Do you have a patch for us Maryann?  The first at least seems legit (For the 
second, there is no associated server, right?)

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue

 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-20 Thread Maryann Xue (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258074#comment-13258074
 ] 

Maryann Xue commented on HBASE-5829:


In AssignmentManager.unassign(HRegionInfo, boolean)
// Remove from the regionsMap
synchronized (this.regions) {
  this.regions.remove(region);
}

In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, boolean)
  synchronized (this.regions) {
this.regions.put(plan.getRegionInfo(), plan.getDestination());
  }

Here, not updating/removing the region from this.servers might cause the 
balancer to generate incorrect region plans.
After the fix of HBASE-5563, it seems this problem won't cause endless loop of 
wrong balances or a region always in transition.
 

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue

 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257704#comment-13257704
 ] 

stack commented on HBASE-5829:
--

Please explain where the disparity between this.server and this.regions is in 
in the code Maryann.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue

 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira