[
https://issues.apache.org/jira/browse/HBASE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Kellerman updated HBASE-1104:
---------------------------------
Attachment: 1104.patch.1
> stack - 07/Jan/09 08:42 PM
> Did you mean to add in changes to Index: src/webapps/master/WEB-INF/web.xml?
No, and I'm not sure how it got changed. Reverted.
> Want to add more javadoc to the @return in below (Not important...)
{code}
Index: src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
===================================================================
--- src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java (revision 732591)
+++ src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java (working copy)
@@ -126,6 +126,7 @@
* @param regionName name of the region to update
* @param b BatchUpdate
* @param expectedValues map of column names to expected data values.
+ * @return true if
{code}
Done. It was missing the @return altogether, and I just forgot to finish the
comment.
Tell me about this change:
{code}
storedInfo = this.master.serverManager.getServerInfo(serverName);
deadServer = this.master.serverManager.isDead(serverName);
* deadServerAndLogsSplit =
* this.master.serverManager.isDeadServerLogsSplit(serverName);
and...
* if ((deadServerAndLogsSplit ||
* (!deadServer && (storedInfo == null ||
* (storedInfo.getStartCode() != startCode)))) &&
* this.regionManager.assignable(info)) {
+ if ((deadServer ||
+ (storedInfo == null || storedInfo.getStartCode() != startCode))) {
+
{code}
> It don't look right. Changes I made for 1099 were "allow assigning if
> its a dead server and its commit logs HAVE been split" or "if NOT a
> dead server....because if a dead server and didn't pass first test,
> then its logs are being split.." ... We don't want BaseScanner
> assigning to servers on dead list. If regions are assigned to server
> on dead list, when dead server runs its scan in shutdown handler,
> we'll reassign these regions as though they'd been on crashed server;
> makes for double assignment and a mess.
You're right. It was a half finished change. What I meant to do was
not assign regions that are offline, in transition or were assigned to
a dead server since ProcessServerShutdown does that.
> You also remove the new method assignable. Don't we want to check if
> region is 'assignable' before dropping into this assigning code block?
> (Not sure... so asking).
If we get this far, we know the region is assignable because of the
test above.
> Your patch does this which as discussed on IRC is not whats wanted:
{code}
@@ -1088,12 +1088,8 @@
byte [] closestKey = store.getRowKeyAtOrBefore(row);
// If it happens to be an exact match, we can stop looping.
// Otherwise, we need to check if it's the max and move to the next
- if (HStoreKey.equalsTwoRowKeys(regionInfo, row, closestKey)) {
+ if (closestKey != null) {
key = new HStoreKey(closestKey, this.regionInfo);
- } else if (closestKey != null &&
- (key == null || HStoreKey.compareTwoRowKeys(
- regionInfo,closestKey, key.getRow()) > 0) ) {
- key = new HStoreKey(closestKey, this.regionInfo);
} else {
return null;
}
{code}
After some discussion with Stack, we determined that neither
implementation was correct. The new code is:
{code}
// get the closest key. (HStore.getRowKeyAtOrBefore can return null)
byte [] closestKey = store.getRowKeyAtOrBefore(row);
// If it happens to be an exact match, we can stop.
// Otherwise, we need to check if it's the max and move to the next
if (closestKey != null) {
if (HStoreKey.equalsTwoRowKeys(regionInfo, row, closestKey)) {
key = new HStoreKey(closestKey, this.regionInfo);
}
if (key == null) {
key = new HStoreKey(closestKey, this.regionInfo);
}
}
if (key == null) {
return null;
}
{code}
> Do you think this safe Jim in below?
{code}
@@ -564,9 +566,10 @@
// the messages we've received. In this case, a close could be
// processed before an open resulting in the master not agreeing on
// the region's state.
+ master.regionManager.setClosed(region.getRegionName());
{code}
> Will we have the problem where state changes are processed out of
> order? Thinking on it, it doesn't seem so but asking just to check.
No, I don't think it is a problem, because the region is still in
transition and cannot be reassigned until the RegionState is removed
from the map.
> Doubly-assigned regions redux
> -----------------------------
>
> Key: HBASE-1104
> URL: https://issues.apache.org/jira/browse/HBASE-1104
> Project: Hadoop HBase
> Issue Type: Bug
> Environment: pset cluster with TRUNK.
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: 1104.patch, 1104.patch.1
>
>
> Testing, I see doubly assigned regions. Below is from master log for
> TestTable,0000135598,1230761605500.
> {code}
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT:
> TestTable,0000116170,1230761152219: TestTable,0000116170,1230761152219 split;
> daughters: TestTable,0000116170,1230761605500,
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.142:60020
> 2008-12-31 22:13:38,561 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988953 and server XX.XX.XX.142:60020
> 2008-12-31 22:13:44,640 [IPC Server handler 4 on 60000] DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Going to close region
> TestTable,0000135598,1230761605500
> 2008-12-31 22:13:50,441 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,457 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988788 and server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,688 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:53,688 [HMaster] DEBUG
> org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessRegionClose
> of TestTable,0000135598,1230761605500, false
> 2008-12-31 22:13:54,263 [IPC Server handler 7 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.141:60020
> 2008-12-31 22:13:57,273 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [IPC Server handler 0 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.141:60020
> 2008-12-31 22:14:03,918 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759989031 and server XX.XX.XX.141:60020
> 2008-12-31 22:14:29,350 [RegionManager.metaScanner] DEBUG
> org.apache.hadoop.hbase.master.BaseScanner:
> TestTable,0000135598,1230761605500 no longer has references to
> TestTable,0000116170,1230761152219
> {code}
> See how we choose to assign before we get the close back from the
> regionserver.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.