[
https://issues.apache.org/jira/browse/HBASE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754895#comment-13754895
]
stack commented on HBASE-9387:
------------------------------
-1 on making new returns other than -1, at least not w/o review and design;
ain't this all complex enough?
On the patch, should be log.error, not log.warn and not needed anyways since
doesn't abort log?
This is an awful workaround but I have no better idea currently.
Regards other places needing review in open region handler, I think they are
fine... it is just this edge where control of the region is being handed off
that is the issue.
Are there other edges around close say that have similar issues?
> Region could get lost during assignment
> ---------------------------------------
>
> Key: HBASE-9387
> URL: https://issues.apache.org/jira/browse/HBASE-9387
> Project: HBase
> Issue Type: Bug
> Components: Region Assignment
> Affects Versions: 0.95.2
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Critical
> Attachments: 9387-v1.txt, 9387-v3.txt, hbase-9387.patch,
> org.apache.hadoop.hbase.TestFullLogReconstruction-output.txt
>
>
> I observed test timeout running against hadoop 2.1.0 with distributed log
> replay turned on.
> Looks like region state for 1588230740 became inconsistent between master and
> the surviving region server:
> {code}
> 2013-08-29 22:15:34,180 INFO [AM.ZK.Worker-pool2-t4]
> master.RegionStates(299): Onlined 1588230740 on
> kiyo.gq1.ygridcore.net,57016,1377814510039
> ...
> 2013-08-29 22:15:34,587 DEBUG [Thread-221]
> client.HConnectionManager$HConnectionImplementation(1269): locateRegionInMeta
> parentTable=hbase:meta, metaLocation={region=hbase:meta,,1.1588230740,
> hostname=kiyo.gq1.ygridcore.net,57016,1377814510039, seqNum=0}, attempt=2 of
> 35 failed; retrying after sleep of 302 because:
> org.apache.hadoop.hbase.exceptions.RegionOpeningException: Region is being
> opened: 1588230740
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2574)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3949)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2733)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26965)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2063)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$CallRunner.run(RpcServer.java:1800)
> at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:165)
> at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:41)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira