[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973945#action_12973945
 ] 

Jonathan Gray commented on HBASE-3380:
--------------------------------------

bq. I don't mind the changes as it helps on cold starts, but I think that in 
the failover case calling regionServerTracker.getOnlineServers().size() would 
be simple and reliable.

I don't think it's necessarily simple and I would prefer not to do halfway 
fixes where we are still actually relying on heartbeats.

The simplest implementation of this fix would create a new bug where the RS has 
actually died but znode is not gone yet.  So master will wait until it reaches 
N regionservers but will only ever get heartbeats from N-1.  So then we have a 
new maxTimeout parameter to deal with this case (and now you're basically where 
we are with my patch but without added benefits of the new configs).

I think these timeout changes are all that is necessary to fix both cases until 
we remove heartbeats in 0.92.

> Master failover can split logs of live servers
> ----------------------------------------------
>
>                 Key: HBASE-3380
>                 URL: https://issues.apache.org/jira/browse/HBASE-3380
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3380-v1.patch
>
>
> The reason why TestMasterFailover fails is that when it does the master 
> failover, the new master doesn't wait long enough for all region servers to 
> checkin so it goes ahead and split logs... which doesn't work because of the 
> way lease timeouts work:
> {noformat}
> 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
> wal.HLogSplitter(256): Splitting hlog 1 of 1:
>  
> hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
>  length=0
> 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
> wal.HLogSplitter$WriterThread(619): Writer thread 
> Thread[WriterThread-1,5,main]: starting
> 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
> wal.HLogSplitter$WriterThread(619): Writer thread 
> Thread[WriterThread-2,5,main]: starting
> 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
> util.FSUtils(625): Recovering file
>  
> hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
> 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
> namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
>  failed to create file 
> /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
>  for
>  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
> because this file is already being created by
>  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
> 127.0.0.1
> ...
> 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
> util.FSUtils(644): Waited 187354ms for lease recovery on
>  
> hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
>  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file
>  
> /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
>  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
> because this file is already
>  being created by 
> DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
> 127.0.0.1
> {noformat}
> I think that we should always check in ZK the number of live region servers 
> before waiting for them to check in, this way we know how many we should 
> expect during failover. There's also a case where we still want to timeout, 
> since RS can die during that time, but we should wait a bit longer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to