[ 
https://issues.apache.org/jira/browse/HBASE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068637#comment-15068637
 ] 

Matteo Bertozzi commented on HBASE-15019:
-----------------------------------------

so, on the RS we know when we failed to close a WAL and we know when the open 
problem is caused by a file not close (lease recovery not called).
 * We can abort the RS once we get stuck on this problem and the split will 
take care of it.
 * We may try to call recoverLease() on the RS if the replication is not able 
to open. worst case (e.g. partition) the Master and the RS will fight for the 
lease and we will get stuck there
 * We can add an rpc to the master and ask him to do the recover lease, so we 
have just the master doing lease recovery and we avoid RS and Master fighting 
on the recovery.
 * We can keep a list of not closed stream in the FSHLog, and try to close them 
every once in awhile. If we are able to append to a new file we should be able 
to close the old one.

the first option is the easiest one, but we kill the RS. 
the second one is probably probably a no-go since may cause deadlock in the 
worst case.
the third one require a new rpc, which may be ok but meh...
the fourth looks like a hack but it is simple and isolated, the only problem 
with that is that we have the strong assumption that we are able to close a 
stream that we are hanging even after an hdfs restart (which seems to work, i'm 
trying to test it) 

> Replication stuck when HDFS is restarted
> ----------------------------------------
>
>                 Key: HBASE-15019
>                 URL: https://issues.apache.org/jira/browse/HBASE-15019
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication, wal
>    Affects Versions: 2.0.0, 1.2.0, 1.1.2, 1.0.3, 0.98.16.1
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>
> RS is normally working and writing on the WAL.
> HDFS is killed and restarted, and the RS try to do a roll.
> The close fail, but the roll succeed (because hdfs is now up) and everything 
> works.
> {noformat}
> 2015-12-11 21:52:28,058 ERROR 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter: Got IOException 
> while writing trailer
> java.io.IOException: All datanodes 10.51.30.152:50010 are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1147)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:945)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:496)
> 2015-12-11 21:52:28,059 ERROR 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Failed close of HLog writer
> java.io.IOException: All datanodes 10.51.30.152:50010 are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1147)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:945)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:496)
> 2015-12-11 21:52:28,059 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
> Riding over HLog close failure! error count=1
> {noformat}
> The problem is on the replication side. that log we rolled and we were not 
> able to close
> is waiting for a lease recovery.
> {noformat}
> 2015-12-11 21:16:31,909 ERROR 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory: Can't open after 267 
> attempts and 301124ms 
> {noformat}
> the WALFactory notify us about that, but there is nothing on the RS side that 
> perform the WAL recovery.
> {noformat}
> 2015-12-11 21:11:30,921 WARN 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory: Lease should have 
> recovered. This is not expected. Will retry
> java.io.IOException: Cannot obtain block length for 
> LocatedBlock{BP-1547065147-10.51.30.152-1446756937665:blk_1073801614_61243; 
> getBlockSize()=83; corrupt=false; offset=0; locs=[10.51.30.154:50010, 
> 10.51.30.152:50010, 10.51.30.155:50010]}
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:358)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:300)
>   at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:237)
>   at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:230)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1448)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:301)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:297)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:297)
>   at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:161)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:116)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:89)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:77)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:68)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:508)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:321)
> {noformat}
> the only way to trigger a WAL recovery is to restart and force the master to 
> trigger the lease recovery on WAL split. 
> since we know that the RS is still going, should we try to recover the lease 
> on the RS side?
> is it better/safer to trigger an abort on the RS, so we have only the master 
> doing lease recovery?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to