[ 
https://issues.apache.org/jira/browse/HBASE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5995:
---------------------------------

    Attachment: hbase-5995_v1.patch

Attaching a candidate patch for this. As per my initial analysis, it seems that 
we have to call recoverLease before opening the wal files for read. 

We do not crash the region servers in this test, so normally all log files 
should be closed, and recoverLease() should not be necessary. However, we do 
restart all the datanodes, and when we trigger a log roll, then the 
DFSOuputStream.close() receives exception on the close: 
{code}
2013-05-03 11:38:28,366 ERROR 
[RegionServer:1;10.11.3.18,51418,1367606292279.logRoller] wal.FSHLog(691): 
Failed close of HLog writer
java.io.IOException: All datanodes 127.0.0.1:51404 are bad. Aborting...
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:941)
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:756)
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:425)
{code}

We just ride over the close() exception, thus should call recoverLease() 
afterwards. 

I have yet to check why on earth we are able to run this successfully with 
Hadoop1. 

The test succeeds with the patch. 
                
> Fix and reenable TestLogRolling.testLogRollOnPipelineRestart
> ------------------------------------------------------------
>
>                 Key: HBASE-5995
>                 URL: https://issues.apache.org/jira/browse/HBASE-5995
>             Project: HBase
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: 0.95.2
>            Reporter: stack
>            Assignee: Enis Soztutar
>            Priority: Blocker
>             Fix For: 0.95.1
>
>         Attachments: hbase-5995_v1.patch
>
>
> HBASE-5984 disabled this flakey test (See the issue for more).  This issue is 
> about getting it enabled again.  Made a blocker on 0.96.0 so it gets 
> attention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to