[ 
https://issues.apache.org/jira/browse/HBASE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619360#comment-13619360
 ] 

Jeffrey Zhong commented on HBASE-8204:
--------------------------------------

[~nkeywal] One motivation for you to remove the {code} FSDataOutputStream out = 
fs.append(p);{code} from function recoverFileLease. Today I found the test case 
TestDistributedLogSplitting#testWorkerAbort is flaky due to we don't handle 
java.nio.channels.ClosedByInterruptException when "recoverLease" get 
interrupted. Even after it's interrupted but it still calls fs.append which 
result in an extra minute wait so fails the test case with timeout error. Below 
are related traces:
{code}
2013-04-01 14:45:04,735 DEBUG [SplitLogWorker-10.11.2.103,58161,1364852631051] 
util.FSHDFSUtils(95): Failed fs.recoverLease invocation, java.io.IOException: 
Call to localhost/127.0.0.1:58147 failed on local exception: 
java.nio.channels.ClosedByInterruptException, trying fs.append instead
2013-04-01 14:45:04,735 DEBUG [SplitLogWorker-10.11.2.103,58161,1364852631051] 
util.FSHDFSUtils(100): trying fs.append for 
hdfs://localhost:58147/user/jzhong/hbase/.logs/10.11.2.103,58161,1364852631051/10.11.2.103%2C58161%2C1364852631051.1364852632043
 with java.io.IOException: Call to localhost/127.0.0.1:58147 failed on local 
exception: java.nio.channels.ClosedByInterruptException
...
2013-04-01 14:46:04,737 WARN  [SplitLogWorker-10.11.2.103,58161,1364852631051] 
regionserver.SplitLogWorker$1(124): log splitting of 
hdfs://localhost:58147/user/jzhong/hbase/.logs/10.11.2.103,58161,1364852631051/10.11.2.103%2C58161%2C1364852631051.1364852632043
 failed, returning error
java.io.IOException: Failed to open 
hdfs://localhost:58147/user/jzhong/hbase/.logs/10.11.2.103,58161,1364852631051/10.11.2.103%2C58161%2C1364852631051.1364852632043
 for append
        at 
org.apache.hadoop.hbase.util.FSHDFSUtils.recoverFileLease(FSHDFSUtils.java:126)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:743)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:436)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:397)
        at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
        at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:274)
        at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
        at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:162)
        at java.lang.Thread.run(Thread.java:680)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:58147 failed on 
local exception: java.nio.channels.ClosedByInterruptException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
        at org.apache.hadoop.ipc.Client.call(Client.java:1112)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
….
{code}  
                
> Logging improvements and try to recover lease even when append is not 
> activated.
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-8204
>                 URL: https://issues.apache.org/jira/browse/HBASE-8204
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 0.96.0
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>            Priority: Minor
>         Attachments: 8204.v2.patch, 8204.v2.patch, 8204.v3.patch, 
> HBASE-8204.v1.patch
>
>
> See patch :-).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to