Jin Feng created HADOOP-9564:
--------------------------------
Summary: DFSClient$DFSOutputStream.closeInternal locks up waiting
for namenode.complete
Key: HADOOP-9564
URL: https://issues.apache.org/jira/browse/HADOOP-9564
Project: Hadoop Common
Issue Type: Bug
Components: fs
Reporter: Jin Feng
Priority: Minor
Hi,
Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS
cluster. It's working fine in production environment. Its integration tests
used to run fine on our dev's local Mac laptop until recently (exact point of
time unknown) our tests started to freeze up very frequently with this stack:
{code}
"com.twitter.ads.billing.spendaggregator.jobs.maintenance.billinghourlyarchive.BillingHourlyAggArchiveJobIT-runner"
prio=5 tid=0x00007f9ce52cb800 nid=0x4a503 waiting on condition
[0x0000000165815000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000152f41378> (a
java.util.concurrent.FutureTask$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
at java.util.concurrent.FutureTask.get(FutureTask.java:111)
at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
- locked <0x000000014f568720> (a java.lang.Object)
at org.apache.hadoop.ipc.Client.call(Client.java:1080)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at $Proxy37.complete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy37.complete(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
- locked <0x0000000152f3f658> (a
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
....
....
{code}
our version is 0.20.2.cdh3u2-t1.
In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've searched
around couldn't find anything resembles this symptom, any helps are really
appreciated!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira