[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
[ https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658114#comment-13658114 ] Suresh Srinivas commented on HADOOP-9564: - Can you please see if you can duplicate this issue on an Apache release. It most likely will happen on Apache release as well. If not, is it a good idea to move this to CDH related jiras? DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete -- Key: HADOOP-9564 URL: https://issues.apache.org/jira/browse/HADOOP-9564 Project: Hadoop Common Issue Type: Bug Components: fs Reporter: Jin Feng Priority: Minor Hi, Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS cluster. It's working fine in production environment. Its integration tests used to run fine on our dev's local Mac laptop until recently (exact point of time unknown) our tests started to freeze up very frequently with this stack: {code} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x000152f41378 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790) - locked 0x00014f568720 (a java.lang.Object) at org.apache.hadoop.ipc.Client.call(Client.java:1080) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy37.complete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy37.complete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566) - locked 0x000152f3f658 (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295) {code} our version is 0.20.2.cdh3u2-t1. In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've searched around couldn't find anything resembles this symptom, any helps are really appreciated! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
[ https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658121#comment-13658121 ] Lohit Vijayarenu commented on HADOOP-9564: -- Will try to see if this is something specific to environment and update this JIRA. DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete -- Key: HADOOP-9564 URL: https://issues.apache.org/jira/browse/HADOOP-9564 Project: Hadoop Common Issue Type: Bug Components: fs Reporter: Jin Feng Priority: Minor Hi, Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS cluster. It's working fine in production environment. Its integration tests used to run fine on our dev's local Mac laptop until recently (exact point of time unknown) our tests started to freeze up very frequently with this stack: {code} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x000152f41378 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790) - locked 0x00014f568720 (a java.lang.Object) at org.apache.hadoop.ipc.Client.call(Client.java:1080) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy37.complete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy37.complete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566) - locked 0x000152f3f658 (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295) {code} our version is 0.20.2.cdh3u2-t1. In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've searched around couldn't find anything resembles this symptom, any helps are really appreciated! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
[ https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657707#comment-13657707 ] Jin Feng commented on HADOOP-9564: -- This is captured from our test output, seems like the DataBlockScanner is really slow coming up with verification results even though our data/file generated in the tests are minimal. {noformat} 13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: our_test_file_name.lzo-f773a37f-3dac-4337-a6cb-004fb94c1d31. blk_-8485988660073681466_1002 13/05/04 06:50:55 INFO datanode.DataNode: Receiving block blk_-8485988660073681466_1002 src: /127.0.0.1:42563 dest: /127.0.0.1:35830 13/05/04 06:50:55 INFO DataNode.clienttrace: src: /127.0.0.1:42563, dest: /127.0.0.1:35830, bytes: 303, op: HDFS_WRITE, cliID: DFSClient_-854844208, offset: 0, srvID: DS-1070312150-10.35.8.106-35830-1367650255272, blockid: blk_-8485988660073681466_1002, duration: 778000 13/05/04 06:50:55 INFO datanode.DataNode: PacketResponder 0 for block blk_-8485988660073681466_1002 terminating 13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:35830 is added to blk_-8485988660073681466_1002 size 303 13/05/04 06:52:48 INFO datanode.DataBlockScanner: Verification succeeded for blk_-8485988660073681466_1002 13/05/04 07:00:40 INFO datanode.DataBlockScanner: Verification succeeded for blk_586310994067086116_1001 {noformat} Could this be related to this bug: HADOOP-4584? DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete -- Key: HADOOP-9564 URL: https://issues.apache.org/jira/browse/HADOOP-9564 Project: Hadoop Common Issue Type: Bug Components: fs Reporter: Jin Feng Priority: Minor Hi, Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS cluster. It's working fine in production environment. Its integration tests used to run fine on our dev's local Mac laptop until recently (exact point of time unknown) our tests started to freeze up very frequently with this stack: {code} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x000152f41378 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790) - locked 0x00014f568720 (a java.lang.Object) at org.apache.hadoop.ipc.Client.call(Client.java:1080) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy37.complete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy37.complete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566) - locked 0x000152f3f658 (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295) {code} our version is 0.20.2.cdh3u2-t1. In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've searched around couldn't find anything resembles this symptom, any helps are really appreciated! -- This message is
[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
[ https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657832#comment-13657832 ] Todd Lipcon commented on HADOOP-9564: - What's the thread doing on the NN side? Keep in mind you're on a very old version (and a twitter-local build that no one else has), so you may not have a lot of luck getting people to help you diagnose farther. DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete -- Key: HADOOP-9564 URL: https://issues.apache.org/jira/browse/HADOOP-9564 Project: Hadoop Common Issue Type: Bug Components: fs Reporter: Jin Feng Priority: Minor Hi, Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS cluster. It's working fine in production environment. Its integration tests used to run fine on our dev's local Mac laptop until recently (exact point of time unknown) our tests started to freeze up very frequently with this stack: {code} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x000152f41378 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790) - locked 0x00014f568720 (a java.lang.Object) at org.apache.hadoop.ipc.Client.call(Client.java:1080) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy37.complete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy37.complete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566) - locked 0x000152f3f658 (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295) {code} our version is 0.20.2.cdh3u2-t1. In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've searched around couldn't find anything resembles this symptom, any helps are really appreciated! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira