[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

2013-05-15 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658114#comment-13658114
 ] 

Suresh Srinivas commented on HADOOP-9564:
-

Can you please see if you can duplicate this issue on an Apache release. It 
most likely will happen on Apache release as well. If not, is it a good idea to 
move this to CDH related jiras?

 DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
 --

 Key: HADOOP-9564
 URL: https://issues.apache.org/jira/browse/HADOOP-9564
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Jin Feng
Priority: Minor

 Hi,
 Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS 
 cluster. It's working fine in production environment. Its integration tests 
 used to run fine on our dev's local Mac laptop until recently (exact point of 
 time unknown) our tests started to freeze up very frequently with this stack:
 {code}
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x000152f41378 (a 
 java.util.concurrent.FutureTask$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
   at java.util.concurrent.FutureTask.get(FutureTask.java:111)
   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
   - locked 0x00014f568720 (a java.lang.Object)
   at org.apache.hadoop.ipc.Client.call(Client.java:1080)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
   at $Proxy37.complete(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
   at $Proxy37.complete(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
   - locked 0x000152f3f658 (a 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
   at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
   at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
   at 
 org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
 
 
 {code}
 our version is 0.20.2.cdh3u2-t1.
 In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've 
 searched around couldn't find anything resembles this symptom, any helps are 
 really appreciated!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

2013-05-15 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658121#comment-13658121
 ] 

Lohit Vijayarenu commented on HADOOP-9564:
--

Will try to see if this is something specific to environment and update this 
JIRA. 

 DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
 --

 Key: HADOOP-9564
 URL: https://issues.apache.org/jira/browse/HADOOP-9564
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Jin Feng
Priority: Minor

 Hi,
 Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS 
 cluster. It's working fine in production environment. Its integration tests 
 used to run fine on our dev's local Mac laptop until recently (exact point of 
 time unknown) our tests started to freeze up very frequently with this stack:
 {code}
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x000152f41378 (a 
 java.util.concurrent.FutureTask$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
   at java.util.concurrent.FutureTask.get(FutureTask.java:111)
   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
   - locked 0x00014f568720 (a java.lang.Object)
   at org.apache.hadoop.ipc.Client.call(Client.java:1080)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
   at $Proxy37.complete(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
   at $Proxy37.complete(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
   - locked 0x000152f3f658 (a 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
   at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
   at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
   at 
 org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
 
 
 {code}
 our version is 0.20.2.cdh3u2-t1.
 In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've 
 searched around couldn't find anything resembles this symptom, any helps are 
 really appreciated!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

2013-05-14 Thread Jin Feng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657707#comment-13657707
 ] 

Jin Feng commented on HADOOP-9564:
--

This is captured from our test output, seems like the DataBlockScanner is 
really slow coming up with verification results even though our data/file 
generated in the tests are minimal.

{noformat}
13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
our_test_file_name.lzo-f773a37f-3dac-4337-a6cb-004fb94c1d31. 
blk_-8485988660073681466_1002
13/05/04 06:50:55 INFO datanode.DataNode: Receiving block 
blk_-8485988660073681466_1002 src: /127.0.0.1:42563 dest: /127.0.0.1:35830
13/05/04 06:50:55 INFO DataNode.clienttrace: src: /127.0.0.1:42563, dest: 
/127.0.0.1:35830, bytes: 303, op: HDFS_WRITE, cliID: DFSClient_-854844208, 
offset: 0, srvID: DS-1070312150-10.35.8.106-35830-1367650255272, blockid: 
blk_-8485988660073681466_1002, duration: 778000
13/05/04 06:50:55 INFO datanode.DataNode: PacketResponder 0 for block 
blk_-8485988660073681466_1002 terminating
13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 127.0.0.1:35830 is added to blk_-8485988660073681466_1002 
size 303
13/05/04 06:52:48 INFO datanode.DataBlockScanner: Verification succeeded for 
blk_-8485988660073681466_1002
13/05/04 07:00:40 INFO datanode.DataBlockScanner: Verification succeeded for 
blk_586310994067086116_1001
{noformat}


Could this be related to this bug: HADOOP-4584?

 DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
 --

 Key: HADOOP-9564
 URL: https://issues.apache.org/jira/browse/HADOOP-9564
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Jin Feng
Priority: Minor

 Hi,
 Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS 
 cluster. It's working fine in production environment. Its integration tests 
 used to run fine on our dev's local Mac laptop until recently (exact point of 
 time unknown) our tests started to freeze up very frequently with this stack:
 {code}
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x000152f41378 (a 
 java.util.concurrent.FutureTask$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
   at java.util.concurrent.FutureTask.get(FutureTask.java:111)
   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
   - locked 0x00014f568720 (a java.lang.Object)
   at org.apache.hadoop.ipc.Client.call(Client.java:1080)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
   at $Proxy37.complete(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
   at $Proxy37.complete(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
   - locked 0x000152f3f658 (a 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
   at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
   at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
   at 
 org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
 
 
 {code}
 our version is 0.20.2.cdh3u2-t1.
 In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've 
 searched around couldn't find anything resembles this symptom, any helps are 
 really appreciated!

--
This message is 

[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

2013-05-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657832#comment-13657832
 ] 

Todd Lipcon commented on HADOOP-9564:
-

What's the thread doing on the NN side? Keep in mind you're on a very old 
version (and a twitter-local build that no one else has), so you may not have a 
lot of luck getting people to help you diagnose farther.

 DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
 --

 Key: HADOOP-9564
 URL: https://issues.apache.org/jira/browse/HADOOP-9564
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Jin Feng
Priority: Minor

 Hi,
 Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS 
 cluster. It's working fine in production environment. Its integration tests 
 used to run fine on our dev's local Mac laptop until recently (exact point of 
 time unknown) our tests started to freeze up very frequently with this stack:
 {code}
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x000152f41378 (a 
 java.util.concurrent.FutureTask$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
   at java.util.concurrent.FutureTask.get(FutureTask.java:111)
   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
   - locked 0x00014f568720 (a java.lang.Object)
   at org.apache.hadoop.ipc.Client.call(Client.java:1080)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
   at $Proxy37.complete(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
   at $Proxy37.complete(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
   - locked 0x000152f3f658 (a 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
   at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
   at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
   at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
   at 
 org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
 
 
 {code}
 our version is 0.20.2.cdh3u2-t1.
 In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've 
 searched around couldn't find anything resembles this symptom, any helps are 
 really appreciated!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira