[ 
https://issues.apache.org/jira/browse/HADOOP-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181718#comment-13181718
 ] 

Kihwal Lee commented on HADOOP-7964:
------------------------------------

The following is partial stack trace of the hung dfs put.

{noformat}
Full thread dump Java HotSpot(TM) Server VM (17.0-b16 mixed mode):

"TGT Renewer for xxx@XXXXXX" daemon prio=10 tid=0x08263c00 nid=0x71f6 in 
Object.wait()
[0xe6b9a000]
   java.lang.Thread.State: RUNNABLE
        at 
org.apache.hadoop.security.SecurityUtil.setTokenServiceUseIp(SecurityUtil.java:71)
        at 
org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:62)
        at 
org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:528)
        - locked <0xf283f1e0> (a 
org.apache.hadoop.security.UserGroupInformation)
        at 
org.apache.hadoop.security.UserGroupInformation.access$800(UserGroupInformation.java:77)
        at 
org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:555)
        at java.lang.Thread.run(Thread.java:619)

"main" prio=10 tid=0x08066c00 nid=0x71e4 in Object.wait() [0xf7440000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.net.NetUtils.<clinit>(NetUtils.java:80)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:174)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:109)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2032)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:78)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2066)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2048)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:284)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:151)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:268)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:190)
        at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:262)
        at 
org.apache.hadoop.fs.shell.CommandWithDestination.getRemoteDestination(CommandWithDestination.java:80)
        at 
org.apache.hadoop.fs.shell.CopyCommands$Put.processOptions(CopyCommands.java:164)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:153)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:296)
{noformat} 

We haven't seen this in 0.20.205, which has the same stuff. Maybe it doesn't 
happen or happens rarely because of timing differences.
                
> Deadlock in class init.
> -----------------------
>
>                 Key: HADOOP-7964
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7964
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security, util
>    Affects Versions: 0.20.205.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>            Reporter: Kihwal Lee
>            Priority: Blocker
>
> After HADOOP-7808, client-side commands hang occasionally. There are cyclic 
> dependencies in NetUtils and SecurityUtil class initialization. Upon initial 
> look at the stack trace, two threads deadlock when they hit the either of 
> class init the same time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to