[
https://issues.apache.org/jira/browse/HDFS-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079910#comment-13079910
]
Vinod Kumar Vavilapalli commented on HDFS-2229:
-----------------------------------------------
Thread dump:
{quote}
Found one Java-level deadlock:
=============================
"Thread[Thread-10,5,main]":
waiting to lock monitor 0x08e19b8c (object 0x31b0b7f0, a
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo),
which is held by "main"
"main":
waiting for ownable synchronizer 0x31641a50, (a
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync),
which is held by "Thread[Thread-10,5,main]"
Java stack information for the threads listed above:
===================================================
"Thread[Thread-10,5,main]":
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3183)
- waiting to lock <0x31b0b7f0> (a
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3563)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logUpdateMasterKey(FSNamesystem.java:4523)
at
org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logUpdateMasterKey(DelegationTokenSecretManager.java:279)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:144)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:168)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:373)
at java.lang.Thread.run(Thread.java:619)
"main":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x31641a50> (a
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
at
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeLock(FSNamesystem.java:382)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlocks(BlockManager.java:1743)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.initializeReplQueues(FSNamesystem.java:3257)
- locked <0x31b0b7f0> (a
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.leave(FSNamesystem.java:3228)
- locked <0x31b0b7f0> (a
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.checkMode(FSNamesystem.java:3315)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.setBlockTotal(FSNamesystem.java:3342)
- locked <0x31b0b7f0> (a
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setBlockTotal(FSNamesystem.java:3619)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.activate(FSNamesystem.java:322)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:489)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:452)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:561)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:553)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1578)
Found 1 deadlock.
{quote}
Logs show that RPC server is started but no the http server
{quote}
...
2011-08-05 08:54:35,286 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
Number of files = 1
2011-08-05 08:54:35,286 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
Number of files under construction = 0
2011-08-05 08:54:35,287 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
Image file of size 113 loaded in 0 seconds.
2011-08-05 08:54:35,287 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
Loaded image for txid 0 from
/tmp/hdfs/hadoop/var/hdfs/name/current/fsimage_0000000000000000000
2011-08-05 08:54:35,288 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
Edits file
/tmp/hdfs/hadoop/var/hdfs/name/current/edits_0000000000000000001-0000000000000000002
of size 1048576 edits # 2 loaded in 0 seconds.
2011-08-05 08:54:35,291 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog:
Starting log segment at 3
2011-08-05 08:54:35,363 INFO org.apache.hadoop.hdfs.server.namenode.NameCache:
initialized with 0 entries 0 lookups
2011-08-05 08:54:35,363 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage
in 387 msecs
2011-08-05 08:54:35,428 INFO org.apache.hadoop.ipc.Server: Starting Socket
Reader #1 for port 8020
2011-08-05 08:54:35,430 INFO org.apache.hadoop.ipc.Server: Starting Socket
Reader #2 for port 8020
2011-08-05 08:54:35,432 INFO org.apache.hadoop.ipc.Server: Starting Socket
Reader #3 for port 8020
2011-08-05 08:54:35,433 INFO org.apache.hadoop.ipc.Server: Starting Socket
Reader #4 for port 8020
2011-08-05 08:54:35,443 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Registered source RpcActivityForPort8020
2011-08-05 08:54:35,451 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Registered source RpcDetailedActivityForPort8020
2011-08-05 08:54:35,455 INFO
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
Updating the current master key for generating delegation tokens
2011-08-05 08:54:35,457 INFO
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
Starting expired delegation token remover thread, tokenRemoverScanInterval=60
min(s)
2011-08-05 08:54:35,457 INFO
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
Updating the current master key for generating delegation tokens
2011-08-05 08:54:35,457 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks under
construction: 0
2011-08-05 08:54:35,457 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: initializing replication
queues
^Stuck here
{quote}
> Deadlock in NameNode
> --------------------
>
> Key: HDFS-2229
> URL: https://issues.apache.org/jira/browse/HDFS-2229
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 0.23.0
> Reporter: Vinod Kumar Vavilapalli
>
> Either I am doing something incredibly stupid, or something about my
> environment is completely weird, or may be it really is a valid bug. I am
> running a NameNode deadlock consistently with 0.23 HDFS. I could never start
> NN successfully.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira