[ 
https://issues.apache.org/jira/browse/HDFS-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10729:
-----------------------------------
    Description: 
We encountered a bug where Standby NameNode crashes due to an NPE when loading 
edits.

{noformat}
2016-08-05 15:06:00,983 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
on operation AddOp [length=0, inodeId=789272719, path=[path], replication=3, 
mtime=1470379597935, atime=1470379597935, blockSize=134217728, blocks=[], 
permissions=<user>:supergroup:rw-r--r--, aclEntries=null, 
clientName=DFSClient_NONMAPREDUCE_1495395702_1, clientMachine=10.210.119.136, 
overwrite=true, RpcClientId=a1512eeb-65e4-43dc-8aa8-d7a1af37ed30, 
RpcCallId=417, storagePolicyId=0, opCode=OP_ADD, txid=4212503758]
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFileEncryptionInfo(FSDirectory.java:2914)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.createFileStatus(FSDirectory.java:2469)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:375)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:410)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
{noformat}

The NameNode crashes and can not be restarted. After some research, we turned 
on debug log of org.apache.hadoop.hdfs.StateChange, restart the NN, and we saw 
the following exception which induced NPE:

{noformat}
16/08/07 18:51:15 DEBUG hdfs.StateChange: DIR* FSDirectory.unprotectedAddFile: 
exception when add [path] to the file system
org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException:
 The directory item limit of [path] is exceeded: limit=1048576 items=1049332
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2060)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2112)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:2081)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addINode(FSDirectory.java:1900)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:368)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:365)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:445)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:426)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.catchupDuringFailover(EditLogTailer.java:182)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1205)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1762)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1635)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1351)
        at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
{noformat}

The exception is thrown, caught and logged at debug level in 
{{FSDirectory#unprotectedAddFile}}. Afterwards, a null is returned, which is 
used and subsequently caused NPE. HDFS-7567 reported a similar bug, but it was 
deemed the NPE could only be thrown if edit is corrupt. However, here we see an 
example where NPE could be thrown without corrupt edits.

Looks like the maximum number of items per directory is exceeded. This is 
similar to HDFS-6102 and HDFS-7482, but this happens at loading edits, rather 
than loading fsimage. 

A possible workaround is to increas the value of 
{{dfs.namenode.fs-limits.max-directory-items}} to 6400000. But I am not sure if 
it would cause any side effects.

  was:
We encountered a bug where Standby NameNode crashes due to an NPE when loading 
edits.

{noformat}
2016-08-05 15:06:00,983 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
on operation AddOp [length=0, inodeId=789272719, path=[path], replication=3, 
mtime=1470379597935, atime=1470379597935, blockSize=134217728, blocks=[], 
permissions=bdi:supergroup:rw-r--r--, aclEntries=null, 
clientName=DFSClient_NONMAPREDUCE_1495395702_1, clientMachine=10.210.119.136, 
overwrite=true, RpcClientId=a1512eeb-65e4-43dc-8aa8-d7a1af37ed30, 
RpcCallId=417, storagePolicyId=0, opCode=OP_ADD, txid=4212503758]
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFileEncryptionInfo(FSDirectory.java:2914)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.createFileStatus(FSDirectory.java:2469)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:375)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:410)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
{noformat}

The NameNode crashes and can not be restarted. After some research, we turned 
on debug log of org.apache.hadoop.hdfs.StateChange, restart the NN, and we saw 
the following exception which induced NPE:

{noformat}
16/08/07 18:51:15 DEBUG hdfs.StateChange: DIR* FSDirectory.unprotectedAddFile: 
exception when add [path] to the file system
org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException:
 The directory item limit of [path] is exceeded: limit=1048576 items=1049332
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2060)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2112)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:2081)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addINode(FSDirectory.java:1900)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:368)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:365)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:445)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:426)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.catchupDuringFailover(EditLogTailer.java:182)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1205)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1762)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1635)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1351)
        at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
{noformat}

The exception is thrown, caught and logged at debug level in 
{{FSDirectory#unprotectedAddFile}}. Afterwards, a null is returned, which is 
used and subsequently caused NPE. HDFS-7567 reported a similar bug, but it was 
deemed the NPE could only be thrown if edit is corrupt. However, here we see an 
example where NPE could be thrown without corrupt edits.

Looks like the maximum number of items per directory is exceeded. This is 
similar to HDFS-6102 and HDFS-7482, but this happens at loading edits, rather 
than loading fsimage. 

A possible workaround is to increas the value of 
{{dfs.namenode.fs-limits.max-directory-items}} to 6400000. But I am not sure if 
it would cause any side effects.


> NameNode crashes when loading edits because max directory items is exceeded
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-10729
>                 URL: https://issues.apache.org/jira/browse/HDFS-10729
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.2
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Critical
>
> We encountered a bug where Standby NameNode crashes due to an NPE when 
> loading edits.
> {noformat}
> 2016-08-05 15:06:00,983 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation AddOp [length=0, inodeId=789272719, path=[path], replication=3, 
> mtime=1470379597935, atime=1470379597935, blockSize=134217728, blocks=[], 
> permissions=<user>:supergroup:rw-r--r--, aclEntries=null, 
> clientName=DFSClient_NONMAPREDUCE_1495395702_1, clientMachine=10.210.119.136, 
> overwrite=true, RpcClientId=a1512eeb-65e4-43dc-8aa8-d7a1af37ed30, 
> RpcCallId=417, storagePolicyId=0, opCode=OP_ADD, txid=4212503758]
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFileEncryptionInfo(FSDirectory.java:2914)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.createFileStatus(FSDirectory.java:2469)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:375)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:360)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
>         at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:410)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
> {noformat}
> The NameNode crashes and can not be restarted. After some research, we turned 
> on debug log of org.apache.hadoop.hdfs.StateChange, restart the NN, and we 
> saw the following exception which induced NPE:
> {noformat}
> 16/08/07 18:51:15 DEBUG hdfs.StateChange: DIR* 
> FSDirectory.unprotectedAddFile: exception when add [path] to the file system
> org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException:
>  The directory item limit of [path] is exceeded: limit=1048576 items=1049332
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2060)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2112)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:2081)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addINode(FSDirectory.java:1900)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:368)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:365)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:230)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:810)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>         at 
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:445)
>         at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:426)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.catchupDuringFailover(EditLogTailer.java:182)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1205)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1762)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1635)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1351)
>         at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>         at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> {noformat}
> The exception is thrown, caught and logged at debug level in 
> {{FSDirectory#unprotectedAddFile}}. Afterwards, a null is returned, which is 
> used and subsequently caused NPE. HDFS-7567 reported a similar bug, but it 
> was deemed the NPE could only be thrown if edit is corrupt. However, here we 
> see an example where NPE could be thrown without corrupt edits.
> Looks like the maximum number of items per directory is exceeded. This is 
> similar to HDFS-6102 and HDFS-7482, but this happens at loading edits, rather 
> than loading fsimage. 
> A possible workaround is to increas the value of 
> {{dfs.namenode.fs-limits.max-directory-items}} to 6400000. But I am not sure 
> if it would cause any side effects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to