[
https://issues.apache.org/jira/browse/HDFS-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15729387#comment-15729387
]
Wei-Chiu Chuang edited comment on HDFS-11197 at 12/7/16 6:01 PM:
-----------------------------------------------------------------
That fails because of this error:
{noformat}
2016-12-07 00:51:50,706 [IPC Server handler 6 on 59757] INFO ipc.Server
(Server.java:logException(2697)) - IPC Server handler 6 on 59757, call Call#819
Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
127.0.0.1:42394
org.apache.hadoop.hdfs.server.namenode.RetryStartFileException: Preconditions
for creating a file failed because of a transient error, retry create later.
at
org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getFileEncryptionInfo(FSDirEncryptionZoneOp.java:330)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2249)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2175)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:742)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:420)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653)
{noformat}
I have seen the exact same test error in the past, so most likely it's not your
patch that caused it. [~xiaochen] filed HDFS-11093 for this previously.
was (Author: jojochuang):
That fails because of this error:
{noformat}
2016-12-07 00:51:50,706 [IPC Server handler 6 on 59757] INFO ipc.Server
(Server.java:logException(2697)) - IPC Server handler 6 on 59757, call Call#819
Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
127.0.0.1:42394
org.apache.hadoop.hdfs.server.namenode.RetryStartFileException: Preconditions
for creating a file failed because of a transient error, retry create later.
at
org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getFileEncryptionInfo(FSDirEncryptionZoneOp.java:330)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2249)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2175)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:742)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:420)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653)
{noformat}
I have seen the exact same test error in the past, so most likely it's not your
patch that caused it. I can file a jira if there's no one filed previously.
> Listing encryption zones fails when deleting a EZ that is on a snapshotted
> directory
> ------------------------------------------------------------------------------------
>
> Key: HDFS-11197
> URL: https://issues.apache.org/jira/browse/HDFS-11197
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 2.6.0
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Minor
> Attachments: HDFS-11197-1.patch, HDFS-11197-2.patch,
> HDFS-11197-3.patch, HDFS-11197-4.patch, HDFS-11197-5.patch,
> HDFS-11197-6.patch, HDFS-11197-7.patch
>
>
> If a EZ directory is under a snapshotable directory, and a snapshot has been
> taking, then if this EZ is permanently deleted, it causes *hdfs crypto
> listZones* command to fail without showing any of the still available zones.
> This happens only after the EZ is removed from Trash folder. For example,
> considering */test-snap* folder is snapshotable and there is already an
> snapshot for it:
> {noformat}
> $ hdfs crypto -listZones
> /user/systest my-key
> /test-snap/EZ-1 my-key
> $ hdfs dfs -rmr /test-snap/EZ-1
> INFO fs.TrashPolicyDefault: Moved: 'hdfs://ns1/test-snap/EZ-1' to trash at:
> hdfs://ns1/user/hdfs/.Trash/Current/test-snap/EZ-1
> $ hdfs crypto -listZones
> /user/systest my-key
> /user/hdfs/.Trash/Current/test-snap/EZ-1 my-key
> $ hdfs dfs -rmr /user/hdfs/.Trash/Current/test-snap/EZ-1
> Deleted /user/hdfs/.Trash/Current/test-snap/EZ-1
> $ hdfs crypto -listZones
> RemoteException: Absolute path required
> {noformat}
> Once this error happens, *hdfs crypto -listZones* only works again if we
> remove the snapshot:
> {noformat}
> $ hdfs dfs -deleteSnapshot /test-snap snap1
> $ hdfs crypto -listZones
> /user/systest my-key
> {noformat}
> If we instead delete the EZ using *skipTrash* option, *hdfs crypto
> -listZones* does not break:
> {noformat}
> $ hdfs crypto -listZones
> /user/systest my-key
> /test-snap/EZ-2 my-key
> $ hdfs dfs -rmr -skipTrash /test-snap/EZ-2
> Deleted /test-snap/EZ-2
> $ hdfs crypto -listZones
> /user/systest my-key
> {noformat}
> The different behaviour seems to be because when removing the EZ trash
> folder, it's related INode is left with no parent INode. This causes
> *EncryptionZoneManager.listEncryptionZones* to throw the seen error, when
> trying to resolve the inodes in the given path.
> Am proposing a patch that fixes this issue by simply performing an additional
> check on *EncryptionZoneManager.listEncryptionZones* for the case an inode
> has no parent, so that it would be skipped on the list without trying to
> resolve it. Feedback on the proposal is appreciated.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]