[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-02-27 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340562#comment-14340562
 ] 

Stephen Chu commented on HDFS-6488:
---

Hi [~brandonli], the patch looks good. I verified on my env (Centos 6.4) that 
the new configuration gives full superuser access. +1 (non-binding)

 HDFS superuser unable to access user's Trash files using NFSv3 mount
 

 Key: HDFS-6488
 URL: https://issues.apache.org/jira/browse/HDFS-6488
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.3.0
Reporter: Stephen Chu
Assignee: Brandon Li
 Attachments: HDFS-6488.001.patch


 As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
 /user/schu/.Trash directory:
 {code}
 bash-4.1$ cd .Trash/
 bash: cd: .Trash/: Permission denied
 bash-4.1$ ls -la
 total 2
 drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
 drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
 drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
 drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
 bash-4.1$ ls .Trash
 ls: cannot open directory .Trash: Permission denied
 bash-4.1$
 {code}
 When using FsShell as hdfs superuser, I have superuser permissions to schu's 
 .Trash contents:
 {code}
 bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu
 -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu/tf1
 {code}
 The NFSv3 logs don't produce any error when superuser tries to access schu 
 Trash contents. However, for other permission errors (e.g. schu tries to 
 delete a directory owned by hdfs), there will be a permission error in the 
 logs.
 I think this is not specific to the .Trash directory perhaps.
 I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
 When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
 I get the same permission denied.
 {code}
 [schu@hdfs-nfs ~]$ hdfs dfs -ls
 Found 4 items
 drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
 drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
 -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
 drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
 bash-4.1$ whoami
 hdfs
 bash-4.1$ pwd
 /hdfs_nfs_mount/user/schu
 bash-4.1$ cd dir1
 bash: cd: dir1: Permission denied
 bash-4.1$
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-02-26 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339509#comment-14339509
 ] 

Stephen Chu commented on HDFS-6488:
---

[~brandonli]

Thank you, Brandon! I will test on RHEL/CentOS env. Will update with results. 
Let me know if there are any other specific platforms you want me to try on.

 HDFS superuser unable to access user's Trash files using NFSv3 mount
 

 Key: HDFS-6488
 URL: https://issues.apache.org/jira/browse/HDFS-6488
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.3.0
Reporter: Stephen Chu
 Attachments: HDFS-6488.001.patch


 As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
 /user/schu/.Trash directory:
 {code}
 bash-4.1$ cd .Trash/
 bash: cd: .Trash/: Permission denied
 bash-4.1$ ls -la
 total 2
 drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
 drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
 drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
 drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
 bash-4.1$ ls .Trash
 ls: cannot open directory .Trash: Permission denied
 bash-4.1$
 {code}
 When using FsShell as hdfs superuser, I have superuser permissions to schu's 
 .Trash contents:
 {code}
 bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu
 -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu/tf1
 {code}
 The NFSv3 logs don't produce any error when superuser tries to access schu 
 Trash contents. However, for other permission errors (e.g. schu tries to 
 delete a directory owned by hdfs), there will be a permission error in the 
 logs.
 I think this is not specific to the .Trash directory perhaps.
 I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
 When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
 I get the same permission denied.
 {code}
 [schu@hdfs-nfs ~]$ hdfs dfs -ls
 Found 4 items
 drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
 drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
 -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
 drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
 bash-4.1$ whoami
 hdfs
 bash-4.1$ pwd
 /hdfs_nfs_mount/user/schu
 bash-4.1$ cd dir1
 bash: cd: dir1: Permission denied
 bash-4.1$
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-02-25 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337651#comment-14337651
 ] 

Stephen Chu commented on HDFS-6488:
---

Thank you, [~brandonli]! I can help test on other environments if that's 
helpful.

 HDFS superuser unable to access user's Trash files using NFSv3 mount
 

 Key: HDFS-6488
 URL: https://issues.apache.org/jira/browse/HDFS-6488
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.3.0
Reporter: Stephen Chu

 As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
 /user/schu/.Trash directory:
 {code}
 bash-4.1$ cd .Trash/
 bash: cd: .Trash/: Permission denied
 bash-4.1$ ls -la
 total 2
 drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
 drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
 drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
 drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
 bash-4.1$ ls .Trash
 ls: cannot open directory .Trash: Permission denied
 bash-4.1$
 {code}
 When using FsShell as hdfs superuser, I have superuser permissions to schu's 
 .Trash contents:
 {code}
 bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu
 -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu/tf1
 {code}
 The NFSv3 logs don't produce any error when superuser tries to access schu 
 Trash contents. However, for other permission errors (e.g. schu tries to 
 delete a directory owned by hdfs), there will be a permission error in the 
 logs.
 I think this is not specific to the .Trash directory perhaps.
 I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
 When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
 I get the same permission denied.
 {code}
 [schu@hdfs-nfs ~]$ hdfs dfs -ls
 Found 4 items
 drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
 drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
 -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
 drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
 bash-4.1$ whoami
 hdfs
 bash-4.1$ pwd
 /hdfs_nfs_mount/user/schu
 bash-4.1$ cd dir1
 bash: cd: dir1: Permission denied
 bash-4.1$
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4505) Balancer failure with nameservice configuration.

2015-01-12 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-4505:
--
Attachment: HDFS-4505.002.patch

[~stayhf], your patch looks good.

I believe this problem still exists. It's been a while, so I've rebased your 
patch on trunk and attached.

 Balancer failure with nameservice configuration.
 

 Key: HDFS-4505
 URL: https://issues.apache.org/jira/browse/HDFS-4505
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.0.2-alpha
 Environment: OS: Mac OS X Server 10.6.8/ Linux 2.6.32 x86_64
Reporter: QueryIO
Assignee: Chu Tong
  Labels: balancer, hdfs
 Attachments: HADOOP-9172.patch, HDFS-4505.002.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 This set of properties ...
 propertynamedfs.namenode.https-address.NameNode1/namevalue192.168.0.10:50470/value/property
 propertynamedfs.namenode.http-address.NameNode1/namevalue192.168.0.10:50070/value/property
 propertynamedfs.namenode.rpc-address.NameNode1/namevalue192.168.0.10:9000/value/property
 propertynamedfs.nameservice.id/namevalueNameNode1/value/property
 propertynamedfs.nameservices/namevalueNameNode1/value/property
 gives following issue while running balancer ...
 2012-12-27 15:42:36,193 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 namenodes = [hdfs://queryio10.local:9000, hdfs://192.168.0.10:9000]
 2012-12-27 15:42:36,194 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/192.168.0.10:50010
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 over-utilized: []
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 underutilized: []
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/192.168.0.10:50010
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 over-utilized: []
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 underutilized: []
 2012-12-27 15:42:37,570 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
  No lease on /system/balancer.id File does not exist. Holder 
 DFSClient_NONMAPREDUCE_1926739478_1 does not have any open files.
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2315)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2306)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2102)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:469)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:294)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:43138)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:910)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1694)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1690)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1688)
   at org.apache.hadoop.ipc.Client.call(Client.java:1164)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
   at $Proxy10.addBlock(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
   at $Proxy10.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:285)
   at 

[jira] [Commented] (HDFS-4505) Balancer failure with nameservice configuration.

2015-01-12 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274569#comment-14274569
 ] 

Stephen Chu commented on HDFS-4505:
---

Hadoop QA ran successfully. The -1 is from a bug (HADOOP-11474).

 Balancer failure with nameservice configuration.
 

 Key: HDFS-4505
 URL: https://issues.apache.org/jira/browse/HDFS-4505
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.0.2-alpha
 Environment: OS: Mac OS X Server 10.6.8/ Linux 2.6.32 x86_64
Reporter: QueryIO
Assignee: Chu Tong
  Labels: balancer, hdfs
 Attachments: HADOOP-9172.patch, HDFS-4505.002.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 This set of properties ...
 propertynamedfs.namenode.https-address.NameNode1/namevalue192.168.0.10:50470/value/property
 propertynamedfs.namenode.http-address.NameNode1/namevalue192.168.0.10:50070/value/property
 propertynamedfs.namenode.rpc-address.NameNode1/namevalue192.168.0.10:9000/value/property
 propertynamedfs.nameservice.id/namevalueNameNode1/value/property
 propertynamedfs.nameservices/namevalueNameNode1/value/property
 gives following issue while running balancer ...
 2012-12-27 15:42:36,193 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 namenodes = [hdfs://queryio10.local:9000, hdfs://192.168.0.10:9000]
 2012-12-27 15:42:36,194 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/192.168.0.10:50010
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 over-utilized: []
 2012-12-27 15:42:37,433 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 underutilized: []
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/192.168.0.10:50010
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 over-utilized: []
 2012-12-27 15:42:37,436 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
 0 underutilized: []
 2012-12-27 15:42:37,570 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
  No lease on /system/balancer.id File does not exist. Holder 
 DFSClient_NONMAPREDUCE_1926739478_1 does not have any open files.
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2315)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2306)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2102)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:469)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:294)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:43138)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:910)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1694)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1690)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1688)
   at org.apache.hadoop.ipc.Client.call(Client.java:1164)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
   at $Proxy10.addBlock(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
   at $Proxy10.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:285)
   at 
 

[jira] [Commented] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-19 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254549#comment-14254549
 ] 

Stephen Chu commented on HDFS-7542:
---

The TestDataNodeVolumeFailureToleration and TestDatanodeManager failures are 
unrelated. There are open outstanding JIRAs assigned to them. Reran them 
successfully locally.

 Add an option to DFSAdmin -safemode wait to ignore connection failures
 --

 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7542.001.patch, HDFS-7542.002.patch


 Currently, the _dfsadmin -safemode wait_ command aborts when connection to 
 the NN fails (network glitch, ConnectException when NN is unreachable, 
 EOFException if network link shut down). 
 In certain situations, users have asked for an option to make the command 
 resilient to connection failures. This is useful so that the admin can 
 initiate the wait command despite the NN not being fully up or survive 
 intermittent network issues. With this option, the admin can rely on the wait 
 command continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-18 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7542:
--
Attachment: HDFS-7542.002.patch

 Add an option to DFSAdmin -safemode wait to ignore connection failures
 --

 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7542.001.patch, HDFS-7542.002.patch


 Currently, the _dfsadmin -safemode wait_ command aborts when connection to 
 the NN fails (network glitch, ConnectException when NN is unreachable, 
 EOFException if network link shut down). 
 In certain situations, users have asked for an option to make the command 
 resilient to connection failures. This is useful so that the admin can 
 initiate the wait command despite the NN not being fully up or survive 
 intermittent network issues. With this option, the admin can rely on the wait 
 command continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-18 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251365#comment-14251365
 ] 

Stephen Chu commented on HDFS-7542:
---

TestRollingUpgradeRollback failure is unrelated to these DFSAdmin command 
changes. I re-ran the test a few times successfully. The release audit warning 
also seems to be incorrect because all modified files have the Apache license. 
It's hard to see the exact test name that timed out. Retrying jenkins with the 
same patch. 

 Add an option to DFSAdmin -safemode wait to ignore connection failures
 --

 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7542.001.patch, HDFS-7542.002.patch


 Currently, the _dfsadmin -safemode wait_ command aborts when connection to 
 the NN fails (network glitch, ConnectException when NN is unreachable, 
 EOFException if network link shut down). 
 In certain situations, users have asked for an option to make the command 
 resilient to connection failures. This is useful so that the admin can 
 initiate the wait command despite the NN not being fully up or survive 
 intermittent network issues. With this option, the admin can rely on the wait 
 command continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-17 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7542:
-

 Summary: Add an option to DFSAdmin -safemode wait to ignore 
connection failures
 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor


Currently, the _dfsadmin -safemode wait_ command aborts when connection to the 
NN fails (network glitch, ConnectException when NN is unreachable, EOFException 
if network link shut down). 

In certain situations, users have asked for an option to make the command 
resilient to connection failures. This is useful so that the admin can initiate 
the wait command despite the NN not being fully up or survive intermittent 
network issues. With this option, the admin can rely on the wait command 
continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-17 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7542:
--
Attachment: HDFS-7542.001.patch

 Add an option to DFSAdmin -safemode wait to ignore connection failures
 --

 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7542.001.patch


 Currently, the _dfsadmin -safemode wait_ command aborts when connection to 
 the NN fails (network glitch, ConnectException when NN is unreachable, 
 EOFException if network link shut down). 
 In certain situations, users have asked for an option to make the command 
 resilient to connection failures. This is useful so that the admin can 
 initiate the wait command despite the NN not being fully up or survive 
 intermittent network issues. With this option, the admin can rely on the wait 
 command continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7542) Add an option to DFSAdmin -safemode wait to ignore connection failures

2014-12-17 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7542:
--
Status: Patch Available  (was: Open)

 Add an option to DFSAdmin -safemode wait to ignore connection failures
 --

 Key: HDFS-7542
 URL: https://issues.apache.org/jira/browse/HDFS-7542
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7542.001.patch


 Currently, the _dfsadmin -safemode wait_ command aborts when connection to 
 the NN fails (network glitch, ConnectException when NN is unreachable, 
 EOFException if network link shut down). 
 In certain situations, users have asked for an option to make the command 
 resilient to connection failures. This is useful so that the admin can 
 initiate the wait command despite the NN not being fully up or survive 
 intermittent network issues. With this option, the admin can rely on the wait 
 command continuing to poll instead of aborting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7501) TransactionsSinceLastCheckpoint can be negative on SBNs

2014-12-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-7501:
-

Assignee: Stephen Chu

 TransactionsSinceLastCheckpoint can be negative on SBNs
 ---

 Key: HDFS-7501
 URL: https://issues.apache.org/jira/browse/HDFS-7501
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Harsh J
Assignee: Stephen Chu
Priority: Trivial

 The metric TransactionsSinceLastCheckpoint is derived as FSEditLog.txid minus 
 NNStorage.mostRecentCheckpointTxId.
 In Standby mode, the former does not increment beyond the loaded or 
 last-when-active value, but the latter does change due to checkpoints done 
 regularly in this mode. Thereby, the SBN will eventually end up showing 
 negative values for TransactionsSinceLastCheckpoint.
 This is not an issue as the metric only makes sense to be monitored on the 
 Active NameNode, but we should perhaps just show the value 0 by detecting if 
 the NN is in SBN form, as allowing a negative number is confusing to view 
 within a chart that tracks it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-28 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186486#comment-14186486
 ] 

Stephen Chu commented on HDFS-6741:
---

Thanks a lot, Harsh!

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6741:
--
Attachment: HDFS-6741.2.patch

Hi Harsh, I think the patch you posted in HDFS-7292 is better. Submitting that 
patch here right now and assigning to you because you made the improvements.

+1 (non-binding)

Applied the patch and ran the new unit tests successfully.

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6741:
--
Attachment: HDFS-6741.2.patch

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-6741:
-

Assignee: Harsh J  (was: Stephen Chu)

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7292) Improve error messages for checkOwner permission related failures

2014-10-26 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184751#comment-14184751
 ] 

Stephen Chu commented on HDFS-7292:
---

Hi, Harsh. I think this is the same as HDFS-6741. I submitted a patch there, 
and can further add more information about the inode in the message if you 
would like.

 Improve error messages for checkOwner permission related failures
 -

 Key: HDFS-7292
 URL: https://issues.apache.org/jira/browse/HDFS-7292
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: Harsh J
Priority: Trivial

 If a bad file create request fails, you get a juicy error self-describing the 
 reason almost:
 {code}Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied: user=root, access=WRITE, 
 inode=/:hdfs:supergroup:drwxr-xr-x{code}
 However, if a setPermission fails, one only gets a vague:
 {code}Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied{code}
 It would be nicer if all forms of permission failures logged the accessed 
 inode and current ownership and permissions in the same way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182935#comment-14182935
 ] 

Stephen Chu commented on HDFS-7283:
---

Thanks for the review, [~wheat9]! The TestGroupsWithHA failure is unrelated. I 
re-ran the test successfully locally.

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-7283.1.patch


 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-24 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6741:
--
Priority: Trivial  (was: Minor)
Target Version/s: 3.0.0, 2.7.0  (was: 3.0.0, 2.6.0)
  Labels: supportability  (was: )

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183127#comment-14183127
 ] 

Stephen Chu commented on HDFS-7283:
---

Thanks again, [~wheat9].

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-7283.1.patch


 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-23 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7283:
-

 Summary: Bump DataNode OOM log from WARN to ERROR
 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial


When the DataNode OOMs, it logs the following WARN message which should be 
bumped up to ERROR because DataNode OOM often leads to DN process abortion.

{code}
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
memory. Will retry in 30 seconds. 
4751 java.lang.OutOfMemoryError: unable to create new native thread
{code}

Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-23 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181897#comment-14181897
 ] 

Stephen Chu commented on HDFS-7283:
---

[~philip], I haven't tried that setting yet. Do you mean that we should verify 
that the DN can be configured to use -XX:OnOutOfMemoryError properly?

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability

 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-23 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7283:
--
Attachment: HDFS-7283.1.patch

Ah, that makes sense, Phil. Thanks. I've attached a patch to bump the log 
severity.

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-7283.1.patch


 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-23 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7283:
--
Status: Patch Available  (was: Open)

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-7283.1.patch


 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7283) Bump DataNode OOM log from WARN to ERROR

2014-10-23 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181989#comment-14181989
 ] 

Stephen Chu commented on HDFS-7283:
---

No added unit tests because this just changes the log severity.

 Bump DataNode OOM log from WARN to ERROR
 

 Key: HDFS-7283
 URL: https://issues.apache.org/jira/browse/HDFS-7283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-7283.1.patch


 When the DataNode OOMs, it logs the following WARN message which should be 
 bumped up to ERROR because DataNode OOM often leads to DN process abortion.
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of 
 memory. Will retry in 30 seconds. 
 4751 java.lang.OutOfMemoryError: unable to create new native thread
 {code}
 Thanks to Roland Teague for identifying this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7094) [ HDFS NFS ] TYPO in NFS configurations from documentation.

2014-09-19 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140915#comment-14140915
 ] 

Stephen Chu commented on HDFS-7094:
---

Hi, [~brahma].

The documentation (which corresponds to version 2.5.1) is actually correct 
because in HDFS-6056 (which went into 2.5.0), the NFS configs were changed from 
{{dfs.nfs.keytab.file}} to {{dfs.nfs.keytab.file}}. Note that in the patch, 
DeprecationDeltas were added, so users can use the {{dfs.nfs.}} prefix for some 
configs.

If you look at {{org.apache.hadoop.hdfs.nfs.conf.NfsConfigKeys}} on branch-2, 
you'll see the up-to-date NFS config names.

Thanks,
Stephen



 [ HDFS NFS ] TYPO in NFS configurations from documentation.
 ---

 Key: HDFS-7094
 URL: https://issues.apache.org/jira/browse/HDFS-7094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

  
   *{color:blue}Config from Documentation{color}*( 
 https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html
  )
 property
 name *{color:red}nfs.keytab.file{color}* /name
 value/etc/hadoop/conf/nfsserver.keytab/value !-- path to the nfs 
 gateway keytab --
   /property 
   property
 name *{color:red}nfs.kerberos.principal{color}* /name
 valuenfsserver/_h...@your-realm.com/value
   /property
  *{color:blue}Config From Code{color}* 
 {code}
   public static final String DFS_NFS_KEYTAB_FILE_KEY = dfs.nfs.keytab.file;
   public static final String DFS_NFS_USER_NAME_KEY = 
 dfs.nfs.kerberos.principal;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7094) [ HDFS NFS ] TYPO in NFS configurations from documentation.

2014-09-19 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140920#comment-14140920
 ] 

Stephen Chu commented on HDFS-7094:
---

Sorry, made a typo above. I meant the NFS configs were changed from, for 
example, {{dfs.nfs.keytab.file}} to {{nfs.keytab.file}}.

 [ HDFS NFS ] TYPO in NFS configurations from documentation.
 ---

 Key: HDFS-7094
 URL: https://issues.apache.org/jira/browse/HDFS-7094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

  
   *{color:blue}Config from Documentation{color}*( 
 https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html
  )
 property
 name *{color:red}nfs.keytab.file{color}* /name
 value/etc/hadoop/conf/nfsserver.keytab/value !-- path to the nfs 
 gateway keytab --
   /property 
   property
 name *{color:red}nfs.kerberos.principal{color}* /name
 valuenfsserver/_h...@your-realm.com/value
   /property
  *{color:blue}Config From Code{color}* 
 {code}
   public static final String DFS_NFS_KEYTAB_FILE_KEY = dfs.nfs.keytab.file;
   public static final String DFS_NFS_USER_NAME_KEY = 
 dfs.nfs.kerberos.principal;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-12 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7061:
-

 Summary: Add test to verify encryption zone creation after 
NameNode restart without saving namespace
 Key: HDFS-7061
 URL: https://issues.apache.org/jira/browse/HDFS-7061
 Project: Hadoop HDFS
  Issue Type: Test
  Components: encryption, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor


Right now we verify that encryption zones are expected after saving the 
namespace and restarting the NameNode.

We should also verify that encryption zone modifications are expected after 
restarting the NameNode without saving the namespace.

This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we toggle 
NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-12 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7061:
--
Issue Type: Sub-task  (was: Test)
Parent: HDFS-6891

 Add test to verify encryption zone creation after NameNode restart without 
 saving namespace
 ---

 Key: HDFS-7061
 URL: https://issues.apache.org/jira/browse/HDFS-7061
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor

 Right now we verify that encryption zones are expected after saving the 
 namespace and restarting the NameNode.
 We should also verify that encryption zone modifications are expected after 
 restarting the NameNode without saving the namespace.
 This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
 toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-12 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7061:
--
Status: Patch Available  (was: Open)

 Add test to verify encryption zone creation after NameNode restart without 
 saving namespace
 ---

 Key: HDFS-7061
 URL: https://issues.apache.org/jira/browse/HDFS-7061
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7061.1.patch


 Right now we verify that encryption zones are expected after saving the 
 namespace and restarting the NameNode.
 We should also verify that encryption zone modifications are expected after 
 restarting the NameNode without saving the namespace.
 This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
 toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-12 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-7061:
--
Attachment: HDFS-7061.1.patch

 Add test to verify encryption zone creation after NameNode restart without 
 saving namespace
 ---

 Key: HDFS-7061
 URL: https://issues.apache.org/jira/browse/HDFS-7061
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-7061.1.patch


 Right now we verify that encryption zones are expected after saving the 
 namespace and restarting the NameNode.
 We should also verify that encryption zone modifications are expected after 
 restarting the NameNode without saving the namespace.
 This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
 toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-10 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.4.patch

Re-kick Hadoop QA.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch, HDFS-6966.4.patch, HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-10 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128989#comment-14128989
 ] 

Stephen Chu commented on HDFS-6966:
---

The above 3 test failures are unrelated. Run them successfully locally.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch, HDFS-6966.4.patch, HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.4.patch

Hey [~andrew.wang], thanks a lot for the comments. I updated the patch to 
address them.

* Added double stars to javadoc the tests descriptions
* Assert true on the successful rename test case
* Add verification that the contents read from snapshotted encrypted file are 
as expected
* Add reading file contents after failover to make sure they're the same
* It turns out that the KeyProvider of both the NNs in the HA test are not the 
same object (fails ==). I couldn't find a good way to set them to be the same 
object, so I changed DFSTestUtil to allow creating a key while specifying a NN 
index and created the same key for both the NameNodes. Let me know if you think 
there's a better way.
* Set the client's provider to NN0's KeyProvider.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-09 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127393#comment-14127393
 ] 

Stephen Chu commented on HDFS-6966:
---

java.lang.RuntimeException: The forked VM terminated without saying properly 
goodbye. VM crash or System.exit called during test build.

Going to re-update the patch to re-kick Hadoop QA.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch, HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.4.patch

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch, HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-09 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127811#comment-14127811
 ] 

Stephen Chu commented on HDFS-6966:
---

TestPipelinesFailover and TestFileConcurrentReader are unrelated. I ran them 
locally successfully.

TestEncryptionZones timed out in testStartFileRetry, which I don't believe was 
modified by this patch. Looking through the patch's changes in 
TestEncryptionZones and DFSTestUtil, they seem isolated from the failed test. I 
looped TestEncryptionZones successfully 20x on my local machine to try to 
repro. [~andrew.wang], any thoughts if this patch could have affected 
testStartFileRetry failure? I couldn't figure out the timeout reason which is 
failing in the // Flip-flop between two EZs to repeatedly fail section.



 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch, 
 HDFS-6966.4.patch, HDFS-6966.4.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-08 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.3.patch

Rebasing patch.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch, HDFS-6966.3.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7032) Add WebHDFS support for reading and writing to encryption zones

2014-09-08 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7032:
-

 Summary: Add WebHDFS support for reading and writing to encryption 
zones
 Key: HDFS-7032
 URL: https://issues.apache.org/jira/browse/HDFS-7032
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, webhdfs
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu


Currently, decrypting files within encryption zones does not work through 
WebHDFS. Users will get returned the raw data.

For example:
{code}
bash-4.1$ hdfs crypto -listZones
/enc2 key128 
/jenkins  key128 

bash-4.1$ hdfs dfs -cat /enc2/hello
hello and goodbye
bash-4.1$ hadoop fs -cat 
webhdfs://hdfs-cdh5-vanilla-1.host.com:20101/enc2/hello14/09/08 15:55:26 WARN 
ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' 
has not been set, no TrustStore will be loaded
å¿¡?~?A
?`?y???Wbash-4.1$ 
bash-4.1$ curl -i -L 
http://hdfs-cdh5-vanilla-1.host.com:20101/webhdfs/v1/enc2/hello?user.name=hdfsop=OPEN;
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Mon, 08 Sep 2014 22:56:08 GMT
Date: Mon, 08 Sep 2014 22:56:08 GMT
Pragma: no-cache
Expires: Mon, 08 Sep 2014 22:56:08 GMT
Date: Mon, 08 Sep 2014 22:56:08 GMT
Pragma: no-cache
Content-Type: application/octet-stream
Set-Cookie: 
hadoop.auth=u=hdfsp=hdfst=simplee=1410252968270s=QzpylAy1ltts1F6hHpsVFGC0TfA=;
 Version=1; Path=/; Expires=Tue, 09-Sep-2014 08:56:08 GMT; HttpOnly
Location: 
http://hdfs-cdh5-vanilla-1.host.com:20003/webhdfs/v1/enc2/hello?op=OPENuser.name=hdfsnamenoderpcaddress=hdfs-cdh5-vanilla-1.host.com:8020offset=0
Content-Length: 0
Server: Jetty(6.1.26)

HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 08 Sep 2014 22:56:08 GMT
Date: Mon, 08 Sep 2014 22:56:08 GMT
Pragma: no-cache
Expires: Mon, 08 Sep 2014 22:56:08 GMT
Date: Mon, 08 Sep 2014 22:56:08 GMT
Pragma: no-cache
Content-Type: application/octet-stream
Content-Length: 18
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Server: Jetty(6.1.26)

å¿¡?~?A
?`?y???Wbash-4.1$ 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7003) Add NFS Gateway support for reading and writing to encryption zones

2014-09-05 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7003:
-

 Summary: Add NFS Gateway support for reading and writing to 
encryption zones
 Key: HDFS-7003
 URL: https://issues.apache.org/jira/browse/HDFS-7003
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, nfs
Affects Versions: 2.6.0
Reporter: Stephen Chu


Currently, reading and writing within encryption zones does not work through 
the NFS gateway.

For example, we have an encryption zone {{/enc}}. Here's the difference of 
reading the file from hadoop fs and the NFS gateway:

{code}
[hdfs@schu-enc2 ~]$ hadoop fs -cat /enc/hi
hi
[hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hi
??
{code}

If we write a file using the NFS gateway, we'll see behavior like this:

{code}
[hdfs@schu-enc2 ~]$ echo hello  /hdfs_nfs/enc/hello
[hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hello
hello
[hdfs@schu-enc2 ~]$ hdfs dfs -cat /enc/hello
???tp[hdfs@schu-enc2 ~]$ 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-02 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.2.patch

Adding new patch with more tests.

In the most recent patch we:

* Add HA test to verify standby NN tracks encryption zones.
* Assert null when calling getEncryptionZoneForPath on a nonexistent path.
* Verify success of renaming a dir and file within an encryption zone
* Run fsck on a system with encryption zones
* Add more snapshot unit testing. In particular, after snapshotting an 
encryption zone, remove the encryption zone and recreate the dir and take a 
snapshot. Verify that the new snapshot does not have an encryption zone. Delete 
the snapshots out of order and verify that the remaining snapshots have the 
correct encryption zone paths.
* Add tests for symlinks within the same encryption zone and within different 
encryption zones.
* Add test to run the OfflineImageViewer on a system of encryption zones.

Again, if it's better, I can merge some tests to save on MiniDFSCluster spin up 
and shutdown time.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-02 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119216#comment-14119216
 ] 

Stephen Chu commented on HDFS-6966:
---

TestPipelinesFailover is unrelated to this patch.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-08-29 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115376#comment-14115376
 ] 

Stephen Chu commented on HDFS-6966:
---

The above test failures are unrelated to the patch's changes, which just touch 
encryption tests and the {{DFSTestUtil#createKey}} helper method. I re-ran them 
successfully locally.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6966) Add additional unit tests for encryption zones

2014-08-28 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6966:
-

 Summary: Add additional unit tests for encryption zones
 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Stephen Chu


There are some more unit tests that can be added for test encryption zones. For 
example, more encryption zone + snapshot tests, running fsck on encryption 
zones, and more.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-28 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114095#comment-14114095
 ] 

Stephen Chu commented on HDFS-6946:
---

[~airbots], [~kihwal], any thoughts on this one? Checking because you worked on 
HDFS-5803.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6946.1.patch


 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-28 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6946:
--

Attachment: testBalancer0Integrity-failure.log

Attaching testBalancer0Integrity failure log from jenkins report. Will go 
through them again too.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6946.1.patch, testBalancer0Integrity-failure.log


 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-08-28 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--

Attachment: HDFS-6966.1.patch

Submitting patch that adds more unit tests for encryption zones.

* Add HA test to verify standby NN tracks encryption zones.
* Assert null when calling {{getEncryptionZoneForPath}} on a nonexistent path.
* Verify success of renaming a dir and file within an encryption zone
* Run fsck on a system with encryption zones
* Add more snapshot unit testing. In particular, after snapshotting an 
encryption zone, remove the encryption zone and recreate the dir and take a 
snapshot. Verify that the new snapshot does not have an encryption zone. Delete 
the snapshots out of order and verify that the remaining snapshots have the 
correct encryption zone paths.
* Add tests for symlinks within the same encryption zone and within different 
encryption zones.

I can merge some of these tests to existing tests to save on MiniDFSCluster 
spin up / shutdown time if preferable.


 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-08-28 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--

Status: Patch Available  (was: Open)

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-08-28 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--

 Target Version/s: 3.0.0, 2.6.0  (was: 3.0.0)
Affects Version/s: 2.6.0

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6946:
--

Attachment: HDFS-6946.1.patch

I ran TestBalancerWithSaslDataTransfer multiple times with a build of current 
trunk and a build of when the test was first introduced (HDFS-2856, git hash 
8d5e8c860ed361ed792affcfe06f1a34b017e421).

git hash 8d5e8c860ed361ed792affcfe06f1a34b017e421 (sec):
90.254
84.332
73.909
74.057
84.397
76.889
83.687 sec


Current trunk (sec):
79.968
62.116 
73.788
79.336
68.545
76.846
70.014

There doesn't seem to be a performance regression, and the test cases in both 
versions look the same. Today (8/27/2014) Hdfs-trunk build had a test failure 
because of timing out of 40s as well in {{TestBlockTokenWithDFS.testEnd2End}}. 

I think we should bump up {{TestBalancer#TIMEOUT}} from 40s to 60s. Attaching 
patch.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6946.1.patch


 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6946:
--

Status: Patch Available  (was: Open)

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6946.1.patch


 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-27 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112802#comment-14112802
 ] 

Stephen Chu commented on HDFS-6773:
---

Thank you, Colin!

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Fix For: 2.6.0

 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-27 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112995#comment-14112995
 ] 

Stephen Chu commented on HDFS-6946:
---

The two failing tests above are not due to the timeout change. I re-ran them 
locally successfully.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6946.1.patch


 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-26 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110953#comment-14110953
 ] 

Stephen Chu commented on HDFS-6773:
---

The above two test failures aren't related to this patch. I ran them locally 
successfully to double-check.

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-26 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111455#comment-14111455
 ] 

Stephen Chu commented on HDFS-6946:
---

Similar to HDFS-5803, where TestBalancer#TIMEOUT was bumped from 20s to 40s.

We can run TestBalancer between current trunk and the time when HDFS-5803 was 
fixed to see if there is a performance regression while taking into account 
test code changes. If there isn't a regression, perhaps we should bump up the 
timeout.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-26 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-6946:
-

Assignee: Stephen Chu

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor

 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-08-26 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6951:
--

Attachment: HDFS-6951-testrepo.patch

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0

 Attachments: HDFS-6951-testrepo.patch


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 To reproduce:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-08-26 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6951:
-

 Summary: Saving namespace and restarting NameNode will remove 
existing encryption zones
 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0
 Attachments: HDFS-6951-testrepo.patch

Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

To reproduce:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-08-26 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6951:
--

Description: 
Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

I could reproduce this on a pseudo-distributed cluster:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.

  was:
Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

To reproduce:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.


 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0

 Attachments: HDFS-6951-testrepo.patch


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-25 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Attachment: HDFS-6773.2.patch

Thanks a lot for the review, [~cmccabe]. Attaching a rebased patch.

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-25 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Status: Open  (was: Patch Available)

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-25 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Status: Patch Available  (was: Open)

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-25 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109982#comment-14109982
 ] 

Stephen Chu commented on HDFS-6773:
---

It looks like the hdfs test run was aborted, but I'm not exactly sure why. I 
reapplied the patch to a clean trunk and ran some snapshot tests successfully 
to double-check. Re-triggering Hadoop QA.

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default

2014-08-25 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Attachment: HDFS-6773.2.patch

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster can run dramatically faster

2014-08-22 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Status: Patch Available  (was: Open)

 MiniDFSCluster can run dramatically faster
 --

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6773) MiniDFSCluster can run dramatically faster

2014-08-22 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6773:
--

Attachment: HDFS-6773.1.patch

Attaching a patch.

* Add {{skipFsyncForTesting}} builder option which defaults to true to 
MiniDFSCluster.
* Remove enabling fsync in {{TestFsDatasetCache}} and {{TestCacheDirectives}} 
because it's not needed.

I left the instances of 
EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); in this first 
patch. Let me know if it's better just to remove them all, or to use the new 
Builder option in some of them to let new test readers be aware of this option.

Quick scan through tests searching for fsync and I don't think any current 
tests require fsync.

 MiniDFSCluster can run dramatically faster
 --

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6134) Transparent data at rest encryption

2014-08-02 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6134:
--

Attachment: HDFS-6134_test_plan.pdf

I've attached a test plan we will execute for this feature. Feel free to 
comment and make suggestions.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFS-6134_test_plan.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the health­care industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6773) MiniDFSCluster can run dramatically faster

2014-08-01 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081959#comment-14081959
 ] 

Stephen Chu commented on HDFS-6773:
---

Thanks, Daryn. That sounds like a good approach.

I see 2 tests that 
{{EditLogFileOutputStream.setShouldSkipFsyncForTesting(false);}}:
TestFsDatasetCache.java
TestCacheDirectives.java

Checked with Andrew and Colin, and we think that fsync is probably not a 
requirement for the caching tests because the unit tests aren't aimed to be run 
with a power-cycle in between. Will look into it more, as well as go through 
the rest of HDFS tests to see if any need fsync.

 MiniDFSCluster can run dramatically faster
 --

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu

 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-30 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078956#comment-14078956
 ] 

Stephen Chu commented on HDFS-6665:
---

Thank you, Andrew!

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Fix For: 2.6.0

 Attachments: HDFS-6665.1.patch, HDFS-6665.2.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6692) Add more HDFS encryption tests

2014-07-30 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079049#comment-14079049
 ] 

Stephen Chu commented on HDFS-6692:
---

Hi Andrew, nice tests.

Nits:
Can remove the concatenation in the following? {{Too many retries because of  
+ encryption zone operations}}
Line with the following comment goes over 80 char: {{* Tests the retry logic in 
startFile. We release the lock while generating an}}

One additional test that can be added is attempting to create an encryption 
zone on a parent of an encryption zone.

A minor additional test is checking that {{HdfsAdmin#listEncryptionZones}} 
succeeds / throws reasonable exception in {{testCreateEZWithNoProvider}}.

I can add these tests later, too, if you prefer.

 Add more HDFS encryption tests
 --

 Key: HDFS-6692
 URL: https://issues.apache.org/jira/browse/HDFS-6692
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6692.001.patch


 Now that we have the basic pieces in place for encryption, it's a good time 
 to look at our test coverage and add new tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6773) MiniDFSCluster can run dramatically faster

2014-07-30 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-6773:
-

Assignee: Stephen Chu

Hi, [~daryn]. Thanks for filing this. I can take it on. If you were already 
doing work for it, feel free to reassign.

 MiniDFSCluster can run dramatically faster
 --

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu

 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6785) Should not be able to create encryption zone using path to a non-directory file

2014-07-30 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6785:
-

 Summary: Should not be able to create encryption zone using path 
to a non-directory file
 Key: HDFS-6785
 URL: https://issues.apache.org/jira/browse/HDFS-6785
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Stephen Chu


Currently, users can create an encryption zone while specifying a path to a 
file, as seen below.

{code}
[hdfs@schu-enc2 ~]$ cat hi
hi
[hdfs@schu-enc2 ~]$ hadoop fs -put hi /hi
[hdfs@schu-enc2 ~]$ hadoop key create testKey
testKey has been successfully created.
KMSClientProvider[http://schu-enc2.vpc.com:16000/kms/v1/] has been updated.
[hdfs@schu-enc2 ~]$ hdfs crypto -createZone -keyName testKey -path /hi
Added encryption zone /hi
[hdfs@schu-enc2 ~]$ hdfs crypto -listZones
/hi  testKey
{code}

Based on my understanding, admins should be able to create encryption zones 
only on empty directories, not files.

If the design changed to allow creating EZ on files, then we should change the 
javadoc of {{HdfsAdmin#createEncryptionZone}}, which currently states, Create 
an encryption zone rooted at an empty existing directory, using the specified 
encryption key. An encryption zone has an associated encryption key used when 
reading and writing files within the zone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6767) Cannot remove directory within encryption zone to Trash

2014-07-29 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6767:
-

 Summary: Cannot remove directory within encryption zone to Trash
 Key: HDFS-6767
 URL: https://issues.apache.org/jira/browse/HDFS-6767
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Stephen Chu


Currently, users that want to remove an encrypted directory using the FsShell 
remove commands need to skip the trash.

If users try to remove an encrypted directory while Trash is enabled, they will 
see the following error:

{code}
[hdfs@schu-enc2 ~]$ hdfs dfs -rm -r /user/hdfs/enc
2014-07-29 13:47:28,799 INFO  [main] hdfs.DFSClient 
(DFSClient.java:init(604)) - Found KeyProvider: KeyProviderCryptoExtension: 
jceks://file@/home/hdfs/hadoop-data/test.jks
2014-07-29 13:47:29,563 INFO  [main] fs.TrashPolicyDefault 
(TrashPolicyDefault.java:initialize(92)) - Namenode trash configuration: 
Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
rm: Failed to move to trash: hdfs://schu-enc2.vpc.com:8020/user/hdfs/enc. 
Consider using -skipTrash option
{code}

This is because the encrypted dir cannot be moved from an encryption zone, as 
the NN log explains:

{code}
2014-07-29 13:47:29,596 INFO  [IPC Server handler 8 on 8020] ipc.Server 
(Server.java:run(2120)) - IPC Server handler 8 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 172.25.3.153:48295 
Call#9 Retry#0
java.io.IOException: /user/hdfs/enc can't be moved from an encryption zone.
at 
org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager.checkMoveValidity(EncryptionZoneManager.java:175)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedRenameTo(FSDirectory.java:526)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.renameTo(FSDirectory.java:440)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3593)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3555)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3522)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:727)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:542)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6767) Cannot remove directory within encryption zone to Trash

2014-07-29 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6767:
--

Attachment: HDFS-6767.1.patch

I agree that improving the error message and requiring users to skip the trash 
is appropriate and reasonable.

Attaching a sample patch that appends the cause to the Failed to move to 
trash error message. Tested it out and the output is now changed to the 
clearer 
{code}
rm: Failed to move to trash: hdfs://schu-enc2.vpc.com:8020/user/hdfs/enc: 
/user/hdfs/enc can't be moved from an encryption zone.
{code}

This is the same issue as HDFS-6760 Deletion of directories with snapshots will 
not output reason for trash move failure.

 Cannot remove directory within encryption zone to Trash
 ---

 Key: HDFS-6767
 URL: https://issues.apache.org/jira/browse/HDFS-6767
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Stephen Chu
 Attachments: HDFS-6767.1.patch


 Currently, users that want to remove an encrypted directory using the FsShell 
 remove commands need to skip the trash.
 If users try to remove an encrypted directory while Trash is enabled, they 
 will see the following error:
 {code}
 [hdfs@schu-enc2 ~]$ hdfs dfs -rm -r /user/hdfs/enc
 2014-07-29 13:47:28,799 INFO  [main] hdfs.DFSClient 
 (DFSClient.java:init(604)) - Found KeyProvider: KeyProviderCryptoExtension: 
 jceks://file@/home/hdfs/hadoop-data/test.jks
 2014-07-29 13:47:29,563 INFO  [main] fs.TrashPolicyDefault 
 (TrashPolicyDefault.java:initialize(92)) - Namenode trash configuration: 
 Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
 rm: Failed to move to trash: hdfs://schu-enc2.vpc.com:8020/user/hdfs/enc. 
 Consider using -skipTrash option
 {code}
 This is because the encrypted dir cannot be moved from an encryption zone, as 
 the NN log explains:
 {code}
 2014-07-29 13:47:29,596 INFO  [IPC Server handler 8 on 8020] ipc.Server 
 (Server.java:run(2120)) - IPC Server handler 8 on 8020, call 
 org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 172.25.3.153:48295 
 Call#9 Retry#0
 java.io.IOException: /user/hdfs/enc can't be moved from an encryption zone.
   at 
 org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager.checkMoveValidity(EncryptionZoneManager.java:175)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedRenameTo(FSDirectory.java:526)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.renameTo(FSDirectory.java:440)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3593)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3555)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3522)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:727)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:542)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode

2014-07-27 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6755:
--

   Resolution: Fixed
Fix Version/s: 2.6.0
   3.0.0
   Status: Resolved  (was: Patch Available)

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.6.0

 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6760) Deletion of directories with snapshots will not output reason for trash move failure

2014-07-27 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6760:
-

 Summary: Deletion of directories with snapshots will not output 
reason for trash move failure
 Key: HDFS-6760
 URL: https://issues.apache.org/jira/browse/HDFS-6760
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor


When using trash-enabled FsShell to delete a directory that has snapshots, we 
se an error message saying Failed to move to trash but no  explanation.

{code}
[hdfs@schu-enc2 ~]$ hdfs dfs -rm -r snap
2014-07-28 05:45:29,527 INFO  [main] fs.TrashPolicyDefault 
(TrashPolicyDefault.java:initialize(92)) - Namenode trash configuration: 
Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
rm: Failed to move to trash: hdfs://schu-enc2.vpc.com:8020/user/hdfs/snap. 
Consider using -skipTrash option
{code}

If we use -skipTrash, then we'll get the explanation: rm: The directory 
/user/hdfs/snap cannot be deleted since /user/hdfs/snap is snapshottable and 
already has snapshots

It'd be an improvement to make it clear that dirs with snapshots cannot be 
deleted when we're using the trash.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-07-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072859#comment-14072859
 ] 

Stephen Chu commented on HDFS-6741:
---

No unit tests were added because this is small change to an exception message.

The failing tests are not related to this change. I locally re-ran TestWebHDFS 
and TestPipelinesFailover multiple times successfully to double-check.

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6741.1.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode

2014-07-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073466#comment-14073466
 ] 

Stephen Chu commented on HDFS-6715:
---

Thank you for fixing this, [~jingzhao]. Changes LGTM. I manually deployed and 
verified. +1 (non-binding).

 webhdfs wont fail over when it gets java.io.IOException: Namenode is in 
 startup mode
 

 Key: HDFS-6715
 URL: https://issues.apache.org/jira/browse/HDFS-6715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.2.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch


 Noticed in our HA testing when we run MR job with webhdfs file system we some 
 times run into 
 {code}
 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
 report from attempt_1397710493213_0001_r_08_0: Container killed by the 
 ApplicationMaster.
 Container killed on request. Exit code is 143
 Container exited with a non-zero exit code 143
 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not 
 commit job
 java.io.IOException: Namenode is in startup mode
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-24 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6665:
--

Target Version/s: 3.0.0, 2.6.0  (was: 2.6.0)
  Status: Patch Available  (was: Open)

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6665.1.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-24 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6665:
--

Attachment: HDFS-6665.1.patch

Attaching patch to add two tests that verify XAttrs with ViewFs and 
ViewFileSystem. They verify that the XAttr operations are routed to the correct 
NameNode.

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6665.1.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-24 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6665:
--

Attachment: HDFS-6665.2.patch

Thanks for the review and catching that, [~andrew.wang]!

Uploading a new patch to fix the comment.

Looked into test results of the TestBlockTokenWithDFS and 
TestNamenodeCapacityReport, and they're not related to these WebHDFS test 
changes. Re-ran them locally successfully.

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6665.1.patch, HDFS-6665.2.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074032#comment-14074032
 ] 

Stephen Chu commented on HDFS-6665:
---

TestPipelinesFailover failure is not due to the patch changes. I re-ran the 
test locally successfully a couple times to be sure. Other than that, the 
Hadoop QA job looks good.

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6665.1.patch, HDFS-6665.2.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-24 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074059#comment-14074059
 ] 

Stephen Chu commented on HDFS-6665:
---

Ah, yes, seems to be the same too many open files issue talked about in that 
JIRA. Thanks for pointing me to it, Vinay.

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6665.1.patch, HDFS-6665.2.patch


 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-23 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072049#comment-14072049
 ] 

Stephen Chu commented on HDFS-6665:
---

I've submitted a patch for HADOOP-10887, similar to what we did for ACLs in 
HADOOP-10845.

Once that's resolved, we can add HDFS tests for XAttrs + ViewFileSystem and 
ViewFs.

 Add tests for XAttrs in combination with viewfs
 ---

 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu

 This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)
 We should verify that XAttr operations work properly with viewfs, and that 
 XAttr commands are routed to the correct namenode in a federated deployment.
 Also, we should make sure that the behavior of XAttr commands on internal 
 dirs is consistent with other commands. For example, setPermission will throw 
 the readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-07-23 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6741:
-

 Summary: Improve permission denied message when 
FSPermissionChecker#checkOwner fails
 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor


Currently, FSPermissionChecker#checkOwner throws an AccessControlException with 
a simple Permission denied message.

When users try to set an ACL without ownership permissions, they'll see 
something like:

{code}
[schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
setfacl: Permission denied
{code}

It'd be helpful if the message had an explanation why the permission was denied 
to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-07-23 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6741:
--

Target Version/s: 3.0.0, 2.6.0
  Status: Patch Available  (was: Open)

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6741.1.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-07-23 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6741:
--

Attachment: HDFS-6741.1.patch

Attaching small change to improve the error message with an explanation.

Permission denied -
Permission denied: User {user} does not own {inode.getFullPathName()}

No unit test added because this just changes an exception message, which isn't 
being depended on right now.

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6741.1.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6733) Creating encryption zone results in NPE when KeyProvider is null

2014-07-22 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6733:
-

 Summary: Creating encryption zone results in NPE when KeyProvider 
is null
 Key: HDFS-6733
 URL: https://issues.apache.org/jira/browse/HDFS-6733
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Stephen Chu


When users try to create an encryption zone on a system that is not configured 
with a KeyProvider, they will run into a NullPointerException.

For example:
[hdfs@schu-enc2 ~]$ hdfs crypto -createZone -keyName abc123 -path /user/hdfs
2014-07-22 23:18:23,273 WARN  [main] crypto.CryptoCodec 
(CryptoCodec.java:getInstance(70)) - Crypto codec 
org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
RemoteException: java.lang.NullPointerException

This error happens in FSNamesystem.createEncryptionZone(FSNamesystem.java:8456):

{code}
try {
  if (keyName == null || keyName.isEmpty()) {
keyName = UUID.randomUUID().toString();
createNewKey(keyName, src);
createdKey = true;
  } else {
KeyVersion keyVersion = provider.getCurrentKey(keyName);
if (keyVersion == null) {
{code}

provider can be null.

An improvement would be to make the error message more specific/say that 
KeyProvider was not found.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6704) Fix the command to launch JournalNode in HDFS-HA document

2014-07-18 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066509#comment-14066509
 ] 

Stephen Chu commented on HDFS-6704:
---

LGTM, +1 (non-binding). 

 Fix the command to launch JournalNode in HDFS-HA document
 -

 Key: HDFS-6704
 URL: https://issues.apache.org/jira/browse/HDFS-6704
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: HDFS-6704.patch


 In HDFSHighAvailabilityWithQJM.html,
 {code}
 After all of the necessary configuration options have been set, you must 
 start the JournalNode daemons on the set of machines where they will run. 
 This can be done by running the command hdfs-daemon.sh journalnode and 
 waiting for the daemon to start on each of the relevant machines.
 {code}
 hdfs-daemon.sh should be hadoop-daemon.sh since hdfs-daemon.sh does not exist.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5624) Add HDFS tests for ACLs in combination with viewfs.

2014-07-15 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062919#comment-14062919
 ] 

Stephen Chu commented on HDFS-5624:
---

Thank you, [~cnauroth]!

 Add HDFS tests for ACLs in combination with viewfs.
 ---

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch, HDFS-5624.002.patch, 
 HDFS-5624.003.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-14 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-5624:
--

Attachment: HDFS-5624.002.patch

Attaching a new patch to address Chris's review comments.

# Fixed to the consistent internaldir pattern.
# Ditto.
# Implemented ACL methods for ViewFs. This included adding the ACL methods to 
ChRootedFs as well. Added unit tests to verify ACL + internal dir behavior in 
ViewFsBaseTest.
# Renamed ACL + ViewFileSystem test to TestViewFileSystemWithAcls. Added new 
test suite TestViewFsWithAcls to cover ACL + ViewFs.

Thanks again, Chris.

 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch, HDFS-5624.002.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-14 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061455#comment-14061455
 ] 

Stephen Chu commented on HDFS-5624:
---

The failing tests above are unrelated to the patch's changes. I ran the tests 
successfully locally on a patched build.

 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch, HDFS-5624.002.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-13 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060124#comment-14060124
 ] 

Stephen Chu commented on HDFS-5624:
---

[~cnauroth], thanks a lot for the review and comments! I'll work on updating 
the patch to address your comments.

# Oops, missed that. Will update the patch to follow the pattern. Thanks for 
catching.
# Ditto.
# I'll take a shot at implementing the ViewFs ACL methods in this patch. 
Because the code is similar, seems it'll be nice to get that into one patch.
# Agreed. Will rename the test, and will also add another test suite to go with 
the added ViewFs ACL implementation.

 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-11 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059018#comment-14059018
 ] 

Stephen Chu commented on HDFS-5624:
---

The above failing unit tests are unrelated to this patch.

I filed a similar patch to cover XAttrs with viewfs at 
https://issues.apache.org/jira/browse/HDFS-6665.

Perhaps I should rename TestViewFsWithAcls to TestViewFsRouteToNamenode so that 
in HDFS-6665 I can add the XAttr tests there? I could also move the tests in 
TestViewFsWithAcls into TestViewFileSystemHdfs if that is ideal.

 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-10 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-5624:
--

Attachment: HDFS-5624.001.patch

Attaching a patch with ACL + ViewFs tests. Thanks, Chris, for the guidance. To 
test these changes, I ran all the hadoop-common and hadoop-hdfs viewfs tests 
successfully.

Changes:

* For the modifying ACL operations in ViewFileSystem.java, if the target path 
is an internal dir, then we throw the InternalDir of ViewFileSystem is 
readonly AccessControlException. This makes the ACL operations consistent with 
the other methods like setPermission. Previously, we would throw an 
UnsupportedOperationException.

* In the case of getAclStatus, we made changes to make it consistent to 
getFileStatus in the case of an internal dir. User is set to the FileSystem 
user's name, group is set to the FileSystem user's primary group (first group), 
sticky bit is false, and ACL entries are equivalent to 555 permissions.

* Add ACL tests for internal dir paths in ViewFileSystemBaseTest.java

* Add TestViewFsWithAcls which brings up a federated MiniDFSCluster with 2 
NameNodes and verifies that ACL operations + ViewFs are directed to the correct 
NameNodes.



 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6640) [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given wrongly(missed webhdfs/v1).).

2014-07-10 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6640:
--

Target Version/s: 3.0.0, 2.6.0
 Summary: [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and 
SETXATTR are given wrongly(missed webhdfs/v1).).  (was: [ Web HDFS ] Syntax for 
MKDIRS and Symbolic link are given wrongly(missed webhdfs/v1).).)

I noticed that the SETXATTR command is incorrect as well.

It has an extraneous op=

{code}
curl -i -X PUT http://HOST:PORT/webhdfs/v1/PATH?op=op=SETXATTR
{code}

Updating a patch to fix this.

 [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given 
 wrongly(missed webhdfs/v1).).
 ---

 Key: HDFS-6640
 URL: https://issues.apache.org/jira/browse/HDFS-6640
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, webhdfs
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Stephen Chu
 Attachments: HDFS-6640.001.patch, HDFS-6640.002.patch


 Need to correct the following :
 Make a Directory
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=MKDIRS[permission=OCTAL]
 Create a Symbolic Link
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=CREATESYMLINK
   destination=PATH[createParent=true|false]
 webhdfs/v1 is missed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6640) [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given wrongly(missed webhdfs/v1).).

2014-07-10 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6640:
--

Attachment: HDFS-6640.002.patch

 [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given 
 wrongly(missed webhdfs/v1).).
 ---

 Key: HDFS-6640
 URL: https://issues.apache.org/jira/browse/HDFS-6640
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, webhdfs
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Stephen Chu
 Attachments: HDFS-6640.001.patch, HDFS-6640.002.patch


 Need to correct the following :
 Make a Directory
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=MKDIRS[permission=OCTAL]
 Create a Symbolic Link
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=CREATESYMLINK
   destination=PATH[createParent=true|false]
 webhdfs/v1 is missed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5624) Add tests for ACLs in combination with viewfs.

2014-07-10 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-5624:
--

Status: Patch Available  (was: Open)

 Add tests for ACLs in combination with viewfs.
 --

 Key: HDFS-5624
 URL: https://issues.apache.org/jira/browse/HDFS-5624
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Chris Nauroth
Assignee: Stephen Chu
 Attachments: HDFS-5624.001.patch


 Add tests verifying that in a federated deployment, a viewfs wrapped over 
 multiple federated NameNodes will dispatch the ACL operations to the correct 
 NameNode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6665) Add tests for XAttrs in combination with viewfs

2014-07-10 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-6665:
-

 Summary: Add tests for XAttrs in combination with viewfs
 Key: HDFS-6665
 URL: https://issues.apache.org/jira/browse/HDFS-6665
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs-client
Affects Versions: 2.5.0
Reporter: Stephen Chu
Assignee: Stephen Chu


This is similar to HDFS-5624 (Add tests for ACLs in combination with viewfs)

We should verify that XAttr operations work properly with viewfs, and that 
XAttr commands are routed to the correct namenode in a federated deployment.

Also, we should make sure that the behavior of XAttr commands on internal dirs 
is consistent with other commands. For example, setPermission will throw the 
readonly AccessControlException for paths above the root mount entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6640) [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given wrongly(missed webhdfs/v1).).

2014-07-10 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058205#comment-14058205
 ] 

Stephen Chu commented on HDFS-6640:
---

Thank you, Akira and Jing!

 [ Web HDFS ] Syntax for MKDIRS, CREATESYMLINK, and SETXATTR are given 
 wrongly(missed webhdfs/v1).).
 ---

 Key: HDFS-6640
 URL: https://issues.apache.org/jira/browse/HDFS-6640
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, webhdfs
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Stephen Chu
 Fix For: 2.6.0

 Attachments: HDFS-6640.001.patch, HDFS-6640.002.patch


 Need to correct the following :
 Make a Directory
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=MKDIRS[permission=OCTAL]
 Create a Symbolic Link
 Submit a HTTP PUT request.
 curl -i -X PUT http://HOST:PORT/PATH?op=CREATESYMLINK
   destination=PATH[createParent=true|false]
 webhdfs/v1 is missed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   >