[ https://issues.apache.org/jira/browse/ACCUMULO-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christopher Tubbs resolved ACCUMULO-2971. ----------------------------------------- Resolution: Fixed > ChangeSecret tool should refuse to run if no write access to HDFS > ----------------------------------------------------------------- > > Key: ACCUMULO-2971 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2971 > Project: Accumulo > Issue Type: Bug > Affects Versions: 1.5.0, 1.5.1, 1.6.0 > Reporter: Sean Busbey > Assignee: Michael Miller > Labels: newbie > Fix For: 1.8.1 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently, the ChangeSecret tool doesn't do any check to ensure the user > running it has the ability to write to /accumlo/instance_id. > In the event that an admin knows the instance secret but runs the command as > a user who can not write to the instance_id, the result is an unhelpful error > message and a disconnect between HDFS and zookeeper. > Example for cluster with instance named "foobar" > {code} > [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id > Found 1 items > -rw-r--r-- 3 accumulo accumulo 0 2014-07-02 09:05 > /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4 > [busbey@edge ~]$ accumulo org.apache.accumulo.server.util.ChangeSecret > old zookeeper password: > new zookeeper password: > Thread "org.apache.accumulo.server.util.ChangeSecret" died Permission denied: > user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > org.apache.hadoop.security.AccessControlException: Permission denied: > user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1489) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:355) > at > org.apache.accumulo.server.util.ChangeSecret.updateHdfs(ChangeSecret.java:150) > at > org.apache.accumulo.server.util.ChangeSecret.main(ChangeSecret.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:141) > at java.lang.Thread.run(Thread.java:662) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): > Permission denied: user=busbey, access=WRITE, > inode="/accumulo":accumulo:accumulo:drwxr-x--x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > at org.apache.hadoop.ipc.Client.call(Client.java:1238) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) > at $Proxy16.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:408) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy17.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1487) > ... 9 more > [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id > Found 1 items > -rw-r--r-- 3 accumulo accumulo 0 2014-07-02 09:05 > /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4 > [busbey@edge ~]$ zookeeper-client > Connecting to localhost:2181 > Welcome to ZooKeeper! > JLine support is enabled > WATCHER:: > WatchedEvent state:SyncConnected type:None path:null > [zk: localhost:2181(CONNECTED) 0] get /accumulo/instances/foobar > 1528cc95-2600-4649-a50e-1645404e9d6c > cZxid = 0xe00034f45 > ctime = Wed Jul 02 09:27:58 PDT 2014 > mZxid = 0xe00034f45 > mtime = Wed Jul 02 09:27:58 PDT 2014 > pZxid = 0xe00034f45 > cversion = 0 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 36 > numChildren = 0 > [zk: localhost:2181(CONNECTED) 1] ls > /accumulo/1528cc95-2600-4649-a50e-1645404e9d6c > [users, monitor, problems, root_tablet, gc, hdfs_reservations, table_locks, > namespaces, recovery, fate, tservers, tables, next_file, tracers, config, > dead, bulk_failed_copyq, masters] > [zk: localhost:2181(CONNECTED) 2] ls > /accumulo/cb977c77-3e13-4522-b718-2b487d722fd4 > [users, problems, monitor, root_tablet, hdfs_reservations, gc, table_locks, > namespaces, recovery, fate, tservers, tables, next_file, tracers, config, > masters, bulk_failed_copyq, dead] > {code} > What's worse, in this condition the cluster will properly come up and show > everything fine if the old instance secret is used. > However, clients and servers will now end up looking at different zookeeper > nodes depending on wether they used HDFS to get the instance_id or if they > use a ZK instance name lookup to get it so long as they use the corresponding > instance secret. > Furthermore, if an admin uses the CleanZooKeeper utility subsequent to this > failure, it'll cause the loss of the zookeeper nodes the server processes are > looking at. > The utility should do a sanity check that /accumulo/instance_id is writable > prior to changing zookeeper. It should also wait to update the instance name > to instand_id pointer in zookeeper until after HDFS has been updated. > Workaround: manually edit the HDFS instance_id to match the new instance id > found zk for the instance name and proceed as though the secret change had > succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)