[ https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472985#comment-15472985 ]
Rakesh R edited comment on ZOOKEEPER-1045 at 9/8/16 6:56 AM: ------------------------------------------------------------- Thanks everyone for putting all the use cases and active discussions. Let me try to summarize all the problems and proposed solutions. *Point-1)* ??Prevent another host with the same krb user and realm to join my zookeeper cluster, for example {{zk/<badhost>@EXAMPLE.COM}}?? Like [~phunt], [~hanm] explained we could make use of zoo.cfg as the source of authorization information and replace the _HOST part in {{user/_HOST@REALM}}. Since admins can configure ZK server details as host name or ipaddress or fqdn in zoo.cfg, server should have a mechanism to resolve this to fully qualified domain name for this IP address. Sometime back I've attached {{HOST_RESOLVER-ZK-1045.patch}} idea (thanks to hadoop project [hadoop SecurityUtil ref.|https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java#L570]), which is an independent patch to prepare QuorumServer kerberos principal by resolving the host address of ZK server to InetAddress.getLocalHost().getCanonicalHostName() and expects principal like zk/ho...@example.com. This principal will be used by the quorum peer learner to talk to another quorum peer server during FLE. Here the implicit requirement is, admin has to ensure that the configured kerberos principal name should be resolved to fully qualified domain name for this IP address. * For authorization every server will compare the full principal name that composed of {{user/host@realm}}. For doing this every server will cross check with the list of known quorum server principals built from zoo.cfg file. +Ensemble-1 :-+ quorum server principal list => {{zk/ho...@example.com, zk/ho...@example.com, zk/ho...@example.com}} +Ensemble-2 :-+ quorum server principal list => {{zk/ho...@example.com, zk/ho...@example.com, zk/ho...@example.com}} * For authentication, quorum learner server will get the remote quorum server principal name and then do authentication. For example, host1 will get host2 principal {{zk/ho...@example.com}} and do authenticate. Does this make sense? *Point-2)* ??Feature of KDC that it will treat repeated attempts to log in with the same Kerberos principal within a short period of time as replay attacks and will reject such login requests. Since we are supporting shared Kerberos credential, we might hit this issue.?? Good catch, Michael. [~cnauroth], I hope you are pointing me to the hadoop code [Client.java#L699|https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L699], isn't it? I'll try to understand this part and get back to you. *Point-3)* ??How would we authorize against something that is not pre-configured? Basically the dynamic reconfiguration of servers (addition and removal). Also, supports upgrade from 3.4 to 3.5 and above.?? [~shralex], IIUC, dynamic reconfig feature is continue using the zoo.cfg configuration file to keep the quorum info and while processing the reconfiguration request, it will always update the zoo.cfg file and ensure this file is uptodate. In that case each server will get the details of newly added server and during this time we should accommodate the logic of updating the {{quorum server principal list}} with the newly added server or removed server details, if any. But there is a case, <badhost> server tries to join {{zk/<badhost>@EXAMPLE.COM}}, I think this has to be restricted at the reconfig command execution side rather than FLE, probably ZOOKEEPER-2014 jira will help to resolve this problem, am I missing anything? was (Author: rakeshr): Thanks everyone for putting all the use cases and active discussions. Let me try to summarize all the problems and proposed solutions. *Point-1)* ??Prevent another user from getting Kerberos credentials for {{zk/<badhost>@EXAMPLE.COM}}, and don't want them to be able to join my cluster.?? Like [~phunt], [~hanm] explained we could make use of zoo.cfg as the source of authorization information and replace the _HOST part in {{user/_HOST@REALM}}. Since admins can configure ZK server details as host name or ipaddress or fqdn in zoo.cfg, server should have a mechanism to resolve this to fully qualified domain name for this IP address. Sometime back I've attached {{HOST_RESOLVER-ZK-1045.patch}} idea (thanks to hadoop project [hadoop SecurityUtil ref.|https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java#L570]), which is an independent patch to prepare QuorumServer kerberos principal by resolving the host address of ZK server to InetAddress.getLocalHost().getCanonicalHostName() and expects principal like zk/ho...@example.com. This principal will be used by the quorum peer learner to talk to another quorum peer server during FLE. Here the implicit requirement is, admin has to ensure that the configured kerberos principal name should be resolved to fully qualified domain name for this IP address. * For authorization every server will compare the full principal name that composed of {{user/host@realm}}. For doing this every server will cross check with the list of known quorum server principals built from zoo.cfg file. +Ensemble-1 :-+ quorum server principal list => {{zk/ho...@example.com, zk/ho...@example.com, zk/ho...@example.com}} +Ensemble-2 :-+ quorum server principal list => {{zk/ho...@example.com, zk/ho...@example.com, zk/ho...@example.com}} * For authentication, quorum learner server will get the remote quorum server principal name and then do authentication. For example, host1 will get host2 principal {{zk/ho...@example.com}} and do authenticate. Does this make sense? *Point-2)* ??Feature of KDC that it will treat repeated attempts to log in with the same Kerberos principal within a short period of time as replay attacks and will reject such login requests. Since we are supporting shared Kerberos credential, we might hit this issue.?? Good catch, Michael. [~cnauroth], I hope you are pointing me to the hadoop code [Client.java#L699|https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L699], isn't it? I'll try to understand this part and get back to you. *Point-3)* ??How would we authorize against something that is not pre-configured? Basically the dynamic reconfiguration of servers (addition and removal). Also, supports upgrade from 3.4 to 3.5 and above.?? [~shralex], IIUC, dynamic reconfig feature is continue using the zoo.cfg configuration file to keep the quorum info and while processing the reconfiguration request, it will always update the zoo.cfg file and ensure this file is uptodate. In that case each server will get the details of newly added server and during this time we should accommodate the logic of updating the {{quorum server principal list}} with the newly added server or removed server details, if any. But there is a case, <badhost> server tries to join {{zk/<badhost>@EXAMPLE.COM}}, I think this has to be restricted at the reconfig command execution side rather than FLE, probably ZOOKEEPER-2014 jira will help to resolve this problem, am I missing anything? > Support Quorum Peer mutual authentication via SASL > -------------------------------------------------- > > Key: ZOOKEEPER-1045 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045 > Project: ZooKeeper > Issue Type: New Feature > Components: server > Reporter: Eugene Koontz > Assignee: Rakesh R > Priority: Critical > Fix For: 3.4.10, 3.5.3 > > Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, > 1045_failing_phunt.tar.gz, HOST_RESOLVER-ZK-1045.patch, > TEST-org.apache.zookeeper.server.quorum.auth.QuorumAuthUpgradeTest.txt, > ZK-1045-test-case-failure-logs.zip, ZOOKEEPER-1045-00.patch, > ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, > ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, > ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, > ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, > ZOOKEEPER-1045TestValidationDesign.pdf > > > ZOOKEEPER-938 addresses mutual authentication between clients and servers. > This bug, on the other hand, is for authentication among quorum peers. > Hopefully much of the work done on SASL integration with Zookeeper for > ZOOKEEPER-938 can be used as a foundation for this enhancement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)