thats interesting, don't remember any change in 0.7.1 that touched SSL part. May be check the zkclient version.
On Fri, Jul 15, 2016 at 1:58 AM, Siddharth Wagle <[email protected]> wrote: > So I was able to get this to work by doing 2 things: > > > - Upgraded helix version to 0.7.1 from 0.6.5 and result client side changes > > - Changes the znode when security is enabled (separate znode for secure / > unsecure) > > > I do not see any ACLs set by Helix on the znode so still verifying if > change # 2 is necessary. > > > BR, > > Sid > > > ------------------------------ > *From:* Siddharth Wagle > *Sent:* Wednesday, July 13, 2016 11:24 AM > > *To:* [email protected] > *Subject:* Re: MIT-Kerberos support for ZkHelixAdmin > > > Hi Kishore, > > > Quick summary of what I am doing (Controller and Participant are the same > jvm): > > > - Instantiate ZkHelixAdmin and create a cluster > > - Add the host as an instance to the cluster > > - Add the state model def > > - Add resources and rebalance > > - Start the participant > > *- Start the controller* > > > > https://github.com/apache/ambari/blob/trunk/ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/MetricCollectorHAController.java > > > So I have everything up to the Controller part working. When security is > enabled the ZkHelixManger for Controller stops working and throws following > error: > > > 2016-07-13 17:56:41,263 INFO org.apache.helix.manager.zk.ZKHelixManager: > KeeperState: SyncConnected, zookeeper:State:CONNECTED Timeout:30000 > sessionid:0x355e190123e001d local:/10.240.0.32:54434 > remoteserver:ambari-sid-3.c.pramod-thangali.internal/10.240.0.30:2181 > lastZxid:8589 > 935298 xid:9 sent:9 recv:11 queuedpkts:0 pendingresp:0 queuedevents:0 > > > 2016-07-13 17:57:41,264 ERROR org.apache.helix.manager.zk.ZKHelixManager: > fail to connect zkserver: > ambari-sid-1.c.pramod-thangali.internal:2181,ambari-sid-2.c.pramod-thangali.internal:2181,ambari-sid-3.c.pramod-thangali.internal:2181 > in 60000ms. expiredSessionId: null, clusterName: ambari-metrics-cluster > > > (Re-throws same exception and never gets out of this state). Zookeeper > server logs do not indicate an incoming client connection. > > > If I put a breakpoint which forces a new client session to Zookeeper > for the CONTROLLER what I am seeing is, > > > 2016-07-13 18:04:39,994 INFO org.apache.helix.manager.zk.ZKHelixManager: > KeeperState: SyncConnected, zookeeper:State:CONNECTED Timeout:30000 > sessionid:0x255e190122a0025 local:null remoteserver:null lastZxid:0 xid:2 > sent:1 recv:1 queuedpkts:0 pendingresp:0 queuedevents:0 > > 2016-07-13 18:04:39,994 INFO org.apache.helix.manager.zk.ZKHelixManager: > KeeperState:Disconnected, disconnectedSessionId: 255e190122a0025, instance: > ambari-sid-5.c.pramod-thangali.internal, type: CONTROLLER > 2016-07-13 18:04:39,995 ERROR org.apache.helix.manager.zk.ZKHelixManager: > fail to createClient. > org.apache.helix.HelixException: Cluster structure is not set up for > cluster: ambari-metrics-cluster > > > Any idea what is going on? > > I see a instance type called CONTROLLER_PARTICIPANT, is this something > that will allow not having to create a separate session for the controller > during initialization. > > > Best Regards, > > Sid > > > ------------------------------ > *From:* kishore g <[email protected]> > *Sent:* Wednesday, July 13, 2016 12:52 AM > *To:* [email protected] > *Subject:* Re: MIT-Kerberos support for ZkHelixAdmin > > Does not look like standard system variables used by Zookeeper. > > Take a look at this wiki > > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide > > export CLIENT_JVMFLAGS=" > -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty > -Dzookeeper.client.secure=true > -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks > -Dzookeeper.ssl.keyStore.password=testpass > -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks > -Dzookeeper.ssl.trustStore.password=testpass" > > > On Tue, Jul 12, 2016 at 8:12 PM, Siddharth Wagle <[email protected]> > wrote: > >> Thanks Kishore, appreciate the help. >> >> >> I do have a jass.conf on the class path which works for Phoenix client >> connecting to ZK (in the same jvm) but does not work for Helix: >> >> >> >> -Djava.security.auth.login.config=/etc/ams-hbase/conf/ams_collector_jaas.conf >> >> >> [root@ambari-sid-4 ~]# cat /etc/ams-hbase/conf/ams_collector_jaas.conf >> >> Client { >> com.sun.security.auth.module.Krb5LoginModule required >> useKeyTab=true >> storeKey=true >> useTicketCache=false >> keyTab="/etc/security/keytabs/ams.collector.keytab" >> principal="amshbase/[email protected]"; >> }; >> >> >> >> ------------------------------ >> *From:* kishore g <[email protected]> >> *Sent:* Tuesday, July 12, 2016 6:36 PM >> *To:* [email protected] >> *Subject:* Re: MIT-Kerberos support for ZkHelixAdmin >> >> We haven't tried ZK with authentication. I think ZK authentication can be >> enabled by setting system properties. Will take a look at it and get back >> to you >> >> On Tue, Jul 12, 2016 at 5:12 PM, Siddharth Wagle <[email protected]> >> wrote: >> >>> Hi, >>> >>> >>> I am working on Ambari Metrics System HA, >>> https://issues.apache.org/jira/browse/AMBARI-15901 >>> >>> and using Helix for task partitioning as well as service discovery. >>> >>> >>> The issue I am facing is that as soon as I enable Kerberos, Helix stops >>> working as it cannot connect to the secure Zookeeper. >>> >>> >>> Are there any examples or recommendations of how to get the ZkHelixAdmin >>> to work with secure Zookeeper. I was unable to find any mention of this in >>> the codebase. >>> >>> >>> Thanks, >>> >>> Sid. >>> >>> >>> >>> >> >
