So I was able to get this to work by doing 2 things:

- Upgraded helix version to 0.7.1 from 0.6.5 and result client side changes

- Changes the znode when security is enabled (separate znode for secure / 
unsecure)


I do not see any ACLs set by Helix on the znode so still verifying if change # 
2 is necessary.


BR,

Sid


________________________________
From: Siddharth Wagle
Sent: Wednesday, July 13, 2016 11:24 AM
To: [email protected]
Subject: Re: MIT-Kerberos support for ZkHelixAdmin


Hi Kishore,


Quick summary of what I am doing (Controller and Participant are the same jvm):


- Instantiate ZkHelixAdmin and create a cluster

- Add the host as an instance to the cluster

- Add the state model def

- Add resources and rebalance

- Start the participant

- Start the controller


https://github.com/apache/ambari/blob/trunk/ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/MetricCollectorHAController.java


So I have everything up to the Controller part working. When security is 
enabled the ZkHelixManger for Controller stops working and throws following 
error:


2016-07-13 17:56:41,263 INFO org.apache.helix.manager.zk.ZKHelixManager: 
KeeperState: SyncConnected, zookeeper:State:CONNECTED Timeout:30000 
sessionid:0x355e190123e001d local:/10.240.0.32:54434 
remoteserver:ambari-sid-3.c.pramod-thangali.internal/10.240.0.30:2181 
lastZxid:8589
935298 xid:9 sent:9 recv:11 queuedpkts:0 pendingresp:0 queuedevents:0


2016-07-13 17:57:41,264 ERROR org.apache.helix.manager.zk.ZKHelixManager: fail 
to connect zkserver: 
ambari-sid-1.c.pramod-thangali.internal:2181,ambari-sid-2.c.pramod-thangali.internal:2181,ambari-sid-3.c.pramod-thangali.internal:2181
 in 60000ms. expiredSessionId: null, clusterName: ambari-metrics-cluster


(Re-throws same exception and never gets out of this state). Zookeeper server 
logs do not indicate an incoming client connection.


If I put a breakpoint which forces a new client session to Zookeeper for the 
CONTROLLER what I am seeing is,


2016-07-13 18:04:39,994 INFO org.apache.helix.manager.zk.ZKHelixManager: 
KeeperState: SyncConnected, zookeeper:State:CONNECTED Timeout:30000 
sessionid:0x255e190122a0025 local:null remoteserver:null lastZxid:0 xid:2 
sent:1 recv:1 queuedpkts:0 pendingresp:0 queuedevents:0

2016-07-13 18:04:39,994 INFO org.apache.helix.manager.zk.ZKHelixManager: 
KeeperState:Disconnected, disconnectedSessionId: 255e190122a0025, instance: 
ambari-sid-5.c.pramod-thangali.internal, type: CONTROLLER
2016-07-13 18:04:39,995 ERROR org.apache.helix.manager.zk.ZKHelixManager: fail 
to createClient.
org.apache.helix.HelixException: Cluster structure is not set up for cluster: 
ambari-metrics-cluster


Any idea what is going on?

I see a instance type called CONTROLLER_PARTICIPANT, is this something that 
will allow not having to create a separate session for the controller during 
initialization.


Best Regards,

Sid


________________________________
From: kishore g <[email protected]>
Sent: Wednesday, July 13, 2016 12:52 AM
To: [email protected]
Subject: Re: MIT-Kerberos support for ZkHelixAdmin

Does not look like standard system variables used by Zookeeper.

Take a look at this wiki

https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide

export CLIENT_JVMFLAGS="
-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
-Dzookeeper.client.secure=true
-Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks
-Dzookeeper.ssl.keyStore.password=testpass
-Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks
-Dzookeeper.ssl.trustStore.password=testpass"


On Tue, Jul 12, 2016 at 8:12 PM, Siddharth Wagle 
<[email protected]<mailto:[email protected]>> wrote:

Thanks Kishore, appreciate the help.


I do have a jass.conf on the class path which works for Phoenix client 
connecting to ZK (in the same jvm) but does not work for Helix:


-Djava.security.auth.login.config=/etc/ams-hbase/conf/ams_collector_jaas.conf


[root@ambari-sid-4 ~]# cat /etc/ams-hbase/conf/ams_collector_jaas.conf

Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
useTicketCache=false
keyTab="/etc/security/keytabs/ams.collector.keytab"
principal="amshbase/[email protected]<mailto:[email protected]>";
};



________________________________
From: kishore g <[email protected]<mailto:[email protected]>>
Sent: Tuesday, July 12, 2016 6:36 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: MIT-Kerberos support for ZkHelixAdmin

We haven't tried ZK with authentication. I think ZK authentication can be 
enabled by setting system properties. Will take a look at it and get back to you

On Tue, Jul 12, 2016 at 5:12 PM, Siddharth Wagle 
<[email protected]<mailto:[email protected]>> wrote:

Hi,


I am working on Ambari Metrics System HA, 
https://issues.apache.org/jira/browse/AMBARI-15901

and using Helix for task partitioning as well as service discovery.


The issue I am facing is that as soon as I enable Kerberos, Helix stops working 
as it cannot connect to the secure Zookeeper.


Are there any examples or recommendations of how to get the ZkHelixAdmin to 
work with secure Zookeeper. I was unable to find any mention of this in the 
codebase.


Thanks,

Sid.




Reply via email to