Richard Dorman wrote:
I'm trying to startup a quorum of Zookeeper servers in a cluster,
however, Zookeeper is failing to start because it cannot find its
hostname in the list of Zookeeper quorum servers.

Can you provide the contents of the ZK log for this? The thing is, we don't do as you say "lookup our hostname in the list of zk quorum servers", rather we rely on the "myid" file, which resides in the data directory (you should have created when you setup the cluster) to identify who "we" (meaning the server) are during server start.

So:
1) myid file has the server id
2) config file on each server has something like

server.1=host1:2888:2889
server.2=host2:2988:2989

where host1 will have myid file with "1"
host2 will have myid file with "2"

I know this problem is well documented on the WIKI, however, my
situation is a little different. The allocation of a node to become a
Zookeeper is done dynamically by a management service running else where
on the cluster. This node then associates its IP with a hostname in the
Zookeeper quorum list. The hostname is not the default hostname of the
node. The node may associate its IP with multiple hostnames for each
service that it is allocated.

We register a server socket as follows:

  ss = new ServerSocket(self.getQuorumAddress().getPort());

Note: we only specify the port number, not the host name/addr here. This should mean that the socket will register on all interfaces (on the host) for all possible ip addresses (wildcard match).

This causes a problem when Zookeeper starts. Zookeeper does a
getdefaulthost which will return the nodes default hostname and not the
associated hostname.

As I mentioned I'd like to see the log for this error.

So my questions are:

1. Is it possible to resolve this some other way? We are not running DNS
(hostname associations are managed by our own services). We also cannot
use the nodes ip address as the nodes are allocated dynamically.
Dynamically updating the config files is also not practical.
2. Why does Zookeeper need to test whether its hostname is in the
Zookeeper quorim list? Can this safely be disabled?

AFAIC we are not doing this. If you could send your config file as well it would be interesting to see in addition to the log of the error.

This is EC2 or something else? What version of ZK are you running?

Patrick

Reply via email to