Re: The idea behind 'myid'
We just need a unique identifier for every server. If such an identifier magically appears somehow, then I believe our protocols will be equally happy. Now, a mechanism to assign ids would also have to take into consideration the group scheme we have for hierarchical quorums. To assign servers to groups, we currently use the identifiers we assign manually to servers. If we don't have such identifiers, then we need a different way of configuring groups. -Flavio On Sep 29, 2009, at 7:01 PM, Patrick Hunt wrote: Jason Venner wrote: I do find having to have a custom file in each zk root somewhat awkward, as I like to rsync my configuration files around. I also would prefer not to have to have all of my zk nodes listed in the configuration file by id. I think I would prefer it if there was a mechanism for each log directory to come up with a cluster unique id, and then the jvm running on that log dir would advertise that ID. You are saying to list the server's ip/port in each of the server config files, but not the id's, correct? The server id is actually used as part of the quorum (zab) protocol - servers with lower id's don't attempt to connect to servers with higher id when forming quorum (to avoid 2 servers negotiating on 2 channels/threads). Perhaps this can be worked around though. Flavio? I'll enter a jira for tracking this feature as soon as apache issue tracking comes back (seems to be down right now). If there was a control that would allow runtime changing of the required quorum, then I could dynamically add and remove nodes. https://issues.apache.org/jira/browse/ZOOKEEPER-107 Patrick 2009/9/29 Ørjan Horpestad orj...@gmail.com Thanks for all of your answers. I can see more clearly why using an IP as id could be a bad idea in a ZK setup. Patrick: I will indeed try out your Zkconf tool, thanks. Regards, Orjan
Re: The idea behind 'myid'
Not sure if you'll find this interesting but my zk configuration generator is available on github: http://github.com/phunt/zkconf zkconf.py will generate all of the configuration needed to run a ZooKeeper ensemble. I mainly use this tool for localhost based testing, but it can generate configurations for any list of servers (see the —server option). Patrick Eric Bowman wrote: Benjamin Reed wrote: you and ted are correct. the id gives zookeeper a stable identifier to use even if the ip address changes. if the ip address doesn't change, we could use that, but we didn't want to make that a built in assumption. if you really do have a rock solid ip address, you could make a wrapper startup script that starts up and creates the myid file based on the ip address. i gotta say though, i've found that such assumptions are often found to be invalid. Yeah, it can be tricky. In more than one cluster, I've seen a set of static configuration files that gets replicated everywhere. If an individual instance needs per-instance configuration, we do that from the command line (using -D). Maybe logic can do it, or maybe a start script has to load a machine local file, whatever. It's a pretty common paradigm, though. It's hardly the end of the world, but it is definitely something my ops people stumbled over.
Re: The idea behind 'myid'
can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to say about it: http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html ben Ørjan Horpestad wrote: Hi! Can someone pin-point me to a site (or please explain ) where I can read about the use of the myid-file for configuring the id of the ZooKeeper servers? I'm sure there is a good reason for using this approach, but it is the first time I have come over this type of non-automatic way for administrating replicas. Regards, Orjan
Re: The idea behind 'myid'
Hi Ben Well, im just wondering why the server's own unique IP-address isn't good enough as a valid identifyer; it strikes me to be a bit exhausting to manually set the id for each server in the cluster. Or maybe there is some details im not getting here :-) Regards, Orjan On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed br...@yahoo-inc.com wrote: can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to say about it: http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html ben Ørjan Horpestad wrote: Hi! Can someone pin-point me to a site (or please explain ) where I can read about the use of the myid-file for configuring the id of the ZooKeeper servers? I'm sure there is a good reason for using this approach, but it is the first time I have come over this type of non-automatic way for administrating replicas. Regards, Orjan
Re: The idea behind 'myid'
A server doesn't have a unique IP address. Each interface can have 1 or more IP addresses and there can be many interfaces. Furthermore, an IP address can move from one machine to another. 2009/9/25 Ørjan Horpestad orj...@gmail.com Hi Ben Well, im just wondering why the server's own unique IP-address isn't good enough as a valid identifyer; it strikes me to be a bit exhausting to manually set the id for each server in the cluster. Or maybe there is some details im not getting here :-) Regards, Orjan On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed br...@yahoo-inc.com wrote: can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to say about it: http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html ben Ørjan Horpestad wrote: Hi! Can someone pin-point me to a site (or please explain ) where I can read about the use of the myid-file for configuring the id of the ZooKeeper servers? I'm sure there is a good reason for using this approach, but it is the first time I have come over this type of non-automatic way for administrating replicas. Regards, Orjan -- Ted Dunning, CTO DeepDyve
Re: The idea behind 'myid'
Another way of doing it, though, would be to tell each instance which IP to use at startup. That way the config can be identical for all users, and there can be whatever logic is required to figure out the right IP address, in the place where logic executing anyhow. I do agree that maintaining the myid file is ackward compared to other approaches that are working elsewhere. It's not really clear what purpose the my id serves except to bind an ip address to a running instance. cheers, Eric Ted Dunning wrote: A server doesn't have a unique IP address. Each interface can have 1 or more IP addresses and there can be many interfaces. Furthermore, an IP address can move from one machine to another. 2009/9/25 Ørjan Horpestad orj...@gmail.com Hi Ben Well, im just wondering why the server's own unique IP-address isn't good enough as a valid identifyer; it strikes me to be a bit exhausting to manually set the id for each server in the cluster. Or maybe there is some details im not getting here :-) Regards, Orjan On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed br...@yahoo-inc.com wrote: can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to say about it: http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html ben Ørjan Horpestad wrote: Hi! Can someone pin-point me to a site (or please explain ) where I can read about the use of the myid-file for configuring the id of the ZooKeeper servers? I'm sure there is a good reason for using this approach, but it is the first time I have come over this type of non-automatic way for administrating replicas. Regards, Orjan -- Eric Bowman Boboco Ltd ebow...@boboco.ie http://www.boboco.ie/ebowman/pubkey.pgp +35318394189/+353872801532
Re: The idea behind 'myid'
you and ted are correct. the id gives zookeeper a stable identifier to use even if the ip address changes. if the ip address doesn't change, we could use that, but we didn't want to make that a built in assumption. if you really do have a rock solid ip address, you could make a wrapper startup script that starts up and creates the myid file based on the ip address. i gotta say though, i've found that such assumptions are often found to be invalid. ben Eric Bowman wrote: Another way of doing it, though, would be to tell each instance which IP to use at startup. That way the config can be identical for all users, and there can be whatever logic is required to figure out the right IP address, in the place where logic executing anyhow. I do agree that maintaining the myid file is ackward compared to other approaches that are working elsewhere. It's not really clear what purpose the my id serves except to bind an ip address to a running instance. cheers, Eric Ted Dunning wrote: A server doesn't have a unique IP address. Each interface can have 1 or more IP addresses and there can be many interfaces. Furthermore, an IP address can move from one machine to another. 2009/9/25 Ørjan Horpestad orj...@gmail.com Hi Ben Well, im just wondering why the server's own unique IP-address isn't good enough as a valid identifyer; it strikes me to be a bit exhausting to manually set the id for each server in the cluster. Or maybe there is some details im not getting here :-) Regards, Orjan On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed br...@yahoo-inc.com wrote: can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to say about it: http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html ben Ørjan Horpestad wrote: Hi! Can someone pin-point me to a site (or please explain ) where I can read about the use of the myid-file for configuring the id of the ZooKeeper servers? I'm sure there is a good reason for using this approach, but it is the first time I have come over this type of non-automatic way for administrating replicas. Regards, Orjan
Re: The idea behind 'myid'
Benjamin Reed wrote: you and ted are correct. the id gives zookeeper a stable identifier to use even if the ip address changes. if the ip address doesn't change, we could use that, but we didn't want to make that a built in assumption. if you really do have a rock solid ip address, you could make a wrapper startup script that starts up and creates the myid file based on the ip address. i gotta say though, i've found that such assumptions are often found to be invalid. Yeah, it can be tricky. In more than one cluster, I've seen a set of static configuration files that gets replicated everywhere. If an individual instance needs per-instance configuration, we do that from the command line (using -D). Maybe logic can do it, or maybe a start script has to load a machine local file, whatever. It's a pretty common paradigm, though. It's hardly the end of the world, but it is definitely something my ops people stumbled over. -- Eric Bowman Boboco Ltd ebow...@boboco.ie http://www.boboco.ie/ebowman/pubkey.pgp +35318394189/+353872801532