Another reconfiguration weirdness question: Steps: Three servers in total, servers 1 and 2 know only about themselves, server 3 knows everyone. I first start up 1 and 2. After quorum is formed I start 3.
Expected: server 3 cannot join or at most becomes an observer. Actually: server 3 is allowed in to the quorum as follower and I can do writes through it. The configuration, however, doesn't include server 3 (4-letter conf or zkCli config) as all three servers list only 1 and 2. I think I saw some discussion about accepting unannounced new members, but couldn't find it now. Even then, the new member obviously should be part of the official configuration. BR, Niko Vuokko 2014-06-25 10:10 GMT+03:00 Niko Vuokko <[email protected]>: > JIRA+patch available @ > https://issues.apache.org/jira/browse/ZOOKEEPER-1946 > > > 2014-06-19 17:52 GMT+03:00 Alexander Shraer <[email protected]>: > > Hi, >> >> 1) yes. The only consistent way to change a configuration is to agree on a >> new one and currently you need to use the reconfig api to do that. >> 2) yeah seems confusing indeed. It probably makes sense to update the >> printed info when the config is updated (in this case when it learns it >> during sync). Can you open a JIRA for this? If you can take a stab at >> fixing it I'll review it. >> >> Thanks >> Alex >> On Jun 19, 2014 4:52 PM, "Niko Vuokko" <[email protected]> wrote: >> >> > I basically try to simulate having a clean slate server coming up with >> > myid=3. So it would have no data, no dynamic configuration file since >> the >> > ZK service has never run. I was hoping to test the reconfiguration API >> to >> > let the new server 3 in to the quorum, but then this happened. >> > >> > "Static configuration" = $ZOOCFG on service start >> > Reconfiguration API used: Not yet... >> > Manual change: Yes, I kill server 3 manually, delete all its data, >> change >> > its configuration and start it up as "new, empty server" >> > >> > Yes, the remaining quorum obviously tells the "new" server 3 to >> configure >> > itself with the old ports, but that just sounds *really* weird to me. >> Two >> > questions: >> > >> > 1) Is it really so that it doesn't matter how I configure the new >> member if >> > the quorum already knows about an old server with the same myid? The old >> > configuration will just be forced upon the new server even if that is >> > unwanted. >> > 2) All the log entries refer to the new port 2184 although that is not >> > actually used. For example, once the "new" server 3 has joined as a >> > follower, I'm still getting rows like >> > >> > 2014-06-19 16:25:10,883 [myid:3] - INFO >> > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2184:QuorumPeer@972] - FOLLOWING >> > >> > And as I mentioned, there is nothing in port 2184, the "new" server 3 >> > responds at 2183... The problematic piece seems to come from >> QuorumPeer@857 >> > where the thread name is set. This bug causes confusion (especially if >> > hostnames change as well), probably nothing worse in most cases though. >> > >> > >> > (btw, 4-letter word conf gives the currently known member configuration) >> > >> > Thanks for the help, >> > niko >> > >> > >> > >> > 2014-06-19 16:22 GMT+03:00 Alexander Shraer <[email protected]>: >> > >> > > when you say "change its static configuration", what exactly do you >> mean >> > ? >> > > this part of the configuration should be located in the membership >> file >> > > (dynamic part of the config). do you use the reconfiguration API to >> > change >> > > it (this is the right way) ? do you manually change it at all/some of >> the >> > > servers (this would have no effect because servers read these files >> only >> > > when they boot) ? >> > > >> > > In the scenario you describe I expect server 3 to come up. Its >> possible >> > if >> > > you changed the config files manually that during sync with the leader >> > the >> > > leader pushes the old config to server 3 that it has in memory (since >> the >> > > files you updated are not read) that's why it effectively has the old >> > > parameters. >> > > >> > > Please use the dynamic reconfig API when you make changes to existing >> > > servers or add/remove servers. >> > > >> > > the "config" command in CLI can show you what config is latest at the >> > > server you're connected too (if I remember correctly there's also a 4 >> > > letter command). You can also check the config file(s) of server 3 >> after >> > it >> > > syncs with the leader. I suspect that it will contain the old config. >> > > >> > > >> > > Alex >> > > >> > > >> > > >> > > >> > > On Thu, Jun 19, 2014 at 2:03 PM, Niko Vuokko <[email protected]> >> > > wrote: >> > > >> > > > Starting from a stable 3-member quorum: >> > > > >> > > > server.1=localhost:2801:3801;2181 >> > > > server.2=localhost:2802:3802;2182 >> > > > server.3=localhost:2803:3803;2183 >> > > > >> > > > I then kill server 3, clear its data directory, keep its myid=3 and >> > > change >> > > > its static configuration to >> > > > >> > > > server.1=localhost:2801:3801;2181 >> > > > server.2=localhost:2802:3802;2182 >> > > > server.3=localhost:2804:3804;2184 >> > > > >> > > > Now what I would expect is that this "new" server 3 will not join >> the >> > > > quorum since the ports don't match what the servers 1 and 2 expect. >> > > > However, it can join. The problem is that the "new" server 3 does >> not >> > > > respect its configuration. Its logs will contain the new port number >> > > 2184, >> > > > but it will actually pick up the dynamic configuration offered by >> the >> > > > quorum and open up the old ports 2183 etc. After joining again, the >> > > dynamic >> > > > configuration file for server 3 contains >> > > > >> > > > server.3=localhost:2803:3803:participant;0.0.0.0:2183 >> > > > >> > > > Also, echo conf | localhost 2184 never replies but echo conf | >> > localhost >> > > > 2183 returns >> > > > >> > > > server.3=localhost:2803:3803:participant;0.0.0.0:2183 >> > > > >> > > > Is this actually intentional or a bug? >> > > > >> > > > >> > > > Best, >> > > > Niko Vuokko >> > > > >> > > >> > >> > >
