On 4 Jul 2014, at 14:16, Flavio Junqueira <[email protected]> wrote:
> Ok, so a couple of obvious checks Sure… > - Are you passing a connection string with all five servers? Yes, most definitely. Prior to deployment of this I did some extensive testing where I killed off ZK servers randomly to test our clients’ ability to reconnect on to another server in the cluster. I know that if they absolutely need to, they can connect elsewhere — but the graphs show they almost always pick the same server. > - Are you calling zoo_deterministic_conn_order(1) by any chance (you > shouldn't if you want shuffling)? No, I wasn’t aware of that function — but in mentioning it you’ve led me to the code that does the shuffling. Is there anything on the server side to force a client to move elsewhere if the server has a disproportional number of the clients connected to it? That’s the function I though I had read exists? That said, given a sufficiently random random() function, it looks like the permute should do enough to stop all clients arriving on the same server initially anyway. Perhaps I’ll need to add some instrumentation dump out the permuted connection list and see how it varies across the clients? —James > -Flavio > > > On Friday, July 4, 2014 2:01 PM, James Mulcahy <[email protected]> > wrote: > > >> >> >> >> Hi Flavio, >> >> Thanks for the quick response — and apologies for not including these >> details up front! >> >> - C client binding >> - 99.99% MacOS X Clients (10.9.2), with a couple of Linux Clients (Ubuntu >> 14.04) >> - All ZK nodes are Linux (Ubuntu 14.404) >> - ZooKeeper 3.4.6 >> >> No Windows involved here…. >> >> —James >> >> >> On 4 Jul 2014, at 13:57, Flavio Junqueira <[email protected]> >> wrote: >> >>> Hi James, >>> >>> Are you using the C or the Java client binding? What's the OS? I'm asking >>> because there is an issue with the randomization of the connect string on >>> Windows we found, but I haven't created a jira for it yet. >>> >>> -Flavio >>> >>> >>> On Friday, July 4, 2014 10:41 AM, James Mulcahy <[email protected]> >>> wrote: >>> >>> >>>> >>>> >>>> >>>> Hello, >>>> >>>> I run a 5 node ZooKeeper ensemble, with ~900 clients connected at a given >>>> time. I’m noticing that at any one point in time, all the clients are >>>> generally connected to the same ZooKeeper node. >>>> >>>> Looking back over the graphs I have which track this, there has only been >>>> one brief period where one node didn’t have >90% of the clients; and >>>> during that period, two nodes shared roughly 50% of the clients each. >>>> >>>> Is this expected behaviour? Is there anything I can do to tune this, to >>>> encourage the clients to be more balanced? >>>> >>>> My expectation was that the clients would self-balance — I thought I’d >>>> read that somewhere in the documentation, but I can’t find a reference for >>>> that now. >>>> >>>> Thanks in advance, >>>> >>>> —James >>>> >> >>
