Looks like to me. I marked it as such.

Patrick

On Mon, Jan 9, 2012 at 6:49 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote:
> Patrick,
>
> Looks like https://issues.apache.org/jira/browse/ZOOKEEPER-1356 is a
> duplicate of 338 ? If yes, then I'll mark it to reflect the same.
>
> Thanks,
> Neha
>
> On Mon, Jan 9, 2012 at 5:36 PM, Patrick Hunt <ph...@apache.org> wrote:
>> dup of https://issues.apache.org/jira/browse/ZOOKEEPER-338 ?
>>
>> Patrick
>>
>> On Mon, Jan 9, 2012 at 3:17 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>>> Neha
>>>
>>> Filing a jira is a great way to further the discussion.
>>>
>>> Sent from my iPhone
>>>
>>> On Jan 9, 2012, at 15:33, Neha Narkhede <neha.narkh...@gmail.com> wrote:
>>>
>>>>>> If you just have machine names in a list that you pass in, then yes, we
>>>> could re-resolve on every reconnect and you could just re-alias that name
>>>> to a new IP. But you'll have to put in logic that will do that but not
>>>> break people using DNS RR.
>>>>
>>>> Having a list of machine names that can be changed to point to new IPs
>>>> seems reasonable too. To be able to do the upgrade without having to
>>>> restart all clients, besides turning off DNS caching in the JVM, we
>>>> still have to solve the problem of zookeeper client caching the IPs in
>>>> code. Having 2 levels of DNS caching, one in the JVM and one in code
>>>> (which cannot be turned off) doesn't look like a good idea. Unless I'm
>>>> missing the purpose of such IP caching in zookeeper ?
>>>>
>>>>>> I realize that moving machines is difficult when you have lots of 
>>>>>> clients.
>>>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>>>> machine move given a cluster of that complexity, though
>>>>
>>>> Its not like it can't be done, it definitely has quite some
>>>> operational overhead. We are trying to brainstorm various approaches
>>>> and come up with one that will involve the least overhead on such
>>>> upgrades going forward.
>>>>
>>>> Having said that, seems like re-resolving host names in reconnect
>>>> doesn't look like a bad idea, provided it doesn't break the DNS RR use
>>>> case. If that sounds good, can I go ahead a file a JIRA for this ?
>>>>
>>>> Thanks,
>>>> Neha
>>>>
>>>> On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <cami...@apache.org> 
>>>> wrote:
>>>>> We don't shuffle IPs after the initial resolution of IP addresses.
>>>>>
>>>>> In DNS RR, you resolve to a list of IPs, shuffle these, and then we round
>>>>> robin through them trying to connect. If you re-resolve on every
>>>>> round-robin, you have to put in logic to know which ones have changed and
>>>>> somehow maintain that shuffle order or you aren't doing a fair back end
>>>>> round robin, which people using the ZK client against DNS RR are relying 
>>>>> on
>>>>> today.
>>>>>
>>>>> If you just have machine names in a list that you pass in, then yes, we
>>>>> could re-resolve on every reconnect and you could just re-alias that name
>>>>> to a new IP. But you'll have to put in logic that will do that but not
>>>>> break people using DNS RR.
>>>>>
>>>>> I realize that moving machines is difficult when you have lots of clients.
>>>>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>>>>> machine move given a cluster of that complexity, though. I also think that
>>>>> if we're going to be putting special cases like this in we might just want
>>>>> to go all the way to a pluggable reconnection scheme, but maybe that is 
>>>>> too
>>>>> aggressive.
>>>>>
>>>>> C
>>>>>
>>>>> On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede 
>>>>> <neha.narkh...@gmail.com>wrote:
>>>>>
>>>>>> Maybe I didn't express myself clearly. When I said DNS RR, I meant its
>>>>>> simplest implementation which resolves a hostname to multiple IPs.
>>>>>>
>>>>>> Whatever method you use to map host names to IPs, the problem is that
>>>>>> the zookeeper client code will always cache the IPs. So to be able to
>>>>>> swap out a machine, all clients would have to be restarted, which if
>>>>>> you have 100s of clients, is a major pain. If you want to move the
>>>>>> entire cluster to new machines, this becomes even harder.
>>>>>>
>>>>>> I don't see why re-resolving host names to IPs in the reconnect logic
>>>>>> is a problem for zookeeper, since you shuffle the list of IPs anyways.
>>>>>>
>>>>>> Thanks,
>>>>>> Neha
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <cami...@apache.org>
>>>>>> wrote:
>>>>>>> You can't sensibly round robin within the client code if you re-resolve
>>>>>> on
>>>>>>> every reconnect, if you're using dns rr. If that's your goal you'd want 
>>>>>>> a
>>>>>>> list of dns alias names and re-resolve each hostname when you hit it on
>>>>>>> reconnect. But that will break people using dns rr.
>>>>>>> You can look into writing a pluggable reconnect logic into the zk 
>>>>>>> client,
>>>>>>> that's what would be required to do this but at the end of the day 
>>>>>>> you'll
>>>>>>> have to give your users special clients to make that work.
>>>>>>>
>>>>>>> C
>>>>>>>  On Jan 9, 2012 1:16 PM, "Neha Narkhede" <neha.narkh...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>> I was reading through the client code and saw that zookeeper client
>>>>>>>> caches the server IPs during startup and maintains it for the rest of
>>>>>>>> its lifetime. If we go with the DNS RR approach or a load balancer
>>>>>>>> approach, and later swap out a server with a new one ( with a new IP
>>>>>>>> ), all clients would have to be restarted to be able to "forget" the
>>>>>>>> old IP and see the new one. That doesn't look like a clean approach to
>>>>>>>> such upgrades. One way of getting around this problem, is adding the
>>>>>>>> resolution of host names to IPs in the "reconnect" logic in addition
>>>>>>>> to the constructor. So when such upgrades happen and the client
>>>>>>>> reconnects, it will see the new list of IPs, and wouldn't require to
>>>>>>>> be restarted.
>>>>>>>>
>>>>>>>> Does this approach sound good or am I missing something here ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Neha
>>>>>>>>
>>>>>>>> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <cami...@apache.org>
>>>>>>>> wrote:
>>>>>>>>> DNS RR is good. I had good experiences using that for my client
>>>>>>>>> configs for exactly the reasons you are listing.
>>>>>>>>>
>>>>>>>>> On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede <
>>>>>> neha.narkh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> Thanks for the responses!
>>>>>>>>>>
>>>>>>>>>>>> How are your clients configured to find the zks now?
>>>>>>>>>>
>>>>>>>>>> Our clients currently use the list of hostnames and ports that
>>>>>>>>>> comprise the zookeeper cluster. For example,
>>>>>>>>>> zoo1:port1,zoo2:port2,zoo3:port3
>>>>>>>>>>
>>>>>>>>>>>>> - switch DNS,
>>>>>>>>>>> - wait for caches to die,
>>>>>>>>>>
>>>>>>>>>> This is something we thought about however, if I understand it
>>>>>>>>>> correctly, doesn't JVM cache DNS entries forever until it is
>>>>>> restarted
>>>>>>>>>> ? We haven't specifically turned DNS caching off on our clients. So
>>>>>>>>>> this solution would require us to restart the clients to see the new
>>>>>>>>>> list of zookeeper hosts.
>>>>>>>>>>
>>>>>>>>>> Another thought is to use DNS RR and have the client zk url have one
>>>>>>>>>> name that resolves to and returns a list of IPs to the zookeeper
>>>>>>>>>> client. This has the advantage of being able to perform hardware
>>>>>>>>>> migration without changing the client connection url, in the future.
>>>>>>>>>> Do people have thoughts about using a DNS RR ?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Neha
>>>>>>>>>>
>>>>>>>>>> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning <ted.dunn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>> In particular, aren't you using DNS names?  If you are, then you can
>>>>>>>>>>>
>>>>>>>>>>> - expand the quorum with the new hardware on new IP addresses,
>>>>>>>>>>> - switch DNS,
>>>>>>>>>>> - wait for caches to die,
>>>>>>>>>>> - restart applications without reconfig or otherwise force new
>>>>>>>> connections,
>>>>>>>>>>> - decrease quorum size again
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier <
>>>>>> cami...@apache.org
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> How are your clients configured to find the zks now? How many
>>>>>> clients
>>>>>>>> do
>>>>>>>>>>>> you have?
>>>>>>>>>>>>
>>>>>>>>>>>> From my phone
>>>>>>>>>>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" <neha.narkh...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> As part of upgrading to Zookeeper 3.3.4, we also have to migrate
>>>>>> our
>>>>>>>>>>>>> zookeeper cluster to new hardware. I'm trying to figure out the
>>>>>> best
>>>>>>>>>>>>> strategy to achieve that with no downtime.
>>>>>>>>>>>>> Here are some possible solutions I see at the moment, I could
>>>>>> have
>>>>>>>>>>>>> missed a few though -
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Swap each machine out with a new machine, but with the same
>>>>>>>> host/IP.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pros: No client side config needs to be changed.
>>>>>>>>>>>>> Cons: Relatively tedious task for Operations
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Add new machines, with different host/IPs to the existing
>>>>>>>> cluster,
>>>>>>>>>>>>> and remove the older machines, taking care to maintain the
>>>>>> quorum at
>>>>>>>>>>>>> all times
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pros: Easier for Operations
>>>>>>>>>>>>> Cons: Client side configs need to be changed and clients need to
>>>>>> be
>>>>>>>>>>>>> restarted/bounced. Another problem is having a large quorum for
>>>>>>>>>>>>> sometime (potentially 9 nodes).
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Hide the new cluster behind either a Hardware load balancer
>>>>>> or a
>>>>>>>>>>>>> DNS server resolving to all host ips.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pros: Makes it easier to move hardware around in the future
>>>>>>>>>>>>> Cons: Possible timeout issues with load balancers messing with
>>>>>>>>>>>>> zookeeper functionality or performance
>>>>>>>>>>>>>
>>>>>>>>>>>>> Read this and found it helpful -
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>> http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1
>>>>>>>>>>>>> But would like to hear from the authors and the users who might
>>>>>> have
>>>>>>>>>>>>> tried this in a real production setup.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm very interested in finding a long term solution for masking
>>>>>> the
>>>>>>>>>>>>> zookeeper host names. Any inputs here are appreciated !
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition to this, it will also be great to know what people
>>>>>> think
>>>>>>>>>>>>> about options 1 and 2, as a solution for hardware changes in
>>>>>>>>>>>>> Zookeeper.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Neha
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>>

Reply via email to