[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-07-23 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Really happy to see this in a released version, thanks everyone.

@anmolnar after testing this with Apache Kafka, it seems like the 
combination of a reasonably long backoff (1 second) with randomization when the 
list is small can cause connection timeouts in some cases that one would not 
expect. The concrete example we saw was usage of localhost which caused ipv4 
and ipv6 addresses to be resolved while the only the ipv4 one worked. We had 
tests that would fail quite often with a connection timeout of 6 seconds 
because the ipv6 one would be picked continuously.

@rajinisivaram filed a ZooKeeper issue to track a potential future 
improvement to this logic:
https://issues.apache.org/jira/browse/ZOOKEEPER-3100


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-30 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Also, as I'm sure many are wondering, a discussion about the next release 
including this fix is taking place in the mailing list:


https://lists.apache.org/thread.html/8d3dcd84b0ad47563e2cf2619eb0775066ff975453865938dd4e7380@%3Cdev.zookeeper.apache.org%3E


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-30 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Thanks @anmolnar, @fpj and @hanm for getting this in.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-28 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Thanks @hanm 


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-28 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Committed to branch-3.4. Please close the pull request @anmolnar.

Please port it to branch-3.5 and master (if you like :-), and create 
necessary JIRAs for auxiliary tasks (documentation update, etc). 


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-28 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@hanm I updated the pull request according to your suggestions. Please take 
a look.
I also updated javadoc accordingly. ZooKeeper docs could be updated in a 
separate PR as suggested.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-25 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
The patch looks good, I think we just need to get the public API change 
issue settled before we can merge this in.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-25 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
There are also some initial comments about JVM DNS caching

>> Re-resolving at StaticHostProvider level may not be sufficient as 
InetAddress.getAllByName(String host) itself uses a Java-level cache inside 
InetAddress and turns to name service (e.g. DNS) only if the host could not be 
found in the Java-level cache.

I think it's a good point that we brought this up - and I think we can't do 
much here as there are multiple level of DNS caching not just in JVM but also 
in OS, switches, etc, so the best we could do here is to make sure we 
re-resolve address when necessary, as this pull request is doing, and meanwhile 
let user know this might not work in the end due to multiple level of DNS 
caching. In that case, I think some links to the Oracle documents about JVM DNS 
caching would be helpful so users don't blame ZooKeeper if things don't work, 
but I think we can do that documentation update separately after this pull 
request is merged.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-22 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
I am reviewing this and will merge before Memorial Day if no other issues.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-22 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
I agree with @ijuma 
@afine You were involved in this patch too. Are you willing to merge it in 
@fpj 's absence?


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-22 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
If there are no other concerns, it would be great for this to be merged.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-14 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@fpj I think this patch is ready for merging as it is. Are you still having 
concerns?


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-03-23 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@afine @ijuma I've finished refactoring of StaticHostProvider.

The implementation follows that I explained in my email as **Option-3**:

> - Do not cache IPs, 
> - Shuffle the names and resolve with getAllByName() every time when 
next() is called,
> - Use getAllByName(), but shuffle the list and return the first IP to 
properly handle hostnames associated with multiple IPs,
> - JDK's built-in caching will prevent name servers from being flooded and 
will do the re-resolution automatically when cache expires,

The `Resolver` interface which is also introduced in this patch is the 
solution for the problem that @afine raised: to properly mock out the 
`getAllByName()` call in unit tests.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-03-20 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@ijuma I feel your pain. :)
No worries, I'm on it. Doing my best to push committers and others to 
review changes.

Additionally I'd like to make a small refactoring to the proposed logic 
before committing, because I'm not entirely convinced about this manual 
caching/shuffling logic that was implemented originally.

You can see the details in one of my comments above and on the `dev` 
mailing list.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-03-20 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@anmolnar, it's been more than 1 month since the last comment on this PR. 
Is there anything that still needs to be addressed? The original PR was 
submitted in January 2017 and it got stalled after a while, I'm hoping the same 
doesn't happen here. :)


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-03-09 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@ijuma Sure, no problem.
I'm waiting for some feedback from the community here and on the mailing 
list and I hope I can commit soon.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-03-08 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Thanks for picking this up @anmolnar, looking forward to this being fixed. 
:)


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-02-08 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Going one step back I wonder why we try to deal with multiple addresses at 
all.

HostProvider should just make a transformation from unresolved 
InetSocketAddresses to resolved InetSocketAddresses. The easiest way as I can 
see it is to create a new instance of InetSocketAddress if the input is 
unresolved every time `next()` is called. Otherwise just pass it through.

JVM will deal with the rest: resolution, caching and re-resolution once the 
cache is expires (30 secs).


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-02-02 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@jeffwidman 

> Should this PR be targeting branch-3.4 or target trunk and then backport 
to the 3.4 series?

The original PR targets 3.4 which is explained in a comment from @fpj on 
the jira:

https://issues.apache.org/jira/browse/ZOOKEEPER-2184?focusedCommentId=15823099=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15823099

> For the 3.5 branch, we will need a different patch because of the 
reconfiguration changes to StaticHostProvider. I'll work on it once the 3.4 
patch gets a +1.



---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-02-01 Thread jeffwidman
Github user jeffwidman commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Should this PR be targeting `branch-3.4` or target `trunk` and then 
backport to the 3.4 series?


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-02-01 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@phunt @afine Did you have a chance to take a look?
I think we've addressed all issues that were mentioned in the original PR.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-01-26 Thread mfenes
Github user mfenes commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Looking at the static initialization block in InetAddressCachePolicy more 
deeply, the default TTL is 30 seconds if there is no SecurityManager installed.
So caching a positive lookup forever in the Java-level cache is the default 
only if there is a SecurityManager installed and the TTL is not overridden by 
"networkaddress.cache.ttl" to a different value.
Default caching policy for a negative lookup is 0 (never cache).
Now the only question is whether 30 seconds default caching is ok or too 
much for ZK.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-01-26 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Just confirmed on 3.4 branch: ZK uses 30 secs cache TTL on my mac.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-01-26 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
@mfenes The only solution I can think of is to set DNS cache TTL 
`networkaddress.cache.ttl` to a configurable, non-infinite value.


---


[GitHub] zookeeper issue #451: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-01-26 Thread mfenes
Github user mfenes commented on the issue:

https://github.com/apache/zookeeper/pull/451
  
Re-resolving at StaticHostProvider level may not be sufficient as 
InetAddress.getAllByName(String host) itself uses a Java-level cache inside 
InetAddress and turns to name service (e.g. DNS) only if the host could not be 
found in the Java-level cache.
Unfortunately, when Java resolves a new host using the name service, it 
puts the host and its addresses in the cache with TTL cache FOREVER. 
This means, once a host gets resolved by Java, it will never again turn to 
the name service to re-resolve it. If a host's addresses get updated in DNS, 
the address cache in Java will still contain the old entry forever.
So re-resolving at StaticHostProvider won't help in this case, as 
InetAddress.getAllByName(String host) will still return the old address(es) I 
think.
Check the getCachedAddresses method inside InetAddress, the get() method of 
static final class Cache inside InetAddress and 
sun.net.InetAddressCachePolicy.get() which returns cachePolicy with default 
value -1 (FOREVER) if it is not overridden by Security properties 
"networkaddress.cache.ttl" and "networkaddress.cache.negative.ttl".


---