Re: ZK crashing wit NPE ( WARN Exception caught (org.apache.zookeeper.server.NettyServerCnxnFactory) [nioEventLoopGroup-4-2] java.lang.NullPointerException

2022-05-03 Thread Flavio Junqueira
Hello Sreejesh,

My read is that servers are not being able to connect to each other for leader 
election.

-Flavio

> On 3 May 2022, at 15:50, Sreejesh Radhakrishnan 
>  wrote:
> 
> Classification: Public
> 
> Hi
>  
> Sorry not sure if this is the right email. But still trying my luck,
>  
> I am getting an NPE as pointed on the subject line, I am using Strimzi KAFKA 
> on GKE (1.21.9-gke.1002)  using  n1-standard-4 infra.  
>  
> Zookeeper is throwing below error, is that something which you guys have seen 
> or have idea of what causing it?
>  
> get ZK log
>  
> Detected Zookeeper ID 2
> Preparing truststore
> Adding /opt/kafka/cluster-ca-certs/ca.crt to truststore 
> /tmp/zookeeper/cluster.truststore.p12 with alias ca
> Certificate was added to keystore
> Preparing truststore is complete
> Looking for the right CA
> Found the right CA: /opt/kafka/cluster-ca-certs/ca.crt
> Preparing keystore for client and quorum listeners
> Preparing keystore for client and quorum listeners is complete
> Starting Zookeeper with configuration:
> # The directory where the snapshot is stored.
> dataDir=/var/lib/zookeeper/data
>  
> # Other options
> 4lw.commands.whitelist=*
> standaloneEnabled=false
> reconfigEnabled=true
> clientPort=12181
> clientPortAddress=127.0.0.1
>  
> # TLS options
> serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
> ssl.clientAuth=need
> ssl.quorum.clientAuth=need
> secureClientPort=2181
> sslQuorum=true
>  
> ssl.trustStore.location=/tmp/zookeeper/cluster.truststore.p12
> ssl.trustStore.***
> ssl.trustStore.type=PKCS12
> ssl.quorum.trustStore.location=/tmp/zookeeper/cluster.truststore.p12
> ssl.quorum.trustStore.***
> ssl.quorum.trustStore.type=PKCS12
>  
> ssl.keyStore.location=/tmp/zookeeper/cluster.keystore.p12
> ssl.keyStore.***
> ssl.keyStore.type=PKCS12
> ssl.quorum.keyStore.location=/tmp/zookeeper/cluster.keystore.p12
> ssl.quorum.keyStore.***
> ssl.quorum.keyStore.type=PKCS12
>  
> # Provided configuration
> tickTime=2000
> initLimit=5
> syncLimit=2
> autopurge.purgeInterval=1
>  
>  
> # Zookeeper nodes configuration
> server.1=uwq-cluster-zookeeper-0.uwq-cluster-zookeeper-nodes.uwq-kafka.svc:2888:3888:participant;127.0.0.1:12181
> server.2=uwq-cluster-zookeeper-1.uwq-cluster-zookeeper-nodes.uwq-kafka.svc:2888:3888:participant;127.0.0.1:12181
> server.3=uwq-cluster-zookeeper-2.uwq-cluster-zookeeper-nodes.uwq-kafka.svc:2888:3888:participant;127.0.0.1:12181
>  
> + exec /usr/bin/tini -w -e 143 -- /opt/kafka/bin/zookeeper-server-start.sh 
> /tmp/zookeeper.properties
> 2022-05-03 09:15:52,181 INFO Reading configuration from: 
> /tmp/zookeeper.properties 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [main]
> 2022-05-03 09:15:52,193 INFO clientPortAddress is 127.0.0.1:12181 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [main]
> 2022-05-03 09:15:52,193 INFO secureClientPortAddress is 0.0.0.0:2181 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [main]
> 2022-05-03 09:15:52,197 INFO Setting -D 
> jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated 
> TLS renegotiation (org.apache.zookeeper.common.X509Util) [main]
> 2022-05-03 09:15:52,198 INFO observerMasterPort is not set 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [main]
> 2022-05-03 09:15:52,198 INFO metricsProvider.className is 
> org.apache.zookeeper.metrics.impl.DefaultMetricsProvider 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [main]
> 2022-05-03 09:15:52,233 INFO autopurge.snapRetainCount set to 3 
> (org.apache.zookeeper.server.DatadirCleanupManager) [main]
> 2022-05-03 09:15:52,233 INFO autopurge.purgeInterval set to 1 
> (org.apache.zookeeper.server.DatadirCleanupManager) [main]
> 2022-05-03 09:15:52,241 INFO Log4j 1.2 jmx support found and enabled. 
> (org.apache.zookeeper.jmx.ManagedUtil) [main]
> 2022-05-03 09:15:52,235 INFO Purge task started. 
> (org.apache.zookeeper.server.DatadirCleanupManager) [PurgeTask]
> 2022-05-03 09:15:52,266 INFO zookeeper.snapshot.trust.empty : false 
> (org.apache.zookeeper.server.persistence.FileTxnSnapLog) [PurgeTask]
> 2022-05-03 09:15:52,268 INFO Starting quorum peer, myid=2 
> (org.apache.zookeeper.server.quorum.QuorumPeerMain) [main]
> 2022-05-03 09:15:52,291 INFO zookeeper.snapshot.compression.method = CHECKED 
> (org.apache.zookeeper.server.persistence.SnapStream) [PurgeTask]
> 2022-05-03 09:15:52,298 INFO Purge task completed. 
> (org.apache.zookeeper.server.DatadirCleanupManager) [PurgeTask]
> 2022-05-03 09:15:52,306 INFO ServerMetrics initialized with provider 
> org.apache.zookeeper.metrics.impl.DefaultMetricsProvider@5276e6b0 
>  
> (org.apache.zookeeper.server.ServerMetrics) [main]
> 2022-05-03 09:15:52,404 INFO zookeeper.client.portUnification=false 
> (org.apache.zookeeper.server.NettyServerCnxnFactory) [main]
> 2022-05-03 09:15:52,404 INFO zookeeper.netty.advancedFlowControl.enabled = 
> false 

Re: Logback

2021-12-15 Thread Flavio Junqueira
We use logback in Pravega, it works fine for us. I'd be ok with the change.

-Flavio

> On 15 Dec 2021, at 12:02, Andor Molnar  wrote:
> 
> Hi ZK folks,
> 
> What do you think about migrating ZK to logback?
> The idea just crossed my mind due to the recent turbulence with log4j.
> 
> Checking some migrating guides, it doesn’t seem the end of the world.
> 
> Andor
> 



Re: Contribution:Official document notes

2021-05-11 Thread Flavio Junqueira
Hi Mo Qian,

I think it would be great to have you contribute your notes. It would be good 
to find someone to review it, though.

-Flavio

> On 11 May 2021, at 11:09, 李新婷  wrote:
> 
> Hello!
>  I'm currently learning from zookeeper's official documents and taking 
> notes(https://blog.csdn.net/moqianmoqian/category_11015863.html). The notes 
> are relatively simple (in Chinese). I feel that it can help others read the 
> official documents and understand zookeeper, so I want to contribute these 
> articles (updating) to more people through you.Thanks for reading my letter, 
> look forward to your reply.
> Wish you happy every day! 
> Mo Qian



Re: [VOTE] Apache ZooKeeper release 3.6.3 candidate 2

2021-04-12 Thread Flavio Junqueira
+1

- Built from sources locally (there are a few flaky tests, though)
- Verified digest and signature
- Ran rat for license issues
- Checked NOTICE and LICENSE files
- Resolved zk dependency from staging repo
- Ran a few smoke tests locally using the cli tool

-Flavio

> On 12 Apr 2021, at 18:15, Patrick Hunt  wrote:
> 
> +1 xsum/sig validate. rat ran clean. I ran dependency check and
> launched/tested a few different cluster sizes manually and they all ran
> fine.
> 
> Patrick
> 
> On Thu, Apr 8, 2021 at 10:19 AM Mohammad Arshad  wrote:
> 
>> This is a bug fix release candidate for 3.6.3. It fixes 52 issues,
>> including multiple CVE fixes.
>> 
>> The full release notes is available at:
>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12348703
>> 
>>  Please download, test and vote by Sunday, April 11th 2021, 23:59
>> UTC+0. 
>> 
>> Source and binary files:
>> https://people.apache.org/~arshad/zookeeper-3.6.3-candidate-2/
>> 
>> Maven staging repo:
>> https://repository.apache.org/content/repositories/orgapachezookeeper-1071
>> 
>> The release candidate tag in git to be voted upon: release-3.6.3-2
>> https://github.com/apache/zookeeper/tree/release-3.6.3-2
>> 
>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>> https://www.apache.org/dist/zookeeper/KEYS
>> 
>> The staging version of the website is:
>> https://people.apache.org/~arshad/zookeeper-3.6.3-candidate-2/website/
>> 
>> Should we release this candidate?
>> 
>> Thanks & Regards
>> Arshad
>> 



Re: rebase and retest on github

2021-01-28 Thread Flavio Junqueira
I do it from the command line and push. You can do it from an IDE too, but I'm 
not aware of a way of doing it directly on GitHub.

-Flavio

> On 29 Jan 2021, at 07:34, Benjamin Reed  wrote:
> 
> i would really like to get ZOOKEEPER-3922 but it needs to be rebased
> and retested. is there a nice way to do that on github? or does the
> pull requestor need to do that?
> 
> happy new year,
> ben



Test failures on 3.7.0 RC1 (was: Re: [VOTE] Apache ZooKeeper release 3.7.0 candidate 1)

2021-01-25 Thread Flavio Junqueira
I don't want to mess up with the vote thread, so I'm responding to this 
separately. I have been trying to build locally too unsuccessfully. I've been 
trying on an Ubuntu VM, Java 8 (build 1.8.0_181-b13), Maven 3.6.0. The set of 
tests failing varies from build to build, if it makes sense, I can try to 
collect all test failures I have seen and post.

-Flavio

> On 25 Jan 2021, at 13:38, Szalay-Bekő Máté  wrote:
> 
> +0 (and not even binding :) )
> 
> - I built the source code (-Pfull-build) on Ubuntu 18.04.3 using OpenJDK
> 8u265 and maven 3.6.3.
> - I also built and executed unit tests for zkpython
> - the unit tests passed for the C-client and for python client
> - checkstyle and spotbugs passed
> - apache-rat passed
> - owasp (CVE check) passed
> - fatjar built (-Pfatjar)
> - I executed a quick rolling-upgrade test from 3.5.9 and from 3.6.2. (using
> https://github.com/symat/zk-rolling-upgrade-test)
> 
> for some reason the java unit tests failed for me.
> 
> On mac (jdk 1.8.212 and maven 3.6.3), I got all the unit tests executed
> successfully, but then the maven job still failed for hbase-server test
> with error message (with -DforkCount=4 and even with -DforkCount=1) like:
> -
> [ERROR] ExecutionException There was an error in the forked process
> [ERROR] unable to create new native thread
> [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException:
> ExecutionException There was an error in the forked process
> [ERROR] unable to create new native thread
> [ERROR] at
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510)
> -
> 
> 
> Then I tried on a dockerized environment (ubuntu 18.4, OpenJDK 8u265 and
> maven 3.6.3) and I got other kinds of strange maven errors:
> ---
> [ERROR] Caused by:
> org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM
> terminated without properly saying goodbye. VM crash or System.exit called?
> [ERROR] Command was /bin/sh -c cd
> /tmp/zk/apache-zookeeper-3.7.0/zookeeper-server &&
> /home/symat/.sdkman/candidates/java/8.0.265-open/jre/bin/java -Xmx512m
> -Dtest.junit.threads=8 -Dzookeeper.junit.threadid=3
> -javaagent:/home/symat/.m2/repository/org/jmockit/jmockit/1.48/jmockit-1.48.jar
> -jar
> /tmp/zk/apache-zookeeper-3.7.0/zookeeper-server/target/surefire/surefirebooter8828313385463488429.jar
> /tmp/zk/apache-zookeeper-3.7.0/zookeeper-server/target/surefire
> 2021-01-25T11-54-03_621-jvmRun3 surefire4024538135165099286tmp
> surefire_37800399112966511000tmp
> [ERROR] Process Exit Code: 0
> [ERROR] at
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669)
> [ERROR] at
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$600(ForkStarter.java:115)
> [ERROR] at
> org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:444)
> [ERROR] at
> org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:420)
> [ERROR] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [ERROR] at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [ERROR] at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [ERROR] at java.lang.Thread.run(Thread.java:748)
> [ERROR]
> --
> 
> 
> These issues might be very well specific to my local (mac or docker on mac)
> environments. This is why I didn't vote with -1
> Can someone else run the java unit tests successfully locally?
> 
> I also tried to check if the CI was green for the last PR on 3.7.0 (
> https://github.com/apache/zookeeper/pull/1586/checks), but it looks the CI
> haven't even started to execute the tests, due to errors in the "install C
> dependencies" step.
> 
> Regards,
> Mate
> 
> On Sun, Jan 24, 2021 at 11:39 PM Patrick Hunt  wrote:
> 
>> +1. xsum/sig verified. rat ran clean. built and dependency checks are fine.
>> Tried running some manual clusters and it was successful.
>> 
>> Regards,
>> 
>> Patrick
>> 
>> 
>> On Sun, Jan 24, 2021 at 12:11 PM Damien Diederen 
>> wrote:
>> 
>>> 
>>> Dear all,
>>> 
>>> This is a second release candidate for ZooKeeper 3.7.0.  Compared to
>>> RC0, it fixes a tarball generation issue, includes a description of the
>>> 'whoami' CLI command, and incorporates a contribution to ZooInspector.
>>> 
>>> ZooKeeper 3.7.0 introduces a number of new features, notably:
>>> 
>>>  * An API to start a ZooKeeper server from Java (ZOOKEEPER-3874);
>>> 
>>>  * Quota enforcement (ZOOKEEPER-3301);
>>> 
>>>  * Host name canonicalization in quorum SASL authentication
>>> (ZOOKEEPER-4030);
>>> 
>>>  * Support for BCFKS key/trust store format (ZOOKEEPER-3950);
>>> 
>>>  * A choice of mandatory authentication scheme(s) (ZOOKEEPER-3561);
>>> 
>>>  * A "whoami" API and CLI command (ZOOKEEPER-3969);
>>> 
>>>  * The possibility of disabling digest 

Re: ZooKeeper Operator

2021-01-18 Thread Flavio Junqueira
It sounds like a good idea to document it and add relevant pointers, Pat.

-Flavio

> On 18 Jan 2021, at 19:00, Patrick Hunt  wrote:
> 
> FYI: The awesome operator list has a few including Pravega:
> https://github.com/operator-framework/awesome-operators
> 
> I've seen a few more while investigating kubebuilder, operator-sdk (rh) and
> the like:
> https://github.com/Ghostbaby/zookeeper-operator
> 
> Perhaps the first thing we might consider is adding a wiki page detailing
> the available options and insights from the community? Esp if folks are
> using them. Similar to the client and tools pages:
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZKClientBindings
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/UsefulTools
> 
> Patrick
> 
> On Mon, Jan 18, 2021 at 2:36 AM Enrico Olivelli  wrote:
> 
>> Thanks for sharing!
>> 
>> We need more support for K8s in the OS community, this is a good step
>> 
>> Enrico
>> 
>> Il giorno lun 18 gen 2021 alle ore 11:18 Flavio Junqueira 
>> ha scritto:
>> 
>>> We've been getting questions and sometimes contributions to the ZooKeeper
>>> Kubernetes Operator we originally did for Pravega, so I feel that it has
>>> been useful more broadly. Perhaps this is something that others might be
>>> interested in too, and I thought of mentioning here.
>>> 
>>> https://github.com/pravega/zookeeper-operator
>>> 
>>> Thanks,
>>> -Flavio
>> 



ZooKeeper Operator

2021-01-18 Thread Flavio Junqueira
We've been getting questions and sometimes contributions to the ZooKeeper 
Kubernetes Operator we originally did for Pravega, so I feel that it has been 
useful more broadly. Perhaps this is something that others might be interested 
in too, and I thought of mentioning here.

https://github.com/pravega/zookeeper-operator

Thanks,
-Flavio  

Re: [ANNOUNCE] Apache ZooKeeper 3.5.9

2021-01-18 Thread Flavio Junqueira
+1

-Flavio

> On 18 Jan 2021, at 09:16, Szalay-Bekő Máté  wrote:
> 
> Thank you Norbert for driving this! :)
> 
> Regards,
> Mate
> 
> On Fri, Jan 15, 2021 at 4:04 PM Norbert Kalmar  wrote:
> 
>> The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
>> 3.5.9
>> 
>> ZooKeeper is a high-performance coordination service for distributed
>> applications. It exposes common services - such as naming,
>> configuration management, synchronization, and group services - in a
>> simple interface so you don't have to write them from scratch. You can
>> use it off-the-shelf to implement consensus, group management, leader
>> election, and presence protocols. And you can build on it for your
>> own, specific needs.
>> 
>> For ZooKeeper release details and downloads,
>> visit:https://zookeeper.apache.org/releases.html
>> 
>> ZooKeeper 3.5.9 Release Notes are
>> at:https://zookeeper.apache.org/doc/r3.5.9/releasenotes.html
>> 
>> We would like to thank the contributors that made the release possible.
>> 
>> Regards,
>> The ZooKeeper Team
>> 



Re: [ANNOUNCE] New Commiter: Damien Diederen

2020-10-28 Thread Flavio Junqueira
Congrats, Damien.

-Flavio


> On 28 Oct 2020, at 12:59, Szalay-Bekő Máté  wrote:
> 
> Congratulations Damien! :)
> 
> On Wed, Oct 28, 2020 at 12:37 PM Tamas Penzes 
> wrote:
> 
>> Congrats Damien!
>> 
>> On Wed, Oct 28, 2020 at 11:34 AM Enrico Olivelli 
>> wrote:
>> 
>>> The Project Management Committee (PMC) for Apache ZooKeeper
>>> has invited Damien Diederen to become a committer and we are pleased
>>> to announce that he has accepted.
>>> 
>>> Damien contributed lots of improvements and bug fixes on Zookeeper C
>> client
>>> and he also participating in the community with good code reviews, and
>>> discussions on our mailing lists.
>>> For instance he is the author of the SASL support in the C client
>>> https://github.com/apache/zookeeper/pull/1134
>>> 
>>> Being a committer enables easier contribution to the
>>> project since there is no need to go via the patch
>>> submission process. This should enable better productivity.
>>> Being a PMC member enables assistance with the management
>>> and to guide the direction of the project.
>>> 
>>> Congratulations Damien !
>>> 
>>> Enrico Olivelli
>>> 
>> 



Re: [DISCUSS][PROPOSAL] Require JDK 11 to build for 3.7

2020-10-22 Thread Flavio Junqueira
There are three points that stand out for me in this thread:

- How do we determine how such a change affects our user base?
- How much effort do the different options induce with respect to maintenance?
- What's the right timeline for changes and how do we communicate them so that 
our users have enough time to prepare?

Someone mentioned a PMC vote, and I don't think this should be a closed vote, 
independent of how the conversation goes.

-Flavio

> On 22 Oct 2020, at 08:39, Alessandro Luccaroni - Diennea 
>  wrote:
> 
> Hi all,
> If I might chime in as a zookeeper user (in multiple products) and a follower 
> of the project I think the drop of Java8 support (official and/or unofficial) 
> could be a big mistake.
> 
> From my own company point of view we already support Java11 in all our 
> applications so we are not directly impacted (and we have upgrade path for 
> older versions to provide to our customers).
> My worries resides in the (high) probability of a userbase fragmentation: in 
> the recent past Zookeeper development picked up speed thanks to a bunch of 
> new committers and PMCs after a period of mostly maintenance focused works, 
> but the number of active committers and PMCs is still very low for a project 
> like this.
> 
> I foresee the risk of spreading thin the resources of the project if we force 
> the userbase to stick to an older version and, in turn, we are forced to 
> backport many issue to the 3.6 branch.
> 
> Alessandro Luccaroni
> Platform Manager @ Diennea - MagNews
> Tel.: (+39) 0546 066100 Int. 924 - Mob.: (+39) 393 7273519
> Viale G.Marconi 30/14 - 48018 Faenza (RA) - Italy
> 
> -Messaggio originale-
> Da: Christopher 
> Inviato: giovedì 22 ottobre 2020 05:21
> A: dev@zookeeper.apache.org
> Oggetto: Re: [DISCUSS][PROPOSAL] Require JDK 11 to build for 3.7
> 
> I'm happy that this discussion has been so lively! I just want to emphasize a 
> few things:
> 
> I really do understand the desire to continue to support Java 8... I get it. 
> But all the conversations around this seem based on what people are doing 
> *today*. But, ZK 3.7 is *tomorrow's* version... a
> *future* release... so it should be based more on reasonable expectations for 
> users in the future, and less based on what is happening today. I suspect 
> *most* people today are still using 3.4 anyway (it was just so stable for so 
> long...), but that shouldn't mean the developers should hold back development 
> on 3.5 and 3.6, any more than today's users of 3.5/3.6 should hold back 3.7.
> 
> Some of the opinions expressed in this discussion seem to propose a scenario 
> where users are going to be updating to "bleeding edge"
> versions of ZooKeeper, but are going to insist on using Java 8.
> Personally, I find this to be implausible. In my experience, people either 
> upgrade everything as soon as they are able to, or they upgrade each thing 
> individually, only when they are forced to. The first group will be happy to 
> move to Java 11 and ZK 3.7. The second group will probably avoid 3.7 anyway, 
> and are fine sticking with 3.6, but if they had to update to 3.7, they'd also 
> be fine updating to Java 11 if they had to in order to use 3.7. I can't 
> imagine the scenario where people are eagerly choosing to upgrade to ZK 3.7, 
> but miserly insisting on using Java 8. Perhaps that scenario exists, but it's 
> hard for me to imagine. Even so, my proposal would still support even that 
> group of people.
> 
> I think there are now effectively three proposals being discussed in this 
> thread:
> 
> 1. (Christopher's original proposal) passively support Java 8 at runtime by 
> making JDK 11 the minimum requirement to build and test.
> This scenario involves continuing to fix bugs, as they are discovered and 
> reported, that affect JDK 8, but passively, rather than proactively. This 
> proposal does *not* drop Java 8 support, but merely de-emphasizes it in 
> development of what will be 3.7 in the future, and drops the requirement to 
> do dedicated testing with Java 8. I think this is low risk, because it is 
> very unlikely that the ZK devs would introduce a bug that would affect only 
> Java 8 and the compiler wouldn't catch it... because the cross-compilation 
> features of newer JDKs are really good.
> 
> 2. (Enrico's alternate proposal) this variation of my proposal would involve 
> continuing to proactively support Java 8 by creating a dedicated testing 
> suite to test client code on Java 8. I think this is a good option, but since 
> it involves a significantly higher amount of work than option 1, I think the 
> cost-benefit analysis would show this to be not worth the effort. Also, if it 
> were implemented, it would need to be done carefully to avoid requiring 
> developers to have concurrently installed both Java 8 and Java 11 in order to 
> perform a build, because requiring Java 8 at build time while developing 
> would be worse than we have today.
> 
> 3. (Andor's preference) move to 

Re: ApacheCon Bug Bash

2020-10-02 Thread Flavio Junqueira
That's very cool. If I understand this correctly, these are not automated, 
there are real contributors behind the PRs, right? Closing the PR would be 
harsh, so why not simply asking the contributor to create an issue and update 
the PR?

-Flavio

> On 2 Oct 2020, at 17:26, Enrico Olivelli  wrote:
> 
> Hey !
> it looks like the Bug bash has brought a few Pull Requests
> https://github.com/apache/zookeeper/pulls
> 
> Unfortunately they are not following the contribution guidelines (for
> instance there is no associated JIRA)
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute -
> 
> Most of the PR are about trivial fixes, I am not sure if a JIRA is deserved.
> 
> What should we do?
> My proposal is to ping the contributor in order to obey the guide and then
> finally accept the patches, as Micheal Han did in this patch
> https://github.com/apache/zookeeper/pull/1470
> 
> I don't want to see that patches remaining on github as low hanging fruit,
> so it is better that we decide how to work on them,
> another option is to close them as invalid (It would be a pity IMHO)
> 
> Enrico
> 
> 
> 
> Il giorno lun 28 set 2020 alle ore 15:03 Tom DuBuisson  ha
> scritto:
> 
>> Enrico,
>> That sounds great.  We'll get the repo activated.
>> 
>> Tom
>> 
>> 
>> On Sun, Sep 27, 2020, 11:11 PM Enrico Olivelli 
>> wrote:
>> 
>>> Tom
>>> Overall I think that we can move forward.
>>> 
>>> This thread has been around for a while, there are no objections, every
>>> question has been answered.
>>> 
>>> Thank you very much
>>> 
>>> I hope this activity will help in growing Zookeeper project both in code
>>> quality and with more contributions, that is to help the community to
>> grow.
>>> 
>>> Best regards
>>> 
>>> Enrico
>>> 
>>> Il Lun 28 Set 2020, 01:27 Tom DuBuisson  ha scritto:
>>> 
 Norbert,
 
 Yes, you understand that correctly.  And those analyzers are
>> FindSecBugs,
 Error Prone and Infer.  All open source and in moderate to wide use
 already.  Only find sec bugs is security specific - Infer and Error
>> Prone
 might find security bugs but they are more general purpose in nature.
 
 -Tom
 
 On Sun, Sep 27, 2020 at 3:43 PM Norbert Kalmar
 
 wrote:
 
> Hello Tom,
> 
> +1 on the initiative, thanks for bringing this to our attention.
> 
> If I understand correctly, there will be no disclosed security issues
 which
> cannot be found with open source static analyzers.
> 
> Regards,
> Norbert
> 
> 
> On Sun, Sep 27, 2020 at 8:23 AM Szalay-Bekő Máté <
> szalay.beko.m...@gmail.com>
> wrote:
> 
>> Hello Guys,
>> 
>> In general I like the idea, but unfortunately I can not really
> participate
>> (either in the coding or in the review) as I have a few important
> projects
>> close to deadline at the moment.
>> 
>> My only concern is with the security bugs, which I don't like to be
> openly
>> reported before publishing a release with the fix. But for any
>> other
 kind
>> of bugfixes / improvements, I am very positive with the initiative.
>> 
>> 
>> Best regards,
>> Mate
>> 
>> On Sun, Sep 27, 2020, 07:06 Tom DuBuisson  wrote:
>> 
>>> Enrico et al,
>>> 
>>> Are there other thoughts on this?  It would be great to get setup
> before
>>> the bash actually begins.  Enrico, lacking other voices would you
 like
> to
>>> make a final call?
>>> 
>>> -Tom
>>> 
>>> On Thu, Sep 24, 2020 at 3:30 AM Enrico Olivelli <
>>> eolive...@gmail.com
> 
>>> wrote:
>>> 
 Tom,
 Personally I am +1 with this proposal. Thanks for your
> clarifications.
 
 But we should ear opinions from other people in this list
 
 
 Enrico
 
 Il giorno mer 23 set 2020 alle ore 23:51 Tom DuBuisson <
> to...@muse.dev
>>> 
>>> ha
 scritto:
 
> Enrico,
> 
> On the topic security issues and reporting:  Muse's default
>>> configuration
> is open source tools and here it is run on open source
>>> projects.
> The
> results are thus already available publicly (in this case
>> from
 FSB,
 Infer,
> and Error Prone).  Muse doesn't post anything to GitHub
>> except
>>> in
> the
 case
> of pull requests and then only if the bug is deemed to have
>>> been
> "introduced" as part of the PR - meaning it shouldn't be a
>>> vulnerability
 in
> currently shipped software.
> 
> If there are desires or proposals about more control over bug
> reports
>>> in
 a
> convenient, configurable, manner then we'd really like to dig
>>> in
> and
>>> hear
> how to help.  In case there is more discussion on this point
>>> I'm
>> CCing
> Andrew who leads 

Re: ApacheCon Bug Bash

2020-09-28 Thread Flavio Junqueira
It does sound like a good initiative, thanks for including us. I still have the 
concern that others have expressed below around exposing security issues. We 
have guidelines to follow and shouldn't be exposing them openly. I see that Tom 
said:

>> All open source and in moderate to wide use already.  Only find sec
>> bugs is security specific - Infer and Error Prone might find security
>> bugs but they are more general purpose in nature.

But I'm unsure about what this is saying. Otherwise, I'm good with bug bashing.

-Flavio


> On 28 Sep 2020, at 08:11, Enrico Olivelli  wrote:
> 
> Tom
> Overall I think that we can move forward.
> 
> This thread has been around for a while, there are no objections, every
> question has been answered.
> 
> Thank you very much
> 
> I hope this activity will help in growing Zookeeper project both in code
> quality and with more contributions, that is to help the community to grow.
> 
> Best regards
> 
> Enrico
> 
> Il Lun 28 Set 2020, 01:27 Tom DuBuisson  ha scritto:
> 
>> Norbert,
>> 
>> Yes, you understand that correctly.  And those analyzers are FindSecBugs,
>> Error Prone and Infer.  All open source and in moderate to wide use
>> already.  Only find sec bugs is security specific - Infer and Error Prone
>> might find security bugs but they are more general purpose in nature.
>> 
>> -Tom
>> 
>> On Sun, Sep 27, 2020 at 3:43 PM Norbert Kalmar
>> 
>> wrote:
>> 
>>> Hello Tom,
>>> 
>>> +1 on the initiative, thanks for bringing this to our attention.
>>> 
>>> If I understand correctly, there will be no disclosed security issues
>> which
>>> cannot be found with open source static analyzers.
>>> 
>>> Regards,
>>> Norbert
>>> 
>>> 
>>> On Sun, Sep 27, 2020 at 8:23 AM Szalay-Bekő Máté <
>>> szalay.beko.m...@gmail.com>
>>> wrote:
>>> 
 Hello Guys,
 
 In general I like the idea, but unfortunately I can not really
>>> participate
 (either in the coding or in the review) as I have a few important
>>> projects
 close to deadline at the moment.
 
 My only concern is with the security bugs, which I don't like to be
>>> openly
 reported before publishing a release with the fix. But for any other
>> kind
 of bugfixes / improvements, I am very positive with the initiative.
 
 
 Best regards,
 Mate
 
 On Sun, Sep 27, 2020, 07:06 Tom DuBuisson  wrote:
 
> Enrico et al,
> 
> Are there other thoughts on this?  It would be great to get setup
>>> before
> the bash actually begins.  Enrico, lacking other voices would you
>> like
>>> to
> make a final call?
> 
> -Tom
> 
> On Thu, Sep 24, 2020 at 3:30 AM Enrico Olivelli >> 
> wrote:
> 
>> Tom,
>> Personally I am +1 with this proposal. Thanks for your
>>> clarifications.
>> 
>> But we should ear opinions from other people in this list
>> 
>> 
>> Enrico
>> 
>> Il giorno mer 23 set 2020 alle ore 23:51 Tom DuBuisson <
>>> to...@muse.dev
> 
> ha
>> scritto:
>> 
>>> Enrico,
>>> 
>>> On the topic security issues and reporting:  Muse's default
> configuration
>>> is open source tools and here it is run on open source projects.
>>> The
>>> results are thus already available publicly (in this case from
>> FSB,
>> Infer,
>>> and Error Prone).  Muse doesn't post anything to GitHub except in
>>> the
>> case
>>> of pull requests and then only if the bug is deemed to have been
>>> "introduced" as part of the PR - meaning it shouldn't be a
> vulnerability
>> in
>>> currently shipped software.
>>> 
>>> If there are desires or proposals about more control over bug
>>> reports
> in
>> a
>>> convenient, configurable, manner then we'd really like to dig in
>>> and
> hear
>>> how to help.  In case there is more discussion on this point I'm
 CCing
>>> Andrew who leads Muse's product design.
>>> 
>>> -Tom
>>> 
>>> On Wed, Sep 23, 2020 at 1:09 PM Enrico Olivelli <
>>> eolive...@gmail.com
> 
>>> wrote:
>>> 
 Il Mer 23 Set 2020, 19:02 Tom DuBuisson  ha
 scritto:
 
> Enrico,
> 
> The Muse App requires two main abilities.  First is events,
>>> such
 as
> notification when pull requests are opened or updated.
>> Second
>>> is
> permission to post comments (which is always possible for
>>> humans
> but
>>> more
> tightly controlled when the poster authenticates as a github
 application).
> The repository being public has allowed us to run the app and
> observe
> ErrorProne, Infer, and FindSecBugs all run out of the box and
> without
> custom configuration.
> 
 
 Makes sense.
 
 One last question from my side
 What about security issues?
 Our policy is to have them reported to
 secur...@zookeeper.apache.org
 before

Re: [VOTE] Apache ZooKeeper 3.6.0 candidate 4

2020-03-03 Thread Flavio Junqueira
+1 (binding)

- Built from sources (there are a good number of flaky tests, but it eventually 
built correctly)
- Checked LICENSE and NOTICED
- Checked release notes
- Checked that the maven dependency resolve for the staging artifact
- Ran some local smoke tests

-Flavio

> On 3 Mar 2020, at 11:01, Andor Molnar  wrote:
> 
> +1 (binding)
> 
> + verified signatures, checksums
> + successful build on Mac and Centos 7.5 (including C tests)
> + run various smoke tests and latency tests with 3-node cluster
> + verified rolling upgrade from 3.5.7
> 
> Thanks Enrico, I think you’re now good to go.
> 
> Andor
> 
> 
> 
>> On 2020. Mar 1., at 10:03, Enrico Olivelli  wrote:
>> 
>> +1 (binding)
>> verified signatures and checksums
>> run a few smoke tests form binaries (standalone mode)
>> tested Prometheus.io metrics endpoint
>> build from sources, run automatic QA tests (rat, checkstyle, spotbugs...)
>> all on Linux with Java 8 (AdoptOpenJDK)
>> 
>> We need at least one more PMC to vote please
>> 
>> Enrico
>> 
>> Il giorno dom 1 mar 2020 alle ore 01:58 Patrick Hunt
>>  ha scritto:
>>> 
>>> +1. xsum/sig verified. rat ran clean. Compiled and ran some manual tests
>>> with various ensemble sizes successfully.
>>> 
>>> Regards,
>>> 
>>> Patrick
>>> 
>>> On Fri, Feb 28, 2020 at 6:53 AM Enrico Olivelli  wrote:
>>> 
 Thank you guys for voting.
 
 We need more votes please
 
 Enrico
 
 Il giorno gio 27 feb 2020 alle ore 14:14 Norbert Kalmar
  ha scritto:
> 
> +1 (non-binding)
> 
> - unit tests pass (PurgeTxnTest as well)
> - source tarball: compiled and started ZK + run few commands from source
> tarball
> - bin tarball: license files checked, started ZK + run few commands
> - signatures OK.
> - compared source tarball with git repository checked out at RC tag using
> meld. Found no divergence.
> 
> Tested on MacOS and Ubuntu 16, using openJDK 1.8.242.
> 
> - Norbert
> 
> On Thu, Feb 27, 2020 at 11:17 AM Szalay-Bekő Máté <
> szalay.beko.m...@gmail.com> wrote:
> 
>> +1 (non-binding)
>> 
>> - I built the code and executed the java/C unit tests using 8u242
>> (everything passed, except
 PurgeTxnTest.testPurgeWhenLogRollingInProgress
>> what seems to never work on my machine.. I saw it before to be flaky
 also
>> on the apache jenkins, I created a Jira iticket for fixing it:
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3740)
>> - Using https://github.com/symat/zk-rolling-upgrade-test
>> - I tested rolling upgrade from 3.5.7 to 3.6.0
>> - I tested rolling restart on 3.6.0 to enable the multi-address
 feature
>> with the new quorum protocol version
>> - Using https://github.com/symat/zookeeper-docker-test I also tested
 the
>> multi-address feature (disabling and re-enabling different virtual
 network
>> interfaces to see that the cluster always recovers)
>> 
>> On Tue, Feb 25, 2020 at 4:13 PM Enrico Olivelli 
>> wrote:
>> 
>>> This is the fifth release candidate for 3.6.0.
>>> 
>>> It is a major release and it introduces a lot of new features, most
>>> notably:
>>> - Built-in data consistency check inside ZooKeeper
>>> - Allow Followers to host Observers
>>> - A new feature proposal to ZooKeeper: authentication enforcement
>>> - Pluggable metrics system for ZooKeeper (and Prometheus.io
 integration)
>>> - TLS Port unification
>>> - Audit logging in ZooKeeper servers
>>> - Improve resilience to network (advertise multiple addresses for
>>> members of a Zookeeper cluster)
>>> - Persistent Recursive Watch
>>> - add an API and the corresponding CLI to get total count of
 recursive
>>> sub nodes under a specific path
>>> 
>>> The full release notes is available at:
>>> 
>>> 
>>> 
>> 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12346617
>>> 
>>> *** Please download, test and vote by February 28th 2020, 23:59
 UTC+0.
>> ***
>>> 
>>> Source files:
>>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-4/
>>> 
>>> Maven staging repo:
>>> 
>> 
 https://repository.apache.org/content/repositories/orgapachezookeeper-1053/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.6.0-4
>>> https://github.com/apache/zookeeper/tree/release-3.6.0-4
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> https://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> 
>>> Please note that this new major release introduces a new JAR for
>>> zookeeper client users: zookeeper-metrics-providers
>>> 
>>> The staging version of the website is:
>>> 
>> 
 https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-4/website/
>>> 
>>> 
>>> Should we release this 

Re: [CANCELLED] Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 2

2020-02-11 Thread Flavio Junqueira
Good catch, Enrico...

> On 11 Feb 2020, at 09:43, Enrico Olivelli  wrote:
> 
> -1
> I saw that 3.6 servers are not able to join a 3.5 cluster, this make
> it difficult or rather impossible to have a graceful rolling upgrade.
> 
> In a separate thread on this list we discussed about this problem and
> we decided to work on adding support for this scenario (rolling
> upgrade without service interruption)
> 
> I am also cancelling this VOTE.
> 
> Thank you for everyone who tested the release.
> 
> 
> Enrico
> 
> Il giorno dom 9 feb 2020 alle ore 16:39 Jordan Zimmerman
>  ha scritto:
>> 
>> The CURATOR-549-zk36-updates branch tests pass
>> 
>> +1 (non binding)
>> 
>> -Jordan
>> 
>>> On Feb 5, 2020, at 2:34 PM, Enrico Olivelli  wrote:
>>> 
>>> This is the third release candidate for Apache ZooKeeper 3.6.0.
>>> 
>>> It is a major release and it introduces a lot of new features, most notably:
>>> - Built-in data consistency check inside ZooKeeper
>>> - Allow Followers to host Observers
>>> - Authentication enforcement
>>> - Pluggable metrics system for ZooKeeper (and Prometheus.io integration)
>>> - TLS Port unification
>>> - Audit logging in ZooKeeper servers
>>> - Improve resilience to network (advertise multiple addresses for
>>> members of a Zookeeper cluster)
>>> - Persistent Recursive Watches
>>> - add an API and the corresponding CLI to get total count of recursive
>>> sub nodes under a specific path
>>> 
>>> The full release notes is available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
>>> 
>>> *** Please download, test and vote by February 8th 2020, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/repositories/orgapachezookeeper-1049/
>>> 
>>> The staging version of the website is:
>>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/website/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.6.0-2
>>> https://github.com/apache/zookeeper/tree/release-3.6.0-2
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> https://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Please note that we are adding a new jar to the dependency set for
>>> clients: zookeeper-metrics-providers.
>>> 
>>> Should we release this candidate?
>>> 
>>> Enrico Olivelli
>> 



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 2

2020-02-08 Thread Flavio Junqueira
+1 binding

- Built from sources and ran tests (some tests fail intermittently)
- Checked SHA 512 and signature
- Checked license and notice
- Ran local smoke tests

-Flavio

> On 7 Feb 2020, at 10:04, Norbert Kalmar  wrote:
> 
> +1 (non-binding)
> 
> - unit tests pass
> - source tarball: compiled and started ZK + run few commands from source
> tarball
> - bin tarball: license files checked, started ZK + run few commands
> - signature OK.
> 
> Tested on MacOS and Linux, openJDK 1.8.242.
> 
> Thanks Enrico!
> 
> - Norbert
> 
> On Thu, Feb 6, 2020 at 11:20 AM Szalay-Bekő Máté 
> wrote:
> 
>> +1 (non-binding)
>> 
>> - I compiled and run all the unit tests using Ubuntu 18.04, using maven
>> 3.3.9 and OpenJDK 1.8.242 (the tests that failed with this JDK for the
>> previous RC now run without a problem)
>> - I compiled and tested the C client and the python client (we added SSL
>> feature / tests in this release for the C and python clients)
>> - I did some manual tests for the multi-address feature with multiple
>> virtual networks (using https://github.com/symat/zookeeper-docker-test)
>> and
>> the cluster did recover quickly after I disabled / enabled various virtual
>> network interfaces
>> 
>> 
>> On Thu, Feb 6, 2020 at 4:36 AM Patrick Hunt  wrote:
>> 
>>> +1 - sig/xsum verified, rat ran clean, I compiled and ran various tests
>> and
>>> they passed.
>>> 
>>> Patrick
>>> 
>>> On Wed, Feb 5, 2020 at 11:34 AM Enrico Olivelli 
>>> wrote:
>>> 
 This is the third release candidate for Apache ZooKeeper 3.6.0.
 
 It is a major release and it introduces a lot of new features, most
 notably:
 - Built-in data consistency check inside ZooKeeper
 - Allow Followers to host Observers
 - Authentication enforcement
 - Pluggable metrics system for ZooKeeper (and Prometheus.io
>> integration)
 - TLS Port unification
 - Audit logging in ZooKeeper servers
 - Improve resilience to network (advertise multiple addresses for
 members of a Zookeeper cluster)
 - Persistent Recursive Watches
 - add an API and the corresponding CLI to get total count of recursive
 sub nodes under a specific path
 
 The full release notes is available at:
 
 
 
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
 
 *** Please download, test and vote by February 8th 2020, 23:59 UTC+0.
>> ***
 
 Source files:
 https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/
 
 Maven staging repo:
 
>>> 
>> https://repository.apache.org/content/repositories/orgapachezookeeper-1049/
 
 The staging version of the website is:
 
>>> 
>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/website/
 
 The release candidate tag in git to be voted upon: release-3.6.0-2
 https://github.com/apache/zookeeper/tree/release-3.6.0-2
 
 ZooKeeper's KEYS file containing PGP keys we use to sign the release:
 https://www.apache.org/dist/zookeeper/KEYS
 
 Please note that we are adding a new jar to the dependency set for
 clients: zookeeper-metrics-providers.
 
 Should we release this candidate?
 
 Enrico Olivelli
 
>>> 
>> 



[ANNOUNCE] Enrico Olivelli new ZooKeeper PMC Member

2020-01-21 Thread Flavio Junqueira
I'm pleased to announce that Enrico Olivelli recently became the newest 
ZooKeeper PMC member. Enrico has contributed immensely to this community; he 
became a ZooKeeper committer in May 2019 and now he joins the PMC.

Join me in congratulating him on the achievement. Congrats, Enrico!

-Flavio on behalf of the Apache ZooKeeper PMC

Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 0

2020-01-15 Thread Flavio Junqueira
I can't parse Rudy's message, is it an issue with my mail application?

-Flavio

> On 15 Jan 2020, at 15:00, rudy_steiner  wrote:
> 
> environment:* MacOS High Sierra 10.13.1* JDK 
> 1.8.0_172I try to run junit test on branch-3.6, and unit test 
> thread get stuck, log as follows:.INFO] Running 
> org.apache.zookeeper.common.X509UtilTest[INFO] Tests run: 3, Failures: 
> 0, Errors: 0, Skipped: 0, Time elapsed: 27.797 s - in 
> org.apache.zookeeper.server.SnapshotDigestTest[INFO] Running 
> org.apache.zookeeper.common.TimeTest[INFO] Tests run: 1, Failures: 0, 
> Errors: 0, Skipped: 0, Time elapsed: 0.718 s - in 
> org.apache.zookeeper.common.TimeTest[INFO] Tests run: 352, Failures: 0, 
> Errors: 0, Skipped: 0, Time elapsed: 7.425 s - in 
> org.apache.zookeeper.common.X509UtilTest[INFO] Running 
> org.apache.zookeeper.common.PEMFileLoaderTest[INFO] Running 
> org.apache.zookeeper.common.KeyStoreFileTypeTest[INFO] Tests run: 9, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.144 s - in 
> org.apache.zookeeper.common.KeyStoreFileTypeTest[INFO] Running 
> org.apache.zookeeper.audit.AuditEventTest[INFO] Tests run: 2, Failures: 
> 0, Errors: 0, Skipped: 0, Time elapsed: 0.084 s - in 
> org.apache.zookeeper.audit.AuditEventTest[INFO] Running 
> org.apache.zookeeper.audit.StandaloneServerAuditTest[INFO] Tests run: 
> 72, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.027 s - in 
> org.apache.zookeeper.common.PEMFileLoaderTest[INFO] Tests run: 5, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.197 s - in 
> org.apache.zookeeper.common.FileChangeWatcherTest[INFO] Tests run: 1, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.755 s - in 
> org.apache.zookeeper.audit.StandaloneServerAuditTest[INFO] Running 
> org.apache.zookeeper.audit.Log4jAuditLoggerTest[INFO] Running 
> org.apache.zookeeper.ZKUtilTest[ERROR] Tests run: 4, Failures: 1, 
> Errors: 0, Skipped: 0, Time elapsed: 0.194 s  FAILURE! - in 
> org.apache.zookeeper.ZKUtilTest[ERROR] 
> testUnreadableFileInput(org.apache.zookeeper.ZKUtilTest)  Time elapsed: 0.014 
> s   FAILURE!java.lang.AssertionError  at 
> org.apache.zookeeper.ZKUtilTest.testUnreadableFileInput(ZKUtilTest.java:83)[INFO]
>  Running org.apache.zookeeper.PortAssignmentTest[INFO] Tests run: 13, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 s - in 
> org.apache.zookeeper.PortAssignmentTest[INFO] Running 
> org.apache.zookeeper.VerGenTest[INFO] Tests run: 6, Failures: 0, Errors: 
> 0, Skipped: 0, Time elapsed: 1.747 s - in 
> org.apache.zookeeper.audit.Log4jAuditLoggerTest[INFO] Tests run: 14, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.327 s - in 
> org.apache.zookeeper.VerGenTest[INFO] Running 
> org.apache.zookeeper.ZooKeeperTest[INFO] Running 
> org.apache.zookeeper.GetAllChildrenNumberTest[INFO] Running 
> org.apache.zookeeper.RemoveWatchesCmdTest[INFO] Tests run: 2, Failures: 
> 0, Errors: 0, Skipped: 0, Time elapsed: 1.511 s - in 
> org.apache.zookeeper.GetAllChildrenNumberTest[INFO] Running 
> org.apache.zookeeper.ClientRequestTimeoutTest[INFO] Tests run: 7, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.004 s - in 
> org.apache.zookeeper.RemoveWatchesCmdTest[INFO] Running 
> org.apache.zookeeper.ClientCanonicalizeTest[INFO] Tests run: 4, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.861 s - in 
> org.apache.zookeeper.ClientCanonicalizeTest[INFO] Running 
> org.apache.zookeeper.client.ZKClientConfigTest[INFO] Tests run: 5, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.155 s - in 
> org.apache.zookeeper.client.ZKClientConfigTest[INFO] Tests run: 35, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.74 s - in 
> org.apache.zookeeper.ZooKeeperTest[INFO] Tests run: 1, Failures: 0, 
> Errors: 0, Skipped: 0, Time elapsed: 16.372 s - in 
> org.apache.zookeeper.ClientRequestTimeoutTest[INFO] Tests run: 46, 
> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.592 s - in 
> org.apache.zookeeper.RemoveWatchesTest[INFO] Tests run: 24, Failures: 0, 
> Errors: 0, Skipped: 0, Time elapsed: 166.152 s - in 
> org.apache.zookeeper.server.quorum.QuorumPeerMainTest[INFO] Tests run: 
> 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 158.386 s - in 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest[INFO] Tests run: 
> 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 416.635 s - in 
> org.apache.zookeeper.server.quorum.QuorumSSLTest. and 
>  I found serveral processes  by ps -ef|grep java :0  6809 87919 
>   0  9:28下午 ?? 2:13.75 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/bin/java 
> -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:62202,suspend=y,server=n 
> -Dvisualvm.id=962684532457553 
> -Dmaven.multiModuleProjectDirectory=/export/workspace/zookeeper 
> -Dmaven.home=/Applications/IntelliJ 
> IDEA.app/Contents/plugins/maven/lib/maven3 
> -Dclassworlds.conf=/Applications/IntelliJ 
> IDEA.app/Contents/plugins/maven/lib/maven3/bin/m2.conf 
> 

Re: Hadoop logo

2019-12-12 Thread Flavio Junqueira
ZooKeeper was a subproject of Hadoop in the early Apache days, and we still 
carry that flag... ;-)

-Flavio

> On 12 Dec 2019, at 16:16, Norbert Kalmar  wrote:
> 
> Oh, wow, I didn't even notice that until now.
> Makes sense, knowing a lot of the time ZK is used "standalone" (I mean
> outside of any hadoop ecosystem).
> 
> Regards,
> Norbert
> 
> On Thu, Dec 12, 2019 at 2:52 PM Flavio Junqueira  wrote:
> 
>> Should we remove that Hadoop logo from the documentation? It has been a
>> while that we aren't a subproject of Hadoop any longer.
>> 
>> -Flavio



Hadoop logo

2019-12-12 Thread Flavio Junqueira
Should we remove that Hadoop logo from the documentation? It has been a while 
that we aren't a subproject of Hadoop any longer.

-Flavio

Re: ZK makes apache 2019 "top 5" projects

2019-12-12 Thread Flavio Junqueira
+1, thank you all for the hard work.

-Flavio

> On 12 Dec 2019, at 08:36, Enrico Olivelli  wrote:
> 
> Yes, great.
> 
> Please also note that Kafka and Lucene/Solr that are still listed in that
> list  are using Zookeeper :)
> 
> 
> Enrico
> 
> Il gio 12 dic 2019, 05:46 tison  ha scritto:
> 
>> Kudos!
>> 
>> Best,
>> tison.
>> 
>> 
>> Patrick Hunt  于2019年12月12日周四 上午11:32写道:
>> 
>>> This is really awesome, check it out:
>>> https://twitter.com/phunt/status/1204966326118141952
>>> 
>>> Kudos ZooKeeper community on all the hard work and efforts!
>>> 
>>> Patrick
>>> 
>> 



Re: [VOTE] Apache ZooKeeper release 3.5.6 candidate 4

2019-10-15 Thread Flavio Junqueira
+1

- Built from sources
- Validated digest and signature for sources tar.gz
- Verified that dependency from staging repo resolves
- Checked release notes
- Ran some local smoke tests

-Flavio


> On 11 Oct 2019, at 08:45, Zili Chen  wrote:
> 
> +1 (non-binding)
> 
> + verify source tarball contains no binary files
> + verify binary tarball contains no source files
> 
> + locally build by `mvn clean install -DskipTests -Pfull-build)` and verify
> by `mvn verify`, tests pass
> 
> + Verified basic zookeeper operation through Cli
> 
> Best,
> tison.
> 
> 
> Mohammad arshad  于2019年10月11日周五 下午2:38写道:
> 
>> +1 (non-binding)
>> Verified checksums and signature
>> Run UT on ubuntu 16.04 with jdk1.8.0_221 with non-root user. All 2554 UTs
>> passed
>> Verified basic zk operation through Cli
>> Executed 4 letter word commands
>> Verified admin server commands
>> Ran rat
>> All are OK :-)
>> 
>> 
>> Thanks & Regards
>> Mohammad Arshad
>> 
>> -Original Message-
>> From: sujith simon [mailto:sujithsimo...@gmail.com]
>> Sent: Friday, October 11, 2019 11:32 AM
>> To: dev@zookeeper.apache.org
>> Subject: Re: [VOTE] Apache ZooKeeper release 3.5.6 candidate 4
>> 
>> +1 (non-binding)
>> 
>> - Verified Checksum
>> - Verified Signature
>> - Verfied that tags are correct
>> - Verified build, installed ZooKeeper and tested basic commands from
>> source tarball
>> - Installed ZooKeeper from bin tarball on a 3 node cluster and tested
>> basic ZooKeeper commands
>> - Verified that Unit tests are passing
>> 
>> Thanks :)
>> 
>> 
>> On Thu, Oct 10, 2019 at 2:53 PM Norbert Kalmar
>> 
>> wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> - unit tests pass
>>> - built and started ZK + run few commands from source tarball
>>> - checked bin tarball, license files, run ZK + few commands
>>> - signature OK.
>>> - git tag OK :)
>>> 
>>> Thanks Enrico!
>>> 
>>> Norbert
>>> 
>>> On Wed, Oct 9, 2019 at 10:45 PM Andor Molnar  wrote:
>>> 
 +1
 
 Verified…
 
 - checksums, signature,
 - unit tests
 - 3-node cluster smoke tests.
 
 Andor
 
 
> On 2019. Oct 9., at 22:40, Enrico Olivelli 
>>> wrote:
> 
> Il mer 9 ott 2019, 21:14 Patrick Hunt  ha scritto:
> 
>> +1 checksums/sig validated. rat ran clean and I was able to build
>> +and
>> exercise the code just fine with java 8.
>> 
>> Note dep check is failing again however:
>> 
>> jackson-databind-2.9.10.jar
>> (pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.9.10,
>> cpe:2.3:a:fasterxml:jackson:2.9.10:*:*:*:*:*:*:*,
>> cpe:2.3:a:fasterxml:jackson-databind:2.9.10:*:*:*:*:*:*:*) :
>> CVE-2019-16942, CVE-2019-16943
>> 
>> I looked at the issue and they seem very specific, given that and
>> the status of databind these days I think we should get this one
>> next time around vs re-re... spinning the rc. What do you think?
>> 
> 
> Agreed.
> And as we are doing a very limited use of Jackson we can look for
> a replacement
> 
> Enrico
> 
>> 
>> Patrick
>> 
>> 
>> On Tue, Oct 8, 2019 at 1:46 PM Enrico Olivelli
>> 
>> wrote:
>> 
>>> This is a bugfix release candidate for 3.5.6.
>>> 
>>> It fixes 29 issues, including upgrade of third party libraries,
>>> TTL Node APIs for C API, support for PCKS12 Keystores, upgrade
>>> of
 Netty 4
>>> and better procedure for the upgrade of servers from 3.4 to 3.5.
>>> 
>>> The full release notes is available at:
>>> 
>>> 
>>> 
>> 
 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310
>>> 801=12345243
>>> 
>>> *** Please download, test and vote by October 11th 2019, 23:59
>> UTC+0.
 ***
>>> 
>>> Source files:
>>> https://people.apache.org/~eolivelli/zookeeper-3.5.6-candidate-4
>>> 
>>> Maven staging repo:
>>> 
>> 
 
>>> https://repository.apache.org/content/repositories/orgapachezookeeper-
>>> 1044
>>> 
>>> The release candidate tag in git to be voted upon:
>>> release-3.5.6-rc4
>>> https://github.com/apache/zookeeper/tree/release-3.5.6-rc4
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the
>> release:
>>> https://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 
>>> Enrico Olivelli
 
 
>>> 
>> 



Re: Apache Zookeeper - Export Commodity Control Classification (ECCN) Needed

2019-06-17 Thread Flavio Junqueira
As we are not developing any cryptography code, we are only consumers, I 
understand that we should be classified as ECCN 5D002 as most of the software 
on the page Pat shared. Is there a document somewhere listing the dependencies 
we use for security?

-Flavio

> On 16 Jun 2019, at 07:19, Enrico Olivelli  wrote:
> 
> Il sab 15 giu 2019, 18:40 Patrick Hunt  ha scritto:
> 
>> I don't believe we've updated it to include ZK, seems we need to go through
>> this process:
>> http://www.apache.org/dev/crypto.html
>> 
>> I haven't been involved with this before - anyone have experience with
>> this, perhaps on another ASF project?
>> 
> 
> 
> I don't. Sorry
> 
> Enrico
> 
> 
>> Patrick
>> 
>> 
>> On Wed, Jun 12, 2019 at 7:01 AM DANOS, Bertrand >> 
>> wrote:
>> 
>>> Dear team,
>>> 
>>> I've check the page http://www.apache.org/licenses/exports/ however the
>>> "Apache Zookeeper" project is not referenced on it.
>>> Can you report me which is the ECCN number associated to Zookeeper
>> project?
>>> 
>>> Thanks in advance
>>> 
>>> Regards
>>> 
>>> Bertrand Danos
>>> 
>>> 
>>> 
>>> This document, technology or software does not contain French national
>>> dual-use or military controlled data nor US national dual-use or military
>>> controlled data.
>>> 
>>> The information in this e-mail is confidential. The contents may not be
>>> disclosed or used by anyone other than the addressee. Access to this
>> e-mail
>>> by anyone else is unauthorised.
>>> If you are not the intended recipient, please notify Airbus immediately
>>> and delete this e-mail.
>>> Airbus cannot accept any responsibility for the accuracy or completeness
>>> of this e-mail as it has been sent over public networks. If you have any
>>> concerns over the content of this message or its Accuracy or Integrity,
>>> please contact Airbus immediately.
>>> All outgoing e-mails from Airbus are checked using regularly updated
>> virus
>>> scanning software but you should take whatever measures you deem to be
>>> appropriate to ensure that this message and any attachments are virus
>> free.
>>> 
>> 



Re: [DISCUSS] Voting on pull requests

2019-06-06 Thread Flavio Junqueira
That's covered in the project bylaws, right?

https://zookeeper.apache.org/bylaws.html 


-Flavio

> On 6 Jun 2019, at 13:49, Enrico Olivelli  wrote:
> 
> Il gio 6 giu 2019, 12:44 Andor Molnar  > ha scritto:
> 
>> Hi folks,
>> 
>> I’ve seen 2 patches committed recently with “-1s" from committers on it.
>> 
>> https://github.com/apache/zookeeper/pull/899 <
>> https://github.com/apache/zookeeper/pull/899>
>> https://github.com/apache/zookeeper/pull/944 <
>> https://github.com/apache/zookeeper/pull/944>
>> 
>> Not a big deal in this case and I think they were in a good shape and
>> ready to commit, but I’d like to clarify how do we handle voting on pull
>> requests. We use github to prepare patches by creating pull requests.
>> Github also has a feature of “reviewing” which means that reviewers are
>> able to “approve”, “comment” and “request for changes”. In terms of voting
>> this means:
>> 
>> - “approve” = +1
>> - “comment” = 0
>> - “request for changes” = -1
>> 
> 
> We should enhance the script (we already did it on Bookkeeper for instance)
> 
>> 
>> In order to commit a patch we need at least 2 binding +1s without binding
>> -1. Committers/PMCs are able to veto this way.
>> 
>> Do we agree on this process completely?
>> 
> 
> Sure
> 
>> 
>> I know that activity in ZooKeeper community is usually very flaky and
>> sometimes it’s hard to find committers to review patches.
> 
> 
> We have a new wave of contributions and new committers, so fortunately this
> is changing.
> 
> 
> In these cases we usually just commit smaller patches with a single binding
>> vote, but I think we should be more careful about binding -1s.
>> 
>> Please in the future if you see my -1 on a patch which you think is ready
>> to commit, bug me as hard as it takes. I’ll make every effort to review as
>> soon as possible and apologies for any delay.
>> 
> 
> Sure.
> 
> 
> Enrico
> 
> 
>> Thanks,
>> Andor



Re: [ANNOUNCE] New ZooKeeper committer: Norbert Kalmar

2019-05-27 Thread Flavio Junqueira
Congrats, Norbert!

-Flavio

> On 27 May 2019, at 11:01, Tamas Penzes  wrote:
> 
> Congrats Norbert!
> 
> On Sun, May 26, 2019 at 11:21 PM Patrick Hunt  wrote:
> 
>> The Apache ZooKeeper PMC recently extended committer karma to Norbert
>> and he has accepted. Norbert has made some great contributions and we
>> are looking forward to even more :)
>> 
>> Congratulations and welcome aboard, Norbert!
>> 



Re: Github & Jira notifications

2019-05-17 Thread Flavio Junqueira
To add to this conversation, we have 5 mailing lists as of today:

dev@z.a.o
commits@z.a.o
private@z.a.o
security@z.a.o
user@z.a.o

The commits@ one was historically supposed to be for notifications.

-Flavio 


> On 17 May 2019, at 16:50, Lars Francke  wrote:
> 
> Alright, I will do so, thank you
> 
> On Fri, May 17, 2019 at 11:46 AM Norbert Kalmar
>  wrote:
> 
>> I think we should start a vote.
>> 
>> Lars, feel free to start it, as it is your initiative, you can place your
>> argument in the initial email.
>> Thanks for bringing this up!
>> 
>> Regards,
>> Norbert
>> 
>> 
>> On Fri, May 17, 2019 at 11:38 AM Lars Francke 
>> wrote:
>> 
>>> I don't really have anything else to add to this conversation. I agree
>> with
>>> Norbert & Andor.
>>> Separating the lists is (slowly) becoming the standard at the ASF in the
>>> projects I participate in and it makes it easier for newcomers.
>>> 
>>> Do we want to put this to a vote or abandon this?
>>> 
>>> On Mon, May 13, 2019 at 5:33 PM Andor Molnar  wrote:
>>> 
 Without responding to this, let’s clarify what emails are we talking
>>> about:
 
 - Jenkins build notifications: failures / successes,
 - Jira notifications: Created, Updated, Status change, Commented, etc.
 - Github notifications: opened/closed pull request, commented on pull
 request, etc.
 
 Researching a bit around it seems like that issues@ mailing list is a
 common pattern across some Hadoop projects:
 https://hbase.apache.org/mailing-lists.html <
 https://hbase.apache.org/mailing-lists.html>
 https://hadoop.apache.org/mailing_lists.html <
 https://hadoop.apache.org/mailing_lists.html>
 https://kafka.apache.org/contact 
 
 But not on:
 https://hive.apache.org/mailing_lists.html <
 https://hive.apache.org/mailing_lists.html>
 https://oozie.apache.org/mail-lists.html <
 https://oozie.apache.org/mail-lists.html>
 
 Regards,
 Andor
 
 
 
> On 2019. May 13., at 16:43, Patrick Hunt  wrote:
> 
> I seem to remember that in the early days of apache the intent was
>> for
 all
> developers to participate in development of the project. MLs were
>> used
 for
> this initially and there has been concern in the past about the move
>> to
> JIRA as it removes critical discussions from the general dev flow.
>> JIRA
> discussions are cc'd to an ML for just this reason. Granted another
 aspect
> of this mirroring is for archival purposes.
> 
> As such the intent is for developers to participate in development
> discussion. It sounds like you're turning this on it's head.
> 
> Isn't this why we have a user list? For folks that are only casually
> interested in project activity?
> 
> Patrick
> 
> On Mon, May 13, 2019 at 3:31 AM Andor Molnar 
>> wrote:
> 
>> Hi,
>> 
>> Sorry for starting the vote too early, for some reason I thought
>> it’s
>>> a
>> straightforward thing, but the concerns are valid.
>> 
>> I believe this is a good thing to do now, because the activity in
>> ZooKeeper community has reached a level which the dev list with all
>> notifications included hasn’t been prepared for. I see emails every
>>> now
 and
>> then from people who fed up with the flood of notification emails
>> and
>> decide to unsubscribe instead of taking the hassle of dealing with
>>> email
>> filters.
>> 
>> *Filters*
>> Pretty much everything which can be done with separate mailing
>> lists,
 can
>> be achieved by some clever email filters too. Though I think having
>> multiple mailing lists is technologically a better distinction for
 emails:
>> separate archives, different retention policies, less email to be
 delivered
>> (filters working on client side) and more convenient for new
 subscribers.
>> 
>> *Markmail*
>> I’m a power user of markmail too, but I’m not sure how it works.
>> Tbh I
>> cannot see the benefits of searching for automated emails from
>> Gitbox,
 but
>> I’m sure we can set it up properly, if needed.
>> 
>> *Existing users*
>> It’s completely valid that on the flip side we’ll mess up the config
>>> of
>> existing users. I don’t we can avoid that or do this in a backward
>> compatible way, but I think it’s worth to pay this price. What will
 happen
>> to them?
>> - Stop receiving emails from Jira and Github,
>> - Get the announcement on the dev list: hey, from now on you need to
>> subscribe from a...@zk.org if you want to be notified about A and
>>> b...@zk.org
>> if you want to be notified about B, etc.
>> I don’t think we can automatically subscribe them for the new lists,
>>> it
>> sounds like against the law (GDPR?), but I’m not a lawyer.
>> 
>> My 2 cents.
>> 
>> Regards,
>> Andor
>> 
>> 

Re: Crypto Policy (was: Re: [VOTE] Apache ZooKeeper release 3.5.5 candidate 5)

2019-04-30 Thread Flavio Junqueira
Thanks for the input, Andor and Norbert. I managed to get it to work. Using 
"hostname -f" like in the instructions was not working for me. I tried adding 
names to /etc/hosts, but it was not liking that either.  It worked when I used 
the loopback IP address. 

On the RC, could you please clarify the status? Is the plan to have a new RC?

-Flavio

> On 30 Apr 2019, at 13:37, Norbert Kalmar  wrote:
> 
> Hi Flavio,
> 
> When I tested the TLS setup on a single machine, I also had issues, java
> errors. It turns out those error were completely unrelated to the problem,
> namely I didn't have the localhost setup in my hostname file. If I remember
> correctly, I needed to add localhost 127.0.0.1, because I only had the
> machine's name setup there. So it couldn't find "localhost".
> 
> Not entirely sure the solution anymore, but it was definitely something
> with localhost.
> 
> After that, it worked fine for me, but I also used a single key for all
> instances on the machine.
> 
> Regards,
> Norbert
> 
> On Tue, Apr 30, 2019 at 1:29 PM Andor Molnar  wrote:
> 
>> Hi Flavio,
>> 
>> Works for me on a single machine with the following keystores.
>> Aliases are the hostname of the machine (all the same).
>> 
>> keystore1.jks
>> ~~
>> Keystore type: JKS
>> Keystore provider: SUN
>> 
>> Your keystore contains 1 entry
>> 
>> andors-macbook-pro.local, Apr 30, 2019, PrivateKeyEntry,
>> Certificate fingerprint (SHA1):
>> 14:46:5F:92:D9:03:88:7E:C9:0A:95:9E:F5:74:08:F4:27:89:36:9D
>> 
>> keystore2.jks
>> ~~
>> Keystore type: JKS
>> Keystore provider: SUN
>> 
>> Your keystore contains 1 entry
>> 
>> andors-macbook-pro.local, Apr 30, 2019, PrivateKeyEntry,
>> Certificate fingerprint (SHA1):
>> 61:11:F3:FC:97:B1:3D:DB:6C:65:11:AE:FB:26:39:C0:4F:8E:A7:F7
>> 
>> keystore3.jks
>> ~~
>> Keystore type: JKS
>> Keystore provider: SUN
>> 
>> Your keystore contains 1 entry
>> 
>> andors-macbook-pro.local, Apr 30, 2019, PrivateKeyEntry,
>> Certificate fingerprint (SHA1):
>> E0:84:2A:37:A0:8E:22:67:B3:50:21:43:34:D0:FD:E8:A4:50:C4:3F
>> 
>> 
>> …and a single truststore:
>> 
>> $ keytool -list -keystore truststore.jks
>> Enter keystore password:
>> Keystore type: JKS
>> Keystore provider: SUN
>> 
>> Your keystore contains 4 entries
>> 
>> mycert3, Apr 30, 2019, trustedCertEntry,
>> Certificate fingerprint (SHA1):
>> E0:84:2A:37:A0:8E:22:67:B3:50:21:43:34:D0:FD:E8:A4:50:C4:3F
>> mycert2, Apr 30, 2019, trustedCertEntry,
>> Certificate fingerprint (SHA1):
>> 61:11:F3:FC:97:B1:3D:DB:6C:65:11:AE:FB:26:39:C0:4F:8E:A7:F7
>> mycert1, Apr 30, 2019, trustedCertEntry,
>> Certificate fingerprint (SHA1):
>> 14:46:5F:92:D9:03:88:7E:C9:0A:95:9E:F5:74:08:F4:27:89:36:9D
>> 
>> Aliases (mycert1, mycert2, mycert3) doesn’t matter here, ZooKeeper only
>> checks if the given certificate is included in the truststore or not.
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>>> On 2019. Apr 30., at 12:48, Andor Molnar  wrote:
>>> 
>>> Makes sense.
>>> 
>>> I’ve tested it on a single machine with the same cert/key for all
>> instances: keystore / truststore only contained a single entry and it
>> worked fine.
>>> We’ve also tested on multiple instances with multiple keys / instances
>> which also worked fine.
>>> 
>>> Let me give it another go with single machine / multiple certs combo.
>>> 
>>> I might need to modify the docs to emphasize keys must be generated on a
>> per machine basis, not ZK instance.
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> 
>>>> On 2019. Apr 29., at 16:53, Flavio Junqueira  wrote:
>>>> 
>>>> I'm also +1 for adding a comment to the release notes (thanks for the
>> suggestion, Ted). Updating the readme makes sense, but the release notes
>> will be the main source to indicate that we require a specific or later
>> version of Java from that particular release. My preference would be to
>> update the release notes.
>>>> 
>>>> As for running TLS on a single node, have you been able to do it? I
>> haven't had a chance to look further into it throughout my day, so if
>> anyone has successfully done it and can share some instructions, it would
>> help me. Otherwise, I'll keep investigating once I have a chance. To be
>> specific, I created the keystore, certificate and truststore fil

Re: Crypto Policy (was: Re: [VOTE] Apache ZooKeeper release 3.5.5 candidate 5)

2019-04-29 Thread Flavio Junqueira
I'm also +1 for adding a comment to the release notes (thanks for the 
suggestion, Ted). Updating the readme makes sense, but the release notes will 
be the main source to indicate that we require a specific or later version of 
Java from that particular release. My preference would be to update the release 
notes.

As for running TLS on a single node, have you been able to do it? I haven't had 
a chance to look further into it throughout my day, so if anyone has 
successfully done it and can share some instructions, it would help me. 
Otherwise, I'll keep investigating once I have a chance. To be specific, I 
created the keystore, certificate and truststore files according to 
instructions, but the instructions assume that the aliases are different when 
it comes to populating the truststore. At that point, I had to get creative and 
I have tried a couple of options that didn't work. Either way, I think that 
being able to run locally and documenting is desirable, although not a blocker. 
If I can get it right, then I can write a gist describing it that we can use as 
a reference until we properly document it.

-Flavio

> On 29 Apr 2019, at 15:42, Andor Molnar  wrote:
> 
> Thanks Flavio for the investigation. I’ll update the README file to include 
> instructions on supported Java 8 versions.
> I’m wondering if I have to update the admin docs based on your problems 
> running TLS quorum on a single machine.
> 
> Andor
> 
> 
> 
>> On 2019. Apr 29., at 15:06, Enrico Olivelli  wrote:
>> 
>> Il lun 29 apr 2019, 13:44 Ted Dunning  ha scritto:
>> 
>>> Other changes in u211+ substantially improve how Java 8 applications behave
>>> in containers. I am seeing this more and more with customers.
>>> 
>>> Combined with the crypto issues, it might be worth a release note
>>> suggesting that if you are going to compile with Java 1.8, you should use a
>>> recent release at u211 (u212?) Or above.
>>> 
>> 
>> +1 for a note on release docs
>> 
>> 
>> Enrico
>> 
>> 
>> 
>> 
>>> On Mon, Apr 29, 2019, 11:43 AM Flavio Junqueira  wrote:
>>> 
>>>> I did a bit more research and it turns out that the crypto.policy option
>>>> was introduced u151:
>>>> 
>>>> 
>>> https://www.oracle.com/technetwork/java/javase/8u151-relnotes-3850493.html
>>>> <
>>>> 
>>> https://www.oracle.com/technetwork/java/javase/8u151-relnotes-3850493.html
>>>>> 
>>>> 
>>>> And started being defined by default with the "unlimited" option in u161:
>>>> 
>>>> 
>>> https://www.oracle.com/technetwork/java/javase/8u161-relnotes-4021379.html
>>>> <
>>>> 
>>> https://www.oracle.com/technetwork/java/javase/8u161-relnotes-4021379.html
>>>>> 
>>>> 
>>>> I have installed a more recent version, 1.8.0_211, and it builds fine
>>> (all
>>>> tests pass consistently for me).
>>>> 
>>>> 
>>>> I'm now trying to start an ensemble with ssl enabled locally, but it is
>>>> failing for me. It looks like the instructions in the admin doc assumes
>>>> different hosts. I need to look more closely into it to determine what
>>> not
>>>> is that I'm doing wrong, but in any case, the instructions do not make it
>>>> very clear whether one can run locally.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 27 Apr 2019, at 19:28, Patrick Hunt  wrote:
>>>>> 
>>>>> Odd. I had done my testing on jdk11/macos which is fine.
>>>>> 
>>>>> I just tried jdk8 and 3 times in a row it's failing with:
>>>>> [ERROR]   SaslAuthTest.testZKOperationsAfterClientSaslAuthFailure:176 »
>>>>> Timeout Failed t...
>>>>> 
>>>>> I don't see the error Flavio is seeing. I have never installed special
>>>>> crypto libraries, etc... just vanilla jdk.
>>>>> 
>>>>> ⌂102% [phunt:~/Downloads/z/apache-zookeeper-3.5.5] 3s $ mvn --version
>>>>> Apache Maven 3.6.1 (d66c9c0b3152b2e69ee9bac180bb8fcc8e6af555;
>>>>> 2019-04-04T12:00:29-07:00)
>>>>> Maven home: /usr/local/Cellar/maven/3.6.1/libexec
>>>>> Java version: 1.8.0_201, vendor: Oracle Corporation, runtime:
>>>>> /Library/Java/JavaVirtualMachines/jdk1.8.0_201.jdk/Contents/Home/jre
>>>>> Default locale: en_US, platform encoding: UTF-8
>>>>> OS name: &qu

Re: Crypto Policy (was: Re: [VOTE] Apache ZooKeeper release 3.5.5 candidate 5)

2019-04-29 Thread Flavio Junqueira
 with a more recent version of Java?
>>> 
>>> Andor
>>> 
>>> 
>>> 
>>>> On 2019. Apr 27., at 17:33, Andor Molnar  wrote:
>>>> 
>>>> Good catch, thanks Flavio for reporting this. We need to double check
>> the tests with Ilya I believe.
>>>> 
>>>> Having tests failure means that you were actually able to _build_
>> ZooKeeper successfully without changing the crypto policy setting. Have you
>> tried to start an ensemble with Quorum TLS by any chance? That would add
>> some more color to this issue.
>>>> 
>>>> This might be just a testing issue.
>>>> 
>>>> Regards,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>>> On 2019. Apr 27., at 16:09, Flavio Junqueira  wrote:
>>>>> 
>>>>> Hi Enrico,
>>>>> 
>>>>> Here is the info you are requesting:
>>>>> 
>>>>> *Java version*
>>>>> 
>>>>> $ java -version
>>>>> java version "1.8.0_152"
>>>>> Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
>>>>> 
>>>>> *Test case errors*
>>>>> 
>>>>> I won’t post all of them, I get a good number of errors:
>>>>> 
>>>>> 
>>>>> [ERROR] Tests run: 64, Failures: 0, Errors: 16, Skipped: 0, Time
>> elapsed: 9.21 s <<< FAILURE! - in org.apache.zookeeper.util.PemReaderTest
>>>>> [ERROR]
>> testLoadCertificateFromKeyStore[1](org.apache.zookeeper.util.PemReaderTest)
>> Time elapsed: 1.593 s  <<< ERROR!
>>>>> java.io.IOException:
>> org.bouncycastle.operator.OperatorCreationException: Illegal key size or
>> default parameters
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)
>>>>> Caused by: org.bouncycastle.operator.OperatorCreationException:
>> Illegal key size or default parameters
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)
>>>>> Caused by: java.security.InvalidKeyException: Illegal key size or
>> default parameters
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)
>>>>> 
>>>>> [ERROR]
>> testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword[1](org.apache.zookeeper.util.PemReaderTest)
>> Time elapsed: 0.004 s  <<< ERROR!
>>>>> java.lang.Exception: Unexpected exception,
>> expected but
>> was
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
>>>>> Caused by: org.bouncycastle.operator.OperatorCreationException:
>> Illegal key size or default parameters
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
>>>>> Caused by: java.security.InvalidKeyException: Illegal key size or
>> default parameters
>>>>> at
>> org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
>>>>> ...
>>>>> 
>>>>> 
>>>>> 
>>>>> *Crypto policy*
>>>>> If I uncomment this configuration option:
>>>>> 
>>>>> # Please see the JCA documentation for additional information on these
>>>>> # files and formats.
>>>>> # crypto.policy=unlimited
>>>>> 
>>>>> in:
>>>>> 
>>>>> $JAVA_HOME/jre/lib/security/java.security
>>>>> 
>>>>> then it all works and I get no error at all. This option controls
>> cryptographic strengths according to the documentation, and is present
>> because of crypto regulations in different countries.
>>>>> 
>>>>> Thanks,
>>>>> -Flavio
>>>>> 
>>>>>> On 27 Apr 2019, at 15:52, Enrico Olivelli 
>> wrote:
>>>>>> 
>>>>>> Il sab 27 apr 2019, 14:18 Flavio Junqueira  ha
>> scritto:
>>>>>> 
>>>>>>> I have a clarification question about the RC. To build th

[ANNOUNCE] New ZooKeeper PMC member: Andor Molnar

2019-04-29 Thread Flavio Junqueira
The Apache ZooKeeper PMC recently invited Andor to join the PMC and he has 
accepted. Andor has made some significant contributions to the project, 
including driving releases.  We are looking forward to even greater 
contributions from Andor now as part of the PMC.

Congratulations and welcome aboard Andor!

-Flavio on behalf of the Apache ZooKeeper PMC

Re: Request for review & contribution of Apache Training

2019-04-27 Thread Flavio Junqueira
This is great, Lars. Thanks for putting this together. I have some additional 
comments:

- I'd rather not characterize zookeeper as a key-value store, one reason being 
that to access a znode we give a "path" and not a "key". I don't want to open 
this up to a discussion on what a key-value store actually is, but I typically 
say that it is a hierarchy of znodes and explain what a znode is (Slide 2). If 
it is early to talk about znodes, then say hierarchy of simple data files.
- It would be nice to have a slide illustrating how to get going with 
zookeeper, like starting a server with `bin/zkServer.sh start` and using the 
`bin/zkCli.sh` to issue some commands.
- In Slide 8, the statement saying that "in an ensemble we always have a single 
leader" isn't accurate. I think you want to say something along the lines of 
"in an ensemble there is at most on leader server supported by a majority of 
followers".

Thanks again for the initiative.

-Flavio

> On 27 Apr 2019, at 00:55, Lars Francke  wrote:
> 
> Thank you Patrick, that's great!
> 
> I've incorporated the feedback (and a bit more) and have uploaded a new
> version.
> 
> This is just a first start and I hope that over time the community will
> expand on it (e.g. slides on use cases/patterns/recipes, API usage, the Zab
> protocol, etc.). This is also an invitation to the ZooKeeper community to
> add anything at any point if you feel it's worthwhile to teach to potential
> learners. If you don't want to fiddle around with slides that's fine as
> well, just filing an issue in Jira in the TRAINING project will already
> help.
> 
> Cheers,
> Lars
> 
> On Fri, Apr 26, 2019 at 10:49 PM Patrick Hunt  wrote:
> 
>> Wow, this is great, thanks Lars. I reviewed, here are my comments:
>> 
>> 3: you might highlight that it's used outside hadoop (e.g. solr and many
>> others inside and outside apache). Also heavily using in the tech industry;
>> facebook, twitter, linkedin, etc...
>> 
>> You might also reference Curator as it's a good way to get started from a
>> client perspective.
>> 
>> Regards,
>> 
>> Patrick
>> 
>> On Fri, Apr 26, 2019 at 7:20 AM Lars Francke 
>> wrote:
>> 
>>> Hi ZooKeeper devs,
>>> 
>>> we started the Apache Training (incubating) project back in February.
>>> Our aim is to develop training material (slides, labs, tests etc.) for all
>>> kinds of projects (mostly focusing on Apache projects but not
>>> exclusively).
>>> 
>>> We're still getting off the ground with homepage and so. We're also
>>> working
>>> on the very first content donation which is a super simple four slide
>>> ZooKeeper introduction.
>>> 
>>> I chose this because it's small and we can test our processes.
>>> 
>>> I still wanted to reach out and see if anyone here is interested in
>>> contributing anything and/or reviewing the existing material. It can be
>>> found in Jira: 
>>> 
>>> Thank you very much!
>>> 
>>> Cheers,
>>> Lars
>>> 
>> 



Re: Crypto Policy (was: Re: [VOTE] Apache ZooKeeper release 3.5.5 candidate 5)

2019-04-27 Thread Flavio Junqueira
Hi Enrico,

Here is the info you are requesting:

*Java version*

$ java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

*Test case errors*

I won’t post all of them, I get a good number of errors:


[ERROR] Tests run: 64, Failures: 0, Errors: 16, Skipped: 0, Time elapsed: 9.21 
s <<< FAILURE! - in org.apache.zookeeper.util.PemReaderTest
[ERROR] 
testLoadCertificateFromKeyStore[1](org.apache.zookeeper.util.PemReaderTest)  
Time elapsed: 1.593 s  <<< ERROR!
java.io.IOException: org.bouncycastle.operator.OperatorCreationException: 
Illegal key size or default parameters
at 
org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)
Caused by: org.bouncycastle.operator.OperatorCreationException: Illegal key 
size or default parameters
at 
org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)
Caused by: java.security.InvalidKeyException: Illegal key size or default 
parameters
at 
org.apache.zookeeper.util.PemReaderTest.testLoadCertificateFromKeyStore(PemReaderTest.java:125)

[ERROR] 
testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword[1](org.apache.zookeeper.util.PemReaderTest)
  Time elapsed: 0.004 s  <<< ERROR!
java.lang.Exception: Unexpected exception, 
expected but was
at 
org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
Caused by: org.bouncycastle.operator.OperatorCreationException: Illegal key 
size or default parameters
at 
org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
Caused by: java.security.InvalidKeyException: Illegal key size or default 
parameters
at 
org.apache.zookeeper.util.PemReaderTest.testLoadEncryptedPrivateKeyFromKeyStoreWithWrongPassword(PemReaderTest.java:93)
...



*Crypto policy*
If I uncomment this configuration option:

  # Please see the JCA documentation for additional information on these
  # files and formats.
  # crypto.policy=unlimited

in:

   $JAVA_HOME/jre/lib/security/java.security

then it all works and I get no error at all. This option controls cryptographic 
strengths according to the documentation, and is present because of crypto 
regulations in different countries.

Thanks,
-Flavio

> On 27 Apr 2019, at 15:52, Enrico Olivelli  wrote:
> 
> Il sab 27 apr 2019, 14:18 Flavio Junqueira  ha scritto:
> 
>> I have a clarification question about the RC. To build the RC, I had to
>> enable crypto.policy unlimited in the jre (I'm using build 1.8.0_152-b16).
> 
> 
> Flavio
> What do you mean with 'build' ?
> Make tests pass?
> AFAIK we are not using tweaked jdks in CI builds, so in theory there is no
> need.
> 
> Can you please share your error?
> 
> Enrico
> 
> 
> I'm wondering if this is going to be an issue for some users as this option
>> is related to import/export regulation. Has anyone looked into it and could
>> clarify it to me, please?
>> 
>> Thanks,
>> -Flavio
>> 
>> 
>>> On 25 Apr 2019, at 15:10, Andor Molnar  wrote:
>>> 
>>> This is the first stable release of 3.5 branch: 3.5.5. It resolves 117
>> issues, including Maven migration, Quorum TLS, TTL nodes and lots of other
>> performance and stability improvements.
>>> 
>>> The full release notes is available at:
>>> 
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12343268
>>> 
>>> *** Please download, test and vote by May 3rd 2019, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> https://dist.apache.org/repos/dist/dev/zookeeper/zookeeper-3.5.5-rc5/
>>> 
>>> Maven staging repos:
>>> 
>> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/parent/3.5.5/
>>> 
>> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper-jute/3.5.5/
>>> 
>> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.5/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.5.5-rc5
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 
>> 
>> 



Crypto Policy (was: Re: [VOTE] Apache ZooKeeper release 3.5.5 candidate 5)

2019-04-27 Thread Flavio Junqueira
I have a clarification question about the RC. To build the RC, I had to enable 
crypto.policy unlimited in the jre (I'm using build 1.8.0_152-b16). I'm 
wondering if this is going to be an issue for some users as this option is 
related to import/export regulation. Has anyone looked into it and could 
clarify it to me, please?

Thanks,
-Flavio


> On 25 Apr 2019, at 15:10, Andor Molnar  wrote:
> 
> This is the first stable release of 3.5 branch: 3.5.5. It resolves 117 
> issues, including Maven migration, Quorum TLS, TTL nodes and lots of other 
> performance and stability improvements.
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12343268
> 
> *** Please download, test and vote by May 3rd 2019, 23:59 UTC+0. ***
> 
> Source files:
> https://dist.apache.org/repos/dist/dev/zookeeper/zookeeper-3.5.5-rc5/
> 
> Maven staging repos:
> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/parent/3.5.5/
> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper-jute/3.5.5/
> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.5/
> 
> The release candidate tag in git to be voted upon: release-3.5.5-rc5
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
> 
> Should we release this candidate?
> 



Re: [VOTE] Apache ZooKeeper release 3.4.14 candidate 5

2019-03-27 Thread Flavio Junqueira
+1, I have checked the following:

- Signature and checksums
- Builds locally, unit tests pass
- NOTICE file has been updated to reflect the present year
- Smoke tests with a local ensemble
- Maven project is able to resolve zookeeper dependency using the staging 
repository
- Release notes

-Flavio

> On 13 Mar 2019, at 19:07, Patrick Hunt  wrote:
> 
> +1 - sig/xsum verified, rat ran clean, was able to run various ensemble
> sizes successfully.
> 
> Patrick
> 
> On Wed, Mar 6, 2019 at 10:56 AM Andor Molnar  wrote:
> 
>> This is a bugfix release candidate for 3.4.14. It fixes 8 issues, mostly
>> build / unit tests issues,
>> dependency updates flagged by OWASP, NPE and a name resolution problem.
>> Among these it also supports
>> experimental Maven build and Markdown based documentation generation.
>> 
>> The full release notes is available at:
>> 
>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12343587
>> 
>> *** Please download, test and vote by March 10th 2019, 23:59 UTC+0. ***
>> 
>> Source files:
>> https://dist.apache.org/repos/dist/dev/zookeeper/zookeeper-3.4.14-rc5/
>> 
>> Maven staging repo:
>> 
>> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.4.14/
>> 
>> The release candidate tag in git to be voted upon: release-3.4.14-rc5
>> 
>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>> http://www.apache.org/dist/zookeeper/KEYS
>> 
>> Should we release this candidate?
>> 
>> Regards,
>> Andor
>> 
>> 
>> 



Re: Question on merge script

2018-05-09 Thread Flavio Junqueira
Thanks for the feedback, Pat. I think the wiki page with merge script 
instructions needs updating. I'll explore it a bit further and will update it.

-Flavio

> On 9 May 2018, at 20:05, Patrick Hunt <ph...@apache.org> wrote:
> 
> On Wed, May 9, 2018 at 1:18 AM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> Hey Michael,
>> 
>> I was trying to merge yesterday a PR generated against branch-3.5, and
>> fetching the PR branch did not give me the merge script. I ended up asking
>> the contributor to change the target branch to master so that I avoid any
>> small hacks with the merge script.
>> 
>> 
> fwiw that's not the workflow I use. I always fetch the latest repo content,
> then switch to the master and use the script to merge/push a PR. It doesn't
> matter which PR or branch you want to merge, you just run the script off
> master and it handles the rest. If the branch/PR is off 3.4 it all just
> works.
> 
> 
>> We should consider doing the following two things, and let me know if it
>> makes sense:
>> 1- Clarifying that if a change is supposed to go to both branch-3.5 and
>> master, the PR should be against master
>> 
> 
> As long as it applies cleanly to master and br35 (etc...) this is not
> really necessary. You use the merge script to merge it into the target
> branch, then after you push that change to apache git repo it will ask you
> if you want to merge to other branches. Typically I would ask the OP to
> post multiple PRs if there are conflicts. I don't usually commit to just
> one branch if the change is necessary for multiple branches and there are
> conflicts. (I wait for all the PRs covering all the branches cleanly)
> 
> 
>> 2- Perhaps merging to branch-3.5 so that I see the script when I fetch a
>> PR branch off branch-3.5. This is unusual, but it is not unreasonable that
>> we have eventually PRs for branch-3.5 only.
>> 
>> I'm focusing on 3.5, but the same reasoning applies to 3.4.
>> 
>> 
> I always just start with master checked out and run the script. Seems fine
> to me and it means we don't need to maintain multiple versions of the
> scripts and keep them in sync. What's the benefit of doing otw?
> 
> Patrick
> 
> 
>> -Flavio
>> 
>> 
>>> On 9 May 2018, at 01:49, Michael Han <h...@apache.org> wrote:
>>> 
>>> Hi Flavio,
>>> 
>>> The merge script is branch agnostic - it only cares about the pull
>> request
>>> number. As long as in the pull request the correct target branch is
>>> specified, the merge script will do its job by merging the change to the
>>> specified target branch. I guess we could commit the same script to
>>> branch-3.5 but the current script in master should be able to do what you
>>> asked.
>>> 
>>> On Tue, May 8, 2018 at 4:06 PM, Flavio Junqueira <f...@apache.org> wrote:
>>> 
>>>> Could anyone remind me why we don't have the merge script on branch-3.5?
>>>> Say I have a change that targets branch-3.5 alone. Shouldn't I be able
>> to
>>>> have a PR that targets branch-3.5 and use the merge script?
>>>> 
>>>> Thanks,
>>>> -Flavio
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Cheers
>>> Michael
>> 
>> 



Re: Name resolution in StaticHostProvider

2018-05-09 Thread Flavio Junqueira
I'm actually now wondering whether we should be using an unchecked exception 
instead. A lot of things have changed with exception handling since we wrote 
this code base initially. An unchecked exception would actually match better my 
current mental model of what that signature should look like.

-Flavio

> On 9 May 2018, at 16:44, Flavio Junqueira <f...@apache.org> wrote:
> 
> I like the idea of indicating to the application that there is something 
> wrong with the list of servers so that it has a chance to look into it. With 
> the current code in `ClientCnxn`, we will log at warn level and hope that 
> someone sees it, but we are not really stopping the client. Throwing might 
> actually be an improvement as it will output a log message, but I'm now 
> wondering if we should propagate it all the way to the application. 
> Responding to myself, one reason for not doing it is that it is not a fatal 
> error unless no server can be resolved.
> 
> -Flavio
> 
>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>> 
>> Hi,
>> 
>> Updating this thread, because the PR is still being review on GitHub.
>> 
>> So, the reason why I refactored the original behaviour of
>> StaticHostProvider is that I believe that it's trying to do something which
>> is not its responsibility. Please tell me if there's a good historical
>> reason for that.
>> 
>> My approach is giving the user the following to options:
>> 1- Use static IP addresses, if you don't want to deal with DNS resolution
>> at all - we guarantee that no DNS logic will involved in this case at all.
>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
>> way in this case e.g. do NOT cache IP address for a longer period that DNS
>> server allows to and re-resolve after TTL expries, because it's mandatory
>> by protocol.
>> 
>> My 2 cents here:
>> - the fix which was originally posted for re-resolution is a workaround and
>> doesn't satisfy the requirement for #2,
>> - the solution is already built-in in JDK and DNS clients in the right way
>> - can't see a reason why we shouldn't use that
>> 
>> I checked this in some other projects as well and found very similar
>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
>> for that:
>> - Standard resolver uses java's built-in getByName().
>> - Qualified resolver still uses getByName(), but adds some logic to avoid
>> incorrect re-resolutions and reverse IP lookups.
>> 
>> Please let me know your thoughts.
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
>> 
>>> Hi Abe,
>>> 
>>> Unfortunately we haven't got any feedback yet. What do you think of
>>> implementing Option #3?
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>>> Did anybody happen to take a quick look by any chance?
>>>> 
>>>> I don't want to push this too hard, because I know it's a time consuming
>>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>>> around for a while and any feedback would be extremely helpful to close it
>>>> quickly.
>>>> 
>>>> Thanks,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> We need more eyes and brains on the following PR:
>>>>> 
>>>>> https://github.com/apache/zookeeper/pull/451
>>>>> 
>>>>> I added a comment few days ago about the way we currently do DNS name
>>>>> resolution in this class and a suggestion on how we could simplify things 
>>>>> a
>>>>> little bit. We talked about it with Abe Fine, but we're a little bit 
>>>>> unsure
>>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>>> feedback from you.
>>>>> 
>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>> 
>>>>> In general, the task that StaticHostProvider does is to get a list of
>>>>> potentially unresolved InetSocketAddress obje

Re: Name resolution in StaticHostProvider

2018-05-09 Thread Flavio Junqueira
I like the idea of indicating to the application that there is something wrong 
with the list of servers so that it has a chance to look into it. With the 
current code in `ClientCnxn`, we will log at warn level and hope that someone 
sees it, but we are not really stopping the client. Throwing might actually be 
an improvement as it will output a log message, but I'm now wondering if we 
should propagate it all the way to the application. Responding to myself, one 
reason for not doing it is that it is not a fatal error unless no server can be 
resolved.

-Flavio
 
> On 8 May 2018, at 16:06, Andor Molnar  wrote:
> 
> Hi,
> 
> Updating this thread, because the PR is still being review on GitHub.
> 
> So, the reason why I refactored the original behaviour of
> StaticHostProvider is that I believe that it's trying to do something which
> is not its responsibility. Please tell me if there's a good historical
> reason for that.
> 
> My approach is giving the user the following to options:
> 1- Use static IP addresses, if you don't want to deal with DNS resolution
> at all - we guarantee that no DNS logic will involved in this case at all.
> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
> way in this case e.g. do NOT cache IP address for a longer period that DNS
> server allows to and re-resolve after TTL expries, because it's mandatory
> by protocol.
> 
> My 2 cents here:
> - the fix which was originally posted for re-resolution is a workaround and
> doesn't satisfy the requirement for #2,
> - the solution is already built-in in JDK and DNS clients in the right way
> - can't see a reason why we shouldn't use that
> 
> I checked this in some other projects as well and found very similar
> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> for that:
> - Standard resolver uses java's built-in getByName().
> - Qualified resolver still uses getByName(), but adds some logic to avoid
> incorrect re-resolutions and reverse IP lookups.
> 
> Please let me know your thoughts.
> 
> Regards,
> Andor
> 
> 
> 
> 
> 
> 
> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar  wrote:
> 
>> Hi Abe,
>> 
>> Unfortunately we haven't got any feedback yet. What do you think of
>> implementing Option #3?
>> 
>> Regards,
>> Andor
>> 
>> 
>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar  wrote:
>> 
>>> Did anybody happen to take a quick look by any chance?
>>> 
>>> I don't want to push this too hard, because I know it's a time consuming
>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>> around for a while and any feedback would be extremely helpful to close it
>>> quickly.
>>> 
>>> Thanks,
>>> Andor
>>> 
>>> 
>>> 
>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar 
>>> wrote:
>>> 
 Hi all,
 
 We need more eyes and brains on the following PR:
 
 https://github.com/apache/zookeeper/pull/451
 
 I added a comment few days ago about the way we currently do DNS name
 resolution in this class and a suggestion on how we could simplify things a
 little bit. We talked about it with Abe Fine, but we're a little bit unsure
 and cannot get a conclusion. It would be extremely handy to get more
 feedback from you.
 
 To add some colour to it, let me elaborate on the situation here:
 
 In general, the task that StaticHostProvider does is to get a list of
 potentially unresolved InetSocketAddress objects, resolve them and iterate
 over the resolved objects by calling next() method.
 
 *Option #1 (current logic)*
 - Resolve addresses with getAllByName() which returns a list of IP
 addresses associated with the address.
 - Cache all these IP's, shuffle them and iterate over.
 - If client is unable to connect to an IP, remove all IPs from the list
 which the original servername was resolved to and re-resolve it.
 
 *Option #2 (getByName())*
 - Resolve address with getByName() instead which returns only the first
 IP address of the name,
 - Do not cache IPs,
 - Shuffle the *names* and resolve with getByName() *every time* when
 next() is called,
 - JDK's built-in caching will prevent name servers from being flooded
 and will do the re-resolution automatically when cache expires,
 - Names with multiple IPs will be handled by DNS servers which (if
 configured properly) return IPs in different order - this is called DNS
 Round Robin -, so getByName() will return different IP on each call.
 
 *Options #3*
 - There's a small problem with option#2: if DNS server is not configured
 properly and handles the round-robin case in a way that it always return
 the IP list in the same order, getByName() will never return the next ip,
 - In order to overcome that, use 

Re: Question on merge script

2018-05-09 Thread Flavio Junqueira
Hey Michael,

I was trying to merge yesterday a PR generated against branch-3.5, and fetching 
the PR branch did not give me the merge script. I ended up asking the 
contributor to change the target branch to master so that I avoid any small 
hacks with the merge script.

We should consider doing the following two things, and let me know if it makes 
sense:
1- Clarifying that if a change is supposed to go to both branch-3.5 and master, 
the PR should be against master
2- Perhaps merging to branch-3.5 so that I see the script when I fetch a PR 
branch off branch-3.5. This is unusual, but it is not unreasonable that we have 
eventually PRs for branch-3.5 only.

I'm focusing on 3.5, but the same reasoning applies to 3.4.

-Flavio

 
> On 9 May 2018, at 01:49, Michael Han <h...@apache.org> wrote:
> 
> Hi Flavio,
> 
> The merge script is branch agnostic - it only cares about the pull request
> number. As long as in the pull request the correct target branch is
> specified, the merge script will do its job by merging the change to the
> specified target branch. I guess we could commit the same script to
> branch-3.5 but the current script in master should be able to do what you
> asked.
> 
> On Tue, May 8, 2018 at 4:06 PM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> Could anyone remind me why we don't have the merge script on branch-3.5?
>> Say I have a change that targets branch-3.5 alone. Shouldn't I be able to
>> have a PR that targets branch-3.5 and use the merge script?
>> 
>> Thanks,
>> -Flavio
> 
> 
> 
> 
> -- 
> Cheers
> Michael



Question on merge script

2018-05-08 Thread Flavio Junqueira
Could anyone remind me why we don't have the merge script on branch-3.5? Say I 
have a change that targets branch-3.5 alone. Shouldn't I be able to have a PR 
that targets branch-3.5 and use the merge script?

Thanks,
-Flavio

[jira] [Resolved] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-05-08 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira resolved ZOOKEEPER-2982.
-
Resolution: Fixed

Issue resolved by pull request 513
[https://github.com/apache/zookeeper/pull/513]

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>    Assignee: Flavio Junqueira
>Priority: Blocker
> Fix For: 3.6.0, 3.5.4
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Name resolution in StaticHostProvider

2018-05-08 Thread Flavio Junqueira
The refactoring did not seem justifiable at first, so my reaction to it. You 
have clarified the reason for including the changes, and I actually like it.

About the exception, there are two points for me:

1- You don't really need to throw to execute the plan you described.
2- In the case we do throw, which is not entirely unreasonable, I'd think about 
the expected behaviour of a method called "next()". In general, I'd expect it 
to either return me the next item or error saying that it cannot return an 
item. The "UnknownHostException" is not doing the latter, though. It is 
indicating that one of the elements of the host provider set is "broken" (not 
resolvable). That's one interpretation. Another interpretation is that the 
logic in "next" needs to indicate to the caller that there is a problem with an 
item in the set of elements of the host provider. 

For (2) let's converge on what semantics we are trying to provide first, please.

-Flavio

> On 8 May 2018, at 21:20, Andor Molnar <an...@cloudera.com> wrote:
> 
> Sorry, I thought you were against the whole refactoring.
> 
> "2- That we discuss separately whether we want to change the behaviour of
> the next()in the HostProvider interface."
> 
> From this it seemed to me, it's not just a polishing issue, but maybe I've
> gotten you wrong.
> Anyway, there're 2 contention points we've encountered so far:
> 
> 1. Do we need to refactor at all?
> 2. If we do so, shall next() throw UnknownHostException or deal with error
> inside.
> 
> And I'd still go with:
> 1. Yes
> 2. Yes, throw
> 
> So, I would leave the PR as it is now.
> 
> Andor
> 
> 
> 
> 
> 
> On Tue, May 8, 2018 at 12:11 PM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> Can you list what the contention points are according to you? Feel free to
>> do that in the PR as well, I want to make sure we have the same
>> understanding of the points that still need to be resolved. From where I
>> stand, there is no major issue pending other than one polishing issue that
>> I brought upon in the PR.
>> 
>> -Flavio
>> 
>>> On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> I'm happy to do that once we have an agreement.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <f...@apache.org> wrote:
>>> 
>>>> It might be a good idea to document whatever we end up doing.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
>>>>> 
>>>>> "If refactoring is necessary to address the issue, then it should be
>> part
>>>>> of the PR."
>>>>> 
>>>>> Agreed. I think it is and it also makes things a lot more simpler, so
>>>> it's
>>>>> easier to review.
>>>>> 
>>>>> "I'm not sure what kind of confirmation you are after here. Could you
>>>>> elaborate, please?"
>>>>> 
>>>>> I'm just wondering what could have been the reason for caching host
>>>>> addresses in StaticHostProvider in the first place.
>>>>> 
>>>>> "The other solution, if I remember enough of it, tried to avoid
>> resolving
>>>>> unnecessarily, but perhaps the DNS client cache is enough..."
>>>>> 
>>>>> That's exactly my point: what JDK is doing internally is perfectly fine
>>>> for
>>>>> us and we don't need extra logic here.
>>>>> 
>>>>> "Could you elaborate on this point, please? What exactly do you mean
>> with
>>>>> this statement?"
>>>>> 
>>>>> See my previous point. Caching is already done in all DNS servers in
>> the
>>>>> chain and also there's also caching in JDK already, which means by
>>>> calling
>>>>> getByName() you don't necessarily fire a DNS request, only when the TTL
>>>> is
>>>>> expired.
>>>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <f...@apache.org>
>> wrote:
>>>>> 
>>>>>> Hi Andor,
>>>>>> 
>>>>>> Thanks for your work on addressing the issue.
>>>>>> 
>>>>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wr

Re: Name resolution in StaticHostProvider

2018-05-08 Thread Flavio Junqueira
Can you list what the contention points are according to you? Feel free to do 
that in the PR as well, I want to make sure we have the same understanding of 
the points that still need to be resolved. From where I stand, there is no 
major issue pending other than one polishing issue that I brought upon in the 
PR.

-Flavio

> On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
> 
> I'm happy to do that once we have an agreement.
> 
> 
> 
> 
> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> It might be a good idea to document whatever we end up doing.
>> 
>> -Flavio
>> 
>>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> "If refactoring is necessary to address the issue, then it should be part
>>> of the PR."
>>> 
>>> Agreed. I think it is and it also makes things a lot more simpler, so
>> it's
>>> easier to review.
>>> 
>>> "I'm not sure what kind of confirmation you are after here. Could you
>>> elaborate, please?"
>>> 
>>> I'm just wondering what could have been the reason for caching host
>>> addresses in StaticHostProvider in the first place.
>>> 
>>> "The other solution, if I remember enough of it, tried to avoid resolving
>>> unnecessarily, but perhaps the DNS client cache is enough..."
>>> 
>>> That's exactly my point: what JDK is doing internally is perfectly fine
>> for
>>> us and we don't need extra logic here.
>>> 
>>> "Could you elaborate on this point, please? What exactly do you mean with
>>> this statement?"
>>> 
>>> See my previous point. Caching is already done in all DNS servers in the
>>> chain and also there's also caching in JDK already, which means by
>> calling
>>> getByName() you don't necessarily fire a DNS request, only when the TTL
>> is
>>> expired.
>>> 
>>> Andor
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <f...@apache.org> wrote:
>>> 
>>>> Hi Andor,
>>>> 
>>>> Thanks for your work on addressing the issue.
>>>> 
>>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Updating this thread, because the PR is still being review on GitHub.
>>>>> 
>>>>> So, the reason why I refactored the original behaviour of
>>>>> StaticHostProvider is that I believe that it's trying to do something
>>>> which
>>>>> is not its responsibility.
>>>> 
>>>> If refactoring is necessary to address the issue, then it should be part
>>>> of the PR. Otherwise, it is better to refactor in a separate PR.
>>>> 
>>>> 
>>>>> Please tell me if there's a good historical
>>>>> reason for that.
>>>> 
>>>> I'm not sure what kind of confirmation you are after here. Could you
>>>> elaborate, please?
>>>> 
>>>>> 
>>>>> My approach is giving the user the following to options:
>>>>> 1- Use static IP addresses, if you don't want to deal with DNS
>> resolution
>>>>> at all - we guarantee that no DNS logic will involved in this case at
>>>> all.
>>>> 
>>>> Sounds fine.
>>>> 
>>>>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>>>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>>>> right
>>>>> way in this case e.g. do NOT cache IP address for a longer period that
>>>> DNS
>>>>> server allows to and re-resolve after TTL expries, because it's
>> mandatory
>>>>> by protocol.
>>>> 
>>>> Sounds fine.
>>>> 
>>>>> 
>>>>> My 2 cents here:
>>>>> - the fix which was originally posted for re-resolution is a workaround
>>>> and
>>>>> doesn't satisfy the requirement for #2,
>>>> 
>>>> The other solution, if I remember enough of it, tried to avoid resolving
>>>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
>>>> unnecessary round-trips. This observation actually brings me to the next
>>>> point:
>>>> 
>>>>> - the solution is already bu

Re: Name resolution in StaticHostProvider

2018-05-08 Thread Flavio Junqueira
It might be a good idea to document whatever we end up doing.

-Flavio

> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
> 
> "If refactoring is necessary to address the issue, then it should be part
> of the PR."
> 
> Agreed. I think it is and it also makes things a lot more simpler, so it's
> easier to review.
> 
> "I'm not sure what kind of confirmation you are after here. Could you
> elaborate, please?"
> 
> I'm just wondering what could have been the reason for caching host
> addresses in StaticHostProvider in the first place.
> 
> "The other solution, if I remember enough of it, tried to avoid resolving
> unnecessarily, but perhaps the DNS client cache is enough..."
> 
> That's exactly my point: what JDK is doing internally is perfectly fine for
> us and we don't need extra logic here.
> 
> "Could you elaborate on this point, please? What exactly do you mean with
> this statement?"
> 
> See my previous point. Caching is already done in all DNS servers in the
> chain and also there's also caching in JDK already, which means by calling
> getByName() you don't necessarily fire a DNS request, only when the TTL is
> expired.
> 
> Andor
> 
> 
> 
> 
> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> Hi Andor,
>> 
>> Thanks for your work on addressing the issue.
>> 
>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Updating this thread, because the PR is still being review on GitHub.
>>> 
>>> So, the reason why I refactored the original behaviour of
>>> StaticHostProvider is that I believe that it's trying to do something
>> which
>>> is not its responsibility.
>> 
>> If refactoring is necessary to address the issue, then it should be part
>> of the PR. Otherwise, it is better to refactor in a separate PR.
>> 
>> 
>>> Please tell me if there's a good historical
>>> reason for that.
>> 
>> I'm not sure what kind of confirmation you are after here. Could you
>> elaborate, please?
>> 
>>> 
>>> My approach is giving the user the following to options:
>>> 1- Use static IP addresses, if you don't want to deal with DNS resolution
>>> at all - we guarantee that no DNS logic will involved in this case at
>> all.
>> 
>> Sounds fine.
>> 
>>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>> right
>>> way in this case e.g. do NOT cache IP address for a longer period that
>> DNS
>>> server allows to and re-resolve after TTL expries, because it's mandatory
>>> by protocol.
>> 
>> Sounds fine.
>> 
>>> 
>>> My 2 cents here:
>>> - the fix which was originally posted for re-resolution is a workaround
>> and
>>> doesn't satisfy the requirement for #2,
>> 
>> The other solution, if I remember enough of it, tried to avoid resolving
>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
>> unnecessary round-trips. This observation actually brings me to the next
>> point:
>> 
>>> - the solution is already built-in in JDK and DNS clients in the right
>> way
>> 
>> Could you elaborate on this point, please? What exactly do you mean with
>> this statement?
>> 
>>> - can't see a reason why we shouldn't use that
>>> 
>>> I checked this in some other projects as well and found very similar
>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
>>> for that:
>>> - Standard resolver uses java's built-in getByName().
>>> - Qualified resolver still uses getByName(), but adds some logic to avoid
>>> incorrect re-resolutions and reverse IP lookups.
>>> 
>>> Please let me know your thoughts.
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>>> Hi Abe,
>>>> 
>>>> Unfortunately we haven't got any feedback yet. What do you think of
>>>> implementing Option #3?
>>>> 
>>>> Regards,
>>>> Andor
>>>> 
>>>> 
>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
>> wrote:
>>>> 
>>>>

Re: Name resolution in StaticHostProvider

2018-05-08 Thread Flavio Junqueira
Hi Andor,

Thanks for your work on addressing the issue.

> On 8 May 2018, at 16:06, Andor Molnar  wrote:
> 
> Hi,
> 
> Updating this thread, because the PR is still being review on GitHub.
> 
> So, the reason why I refactored the original behaviour of
> StaticHostProvider is that I believe that it's trying to do something which
> is not its responsibility.

If refactoring is necessary to address the issue, then it should be part of the 
PR. Otherwise, it is better to refactor in a separate PR.


> Please tell me if there's a good historical
> reason for that.

I'm not sure what kind of confirmation you are after here. Could you elaborate, 
please?

> 
> My approach is giving the user the following to options:
> 1- Use static IP addresses, if you don't want to deal with DNS resolution
> at all - we guarantee that no DNS logic will involved in this case at all.

Sounds fine.

> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
> way in this case e.g. do NOT cache IP address for a longer period that DNS
> server allows to and re-resolve after TTL expries, because it's mandatory
> by protocol.

Sounds fine.

> 
> My 2 cents here:
> - the fix which was originally posted for re-resolution is a workaround and
> doesn't satisfy the requirement for #2,

The other solution, if I remember enough of it, tried to avoid resolving 
unnecessarily, but perhaps the DNS client cache is enough to avoid the 
unnecessary round-trips. This observation actually brings me to the next point:

> - the solution is already built-in in JDK and DNS clients in the right way

Could you elaborate on this point, please? What exactly do you mean with this 
statement?

> - can't see a reason why we shouldn't use that
> 
> I checked this in some other projects as well and found very similar
> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> for that:
> - Standard resolver uses java's built-in getByName().
> - Qualified resolver still uses getByName(), but adds some logic to avoid
> incorrect re-resolutions and reverse IP lookups.
> 
> Please let me know your thoughts.
> 
> Regards,
> Andor
> 
> 
> 
> 
> 
> 
> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar  wrote:
> 
>> Hi Abe,
>> 
>> Unfortunately we haven't got any feedback yet. What do you think of
>> implementing Option #3?
>> 
>> Regards,
>> Andor
>> 
>> 
>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar  wrote:
>> 
>>> Did anybody happen to take a quick look by any chance?
>>> 
>>> I don't want to push this too hard, because I know it's a time consuming
>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>> around for a while and any feedback would be extremely helpful to close it
>>> quickly.
>>> 
>>> Thanks,
>>> Andor
>>> 
>>> 
>>> 
>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar 
>>> wrote:
>>> 
 Hi all,
 
 We need more eyes and brains on the following PR:
 
 https://github.com/apache/zookeeper/pull/451
 
 I added a comment few days ago about the way we currently do DNS name
 resolution in this class and a suggestion on how we could simplify things a
 little bit. We talked about it with Abe Fine, but we're a little bit unsure
 and cannot get a conclusion. It would be extremely handy to get more
 feedback from you.
 
 To add some colour to it, let me elaborate on the situation here:
 
 In general, the task that StaticHostProvider does is to get a list of
 potentially unresolved InetSocketAddress objects, resolve them and iterate
 over the resolved objects by calling next() method.
 
 *Option #1 (current logic)*
 - Resolve addresses with getAllByName() which returns a list of IP
 addresses associated with the address.
 - Cache all these IP's, shuffle them and iterate over.
 - If client is unable to connect to an IP, remove all IPs from the list
 which the original servername was resolved to and re-resolve it.
 
 *Option #2 (getByName())*
 - Resolve address with getByName() instead which returns only the first
 IP address of the name,
 - Do not cache IPs,
 - Shuffle the *names* and resolve with getByName() *every time* when
 next() is called,
 - JDK's built-in caching will prevent name servers from being flooded
 and will do the re-resolution automatically when cache expires,
 - Names with multiple IPs will be handled by DNS servers which (if
 configured properly) return IPs in different order - this is called DNS
 Round Robin -, so getByName() will return different IP on each call.
 
 *Options #3*
 - There's a small problem with option#2: if DNS server is not configured
 properly and handles the round-robin case in a way that it always return
 the IP list in the same order, getByName() will never 

Re: Let's cut a ZK 3.5.4-beta release

2018-05-08 Thread Flavio Junqueira
Hi Pat,

I'm ready to merge ZK-2982. It is a one line change that is actually a mistake 
that was made during the port of the changes from 3.4 to 3.5. It is just 
bringing in the line that was missed.

As you are the RM, I'll follow your instructions. I'm fine with either merging 
it today or doing a 3.5.5 immediately after. Let me know where you are with the 
release candidate and we can decide on the best course of action.

-Flavio

> On 8 May 2018, at 05:56, Patrick Hunt <ph...@apache.org> wrote:
> 
> There's been plenty of time for someone to get these into 3.5.4 before now.
> I really want to get 3.5.4 out so that folks can access/assess what's
> currently committed.
> 
> I'm happy to drive a follow-on release with these changes soon after 3.5.4
> ships if there's sufficient progress.
> 
> Patrick
> 
> On Mon, May 7, 2018 at 2:00 PM, Alexander Shraer <shra...@gmail.com> wrote:
> 
>> Lets also get this in: https://issues.apache.org/
>> jira/browse/ZOOKEEPER-2959
>> This affects anyone running observers.
>> 
>> On Mon, May 7, 2018 at 1:45 PM, Flavio Junqueira <f...@apache.org> wrote:
>> 
>>> We should strongly consider merging this:
>>> 
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2982
>>> 
>>> As it is causing problems to anyone using ZK on K8s.
>>> 
>>> -Flavio
>>> 
>>>> On 7 May 2018, at 17:32, Rakesh Radhakrishnan <rake...@apache.org>
>>> wrote:
>>>> 
>>>> I have added my feedback to the PR. Please take a look at it.
>>>> 
>>>> 
>>>> Rakesh
>>>> 
>>>> On Mon, May 7, 2018 at 8:38 AM, Patrick Hunt <ph...@apache.org> wrote:
>>>> 
>>>>> No one has looked at 2901, while I'm not super happy with it it seems
>>> fine
>>>>> - I'll commit it as-is if I don't hear anything by EOD tomorrow
>>> (Monday).
>>>>> After which I'll start the release process.
>>>>> 
>>>>> Patrick
>>>>> 
>>>>> On Mon, Mar 26, 2018 at 2:09 AM, Andor Molnar <an...@cloudera.com>
>>> wrote:
>>>>> 
>>>>>> I'm currently working on ZOOKEEPER-2184. PR has been open for ages on
>>> 3.4
>>>>>> branch, please review if you have some capacity.
>>>>>> I'll port the fix to the 3.5 branch too, if we have an agreement and
>>>>>> 3.4-version is merged.
>>>>>> 
>>>>>> ZK-2982 is somewhat related, I believe my changes will fix that one
>>> too.
>>>>>> 
>>>>>> ZK-1818 is probably tough, but patch is already available. Somebody
>>>>> should
>>>>>> pick it up, which I'm happy to do once finished with above stuff.
>>>>>> 
>>>>>> Regards,
>>>>>> Andor
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Mar 26, 2018 at 3:39 AM, Michael Han <h...@apache.org>
>> wrote:
>>>>>> 
>>>>>>> +1 on 3.5.4 release planning.
>>>>>>> 
>>>>>>>>> There are 10 open blocker issues marked for 3.5.4. Can I get some
>>>>> help
>>>>>>> to sort out those issues?
>>>>>>> 
>>>>>>> The url posted does not work for me, here is the query I use:
>>>>>>> https://goo.gl/3MJZMN
>>>>>>> 
>>>>>>> Just had a chance to go through the JIRA and did some clean ups.
>>>>>>> - 1159: lower the priority from blocker to major.
>>>>>>> - 761: resolved because the code is merged.
>>>>>>> 
>>>>>>> ZOOKEEPER-2903, 2184, 2982 looks like real blockers for the 3.5.4
>>>>>> release.
>>>>>>> The rest of the blockers are legacy that's get postponed
>> indefinitely.
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Mar 14, 2018 at 11:59 AM, Flavio Junqueira <f...@apache.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> Ok, I can have a look at ZK-2901. I'd like to get ZK-2982 in as
>> well
>>>>> as
>>>>>>> it
>>>>>>>> is causing us problems with Kubernetes. The fix is simple, but I'm
>>>>>>>> wondering about adding a test case.
>>>>>>>> 
>>>>>>>> -Flavio
&g

[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-05-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467049#comment-16467049
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

Based on this comment, I'm +1 from this change:

https://issues.apache.org/jira/browse/ZOOKEEPER-2982?focusedCommentId=16371886=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16371886

It would have been good to have a test case, but we haven't been able to come 
up with anything, so I suggest we leave it for future work. We also have a +1 
from [~andorm] in the pull request.

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>    Assignee: Flavio Junqueira
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-05-08 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-2982:
---

Assignee: Flavio Junqueira

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>    Assignee: Flavio Junqueira
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Let's cut a ZK 3.5.4-beta release

2018-05-07 Thread Flavio Junqueira
We should strongly consider merging this:

https://issues.apache.org/jira/browse/ZOOKEEPER-2982

As it is causing problems to anyone using ZK on K8s. 

-Flavio

> On 7 May 2018, at 17:32, Rakesh Radhakrishnan <rake...@apache.org> wrote:
> 
> I have added my feedback to the PR. Please take a look at it.
> 
> 
> Rakesh
> 
> On Mon, May 7, 2018 at 8:38 AM, Patrick Hunt <ph...@apache.org> wrote:
> 
>> No one has looked at 2901, while I'm not super happy with it it seems fine
>> - I'll commit it as-is if I don't hear anything by EOD tomorrow (Monday).
>> After which I'll start the release process.
>> 
>> Patrick
>> 
>> On Mon, Mar 26, 2018 at 2:09 AM, Andor Molnar <an...@cloudera.com> wrote:
>> 
>>> I'm currently working on ZOOKEEPER-2184. PR has been open for ages on 3.4
>>> branch, please review if you have some capacity.
>>> I'll port the fix to the 3.5 branch too, if we have an agreement and
>>> 3.4-version is merged.
>>> 
>>> ZK-2982 is somewhat related, I believe my changes will fix that one too.
>>> 
>>> ZK-1818 is probably tough, but patch is already available. Somebody
>> should
>>> pick it up, which I'm happy to do once finished with above stuff.
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> 
>>> On Mon, Mar 26, 2018 at 3:39 AM, Michael Han <h...@apache.org> wrote:
>>> 
>>>> +1 on 3.5.4 release planning.
>>>> 
>>>>>> There are 10 open blocker issues marked for 3.5.4. Can I get some
>> help
>>>> to sort out those issues?
>>>> 
>>>> The url posted does not work for me, here is the query I use:
>>>> https://goo.gl/3MJZMN
>>>> 
>>>> Just had a chance to go through the JIRA and did some clean ups.
>>>> - 1159: lower the priority from blocker to major.
>>>> - 761: resolved because the code is merged.
>>>> 
>>>> ZOOKEEPER-2903, 2184, 2982 looks like real blockers for the 3.5.4
>>> release.
>>>> The rest of the blockers are legacy that's get postponed indefinitely.
>>>> 
>>>> 
>>>> On Wed, Mar 14, 2018 at 11:59 AM, Flavio Junqueira <f...@apache.org>
>>> wrote:
>>>> 
>>>>> Ok, I can have a look at ZK-2901. I'd like to get ZK-2982 in as well
>> as
>>>> it
>>>>> is causing us problems with Kubernetes. The fix is simple, but I'm
>>>>> wondering about adding a test case.
>>>>> 
>>>>> -Flavio
>>>>> 
>>>>>> On 14 Mar 2018, at 17:30, Patrick Hunt <ph...@apache.org> wrote:
>>>>>> 
>>>>>> I would like to cut 3.5.4. Need more eyes on 2901 though.
>>>>>> 
>>>>>> Patrick
>>>>>> 
>>>>>> On Wed, Mar 14, 2018 at 4:26 AM, Flavio Junqueira <f...@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> I think ZK-2901 is close to being merged, yes? And with that, will
>>> we
>>>>> cut
>>>>>>> a 3.5.4 release?
>>>>>>> 
>>>>>>> -Flavio
>>>>>>> 
>>>>>>>> On 7 Dec 2017, at 00:27, Patrick Hunt <ph...@apache.org> wrote:
>>>>>>>> 
>>>>>>>> I haven't forgotten about this - we've been stuck on
>> ZOOKEEPER-2901
>>>> . I
>>>>>>>> think we getting closer but it's been tricky to navigate
>> addressing
>>>> the
>>>>>>>> issue vs backward compat vs making things worse. I was about to
>>> sign
>>>>> off
>>>>>>>> then noticed I had missed something. Jordan has been working to
>>>>> address.
>>>>>>>> 
>>>>>>>> Camille, Edward -- it would be good if you could take a look at
>>> 2901
>>>>>>> given
>>>>>>>> you participated in the original creation/commit of this feature.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> 
>>>>>>>> Patrick
>>>>>>>> 
>>>>>>>> On Tue, Nov 21, 2017 at 11:32 AM, Karan Mehta <
>>>> karanmeht...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Can we get ZOOKEEPER-2770
>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2770> in as
>>> well?
>>>>> The
>>>>>>> PR
>>>>>>>>> is ready for review at https://github.com/apache/
>>> zookeeper/pull/307
>>>>>>>>> It will be a nice feature addition. Thanks!
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> Karan
>>>>>>>>> ᐧ
>>>>>>>>> 
>>>>>>>>> On Tue, Nov 21, 2017 at 11:00 AM, Jordan Zimmerman <
>>>>>>>>> jor...@jordanzimmerman.com> wrote:
>>>>>>>>> 
>>>>>>>>>>> Afaict the only real blocker for the release at this point is
>>>>>>>>>>> ZOOKEEPER-2901 - Jordan can you resolve the comments, after
>>> which
>>>> we
>>>>>>>>>> should
>>>>>>>>>>> be good to go. LMK if there's anything I'm missing.
>>>>>>>>>> 
>>>>>>>>>> I'll have this done in the next day or so. Please wait for me
>> if
>>>> you
>>>>>>> can!
>>>>>>>>>> 
>>>>>>>>>> -Jordan
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 



Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-23 Thread Flavio Junqueira
+1, verified the following:

- checksums and signature
- build passes
- rat tool output does not indicate any problem
- LICENSE and NOTICE look both ok
- local simple smoke tests work

-Flavio


> On 2 Apr 2018, at 02:01, Michael Han  wrote:
> 
> +1
> 
> - verified xsum/sig.
> - release notes looks good.
> - verified cluster with different sizes.
> - verified with few 4lw commands.
> - verified data / log dir swap was fixed.
> - all unit test passed.
> 
> 
> On Wed, Mar 28, 2018 at 11:55 AM, Patrick Hunt  wrote:
> 
>> +1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
>> which seemed fine. Ran the tests and they passed. LGTM.
>> 
>> Patrick
>> 
>> On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine  wrote:
>> 
>>> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
>>> including issues that
>>> affect incorrect handling of the dataDir and the dataLogDir.
>>> 
>>> This candidate fixes an issue in the release notes of candidate 0.
>>> 
>>> The full release notes are available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12310801=12342040
>>> 
>>> *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/groups/staging/org/
>>> apache/zookeeper/zookeeper/3.4.12/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.4.12-rc1
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 
>> 



Re: [VOTE] Migrate ZK to Maven build

2018-04-20 Thread Flavio Junqueira
Definitely +1

> On 20 Apr 2018, at 16:06, Norbert Kalmar  wrote:
> 
> Hi,
> 
> Let's start a vote on migrating to maven instead of ant.
> https://issues.apache.org/jira/browse/ZOOKEEPER-3021
> 
> *Shall we migrate ZooKeeper build from ant to Maven?*
> 
> Please reply with [Yes / +1] or [No / -1] to this thread.
> 
> Thanks,
> Norbert



Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-04-05 Thread Flavio Junqueira
We should consider not using md5 for the next RC as per the ASF policy:

https://www.apache.org/dev/release-distribution.html#sigs-and-sums

-Flavio

> On 26 Mar 2018, at 21:37, Abraham Fine  wrote:
> 
> Thank you everyone for your votes to far. Due to the issues that Michael Han 
> pointed out I will cancel this RC and release a new one with the correct 
> release notes.
> 
> Expect the new RC very soon.
> 
> Thanks,
> Abe
> 
> On Mon, Mar 26, 2018, at 12:29, Patrick Hunt wrote:
>> +1 - sig/xsum verify, RAT looks ok, was able to build/test successfully
>> under jdk7/mac. Tested out a few deployment combinations and it seemed ok.
>> Manually verified swapping data and datalogdir resulted in the server not
>> coming up and human readable error in the logs.
>> 
>> Patrick
>> 
>> On Thu, Mar 22, 2018 at 1:05 PM, Abraham Fine  wrote:
>> 
>>> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
>>> including issues that
>>> affect incorrect handling of the dataDir and the dataLogDir.
>>> 
>>> The full release notes is available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12310801=12342040
>>> 
>>> *** Please download, test and vote by March 27th 2018, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-0/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/groups/staging/org/
>>> apache/zookeeper/zookeeper/3.4.12/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.4.12-rc0
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 



Re: 3.4.12

2018-03-15 Thread Flavio Junqueira
+1 for cutting a 3.4.12 RC. Thanks for volunteering, Abe.

-Flavio

> On 8 Mar 2018, at 18:53, Rakesh Radhakrishnan  wrote:
> 
> Appreciate Abe for the initiative and efforts!
> 
> +1, for "3.4.12" releasing.
> 
> Please feel free to ping me if any help needed when making this release.
> 
> Regards,
> Rakesh
> 
> On Sat, Mar 3, 2018 at 4:19 AM, Abraham Fine  wrote:
> 
>> I am very much interested in taking a turn as a RM and I think it is a
>> great time to do a release (now that 2967, 2249, and 2960 arge merged in).
>> 
>> I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is
>> anything else that we need to merge in before cutting a release.
>> 
>> Abe
>> 
>> On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
>>> There are 19 resolved issues http://bit.ly/2oK9aTx
>>> and 14 unresolved http://bit.ly/2oFWywS
>>> ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
>>> regression and was pushed from 3.4.11, we could do so again given it's
>>> still being worked on.
>>> 
>>> Abe are you interested in taking a turn as RM?
>>> 
>>> Patrick
>>> 
>>> On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar  wrote:
>>> 
 Hi dev,
 
 User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
 ) (again?)
 Are we good to cut 3.4.12 soon or still waiting for something to be
 committed?
 
 Andor
 
>> 



Re: Let's cut a ZK 3.5.4-beta release

2018-03-14 Thread Flavio Junqueira
Ok, I can have a look at ZK-2901. I'd like to get ZK-2982 in as well as it is 
causing us problems with Kubernetes. The fix is simple, but I'm wondering about 
adding a test case.

-Flavio

> On 14 Mar 2018, at 17:30, Patrick Hunt <ph...@apache.org> wrote:
> 
> I would like to cut 3.5.4. Need more eyes on 2901 though.
> 
> Patrick
> 
> On Wed, Mar 14, 2018 at 4:26 AM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> I think ZK-2901 is close to being merged, yes? And with that, will we cut
>> a 3.5.4 release?
>> 
>> -Flavio
>> 
>>> On 7 Dec 2017, at 00:27, Patrick Hunt <ph...@apache.org> wrote:
>>> 
>>> I haven't forgotten about this - we've been stuck on ZOOKEEPER-2901 . I
>>> think we getting closer but it's been tricky to navigate addressing the
>>> issue vs backward compat vs making things worse. I was about to sign off
>>> then noticed I had missed something. Jordan has been working to address.
>>> 
>>> Camille, Edward -- it would be good if you could take a look at 2901
>> given
>>> you participated in the original creation/commit of this feature.
>>> 
>>> Regards,
>>> 
>>> Patrick
>>> 
>>> On Tue, Nov 21, 2017 at 11:32 AM, Karan Mehta <karanmeht...@gmail.com>
>>> wrote:
>>> 
>>>> Can we get ZOOKEEPER-2770
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2770> in as well? The
>> PR
>>>> is ready for review at https://github.com/apache/zookeeper/pull/307
>>>> It will be a nice feature addition. Thanks!
>>>> 
>>>> Regards
>>>> Karan
>>>> ᐧ
>>>> 
>>>> On Tue, Nov 21, 2017 at 11:00 AM, Jordan Zimmerman <
>>>> jor...@jordanzimmerman.com> wrote:
>>>> 
>>>>>> Afaict the only real blocker for the release at this point is
>>>>>> ZOOKEEPER-2901 - Jordan can you resolve the comments, after which we
>>>>> should
>>>>>> be good to go. LMK if there's anything I'm missing.
>>>>> 
>>>>> I'll have this done in the next day or so. Please wait for me if you
>> can!
>>>>> 
>>>>> -Jordan
>>>> 
>> 
>> 



[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-03-14 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398500#comment-16398500
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

hey guys, here are some comments based on the latest points:

[~abrahamfine] the improvement you are proposing requires further discussion. 
for one thing, the user has told us to use names and now we are trying to 
second-guess how to connect to the server. I'm not saying this is necessarily a 
bad idea, but I feel it needs to be addressed separately. I think we should go 
for now with the fix that [~eronwright] is proposing as it fixes an issue with 
the port.

[~andorm] do you think you will be able to come up with a test case? To recap, 
I think we need to test that we are able to resolve names correctly despite 
changes in the mapping of name to address. I'm not sure what a good way of 
testing it would be.

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Let's cut a ZK 3.5.4-beta release

2018-03-14 Thread Flavio Junqueira
There are 10 open blocker issues marked for 3.5.4. Can I get some help to sort 
out those issues?

https://issues.apache.org/jira/projects/ZOOKEEPER/versions/12340141

-Flavio

> On 14 Mar 2018, at 12:28, Enrico Olivelli <eolive...@gmail.com> wrote:
> 
> 2018-03-14 12:26 GMT+01:00 Flavio Junqueira <f...@apache.org>:
> 
>> I think ZK-2901 is close to being merged, yes? And with that, will we cut
>> a 3.5.4 release?
>> 
> 
> 
> It will be great !
> 
> ZK 3.5.4 has Quorum Peer Mutual Auth which is very important in order to
> move from 3.4 branch to 3.5
> 
> Enrico
> 
> 
>> 
>> -Flavio
>> 
>>> On 7 Dec 2017, at 00:27, Patrick Hunt <ph...@apache.org> wrote:
>>> 
>>> I haven't forgotten about this - we've been stuck on ZOOKEEPER-2901 . I
>>> think we getting closer but it's been tricky to navigate addressing the
>>> issue vs backward compat vs making things worse. I was about to sign off
>>> then noticed I had missed something. Jordan has been working to address.
>>> 
>>> Camille, Edward -- it would be good if you could take a look at 2901
>> given
>>> you participated in the original creation/commit of this feature.
>>> 
>>> Regards,
>>> 
>>> Patrick
>>> 
>>> On Tue, Nov 21, 2017 at 11:32 AM, Karan Mehta <karanmeht...@gmail.com>
>>> wrote:
>>> 
>>>> Can we get ZOOKEEPER-2770
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2770> in as well? The
>> PR
>>>> is ready for review at https://github.com/apache/zookeeper/pull/307
>>>> It will be a nice feature addition. Thanks!
>>>> 
>>>> Regards
>>>> Karan
>>>> ᐧ
>>>> 
>>>> On Tue, Nov 21, 2017 at 11:00 AM, Jordan Zimmerman <
>>>> jor...@jordanzimmerman.com> wrote:
>>>> 
>>>>>> Afaict the only real blocker for the release at this point is
>>>>>> ZOOKEEPER-2901 - Jordan can you resolve the comments, after which we
>>>>> should
>>>>>> be good to go. LMK if there's anything I'm missing.
>>>>> 
>>>>> I'll have this done in the next day or so. Please wait for me if you
>> can!
>>>>> 
>>>>> -Jordan
>>>> 
>> 
>> 



Re: Let's cut a ZK 3.5.4-beta release

2018-03-14 Thread Flavio Junqueira
I think ZK-2901 is close to being merged, yes? And with that, will we cut a 
3.5.4 release?

-Flavio

> On 7 Dec 2017, at 00:27, Patrick Hunt  wrote:
> 
> I haven't forgotten about this - we've been stuck on ZOOKEEPER-2901 . I
> think we getting closer but it's been tricky to navigate addressing the
> issue vs backward compat vs making things worse. I was about to sign off
> then noticed I had missed something. Jordan has been working to address.
> 
> Camille, Edward -- it would be good if you could take a look at 2901 given
> you participated in the original creation/commit of this feature.
> 
> Regards,
> 
> Patrick
> 
> On Tue, Nov 21, 2017 at 11:32 AM, Karan Mehta 
> wrote:
> 
>> Can we get ZOOKEEPER-2770
>>  in as well? The PR
>> is ready for review at https://github.com/apache/zookeeper/pull/307
>> It will be a nice feature addition. Thanks!
>> 
>> Regards
>> Karan
>> ᐧ
>> 
>> On Tue, Nov 21, 2017 at 11:00 AM, Jordan Zimmerman <
>> jor...@jordanzimmerman.com> wrote:
>> 
 Afaict the only real blocker for the release at this point is
 ZOOKEEPER-2901 - Jordan can you resolve the comments, after which we
>>> should
 be good to go. LMK if there's anything I'm missing.
>>> 
>>> I'll have this done in the next day or so. Please wait for me if you can!
>>> 
>>> -Jordan
>> 



[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-22 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372542#comment-16372542
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

I have tried your recipe for reproducing as well [~andorm] by changing 
{{/etc/hosts}} and got the same issue. The problem is that the leader fails to 
bind to the port, which actually makes me wonder whether we need to do anything 
about the leader with respect to this issue:

```
java.net.SocketException: Unresolved address
at java.net.ServerSocket.bind(ServerSocket.java:368)
at java.net.ServerSocket.bind(ServerSocket.java:329)
at org.apache.zookeeper.server.quorum.Leader.(Leader.java:240)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.makeLeader(QuorumPeer.java:1023)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1226)
```

Your suggestion of the alternative change is sensible, but I'd say that for 
consistency, it is better that we simply do the same that we have in 3.4, which 
is to make the change in {{findLeader}}.

One thing that I believe we haven't been able to do is to have a test case to 
report it. It would be good to have it, but I'm not sure what would be a good 
way.

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-22 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372542#comment-16372542
 ] 

Flavio Junqueira edited comment on ZOOKEEPER-2982 at 2/22/18 8:33 AM:
--

I have tried your recipe for reproducing as well [~andorm] by changing 
{{/etc/hosts}} and got the same issue. The problem is that the leader fails to 
bind to the port, which actually makes me wonder whether we need to do anything 
about the leader with respect to this issue:

{noformat}
java.net.SocketException: Unresolved address
at java.net.ServerSocket.bind(ServerSocket.java:368)
at java.net.ServerSocket.bind(ServerSocket.java:329)
at org.apache.zookeeper.server.quorum.Leader.(Leader.java:240)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.makeLeader(QuorumPeer.java:1023)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1226)
{noformat}

Your suggestion of the alternative change is sensible, but I'd say that for 
consistency, it is better that we simply do the same that we have in 3.4, which 
is to make the change in {{findLeader}}.

One thing that I believe we haven't been able to do is to have a test case to 
report it. It would be good to have it, but I'm not sure what would be a good 
way.


was (Author: fpj):
I have tried your recipe for reproducing as well [~andorm] by changing 
{{/etc/hosts}} and got the same issue. The problem is that the leader fails to 
bind to the port, which actually makes me wonder whether we need to do anything 
about the leader with respect to this issue:

```
java.net.SocketException: Unresolved address
at java.net.ServerSocket.bind(ServerSocket.java:368)
at java.net.ServerSocket.bind(ServerSocket.java:329)
at org.apache.zookeeper.server.quorum.Leader.(Leader.java:240)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.makeLeader(QuorumPeer.java:1023)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1226)
```

Your suggestion of the alternative change is sensible, but I'd say that for 
consistency, it is better that we simply do the same that we have in 3.4, which 
is to make the change in {{findLeader}}.

One thing that I believe we haven't been able to do is to have a test case to 
report it. It would be good to have it, but I'm not sure what would be a good 
way.

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [SUGGESTION] Target branches 3.5 and master (3.6) to Java 8

2018-02-21 Thread Flavio Junqueira
Hi Tamaas,

Thanks for the feedback. I'm fine with the plan. We might want to send a 
message to the user list once we reach some agreement here to assess whether 
users have a concern.

-Flavio

> On 20 Feb 2018, at 20:49, Tamás Pénzes <tam...@cloudera.com> wrote:
> 
> Hi All,
> 
> Just to add my 2 cents. // Might be five, I write long. :)
> Hope, you find valuable bits.
> 
> As many of us I also hope that ZooKeeper 3.5 will be released soon.
> Until then most of the changes go into master and branch-3.5 too, so I
> would keep them on the same Java version for code compatibility. In the
> same time I'd be happy if it was Java 8.
> 
> ZK 3.5+ supports Java 7 since December 2014, an almost 7 year old Java
> version today.
> It was a perfect decision in 2014, when nobody expected ZK 3.5 coming so
> late, but things might be different four years later.
> 
> Since we have to keep compatibility with Java 6 on branch-3.4 we already
> need manual changes when cherry picking into that branch. Not much
> difference if branch-3.5 is Java 8.
> 
> 
> As Flavio said changing branch-3.5 to Java 8 might cause issues for users
> already using ZK 3.5.x-beta.
> I totally agree with that concern, but using a beta state software means
> you accept the risk of facing changes.
> And Java 8 is four years old now, so we would not change to bleeding edge,
> which I guess nobody wanted.
> 
> 
> So what I would propose is the following:
> 
>   - Upgrade branches "master" and "branch-3.5" to Java 8 (LTS) asap.
>   - After releasing 3.5 GA and the next LTS Java version (Java 11 /
>   18.9-LTS) gets released upgrade "master" branch to Java 11-LTS. (
>   http://www.oracle.com/technetwork/java/eol-135779.html)
>   - I would not upgrade Java to a non-LTS version.
> 
> 
> What do you think about it?
> 
> Thanks, Tamaas
> 
> 
> On Mon, Feb 19, 2018 at 10:32 PM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> I'm fine with moving to Java 8 or even 9 in 3.6. Does anyone have a
>> different option? Otherwise, should we start a vote?
>> 
>> -Flavio
>> 
>> 
>>> On 16 Feb 2018, at 21:28, Abraham Fine <af...@apache.org> wrote:
>>> 
>>> I'm a -1 on requiring different minimum versions of java for the client
>> and the server.  I think this has the potential to create a lot of
>> confusion for users and contributors.
>>> 
>>> I would support moving master (3.6) to java 8, I also think it is worth
>> considering moving to java 9. Given how long our release cycle tends to be
>> I think targeting the latest and greatest this early in the development
>> cycle is reasonable.
>>> 
>>> Thanks,
>>> Abe
>>> 
>>> On Fri, Feb 16, 2018, at 06:48, Enrico Olivelli wrote:
>>>> 2018-02-16 14:20 GMT+01:00 Andor Molnar <an...@cloudera.com>:
>>>> 
>>>>> +1 for setting the Java8 requirement on server side.
>>>>> 
>>>>> *Client side.*
>>>>> I'd like the idea of the setting the requirement on client side too
>> without
>>>>> introducing anything Java8 specific. I'm not planning to use Java8
>> features
>>>>> right on, just thinking of opening the gates would be useful in the
>> long
>>>>> run.
>>>>> 
>>>>> Additionally, I don't see heavy development on the client side. Users
>> who
>>>>> are tightly coupled to Java7 are still able to use existing clients as
>> long
>>>>> as we introduce something breaking which they're forced to upgrade to
>> for
>>>>> whatever reason. I'm not sure what are the odds of that to happen.
>>>>> 
>>>> 
>>>> 
>>>> My two cents
>>>> Actually ZooKeeper is distributed as a single JAR which contains both
>>>> server and client side code, requiring Java 7 for the client and Java 8
>> for
>>>> the server will require a new way of packaging the artifacts and
>> building
>>>> the project (and this will require separating client side and server
>> side
>>>> code base).
>>>> Maybe I am missing something.
>>>> 
>>>> 
>>>> Enrico
>>>> 
>>>> 
>>>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Feb 16, 2018 at 12:31 PM, Flavio Junqueira <f...@apache.org>
>> wrote:
>>>>> 
>>>>>> We have this section in the admin doc that talks about the system
&

[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-21 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371886#comment-16371886
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

I think I know what's going on. It is correct that {{connectOne}} invokes 
{{recreateSocketAddresses}}, but that invocation won't happen in the case that 
the server receives a connection request rather than starting the connection. 
In fact, servers with larger ids are always supposed to start the connections.

I think the patch proposed here of invoking {{recreateSocketAddresses}} in 
{{findLeader}} in the learner class makes sense to compensate for 
{{recreateSocketAddresses}} not being invoked during leader election. Any other 
insight or anything I'm missing?



> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: 3.5.3-beta.zip, fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-21 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371452#comment-16371452
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

>From the logs, we can see the same exception being raised when the server is 
>trying to connect to elect a leader:

{noformat}
2018-02-20 20:41:25,669 [myid:1] - WARN  
[QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):QuorumPeer$QuorumServer@173]
 - Failed to resolve address: 
pravega-zookeeper-2.pravega-zookeeper-headless.default.svc.cluster.local
java.net.UnknownHostException: 
pravega-zookeeper-2.pravega-zookeeper-headless.default.svc.cluster.local
at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at 
org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:171)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.recreateSocketAddresses(QuorumPeer.java:727)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:682)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:716)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:919)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1190)
{noformat}

Once the address resolves and it can connect, the exception goes away and the 
notification messages flow regularly. The question is why the update performed 
during leader election to the quorum view in {{QuorumCnxManager.connectOne}} is 
not taking any effect in the view that {{Learner.findLeader}} uses to get the 
`QuorumServer` instance to connect to the leader. Two possibilities I can think 
of:

1- The server hasn't connected to the elected server during leader election, in 
which case the address wasn't updated.
2- The quorum view that the learner is using to get the quorum server instance 
is not the one that was updated in {{QuorumCnxManager}}.



> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
> Attachments: fixed.log
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370721#comment-16370721
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

[~eronwright] could you upload some server logs so that we can have a look, 
please?

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370695#comment-16370695
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

[~eronwright] in your set up, once you get that exception, is it the case that 
the ensemble never recovers (it is never able to elect a leader)?

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370690#comment-16370690
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

[~eronwright] a follower invokes followLeader after leader election completes. 
followLeader makes a call to connectToLeader.

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2982) Re-try DNS hostname -> IP resolution

2018-02-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370682#comment-16370682
 ] 

Flavio Junqueira commented on ZOOKEEPER-2982:
-

I don't see a reason why it would be a problem to invoke 
\{recreateSocketAddresses} from \{Learner.findLeader}. That method is simply 
returning the \{QuorumServer} instance corresponding to the server it believes 
to be the leader.

The thing that is puzzling is how a server ended up voting for another server 
that it can't talk to (because the name doesn't resolve). Is it a race that 
eventually goes away?

> Re-try DNS hostname -> IP resolution
> 
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.3
>Reporter: Eron Wright 
>Priority: Blocker
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4.  Some portions of the fix 
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started 
> before all peer addresses are resolvable, that server may cache a negative 
> lookup result and forever fail to resolve the address.For example, 
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) 
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN  
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
>  - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at 
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
> In the above example, the address `zk-2.zk.default.svc.cluster.local` was not 
> resolvable when the server started, but became resolvable shortly thereafter. 
>The server should eventually succeed but doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [Discuss] ZOOKEEPER-2982 DNS negative caching in 3.5

2018-02-20 Thread Flavio Junqueira
Thanks for catching this, Eron. It looks like the port to 3.5 misses changes as 
you correctly pointed out:

   
https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908
 


In particular, changes in Learner.java. I would say this should definitely be 
in 3.5.4.

-Flavio

> On 20 Feb 2018, at 01:14, Eron Wright  wrote:
> 
> Hello,
> 
> I attempted to run ZK 3.5.3-beta in a Kubernetes cluster, using the typical
> approach of a StatefulSet plus a pair of Services.   I observed that some
> of my ZK servers would fail to resolve the DNS addresses of its peers
> indefinitely.   It is normal that addresses cannot be resolved immediately
> at startup because the records are created asynchronously by Kubernetes.
> One would expect ZK to keep trying and eventually succeed.   Note that
> this issue affects 3.5 only; 3.4 seems to work fine.
> 
> I tracked the root cause down to a regression in 3.5.  ZOOKEEPER-1506 made
> an improvement 3.4 that wasn't ported to 3.5.  I opened ZOOKEEPER-2982 to
> track this, and have a PR ready.   Could we shoot to get the fix into 3.5.4?
> 
> Thanks,
> Eron Wright



Re: [SUGGESTION] Target branches 3.5 and master (3.6) to Java 8

2018-02-19 Thread Flavio Junqueira
I'm fine with moving to Java 8 or even 9 in 3.6. Does anyone have a different 
option? Otherwise, should we start a vote?

-Flavio


> On 16 Feb 2018, at 21:28, Abraham Fine <af...@apache.org> wrote:
> 
> I'm a -1 on requiring different minimum versions of java for the client and 
> the server.  I think this has the potential to create a lot of confusion for 
> users and contributors. 
> 
> I would support moving master (3.6) to java 8, I also think it is worth 
> considering moving to java 9. Given how long our release cycle tends to be I 
> think targeting the latest and greatest this early in the development cycle 
> is reasonable.
> 
> Thanks,
> Abe
> 
> On Fri, Feb 16, 2018, at 06:48, Enrico Olivelli wrote:
>> 2018-02-16 14:20 GMT+01:00 Andor Molnar <an...@cloudera.com>:
>> 
>>> +1 for setting the Java8 requirement on server side.
>>> 
>>> *Client side.*
>>> I'd like the idea of the setting the requirement on client side too without
>>> introducing anything Java8 specific. I'm not planning to use Java8 features
>>> right on, just thinking of opening the gates would be useful in the long
>>> run.
>>> 
>>> Additionally, I don't see heavy development on the client side. Users who
>>> are tightly coupled to Java7 are still able to use existing clients as long
>>> as we introduce something breaking which they're forced to upgrade to for
>>> whatever reason. I'm not sure what are the odds of that to happen.
>>> 
>> 
>> 
>> My two cents
>> Actually ZooKeeper is distributed as a single JAR which contains both
>> server and client side code, requiring Java 7 for the client and Java 8 for
>> the server will require a new way of packaging the artifacts and building
>> the project (and this will require separating client side and server side
>> code base).
>> Maybe I am missing something.
>> 
>> 
>> Enrico
>> 
>> 
>>> 
>>> Andor
>>> 
>>> 
>>> 
>>> On Fri, Feb 16, 2018 at 12:31 PM, Flavio Junqueira <f...@apache.org> wrote:
>>> 
>>>> We have this section in the admin doc that talks about the system
>>>> requirements:
>>>> 
>>>> https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperAdmin.html#sc_
>>>> requiredSoftware <https://zookeeper.apache.org/doc/r3.5.3-beta/
>>>> zookeeperAdmin.html#sc_requiredSoftware>
>>>> 
>>>> If we change, then we have to update that section. Specifically about
>>>> client and server, I'd think that there is no problem with requiring
>>> Java 8
>>>> on the server. The potential concern is with the client as it affects
>>>> applications that build against it. It would be best to not force
>>>> applications to upgrade themselves. Looking at the compatibility guide
>>> for
>>>> Java 8:
>>>> 
>>>> http://www.oracle.com/technetwork/java/javase/8-
>>>> compatibility-guide-2156366.html <http://www.oracle.com/
>>>> technetwork/java/javase/8-compatibility-guide-2156366.html>
>>>> 
>>>> The risk is that an application is strictly using Java 7 because of some
>>>> incompatibility listed in that guide, in which case, it wouldn't be able
>>> to
>>>> compile the ZK client assuming we get it to use some Java 8 construct.
>>> One
>>>> option is that we raise the requirement to Java 8, but we do no really
>>>> introduce anything that breaks compatibility for the next version. Users
>>>> should take this as a warning that they need to migrate to Java 8. I'm
>>> not
>>>> sure this makes the situation any better, though. Another option is that
>>> we
>>>> set a release to be the one in which we migrate and let everyone know
>>> that
>>>> they need to migrate.
>>>> 
>>>> -Flavio
>>>> 
>>>> 
>>>>> On 16 Feb 2018, at 12:05, Andor Molnar <an...@cloudera.com> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I think it would be nice to draw a line at branch-3.5 and target Java
>>>>> version 8 onwards. It seems to be a good opportunity for the upgrade
>>>> before
>>>>> we release a stable version of 3.5.
>>>>> 
>>>>> The benefit would be the ability to use new features of Java 8 in the
>>>> code:
>>>>> 
>>>>> Do think it's feasible?
>>>>> 
>>>>> Regards,
>>>>> Andor
>>>> 
>>>> 
>>> 



[jira] [Assigned] (ZOOKEEPER-2184) Zookeeper Client should re-resolve hosts when connection attempts fail

2018-02-19 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-2184:
---

Assignee: Andor Molnar  (was: Flavio Junqueira)

> Zookeeper Client should re-resolve hosts when connection attempts fail
> --
>
> Key: ZOOKEEPER-2184
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2184
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.6, 3.4.7, 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2, 
> 3.5.3, 3.4.11
> Environment: Ubuntu 14.04 host, Docker containers for Zookeeper & 
> Kafka
>Reporter: Robert P. Thille
>Assignee: Andor Molnar
>Priority: Blocker
>  Labels: easyfix, patch
> Fix For: 3.5.4, 3.4.12
>
> Attachments: ZOOKEEPER-2184.patch
>
>
> Testing in a Docker environment with a single Kafka instance using a single 
> Zookeeper instance. Restarting the Zookeeper container will cause it to 
> receive a new IP address. Kafka will never be able to reconnect to Zookeeper 
> and will hang indefinitely. Updating DNS or /etc/hosts with the new IP 
> address will not help the client to reconnect as the 
> zookeeper/client/StaticHostProvider resolves the connection string hosts at 
> creation time and never re-resolves.
> A solution would be for the client to notice that connection attempts fail 
> and attempt to re-resolve the hostnames in the connectString.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [SUGGESTION] Target branches 3.5 and master (3.6) to Java 8

2018-02-16 Thread Flavio Junqueira
We have this section in the admin doc that talks about the system requirements:

https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperAdmin.html#sc_requiredSoftware
 


If we change, then we have to update that section. Specifically about client 
and server, I'd think that there is no problem with requiring Java 8 on the 
server. The potential concern is with the client as it affects applications 
that build against it. It would be best to not force applications to upgrade 
themselves. Looking at the compatibility guide for Java 8:

http://www.oracle.com/technetwork/java/javase/8-compatibility-guide-2156366.html
 


The risk is that an application is strictly using Java 7 because of some 
incompatibility listed in that guide, in which case, it wouldn't be able to 
compile the ZK client assuming we get it to use some Java 8 construct. One 
option is that we raise the requirement to Java 8, but we do no really 
introduce anything that breaks compatibility for the next version. Users should 
take this as a warning that they need to migrate to Java 8. I'm not sure this 
makes the situation any better, though. Another option is that we set a release 
to be the one in which we migrate and let everyone know that they need to 
migrate.

-Flavio


> On 16 Feb 2018, at 12:05, Andor Molnar  wrote:
> 
> Hi all,
> 
> I think it would be nice to draw a line at branch-3.5 and target Java
> version 8 onwards. It seems to be a good opportunity for the upgrade before
> we release a stable version of 3.5.
> 
> The benefit would be the ability to use new features of Java 8 in the code:
> 
> Do think it's feasible?
> 
> Regards,
> Andor



Re: Review request for ZOOKEEPER-2845

2018-02-13 Thread Flavio Junqueira
Give me some time to catch up with the discussion thread in the issue, please.

-Flavio

> On 5 Feb 2018, at 15:43, Bobby Evans  wrote:
> 
> I was really hoping to get a review for ZOOKEEPER-2845
> 
> https://github.com/apache/zookeeper/pull/453 (master)
> https://github.com/apache/zookeeper/pull/454 (3.5 line)
> https://github.com/apache/zookeeper/pull/455 (3.4 line)
> 
> The bug was exposed by changes made in ZOOKEEPER-2678 which went into
> 3.4.10.
> There is a real, although rare, possibility of data corruption and because
> ZooKeeper is mission critical to so many other projects I would love to get
> this in before 3.4.12 is released.
> 
> Thanks,
> 
> Bobby Evans



Criticism on ZK

2018-02-13 Thread Flavio Junqueira
Hello community,

I came across this blog post:

  https://banzaicloud.com/blog/kafka-on-etcd/

And I thought it would be a good idea to discuss the criticism as a community. 
Let me copy the points here and add some notes:

• Unlike Kafka it does not have a vibrant and huge community (merge 
those PR’s please, anyone?)
I have personally met and worked with a lot of great people in this community 
over the years, and as such, I probably have a pretty biased view. But, it is a 
common concern that we are not fast enough at responding. We also don't have 
conferences and large meetups compared to other communities. Are those really 
necessary, though? What can we do to be a better community?

• It uses a protocol which is hard to understand and it’s hard to 
maintain a large Zookeeper cluster
I can't really speak for the hard to understand part, and I don't understand 
what "maintain a large ZooKeeper cluster" is referring to. How large is it and 
why do we need it to be large? We have features like observers that enable 
large clusters, but whether it solves the problem depends on what they are 
after.

• It’s a bit outdated, compared say with Raft
When we wrote about Zab years back, we had as a goal to explain the protocol in 
a way that could be reproduced. We had other goals too, like explaining how we 
had been successful in implementing a system like ZooKeeper with that protocol, 
the properties it guaranteed and so on. Raft focused on the simplicity of 
understanding, which makes a lot of sense given that there was interest in 
reproducing it. Given its focus, and clearly the quality of the people behind 
it, Raft has been more successful in popularizing the implementation of 
replicated state machines. At a protocol level, however, I don't think there is 
anything that makes Zab outdated with respect to Raft.

• It’s written in Java (yes, it’s opinionated but this is a problem for 
us as ZK is an infrastructure component)
This is arguable, there are pros and cons both ways.

• We run everything in Kubernetes and k8s by default has an in-built 
Raft implementation, etcd
I can totally understand this point. No one wants to have to operate two 
systems doing similar things. To consolidate operations, it clearly makes sense 
to pick one. Ironically, this post talks about plugability, but Kubernetes does 
not really give the option of using zk rather than etcd if that's what I want 
to use.  

• Linearizability (if there is a word like this) - check this 
comparison chart
We do provide linearizable reads with sync(), although I understand that it is 
arguable whether that is truly linearizable. There has been a long running 
discussion about whether we should make sync() truly linearizable by making it 
a first-class txn. Back in the day, we haven't done it because we wanted reads 
to be fast, so we implemented it in a way that it didn't have to go through the 
whole pipeline of request processors, but it still reaches out to the leader. 
See the issue for more detail: 
https://issues.apache.org/jira/browse/ZOOKEEPER-2136

• Performance and inherent scalability issues
I don't know if those experiments were done using a dedicated device to the txn 
log, which is a well-known fact about zk's performance. Incremental 
snapshotting is clearly a good way to reduce the amount of disk load for 
snapshots, but I wonder whether that's really a primary concern given that 
servers these days often have multiple devices.

I don't understand that max CPU utilization for zk 
(https://coreos.com/blog/performance-of-etcd.html). Perhaps this is something 
to be investigated.

• Client side complexity and thick clients
Due to the set of features we wanted to offer, we have indeed chosen this path. 

• Lack of service discovery
I don't have a good sense of how many users are actually bothered by this. I 
have heard complaints over time about service discovery with ZooKeeper, but I'm 
not sure there was any conclusion about whether service discovery is a good use 
case for such coordination systems, including etcd for that matter.

Any feedback?

Thanks,
-Flavio

New PMC Member: Michael Han

2017-06-27 Thread Flavio Junqueira
I'm very happy to announce that the Apache ZooKeeper PMC has voted to invite 
Michael Han to join the PMC and Michael accepted. Michael has done outstanding 
work in the community over the recent past and we felt it was time for Michael 
to deepen his level of engagement by joining the PMC.

Please join me in congratulating Michael for his achievement. Congratulations, 
Michael!

-Flavio




[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.

2017-06-09 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044550#comment-16044550
 ] 

Flavio Junqueira commented on ZOOKEEPER-1260:
-

Should we revive this patch? It seems to be stale.

> Audit logging in ZooKeeper servers.
> ---
>
> Key: ZOOKEEPER-1260
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260
> Project: ZooKeeper
>  Issue Type: New Feature
>Reporter: Mahadev konar
>Assignee: Mohammad Arshad
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf
>
>
> Lots of users have had questions on debugging which client changed what znode 
> and what updates went through a znode. We should add audit logging as in 
> Hadoop (look at Namenode Audit logging) to log which client changed what in 
> the zookeeper servers. This could just be a log4j audit logger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2804) Node creation fails with NPE if ACLs are null

2017-06-09 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044549#comment-16044549
 ] 

Flavio Junqueira commented on ZOOKEEPER-2804:
-

Good catch [~Bhupendra]! Do you want to submit a patch?

> Node creation fails with NPE if ACLs are null
> -
>
> Key: ZOOKEEPER-2804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2804
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: Bhupendra Kumar Jain
>
> If null ACLs are passed then zk node creation fails with NPE
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.removeDuplicates(PrepRequestProcessor.java:1301)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:1341)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:519)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:1126)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:178)
> {code}
> Below APIs have problem.
> {code}
> public void create(final String path, byte data[], List acl,
> CreateMode createMode, StringCallback cb, Object ctx)
> public void create(final String path, byte data[], List acl,
> CreateMode createMode, Create2Callback cb, Object ctx)
> {code}
> Solution: 
> a)  Need to handle NULL ACLs in removeDuplicates method in server.  
> b) Also add the client side validation for null / empty ACL for above API 
> similar to other create API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet

2017-06-09 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044197#comment-16044197
 ] 

Flavio Junqueira commented on ZOOKEEPER-2355:
-

It does look like a good candidate to be resolved soon. There is a patch 
available, but it seems to be stale. I also have had a look at it some time 
back, so I need to refresh my view.

In any case, help is appreciated.

> Ephemeral node is never deleted if follower fails while reading the proposal 
> packet
> ---
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Critical
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-2355-01.patch, ZOOKEEPER-2355-02.patch, 
> ZOOKEEPER-2355-03.patch, ZOOKEEPER-2355-04.patch, ZOOKEEPER-2355-05.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the 
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, 
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with 
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario 
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with 
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created 
> is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have 
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Build failure for recent commit

2017-06-08 Thread Flavio Junqueira
One of the project committers needs to cut a release candidate and put it up 
for a vote. If it is a bug fix release, then it should be relatively 
straightforward.

-Flavio

> On 06 Jun 2017, at 20:39, Ben Sherman  wrote:
> 
> Looking at https://issues.apache.org/jira/browse/ZOOKEEPER-1748 and
> https://github.com/apache/zookeeper/pull/83
> 
> It looks like jenkins is trying to post that the build worked and can't,
> resulting in what looks like a failure.  Can I get a hand on fixing this?
> 
> Also, what is the process for proposing a new release getting cut?  I'd
> like to see this change go into 3.4.11 asap.



[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-06-01 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033055#comment-16033055
 ] 

Flavio Junqueira commented on ZOOKEEPER-2779:
-

I'm sorry for jumping in late, but just so that I understand. If the problem is 
that we want to give the application the ability of setting a specific ACL 
without having an initial window of vulnerability, then would it be possible to 
have a parameter that sets that ACL rather a parameter that skips the default? 
A parameter that skips the default works, but sounds a bit hacky.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2017-05-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028326#comment-16028326
 ] 

Flavio Junqueira commented on ZOOKEEPER-1936:
-

It also looks like the diff was broken as the number of commits listed is 
large. I haven't looked closely but it seems that merges weren't done 
appropriately.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-1936.branch-3.4.patch, ZOOKEEPER-1936.patch, 
> ZOOKEEPER-1936.v2.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, 
> ZOOKEEPER-1936.v4.patch, ZOOKEEPER-1936.v5.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2017-05-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028323#comment-16028323
 ] 

Flavio Junqueira commented on ZOOKEEPER-1936:
-

I think it was simply closed, I had a few comments there that were never 
addressed.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-1936.branch-3.4.patch, ZOOKEEPER-1936.patch, 
> ZOOKEEPER-1936.v2.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, 
> ZOOKEEPER-1936.v4.patch, ZOOKEEPER-1936.v5.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: FYI - zetcd: running ZooKeeper apps without ZooKeeper

2017-05-23 Thread Flavio Junqueira
Yeah, that's a good reference, thanks for starting the thread.

@phunt
I see a lot of systems trying to implement the API of others in the same space 
hoping that they can make the migration easier. I feel that it isn't so much 
about us being a de facto standard, but more about the widespread use. Another 
point is that some applications have decided that they like etcd/Consul better, 
and they still need to have that pesky zookeeper because of some other 
dependency, e.g., Kafka.

@jordan
I'm really interested in your observation about the Consul herding.  Could you 
elaborate on the herding argument? Perhaps we should write a post about it...

-Flavio

> On 20 May 2017, at 18:44, Jordan Zimmerman  wrote:
> 
> I think they're trying to displace ZK for places that use things like Kafka 
> too. They can make the argument that you can install Hashicorp everything and 
> get rid of having to manage anything else. That said, both etcd and consul 
> are more k/v stores than they are distributed coordinators. I was playing 
> around with Consul this past week and it's locking recipes are far inferior 
> to ZooKeeper, for example (it suffers serious herding when contending for a 
> new lock holder).
> 
> -Jordan
> 
>> On May 20, 2017, at 6:40 PM, Patrick Hunt  wrote:
>> 
>> I saw that a few days ago, seems like it could be a real boon for folks
>> running in K8s (for example). The long term stability of our APIs really
>> reduce the pain of implementing something like this. Does Hashicorp have
>> something like this yet?
>> 
>> If I knew ten years ago that we would become the standard I would have
>> pushed harder to fix some of the rough(er) edges. ;-)
>> 
>> Patrick
>> 
>> On Fri, May 19, 2017 at 10:34 PM, Jordan Zimmerman <
>> jor...@jordanzimmerman.com> wrote:
>> 
>>> "The zetcd proxy sits in front of an etcd cluster and serves an emulated
>>> ZooKeeper client port, letting unmodified ZooKeeper applications run on top
>>> of etcd."
>>> 
>>> https://coreos.com/blog/introducing-zetcd
> 



Question about license

2017-04-14 Thread Flavio Junqueira
I think this is a question more for Pat Hunt. When going over the RAT report 
for the 3.5.3 RC, I noticed a bunch of doxygen-related files that have been 
there for quite some as they don't have the Apache License header. What 
actually called my attention is this observation in the legal FAQ:

CAN WE USE DOXYGEN-GENERATED CONFIG FILES?
As long as the generated comments are removed from the Doxygen-generated files, 
these files may be used.

https://www.apache.org/legal/resolved.html

I'm not entirely sure what comments this is referring to. Does Pat or anyone 
else remember if we have done a license sanity check on those files? 

Thanks,
-Flavio 

Re: [VOTE] Apache ZooKeeper release 3.5.3-beta candidate 1

2017-04-14 Thread Flavio Junqueira
+1, I have checked license and notice, RAT report, digests and signature. I 
have also ran a few local tests. Looks good!

I have an observation about the RAT report, but I'll send to the list 
separately.

-Flavio

> On 13 Apr 2017, at 05:47, sandeep shrids  wrote:
> 
> +1
> Tests passed for me. Started zk in standalone and in cluster mode.
> Used command-line for my tests, java version "1.8.0_121".
> 
> Minor comment: the artifact ending with .tar.gz is not a gzip archive but a
> tar.
> 
> -Sandeep
> 
> On Thu, Apr 13, 2017 at 5:28 AM, Patrick Hunt  wrote:
> 
>> +1 xsum/sig verified, RAT ran clean. Tests passed for me. I started a few
>> different cluster sizes, ran the command line client and it all seemed
>> fine.
>> 
>> Patrick
>> 
>> On Mon, Apr 3, 2017 at 10:26 AM, Michael Han  wrote:
>> 
>>> Hi all,
>>> 
>>> This is the second release candidate (RC1) of 3.5.3-beta, following the
>>> first release candidate (RC0) cut off last week [1]. It fixes incorrect
>>> Implementation-Version information contained in the manifest file of the
>>> generated artifacts. There is no code changes, just regenerated and
>>> resigned artifacts.
>>> 
>>> The full release notes are available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12310801=12335444
>>> 
>>>  Please download, test and vote by April 8th 2017, 23:59 UTC+0. 
>>> 
>>> Source files:
>>> https://home.apache.org/~hanm/zookeeper/zookeeper-3.5.3-beta-rc1/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/groups/staging/org/
>>> apache/zookeeper/zookeeper/3.5.3-beta/
>>> 
>>> The tag to be voted upon:
>>> *https://github.com/apache/zookeeper/tree/release-3.5.3-rc1
>>> *
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 
>>> [1]
>>> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/
>> 201703.mbox/%3CCA%
>>> 2Bi0x1LTONuaT6H%2BiOHAT2O1SVPKnQzzQh8vKMzWJVijCrHUrw%40mail.gmail.com%3E
>>> 
>>> --
>>> Cheers
>>> Michael.
>>> 
>> 



[jira] [Created] (ZOOKEEPER-2757) Incorrect path crashes zkCli

2017-04-14 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created ZOOKEEPER-2757:
---

 Summary: Incorrect path crashes zkCli
 Key: ZOOKEEPER-2757
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2757
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.5.3
Reporter: Flavio Junqueira
Priority: Minor
 Fix For: 3.5.4


If I try {{delete test}} without the leading /, then the CLI crashes with this 
exception:

{noformat}
Exception in thread "main" java.lang.IllegalArgumentException: Path must start 
with / character
at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:51)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:1659)
at org.apache.zookeeper.cli.DeleteCommand.exec(DeleteCommand.java:83)
at 
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:655)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:586)
at 
org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:370)
at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:330)
at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:292)
{noformat}

It should really fail the operation rather than crash the CLI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Can't verify signature of RC (Re: [VOTE] Apache ZooKeeper release 3.5.3-beta candidate 1)

2017-04-14 Thread Flavio Junqueira
Yeah, this one worked, thanks.

-Flavio

> On 14 Apr 2017, at 20:08, Michael Han  wrote:
> 
> gpg --keyserver keys.gnupg.net  --recv 767E7473



Can't verify signature of RC (Re: [VOTE] Apache ZooKeeper release 3.5.3-beta candidate 1)

2017-04-14 Thread Flavio Junqueira
Michael,

I can't verify the signature of the RC:

$ gpg2 --verify zookeeper-3.5.3-beta.tar.gz.asc zookeeper-3.5.3-beta.tar.gz
gpg: Signature made Mon Apr  3 18:39:14 2017 CEST using RSA key ID 767E7473
gpg: requesting key 767E7473 from hkp server keys.gnupg.net
: can't connect to `keys.gnupg.net': host not found
gpgkeys: HTTP fetch error 7: couldn't connect: Not found
gpg: no valid OpenPGP data found.
gpg: Total number processed: 0
gpg: Can't check signature: No public key

Is there anything I'm doing wrong?

-Flavio

> On 03 Apr 2017, at 19:26, Michael Han  wrote:
> 
> Hi all,
> 
> This is the second release candidate (RC1) of 3.5.3-beta, following the
> first release candidate (RC0) cut off last week [1]. It fixes incorrect
> Implementation-Version information contained in the manifest file of the
> generated artifacts. There is no code changes, just regenerated and
> resigned artifacts.
> 
> The full release notes are available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12310801=12335444
> 
>  Please download, test and vote by April 8th 2017, 23:59 UTC+0. 
> 
> Source files:
> https://home.apache.org/~hanm/zookeeper/zookeeper-3.5.3-beta-rc1/
> 
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/
> apache/zookeeper/zookeeper/3.5.3-beta/
> 
> The tag to be voted upon:
> *https://github.com/apache/zookeeper/tree/release-3.5.3-rc1
> *
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
> 
> Should we release this candidate?
> 
> [1]
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201703.mbox/%3CCA%2Bi0x1LTONuaT6H%2BiOHAT2O1SVPKnQzzQh8vKMzWJVijCrHUrw%40mail.gmail.com%3E
> 
> -- 
> Cheers
> Michael.



Extending the vote (was: Re: [VOTE] Apache ZooKeeper release 3.5.3-beta candidate 1)

2017-04-07 Thread Flavio Junqueira
Would it be ok to extend the vote until mid next week?

-Flavio

> On 03 Apr 2017, at 18:26, Michael Han  wrote:
> 
> Hi all,
> 
> This is the second release candidate (RC1) of 3.5.3-beta, following the
> first release candidate (RC0) cut off last week [1]. It fixes incorrect
> Implementation-Version information contained in the manifest file of the
> generated artifacts. There is no code changes, just regenerated and
> resigned artifacts.
> 
> The full release notes are available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12310801=12335444
> 
>  Please download, test and vote by April 8th 2017, 23:59 UTC+0. 
> 
> Source files:
> https://home.apache.org/~hanm/zookeeper/zookeeper-3.5.3-beta-rc1/
> 
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/
> apache/zookeeper/zookeeper/3.5.3-beta/
> 
> The tag to be voted upon:
> *https://github.com/apache/zookeeper/tree/release-3.5.3-rc1
> *
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
> 
> Should we release this candidate?
> 
> [1]
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201703.mbox/%3CCA%2Bi0x1LTONuaT6H%2BiOHAT2O1SVPKnQzzQh8vKMzWJVijCrHUrw%40mail.gmail.com%3E
> 
> -- 
> Cheers
> Michael.



[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism

2017-03-31 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950735#comment-15950735
 ] 

Flavio Junqueira commented on ZOOKEEPER-2076:
-

Go for it, [~atris], I've assigned it to you.

> Improve Leader Change Mechanism
> ---
>
> Key: ZOOKEEPER-2076
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Alexander Shraer
>Assignee: Atri Sharma
>
> When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a 
> mechanism where the old leader nominates the new one. Although it reduces the 
> time for a new leader to be elected, it still takes too long. This JIRA is 
> for two things:
> 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the 
> handoff.
> 2. Make it a first-class citizen & export it as a client API. We get 
> questions about this once in a while - how do I cause a different leader to 
> be elected ? Currently the response is either kill or reconfigure the current 
> leader.
> Any one interested to work on this ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ZOOKEEPER-2076) Improve Leader Change Mechanism

2017-03-31 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-2076:
---

Assignee: Atri Sharma

> Improve Leader Change Mechanism
> ---
>
> Key: ZOOKEEPER-2076
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Alexander Shraer
>Assignee: Atri Sharma
>
> When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a 
> mechanism where the old leader nominates the new one. Although it reduces the 
> time for a new leader to be elected, it still takes too long. This JIRA is 
> for two things:
> 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the 
> handoff.
> 2. Make it a first-class citizen & export it as a client API. We get 
> questions about this once in a while - how do I cause a different leader to 
> be elected ? Currently the response is either kill or reconfigure the current 
> leader.
> Any one interested to work on this ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Patch for ZOOKEEPER-2184 feedback

2017-03-31 Thread Flavio Junqueira
I'd love to see ZK-2184 fixed. If you have come up with a PR, Powell, I'd be 
happy to have a look and see if we can converge to a common set of changes.

Michael is right that there is already a PR there, so we would eventually need 
to decide whether to make changes to it, drop or what.

-Flavio

> On 31 Mar 2017, at 07:23, powell molleti  wrote:
> 
> Hi Michael,
> 
> 
> I did look at it and I can attempt to rebase to it that should not be a 
> problem but that again the changes could undo most of it. 
> I am pointing to the comment:
> https://issues.apache.org/jira/browse/ZOOKEEPER-2184?focusedCommentId=15873730=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15873730
> 
> Which quotes 
> "The ideal situation for the problematic scenario is that we resolve the host 
> name every time we try to connect to a server, but that would be a fairly 
> fundamental change to how we resolve addresses in ZooKeeper."
> 
> We can move this conversation to the Jira. I posted my changes after reading 
> that comment since I felt like these changes could address this issue or 
> at-least is headed in that direction.
> 
> General idea is that we could let hostname resolution happen right before 
> socket connect call. If a customer never provided hostname perhaps it is 
> incorrect to perform reverse name lookup for it. If a customer did provided a 
> hostname then perhaps it is incorrect to perform a reverse name lookup using 
> the address we resolved(for the given name) later on and use this instead of 
> the given hostname.
> 
> Please advise.
> 
> thanks
> Powell.
> On Thursday, March 30, 2017 3:55 PM, Michael Han  wrote:
> 
> 
> 
> HI Powell,
> 
> Have you looked at the existing PR (
> https://github.com/apache/zookeeper/pull/150) for ZOOKEEPER-2184? I think
> that's what community is working on, and it's close to get merged, so
> probably worth to adjust your work on top of that issue?
> 
> 
> On Wed, Mar 29, 2017 at 9:25 PM, powell molleti > wrote:
> 
>> Hi,
>> 
>> 
>> I was wondering if anyone has cycles to look at the PR I have for
>> ZOOKEEPER-2184: Resolve address only on demand (
>> https://github.com/apache/zookeeper/pull/199 ).
>> 
>> Let me know if I am heading in the wrong direction any pointers will help
>> me to use these changes or drop them from a different PR I have.
>> 
>> thanks
>> Powell.
>> 
> 
> 
> 
> -- 
> Cheers
> Michael.



Re: [VOTE] Apache ZooKeeper release 3.4.10 candidate 1

2017-03-28 Thread Flavio Junqueira
+1, ran tests, checked signature and digests, checked license and notice, ran 
some smoke tests locally.

-Flavio

> On 28 Mar 2017, at 22:44, Michael Han  wrote:
> 
> Yes, I forgot to mention that I did some backward compatible verification
> as well. I was using 3.4.10 client to connect to a 3.5.3 ensemble and it
> works as expected.
> 
> On Tue, Mar 28, 2017 at 2:40 PM, Patrick Hunt  wrote:
> 
>> fwiw I also tried running a 3 and 23 node cluster and it worked fine. 3.4.9
>> client accessing 3.4.10 rc1 ensemble worked fine as well.
>> 
>> Patrick
>> 
>> On Mon, Mar 27, 2017 at 5:04 PM, Patrick Hunt  wrote:
>> 
>>> +1 - xsum/sig verified, RAT ran clean, was able to build and run the
>> tests
>>> successfully (mac).
>>> 
>>> Patrick
>>> 
>>> On Thu, Mar 23, 2017 at 5:40 AM, Rakesh Radhakrishnan <
>> rake...@apache.org>
>>> wrote:
>>> 
 This is the second release candidate for 3.4.10. This candidate fixed
>> the
 MiniKDC authentication test case failures reported in the previous
 candidate.
 
 This is a bug fix release candidate for 3.4.10. It fixes 43 issues,
 including
 security feature QuorumPeer mutual authentication via SASL and other
 potential bugs.
 
 The full release notes are available at:
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
 ctId=12310801=12338036
 
 *** Please download, test and vote by March 29th 2017, 23:59 UTC+0. ***
 
 Source files:
 http://people.apache.org/~rakeshr/zookeeper-3.4.10-candidate-1/
 
 Maven staging repo:
 https://repository.apache.org/content/groups/staging/org/apa
 che/zookeeper/zookeeper/3.4.10/
 
 The release candidate tag in git to be voted upon: release-3.4.10-rc1
 https://github.com/apache/zookeeper/tree/release-3.4.10-rc1
 
 ZooKeeper's KEYS file containing PGP keys we use to sign the release:
 http://www.apache.org/dist/zookeeper/KEYS
 
 Should we release this candidate?
 
 
 Rakesh
 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Cheers
> Michael.



[jira] [Assigned] (ZOOKEEPER-2709) Clarify documentation around "auth" ACL scheme

2017-03-12 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-2709:
---

Assignee: Josh Elser

> Clarify documentation around "auth" ACL scheme
> --
>
> Key: ZOOKEEPER-2709
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2709
> Project: ZooKeeper
>  Issue Type: Task
>  Components: documentation
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 3.5.3, 3.6.0
>
>
> We recently found up in HBASE-17717 that we were incorrectly setting an ACL 
> on our "sensitive" znodes after the output of {{getACL}} on these nodes 
> didn't match what was expected.
> In referencing the documentation about how the {{auth}} ACL scheme was 
> supposed to work, it was unclear if it was a ZooKeeper bug or an HBase bug. 
> After reading some ZooKeeper code, we found that it was an HBase bug, but it 
> would be nice to clarify the docs around this ACL scheme.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: v 3.6.0

2017-03-12 Thread Flavio Junqueira
I had volunteered to manage it, but I got no cycles at all in the past 2-3 
months. We are also a bit late with the other 3.4 and 3.5 releases, which 
helped with getting it delayed. I agree otherwise that it would be great to 
have it sooner rather than later, though.

-Flavio

> On 11 Mar 2017, at 23:02, Jordan Zimmerman  wrote:
> 
> Is there any idea of date for 3.6.0? We really need TTL nodes and have 
> resorted to a fork internally to get it. I know that 3.5.x is the current 
> focus but at this rate how long, realistically, would we have 3.6.0? 
> 
> -Jordan



[jira] [Commented] (ZOOKEEPER-2184) Zookeeper Client should re-resolve hosts when connection attempts fail

2017-02-19 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873730#comment-15873730
 ] 

Flavio Junqueira commented on ZOOKEEPER-2184:
-

I haven't had much time to work on this issue, but here is my current 
assessment.

This issue seemed easy to fix at first, but it is fairly fundamental with 
respect to how we resolve host names. Currently, we resolve host names when we 
start a client and never resolve it again. This is the cause of the problem 
reported in the issue because in the scenario described, the zookeeper 
container is re-started and changes addresses, which prevents the client from 
connecting to the zookeeper server. 

The proposed patch here tries to re-resolve the hostname every time the client 
fails to connect to the resolved address. It kind of works, but it makes 
{{StaticHostProvider}} a bit messy because the expectation with the current 
wiring is that we won't have to resolve again.

The ideal situation for the problematic scenario is that we resolve the host 
name every time we try to connect to a server, but that would be a fairly 
fundamental change to how we resolve addresses in ZooKeeper. 

I was also looking at the C client and it might get a bit messy too there 
because I don't think we currently keep the association between the host name 
and the resolved address, so we don't really know what to resolve again. It 
might be possible to do it via the canonical name in {{getaddrinfo}}, but I'm 
not sure how that works with windows.

One specific proposal to avoid having clients never finding a server ever again 
without deep changes to the current wiring is to resolve again everything in 
the case the client tries all and none succeeds. That would be a fairly 
straightforward change to both Java and C client, but it would not resolve 
addresses again in the case the a strict subset has changed addresses and at 
least one server is reachable.




> Zookeeper Client should re-resolve hosts when connection attempts fail
> --
>
> Key: ZOOKEEPER-2184
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2184
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.6, 3.5.0
> Environment: Ubuntu 14.04 host, Docker containers for Zookeeper & 
> Kafka
>Reporter: Robert P. Thille
>Assignee: Flavio Junqueira
>  Labels: easyfix, patch
> Fix For: 3.5.3, 3.4.11
>
> Attachments: ZOOKEEPER-2184.patch
>
>
> Testing in a Docker environment with a single Kafka instance using a single 
> Zookeeper instance. Restarting the Zookeeper container will cause it to 
> receive a new IP address. Kafka will never be able to reconnect to Zookeeper 
> and will hang indefinitely. Updating DNS or /etc/hosts with the new IP 
> address will not help the client to reconnect as the 
> zookeeper/client/StaticHostProvider resolves the connection string hosts at 
> creation time and never re-resolves.
> A solution would be for the client to notice that connection attempts fail 
> and attempt to re-resolve the hostnames in the connectString.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2260) Paginated getChildren call

2017-01-28 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15844081#comment-15844081
 ] 

Flavio Junqueira commented on ZOOKEEPER-2260:
-

I really like this feature! Please submit a pull request, though, see the 
how-to-contribute page:

https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute

> Paginated getChildren call
> --
>
> Key: ZOOKEEPER-2260
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2260
> Project: ZooKeeper
>  Issue Type: New Feature
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Marco P.
>Assignee: Marco P.
>  Labels: api, features
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2260.patch, ZOOKEEPER-2260.patch
>
>
> Add pagination support to the getChildren() call, allowing clients to iterate 
> over children N at the time.
> Motivations for this include:
>   - Getting out of a situation where so many children were created that 
> listing them exceeded the network buffer sizes (making it impossible to 
> recover by deleting)[1]
>  - More efficient traversal of nodes with large number of children [2]
> I do have a patch (for 3.4.6) we've been using successfully for a while, but 
> I suspect much more work is needed for this to be accepted. 
> [1] https://issues.apache.org/jira/browse/ZOOKEEPER-272
> [2] https://issues.apache.org/jira/browse/ZOOKEEPER-282



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2125) SSL on Netty client-server communication

2017-01-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843297#comment-15843297
 ] 

Flavio Junqueira commented on ZOOKEEPER-2125:
-

I remember having a different user on our list also requesting a backport of 
this feature. I'd have to look for the thread, but the message we gave at the 
time was that it is ok to work on a patch and share with the community, but we 
won't be merging it o 3.4 because it is a stable branch and we generally avoid 
merging new features to a stable branch, we target merging only bug fixes.

As for it being technically possible, I don't see why i wouldn't be, but I 
haven't actually tried.

> SSL on Netty client-server communication
> 
>
> Key: ZOOKEEPER-2125
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2125
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Hongchao Deng
>Assignee: Hongchao Deng
> Fix For: 3.5.1, 3.6.0
>
> Attachments: testKeyStore.jks, testTrustStore.jks, 
> ZOOKEEPER-2125-build.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch
>
>
> Supporting SSL on Netty client-server communication. 
> 1. It supports keystore and trustore usage. 
> 2. It adds an additional ZK server port which supports SSL. This would be 
> useful for rolling upgrade.
> RB: https://reviews.apache.org/r/31277/
> The patch includes three files: 
> * testing purpose keystore and truststore under 
> "$(ZK_REPO_HOME)/src/java/test/data/ssl". Might need to create "ssl/".
> * latest ZOOKEEPER-2125.patch
> h2. How to use it
> You need to set some parameters on both ZK server and client.
> h3. Server
> You need to specify a listening SSL port in "zoo.cfg":
> {code}
> secureClientPort=2281
> {code}
> Just like what you did with "clientPort". And then set some jvm flags:
> {code}
> export 
> SERVER_JVMFLAGS="-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
>  -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks 
> -Dzookeeper.ssl.keyStore.password=testpass 
> -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks 
> -Dzookeeper.ssl.trustStore.password=testpass"
> {code}
> Please change keystore and truststore parameters accordingly.
> h3. Client
> You need to set jvm flags:
> {code}
> export 
> CLIENT_JVMFLAGS="-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
>  -Dzookeeper.client.secure=true 
> -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks 
> -Dzookeeper.ssl.keyStore.password=testpass 
> -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks 
> -Dzookeeper.ssl.trustStore.password=testpass"
> {code}
> change keystore and truststore parameters accordingly.
> And then connect to the server's SSL port, in this case:
> {code}
> bin/zkCli.sh -server 127.0.0.1:2281
> {code}
> If you have any feedback, you are more than welcome to discuss it here!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2125) SSL on Netty client-server communication

2017-01-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842884#comment-15842884
 ] 

Flavio Junqueira commented on ZOOKEEPER-2125:
-

hello [~shivamvds], this is not a bug fix, but a feature and as such we have no 
plan to back port it to the 3.4 branch. we are working towards a stable 3.5.x 
release, though.

> SSL on Netty client-server communication
> 
>
> Key: ZOOKEEPER-2125
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2125
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Hongchao Deng
>Assignee: Hongchao Deng
> Fix For: 3.5.1, 3.6.0
>
> Attachments: testKeyStore.jks, testTrustStore.jks, 
> ZOOKEEPER-2125-build.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, 
> ZOOKEEPER-2125.patch
>
>
> Supporting SSL on Netty client-server communication. 
> 1. It supports keystore and trustore usage. 
> 2. It adds an additional ZK server port which supports SSL. This would be 
> useful for rolling upgrade.
> RB: https://reviews.apache.org/r/31277/
> The patch includes three files: 
> * testing purpose keystore and truststore under 
> "$(ZK_REPO_HOME)/src/java/test/data/ssl". Might need to create "ssl/".
> * latest ZOOKEEPER-2125.patch
> h2. How to use it
> You need to set some parameters on both ZK server and client.
> h3. Server
> You need to specify a listening SSL port in "zoo.cfg":
> {code}
> secureClientPort=2281
> {code}
> Just like what you did with "clientPort". And then set some jvm flags:
> {code}
> export 
> SERVER_JVMFLAGS="-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
>  -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks 
> -Dzookeeper.ssl.keyStore.password=testpass 
> -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks 
> -Dzookeeper.ssl.trustStore.password=testpass"
> {code}
> Please change keystore and truststore parameters accordingly.
> h3. Client
> You need to set jvm flags:
> {code}
> export 
> CLIENT_JVMFLAGS="-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
>  -Dzookeeper.client.secure=true 
> -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks 
> -Dzookeeper.ssl.keyStore.password=testpass 
> -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks 
> -Dzookeeper.ssl.trustStore.password=testpass"
> {code}
> change keystore and truststore parameters accordingly.
> And then connect to the server's SSL port, in this case:
> {code}
> bin/zkCli.sh -server 127.0.0.1:2281
> {code}
> If you have any feedback, you are more than welcome to discuss it here!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2672) Remove CHANGE.txt

2017-01-26 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839537#comment-15839537
 ] 

Flavio Junqueira commented on ZOOKEEPER-2672:
-

I'm not aware on any dependency on CHANGES.txt, so I'm +1 for removing it. 
According to the project bylaws, I'd say that this change corresponds to a 
change to the code base, and as such, the vote is by lazy approval, switching 
to lazy majority in the case of at least one -1, where the binding votes are 
from active committers.

> Remove CHANGE.txt
> -
>
> Key: ZOOKEEPER-2672
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2672
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.4.9, 3.5.2
>Reporter: Michael Han
>Assignee: Michael Han
>
> The CHANGE.txt is already not the source of truth of what's changed after we 
> migrating to git - most of the git commits in recent couple of months don't 
> update CHANGE.txt. The option of updating CHANGE.txt during commit flow 
> automatically is none trivial, and do that manually is cumbersome and error 
> prone.
> The consensus is we would rely on source control revision logs instead of 
> CHANGE.txt moving forward; see 
> https://www.mail-archive.com/dev@zookeeper.apache.org/msg37108.html for more 
> details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: ZooKeeper 3.4.10 release discussion

2017-01-26 Thread Flavio Junqueira
Here are a few comments on the proposal of changes to the release process:

- It might be a better idea to preserve the HowToRelease document for future 
reference, clone the document, and change the cloned document to reflect the 
git commands rather than svn.  
- We still need to modify Step 2 to be git oriented, otherwise it will look odd 
that we have svn there.
- In Step 4, I thought that we had informally agreed to rely on the git log 
rather than maintain the CHANGES.txt file. If we aren't all onboard with the 
idea of stopping to use CHANGES.txt, then we need to discuss this separately.
- Steps 5 and 6: I'm not sure why the steps to produce the release notes 
changes. We still resolve issues on jira which is pretty much the source of 
data for the release notes.
- Step 10: I personally don't like using "git commit -a" unless you're pretty 
sure that it is what you want. A much safer approach is to run "git status" and 
"git add" to the individual files/directories.
- Step 11: Why are we tagging with -s? Is that standard practice in other 
projects?

-Flavio

> On 26 Jan 2017, at 03:30, Rakesh Radhakrishnan <rake...@apache.org> wrote:
> 
> Agreed, will try to resolve ZK-2184. I have included this to 3.4.10
> releasing. I could see few open review comments in the PR, probably will
> push once this is concluded.
> 
> Thanks,
> Rakesh
> 
> On Thu, Jan 26, 2017 at 2:01 AM, Flavio Junqueira <f...@apache.org> wrote:
> 
>> I'd like to have ZK-2184 in as well. I have seen many cases in which
>> applications are affected by that problem. If folks can help me push it
>> through, I'd appreciate.
>> 
>> -Flavio
>> 
>>> On 25 Jan 2017, at 17:01, Rakesh Radhakrishnan <rake...@apache.org>
>> wrote:
>>> 
>>> I've reviewed ZOOKEEPER-2044 pull request and added few comments. I hope
>>> this will be committed soon.
>>> 
>>> I'm planning to keep the CHANGE.txt file for this release. But, not
>>> updating the commit history considering that git revision can be used as
>> a
>>> reference. Please see my comment https://goo.gl/wu5V2M in ZOOKEEPER-2672
>>> jira.
>>> 
>>> Sometime back, I've filtered the issues which was marked for 3.4.10 and
>>> moved out these to 3.4.11 release.
>>> 
>>> Thanks,
>>> Rakesh
>>> 
>>> On Wed, Jan 25, 2017 at 5:41 AM, Michael Han <h...@cloudera.com> wrote:
>>> 
>>>> Hi Rakesh,
>>>> 
>>>> Thanks for driving 3.4.10 release.
>>>> 
>>>> I've been looking at https://issues.apache.org/
>> jira/browse/ZOOKEEPER-2044
>>>> today I think this could be a good addition to 3.4.10 release - what do
>> you
>>>> think? Should we get this in 3.4.10?
>>>> 
>>>> 
>>>> On Tue, Jan 24, 2017 at 9:13 AM, Rakesh Radhakrishnan <
>> rake...@apache.org>
>>>> wrote:
>>>> 
>>>>> Hi folks,
>>>>> 
>>>>> ZOOKEEPER-2573 fix is agreed and will be resolved soon. After
>> committing
>>>>> this jira, I'm planning to start cutting a release candidate based on
>> my
>>>>> proposed "HowToRelease" ZK cwiki changes.
>>>>> 
>>>>> Appreciate feedback on proposed ZK cwiki https://cwiki.apache.org/
>>>>> confluence/display/ZOOKEEPER/HowToRelease changes. Please refer my
>>>>> previous
>>>>> mail to understand more about it.
>>>>> 
>>>>> Thanks,
>>>>> Rakesh
>>>>> 
>>>>> On Tue, Jan 17, 2017 at 12:11 PM, Rakesh Radhakrishnan <
>>>> rake...@apache.org
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> OK. I have modified ZK cwiki page https://cwiki.apache.org/
>>>>>> confluence/display/ZOOKEEPER/HowToRelease directly. Please review the
>>>>> newly
>>>>>> added lines in orange color to understand the changes. The following
>>>>>> sections has been modified:
>>>>>> 
>>>>>>  - *Updating the release branch -> modified steps **1, 4, 10, 11*
>>>>>>  - *Building -> modified step 9*
>>>>>>  - *Publishing -> modified step 1*
>>>>>> 
>>>>>> Thanks,
>>>>>> Rakesh
>>>>>> 
>>>>>> On Tue, Jan 17, 2017 at 11:36 AM, Patrick Hunt <ph...@apache.org>
>>>> wrote:
>>>>>> 
>>>>>>> Perhaps yo

Re: ZooKeeper 3.4.10 release discussion

2017-01-25 Thread Flavio Junqueira
I'd like to have ZK-2184 in as well. I have seen many cases in which 
applications are affected by that problem. If folks can help me push it 
through, I'd appreciate.

-Flavio

> On 25 Jan 2017, at 17:01, Rakesh Radhakrishnan  wrote:
> 
> I've reviewed ZOOKEEPER-2044 pull request and added few comments. I hope
> this will be committed soon.
> 
> I'm planning to keep the CHANGE.txt file for this release. But, not
> updating the commit history considering that git revision can be used as a
> reference. Please see my comment https://goo.gl/wu5V2M in ZOOKEEPER-2672
> jira.
> 
> Sometime back, I've filtered the issues which was marked for 3.4.10 and
> moved out these to 3.4.11 release.
> 
> Thanks,
> Rakesh
> 
> On Wed, Jan 25, 2017 at 5:41 AM, Michael Han  wrote:
> 
>> Hi Rakesh,
>> 
>> Thanks for driving 3.4.10 release.
>> 
>> I've been looking at https://issues.apache.org/jira/browse/ZOOKEEPER-2044
>> today I think this could be a good addition to 3.4.10 release - what do you
>> think? Should we get this in 3.4.10?
>> 
>> 
>> On Tue, Jan 24, 2017 at 9:13 AM, Rakesh Radhakrishnan 
>> wrote:
>> 
>>> Hi folks,
>>> 
>>> ZOOKEEPER-2573 fix is agreed and will be resolved soon. After committing
>>> this jira, I'm planning to start cutting a release candidate based on my
>>> proposed "HowToRelease" ZK cwiki changes.
>>> 
>>> Appreciate feedback on proposed ZK cwiki https://cwiki.apache.org/
>>> confluence/display/ZOOKEEPER/HowToRelease changes. Please refer my
>>> previous
>>> mail to understand more about it.
>>> 
>>> Thanks,
>>> Rakesh
>>> 
>>> On Tue, Jan 17, 2017 at 12:11 PM, Rakesh Radhakrishnan <
>> rake...@apache.org
 
>>> wrote:
>>> 
 OK. I have modified ZK cwiki page https://cwiki.apache.org/
 confluence/display/ZOOKEEPER/HowToRelease directly. Please review the
>>> newly
 added lines in orange color to understand the changes. The following
 sections has been modified:
 
   - *Updating the release branch -> modified steps **1, 4, 10, 11*
   - *Building -> modified step 9*
   - *Publishing -> modified step 1*
 
 Thanks,
 Rakesh
 
 On Tue, Jan 17, 2017 at 11:36 AM, Patrick Hunt 
>> wrote:
 
> Perhaps you can make the changes directly on the wiki page as a
>>> duplicate
> line item under the original in a different color? It's hard for me to
> really follow, esp as it's not a 1:1 replacement iiuc. Could you try
> editing the wiki directly to start with, leave the original line and
>> add
> the new line(s) but in another color or some other indication?
> 
> Thanks Rakesh.
> 
> Patrick
> 
> On Mon, Jan 16, 2017 at 8:48 AM, Rakesh Radhakrishnan <
>>> rake...@apache.org
>> 
> wrote:
> 
>> Hi folks,
>> 
>> As we all know, 3.4.10 release is the first ZooKeeper release after
>>> the
>> github repository migration. I have tried an attempt to modify the
>>> steps
>> described in the '
>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease'
> page
>> to
>> make the release. Since this release is from an already created
>>> branch,
> I
>> have focused only the branch related parts in cwiki and below
>> sections
> in
>> the page needed changes like,
>> 
>> 
>> *Updating the release branch*
>> 1. Check out the branch with:
>> git clone -b branch-X.Y
>> https://git-wip-us.apache.org/repos/asf/zookeeper.git
>> 
>> 2. I'm skipping this step, which is not required now.
>> 
>> 4. Update CHANGES.txt with the committed jira details. As we follow
>> PR
>> merging, most of the jira info is not updated in this file. I
>> believe
>> release manager need to update this file to capture the jira details
> marked
>> for that release.
>> 
>> 10. Commit these changes.
>> git commit -a -m "Preparing for release X.Y.Z"
>> git push  
>> 
>> 11. Tag the release candidate (R is the release candidate number,
>> and
>> starts from 0):
>> git tag -s release-X.Y.Z-rcR -m "ZooKeeper X.Y.Z-rcR release."
>> 
>> Push the newly created rc tag to the remote repo.
>> git push  release-X.Y.Z-rcR
>> 
>> 
>> *Building*
>> 9. Call for a release vote on dev
>>  In the release candidate dev mail format, it needs to
>> change
> the
>> tag like,
>> 
>>  "The RC tag in git to be voted upon: release-X.Y.Z-rcR"
>> 
>> 
>> *Publishing*
>> 1. Tag the release:
>> git tag -s release-X.Y.Z -m "ZooKeeper X.Y.Z release."
>> 
>> Push the newly created release tag to the remote repo.
>> git push  release-X.Y.Z
>> 
>> 
>> I'd like to know whether I'm going in the right direction and start
> cutting
>> the 3.4.10 release by following this approach. Thanks!
>> 
>> Thanks,
>> Rakesh
>> 
>> On Mon, Jan 

[jira] [Updated] (ZOOKEEPER-2633) Build failure in contrib/zkfuse with gcc 6.x

2017-01-25 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-2633:

Assignee: Raghavendra Prabhu

> Build failure in contrib/zkfuse with gcc 6.x
> 
>
> Key: ZOOKEEPER-2633
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2633
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: contrib-zkfuse
> Environment: gcc --version
> gcc (GCC) 6.2.1 20160830
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> g++ --version
> g++ (GCC) 6.2.1 20160830
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> CFLAGS, CXXFLAGS, and LDFLAGS are unset, hence default.
> uname -a
> Linux lative 4.8.8-1-ARCH #1 SMP PREEMPT Tue Nov 15 08:25:24 CET 2016 x86_64 
> GNU/Linux
>Reporter: Raghavendra Prabhu
>Assignee: Raghavendra Prabhu
>Priority: Minor
> Fix For: 3.4.10, 3.5.3, 3.6.0
>
>
> The build in contrib/zkfuse fails with
> {noformat}
> make
> (CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh 
> /home/raghu/zookeeper/src/contrib/zkfuse/missing autoheader)
> rm -f stamp-h1
> touch config.h.in
> cd . && /bin/sh ./config.status config.h
> config.status: creating config.h
> config.status: config.h is unchanged
> make  all-recursive
> make[1]: Entering directory '/home/raghu/zookeeper/src/contrib/zkfuse'
> Making all in src
> make[2]: Entering directory '/home/raghu/zookeeper/src/contrib/zkfuse/src'
> g++ -DHAVE_CONFIG_H -I. -I..
> -I/home/raghu/zookeeper/src/contrib/zkfuse/../../c/include 
> -I/home/raghu/zookeeper/src/contrib/zkfuse/../../c/generated -I../include 
> -I/usr/include -D_FILE_OFFSET_BITS=64 -D_REENTRANT -march=x86-64 
> -mtune=generic -O2 -pipe -fstack-protector-strong -MT zkfuse.o -MD -MP -MF 
> .deps/zkfuse.Tpo -c -o zkfuse.o zkfuse.cc
> g++ -DHAVE_CONFIG_H -I. -I..
> -I/home/raghu/zookeeper/src/contrib/zkfuse/../../c/include 
> -I/home/raghu/zookeeper/src/contrib/zkfuse/../../c/generated -I../include 
> -I/usr/include -D_FILE_OFFSET_BITS=64 -D_REENTRANT -march=x86-64 
> -mtune=generic -O2 -pipe -fstack-protector-strong -MT zkadapter.o -MD -MP -MF 
> .deps/zkadapter.Tpo -c -o zkadapter.o zkadapter.cc
> In file included from zkadapter.h:34:0,
>  from zkadapter.cc:24:
> event.h:216:9: error: reference to ‘shared_ptr’ is ambiguous
>  shared_ptr m_eventWrapper;
>  ^~
> In file included from /usr/include/boost/throw_exception.hpp:42:0,
>  from /usr/include/boost/smart_ptr/shared_ptr.hpp:27,
>  from /usr/include/boost/shared_ptr.hpp:17,
>  from event.h:30,
>  from zkadapter.h:34,
>  from zkadapter.cc:24:
> /usr/include/boost/exception/exception.hpp:148:11: note: candidates are: 
> template class boost::shared_ptr
>  class shared_ptr;
>^~
> In file included from /usr/include/c++/6.2.1/bits/shared_ptr.h:52:0,
>  from /usr/include/c++/6.2.1/memory:82,
>  from /usr/include/boost/config/no_tr1/memory.hpp:21,
>  from /usr/include/boost/smart_ptr/shared_ptr.hpp:23,
>  from /usr/include/boost/shared_ptr.hpp:17,
>  from event.h:30,
>  from zkadapter.h:34,
>  from zkadapter.cc:24:
> /usr/include/c++/6.2.1/bits/shared_ptr_base.h:343:11: note: 
> template class std::shared_ptr
>  class shared_ptr;
>^~
> In file included from zkadapter.h:34:0,
>  from zkadapter.cc:24:
> event.h: In constructor ‘zkfuse::GenericEvent::GenericEvent(int, 
> zkfuse::AbstractEventWrapper*)’:
> event.h:189:27: error: class ‘zkfuse::GenericEvent’ does not have any field 
> named ‘m_eventWrapper’
>  m_type(type), m_eventWrapper(eventWrapper) {
>^~
> event.h: In member function ‘void* zkfuse::GenericEvent::getEvent() const’:
> event.h:204:41: error: ‘m_eventWrapper’ was not declared in this scope
>  void *getEvent() const { return m_eventWrapper->getWrapee(); }
>  ^~
> In file included from zkadapte

  1   2   3   4   5   6   7   8   9   10   >