Re: [ANNOUNCE] new ZooKeeper PMC member: Mate Szalay-Beko

2022-03-28 Thread Jordan Zimmerman
Congrats!!!

> On Mar 28, 2022, at 7:42 AM, Enrico Olivelli  wrote:
> 
> I am happy to announce that Mate Szalay-Beko has been invited to join
> the Apache ZooKeeper PMC and he accepted.
> 
> Mate is doing great work for our community.
> 
> Please join me in congratulating with him
> 
> Congrats Mate !
> 
> 
> If you want to know more about the ASF works and what is a PMC you can
> read more here
> https://www.apache.org/foundation/how-it-works.html#pmc
> 
> Enrico



Re: Double Locking Issue

2021-07-20 Thread Jordan Zimmerman
A few things...

SUSPENDED in Curator means only that the connect has been lost, not that the 
session has ended. LOST is the state that means the session has ended
Be aware of how GCs can affect Curator. See the Tech Note here: 
https://cwiki.apache.org/confluence/display/CURATOR/TN10 

Also read this Tech Note on session handling: 
https://cwiki.apache.org/confluence/display/CURATOR/TN14 


I don't follow the timeline given in the original post. Why is the session 
timing out at T4? Has there been a network partition? Can you provide details 
on the partition and the behavior you're seeing? 

As I recall, InterProcessMutex creates a new node every time acquire is called. 
It's the client code's responsibility that manage SUSPENDED/LOST. I'd like to 
see your code for handling that. If there is a network partition your code 
would not be able to delete the lock's node until the partition is repaired.

In any event, I'd need to see much more detail and some sample code to better 
comment on this.

-Jordan

> On Jul 20, 2021, at 8:47 AM, Enrico Olivelli  wrote:
> 
> (crossposting to dev@zookeeper)
> 
> Hi ZooKeepers,
> can anyone take a look at this problem an user found  while using Curator ?
> 
> Thanks in advance
> Enrico
> 
> Il giorno mar 20 lug 2021 alle ore 09:01 Cameron McKenzie <
> cammcken...@apache.org> ha scritto:
> 
>> hey Viswa,
>> I'm by no means an expert on this chunk of code, but I've done a bit of
>> digging and it certainly seems that you've uncovered an issue.
>> 
>> Ultimately the root cause of the issue is the weirdness in the way that ZK
>> is handling ephemeral nodes. I'm not sure if this is intentional or a bug,
>> but I would have thought that if the ephemeral nodes are tied to a session
>> then they should be removed as soon as the session has expired.
>> 
>> From the Curator standpoint, it appears that the InterProcessMutex has been
>> written with the assumption that ephemeral nodes are deleted when their
>> session expires. To fix it on the Curator side, I think that we would need
>> to provide a way to interrupt the acquire() method, so that when the
>> connection goes into a SUSPENDED state we can force the restart of the
>> acquisition method. I guess you could just explicitly interrupt the thread
>> when your ConnectionStateListener gets a SUSPENDED event, but this is a bit
>> ugly.
>> 
>> Might be worth raising the issue on the ZK lists to see if this is a bug or
>> by design.
>> 
>> Any other devs have any thoughts?
>> cheers
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Jul 20, 2021 at 3:45 AM Viswanathan Rajagopal
>>  wrote:
>> 
>>> Hi Team,
>>> 
>>> Good day.
>>> 
>>> Recently came across “Double Locking Issue (i.e. two clients acquiring
>>> lock)” using Curator code ( InterProcessMutex lock APIs ) in our
>> application
>>> 
>>> Our use case:
>>> 
>>>  *   Two clients attempts to acquire the zookeeper lock using Curator
>>> InterProcessMutex and whoever owns it would release it once sees the
>>> connection disconnect ( on receiving Connection.SUSPENDED /
>> Connection.LOST
>>> Curator Connection Events from Connection listener)
>>> 
>>> Issue we noticed:
>>> 
>>>  *   After session expired & reconnected with new session, both client
>>> seems to have acquired the lock. Interesting thing that we found is that
>>> one of the clients still holds the lock while its lock node (ephemeral)
>> was
>>> gone
>>> 
>>> Things we found:
>>> 
>>>  *   Based on our initial analysis and few test runs, we saw that
>> Curator
>>> acquire() method acquires the lock based on “about to be deleted lock
>> node
>>> of previous session”. Explanation : Ephemeral node created by previous
>>> session was  still seen by client that reconnected with new session id
>>> until server cleans that up. If this happens, Curator acquire() would
>> hold
>>> the lock.
>>> 
>>> 
>>> 
>>>  *   Clearly we could see the race condition (in zookeeper code) between
>>> 1). Client reconnecting to server with new session id and 2). server
>>> deleting the ephemeral nodes of client’s previous session. We were able
>> to
>>> reproduce this issue using the following approach,
>>> *   Artificially break the socket connection between client and
>>> server for 30s
>>> *   Artificially pausing the set of server codes for a min and
>> unpause
>>> 
>>> 
>>>  *   On the above mentioned race condition, if client manage to
>> reconnect
>>> to server with new session id before server cleans up the ephemeral nodes
>>> of client’s previous session,  Curator lock acquire() who is trying to
>>> acquire the lock will hold the lock as it still sees the lock node in
>>> zookeeper directory. Eventually server would be cleaning up the ephemeral
>>> nodes leaving the Curator local lock thread data stale giving the
>> illusion
>>> that it still hold the lock while its ephemeral node is gone
>>> 
>>> 
>>>  *   Timeline events 

Re: An official API for starting ZooKeeper from a Java program

2020-07-01 Thread Jordan Zimmerman
The issue is trying to start ZooKeeper programmatically. ZooKeeperServerMain 
assumes starting from the CLI with zoo.cfg, etc. If you want to do all that 
programmatically it's more difficult. What is needed is a version of 
ZooKeeperServerMain for programmatic use.

> On Jul 1, 2020, at 3:41 PM, Christopher  wrote:
> 
> In my experience, the current ZooKeeperServerMain is adequate for
> this. I currently wrap that for my (unofficial) zookeeper-maven-plugin
> project (https://github.com/revelc/zookeeper-maven-plugin). I'm
> curious what additional benefits having a new API would add that can't
> be done with the current ones.
> 
> I'm in favor of removing the reliance on the scripts to start
> ZooKeeper, but from what I can tell, the dependence on those is
> already pretty minimal (I didn't need it for my wrapping, and it also
> doesn't seem to have been necessary for Apache Accumulo's
> MiniAccumuloCluster, which also launches ZooKeeper in Java code with
> minimal effort).
> 
> On Wed, Jul 1, 2020 at 2:36 AM Enrico Olivelli  wrote:
>> 
>> Il giorno gio 25 giu 2020 alle ore 22:48 Patrick Hunt  ha
>> scritto:
>> 
>>> On Thu, Jun 25, 2020 at 1:45 PM Enrico Olivelli 
>>> wrote:
>>> 
 Il Gio 25 Giu 2020, 22:13 Patrick Hunt  ha scritto:
 
> On Thu, Jun 25, 2020 at 3:20 AM Enrico Olivelli 
> wrote:
> 
>> Hi,
>> we recently got into ZOOKEEPER-3803
 FileTxnSnapLog.fastForwardFromEdits()
>> throws NPE if TestingServer is started from another thread (see [1])
>> and I have similar cases in other non OS products that run ZooKeeper
 from
>> Java.
>> 
>> The case of ZOOKEEPER-3803 is for Curator TestingServer, and here we
 are
>> talking about running a ZK server for testing purposes.
>> But I have other usecases in which the ZK server must be launched,
>>> for
>> production usage, from a Java based bootstrap launcher.
>> 
>> We are now officially supporting our binary package distribution that
> runs
>> ZooKeeper from a bash script and the bash script is coupled with
>> ZooKeeperMain and QuorumPeerMain.
>> 
>> In my opinion we should provide a well known and supported API,
>>> better
> than
>> using directly those two classes, to run ZooKeeper safely in
 production,
>> but launched from Java.
>> 
>> I am not here supporting or suggesting the idea of running ZooKeeper
> inside
>> the same process of a client application,
>> but only to provide a clear and stable API to start/stop and do
>>> minimal
>> health checks to a ZooKeeper peer.
>> 
>> I will be happy to work on it
>> 
>> Thoughts ?
>> 
> 
> Sounds reasonable to me. That said I'm not sure I follow you entirely.
> Isn't that the goal of the two Main classes? Is it that they are
 deficient
> (and can therefore be fixed to address) or they are serving a different
> role entirely from what you intend to provide?
> 
 
 Those classes do not have a clear interface.
 
 We need at least
 - init(configuration)
 - start()
 - stop()
 - boolean isAlive()
 
 
>>> Makes sense to me. Can we refactor the Main classes to include/use that and
>>> also doc/pub it as an public/enduser interface? Possible user/consumer
>>> visible regressions may make that a bad idea? Or do you want to have just
>>> one interface that supports any cluster type, distributed or non. I could
>>> see both approaches working.
>>> 
>> 
>> I would prefer to not change existing code, and create a wrapper around
>> ZooKeeperMain and other classes.
>> ZooKeeper bootstrap code is already used in many fancy ways, the less we
>> change it the better it is for the sake of stability.
>> 
>> This is the JIRA with complete design choices
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3874
>> 
>> I am planning to work on it as soon as possible, I would like this to be
>> delivered with 3.7.0
>> 
>> Enrico
>> 
>> 
>>> 
>>> Patrick
>>> 
>>> 
 Optionally we can provide some utility to get the endpoint address.
 
 Very similar to Curator TestingServer but:
 - for production
 - maintained by Zookeeper project
 
 
 Enrico
 
 
 
> Patrick
> 
> 
>> 
>> Enrico
>> 
>> 
>> [1] https://issues.apache.org/jira/browse/ZOOKEEPER-3803
>> 
> 
 
>>> 



[jira] [Created] (ZOOKEEPER-3831) Add a test that does a minimal validation of Apache Curator

2020-05-15 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3831:
---

 Summary: Add a test that does a minimal validation of Apache 
Curator
 Key: ZOOKEEPER-3831
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3831
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Affects Versions: 3.6.1
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman


Given that Apache Curator is one of the most widely used ZooKeeper clients it 
would be beneficial for ZooKeeper to have a minimal test to ensure that the 
codebase doesn't cause incompatibilities with Curator in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New ZooKeeper committer: Mate Szalay-Beko

2020-04-03 Thread Jordan Zimmerman
Congrats!!!

> On Apr 3, 2020, at 3:42 AM, Andor Molnar  wrote:
> 
> The Apache ZooKeeper PMC recently extended committer karma to Mate and he has 
> accepted. 
> Mate has made some great contributions (including C client!) and we are 
> looking forward to even more. :) 
> 
> Congratulations and welcome aboard, Mate!
> 
> 



Re: [DISCUSS] Sending 3.4 release line to End-Of-Life status

2020-04-02 Thread Jordan Zimmerman
+1 FYI - the next release of Curator will drop 3.4.x support.

-Jordan

> On Apr 1, 2020, at 11:30 PM, Michael Han  wrote:
> 
> +1.
> 
> For EOL policy statement, just to throw something out here that i can think
> of:
> 
> * Define what EOL means (such as: not supported by community dev team
> anymore, no future 3.4 releases .. still accessible at download page for X
> years..) and a date of EOL.
> 
> * Provide guidelines for upgrading paths to 3.5 / 3.6.
> 
> * State interoperability guarantees another post pointed out previously ^
> 
> On Wed, Apr 1, 2020 at 2:04 AM Andor Molnar  wrote:
> 
>> Hi folks,
>> 
>> Based on Enrico’s latest post about a 3.4 client problem I’d like to push
>> this initiative.
>> Asking more senior members of the community what communicated policy is
>> needed exactly to say 3.4 is EoL?
>> 
>> In terms of timing I’d like Patrick’s suggestion about 1st of June, 2020.
>> 
>> Any objections?
>> 
>> Andor
>> 
>> 
>> 
>> 
>>> On 2020. Mar 4., at 18:45, Michael K. Edwards 
>> wrote:
>>> 
>>> I think it would be useful for an EOL statement about 3.4.x to include a
>>> policy on interoperability of newer ZooKeeper servers with 3.4.x client
>>> code.  Stacks that build on top of Kafka and Hadoop (I'm looking at you,
>>> Spark) often wind up having an indirect dependency on a comically stale
>>> ZooKeeper library.  Even if this library isn't really exercised by the
>>> client side of the stack, it's there in the mountain of jars; and when
>>> application code also wants to use ZooKeeper more directly, using a newer
>>> client library can get kind of messy.  The approach I've taken has been
>> to
>>> rebuild large swathes of the stack around a consistent, recent ZooKeeper
>>> build; but I think it would be relevant to a lot of people to know
>> whether,
>>> say, a 3.4.14 client will work reliably with a 3.6.x quorum.
>>> 
>>> On Wed, Mar 4, 2020 at 9:28 AM Enrico Olivelli 
>> wrote:
>>> 
>>>> Il giorno mer 4 mar 2020 alle ore 17:23 Patrick Hunt
>>>>  ha scritto:
>>>>> 
>>>>> It seems like we should have a stated/communicated policy around
>> release
>>>>> lifecycles before sending an EOL message. That way folks have some
>> runway
>>>>> to plan for the event, both near term (3.4) as well as long term.
>>>> 
>>>> Shall we set a deadline ?
>>>> Something like "3.4 will be EOL by the end of 2020" ?
>>>> At this point we are only "discussing" about sending 3.4 to EOL, no
>>>> decision has been made yet
>>>> 
>>>> 
>>>> Enrico
>>>> 
>>>> 
>>>>> 
>>>>> Patrick
>>>>> 
>>>>> On Wed, Mar 4, 2020 at 5:16 AM Szalay-Bekő Máté <
>>>> szalay.beko.m...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Also a minor thing to consider: we wanted to ask the HBase community
>> to
>>>>>> upgrade to ZooKeeper 3.5 before, and the conclusion there was that
>> they
>>>>>> will do so only when the EOL will be at least scheduled / announced on
>>>> the
>>>>>> ZooKeeper 3.4 versions. Maybe there are other ZooKeeper users as well
>>>> who
>>>>>> will not upgrade until they get 'an official' statement about the 3.4
>>>>>> versions.
>>>>>> 
>>>>>> On Wed, Mar 4, 2020 at 1:44 PM Jordan Zimmerman <
>>>>>> jor...@jordanzimmerman.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> I'm +1 on this. We're planning to drop support for 3.4.x in the next
>>>>>>> release of Apache Curator, FYI.
>>>>>>> 
>>>>>>> -Jordan
>>>>>>> 
>>>>>>>> On Mar 4, 2020, at 7:36 AM, Enrico Olivelli 
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> we are releasing 3.6.0 (I am waiting for mirrors to sync before
>>>>>>>> updating the website).
>>>>>>>> 
>>>>>>>> In my opinion it is time to officially send 3.4 branch to EOL
>>>> status,
>>>>>>> that is:
>>>>>>>> - we are not expecting new releases
>>>>>>>> - drop 3.4 from download area (it will stay on archives as usual)
>>>>>>>> - strongly encourage people to update to 3.5/3.6
>>>>>>>> 
>>>>>>>> 3.4 is far away from master branch and even from 3.6.
>>>>>>>> There is a clean upgrade path from 3.4.LATEST to 3.5.7 and to 3.6
>>>> so
>>>>>>>> users are able to upgrade.
>>>>>>>> 
>>>>>>>> I am not sure we need a VOTE, if we simply agree I can drop 3.4
>>>> from
>>>>>>>> the "dist" are as long as I push the new website.
>>>>>>>> 
>>>>>>>> Best regards
>>>>>>>> Enrico
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> 
>> 



Re: ZK does not compile on JDK14 due to "java.lang.Record"

2020-03-27 Thread Jordan Zimmerman
Boy - it seems it's a mistake for the JDK to have Record in the java.lang 
package. Putting it in a different package would fix this. I wonder if we 
should file a bug or bring it up on the Amber list?

-Jordan

> On Mar 27, 2020, at 11:26 AM, Enrico Olivelli  wrote:
> 
> Let me file and INFRA issue and a patch for jute
> 
> Stay tuned
> 
> Enrico
> 
> Il Ven 27 Mar 2020, 17:03 Patrick Hunt  ha scritto:
> 
>> Confirmed locally with oracle jdk 14 and zk trunk.
>> 
>> Patrick
>> 
>> On Fri, Mar 27, 2020 at 6:19 AM Enrico Olivelli 
>> wrote:
>> 
>>> Il giorno gio 26 mar 2020 alle ore 23:45 Patrick Hunt
>>>  ha scritto:
 
 Seems the new JEP 359 record feature is added to jdk14 as a preview and
 it's introduced a regression wrt our "Record"
 https://openjdk.java.net/jeps/359
 
 So two things then - we should disambiguate our Record and see why the
 jenkins job is not seeing this... odd.
>>> 
>>> Maybe Jenkins has an early version of JDK14 without records support
>>> 
>>> having a mvn -v on jenkins will help
>>> 
>>> Can anyone try locally ?
>>> You can download the jdk and use it just by unpacking the tar.gz file,
>>> no need to "install" it
>>> 
>>> Enrico
>>> 
 
 Patrick
 
 
 On Thu, Mar 26, 2020 at 3:26 PM Enrico Olivelli 
>>> wrote:
 
> Patrick
> you are right
> it looks like it is using "/home/jenkins/tools/java/latest14"
> 
> this is my maven version info:
> 
> [eolivelli@localhost target]$ mvn -v
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: /home/eolivelli/dev/maven
> Java version: 14, vendor: AdoptOpenJDK, runtime:
> /home/eolivelli/dev/jdk-14+36
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "5.5.10-200.fc31.x86_64", arch: "amd64",
> family: "unix"
> 
> we should add some "mvn -v" to be executed as a pre build step
> 
> Enrico
> 
> Il giorno gio 26 mar 2020 alle ore 23:22 Patrick Hunt
>  ha scritto:
>> 
>> The jenkins job for jdk14 is passing - any ideas why you are seeing
>> different? Is the jenkins job setup incorrectly?
>> 
>> 
> 
>>> 
>> https://builds.apache.org/view/Z/view/ZooKeeper/job/zookeeper-master-maven-jdk14/
>> 
>> Patrick
>> 
>> 
>> On Thu, Mar 26, 2020 at 3:13 PM Enrico Olivelli <
>> eolive...@gmail.com
 
> wrote:
>> 
>>> Hi,
>>> it looks like ZK cannot be build on JDK14 due to a small source
>>> compatibility issue.
>>> The error is below.
>>> 
>>> The fix is trivial, we just only have to explicitly import the
>> full
>>> classname of "Record"
>>> 
>>> Enrico
>>> 
>>> both interface org.apache.jute.Record in org.apache.jute and
>> class
>>> java.lang.Record in java.lang match
>>> [ERROR]
>>> 
> 
>>> 
>> /home/eolivelli/dev/zookeeper/zookeeper-jute/target/generated-sources/java/org/apache/zookeeper/proto/GetMaxChildrenResponse.java:25:
>>> error: reference to Record is ambiguous
>>> [ERROR] public class GetMaxChildrenResponse implements Record {
>>> 
> 
>>> 
>> 



[jira] [Created] (ZOOKEEPER-3762) Add Client/Server API to return available features

2020-03-18 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3762:
---

 Summary: Add Client/Server API to return available features
 Key: ZOOKEEPER-3762
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3762
 Project: ZooKeeper
  Issue Type: New Feature
  Components: c client, java client, server
Affects Versions: 3.6.0
Reporter: Jordan Zimmerman


Recent versions have introduced several new features/changes. Clients would 
benefit from an API that reports the feature set that a server instance 
supports. Something like (in Java):

{code}
public enum ServerFeatures {
TTL_NODES,
PERSISTENT_WATCHERS,
... etc ... full set of features TBD
}

public Collection< ServerFeatures> getServerFeatures() {
...
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Sending 3.4 release line to End-Of-Life status

2020-03-04 Thread Jordan Zimmerman
I'm +1 on this. We're planning to drop support for 3.4.x in the next release of 
Apache Curator, FYI.

-Jordan

> On Mar 4, 2020, at 7:36 AM, Enrico Olivelli  wrote:
> 
> Hi,
> we are releasing 3.6.0 (I am waiting for mirrors to sync before
> updating the website).
> 
> In my opinion it is time to officially send 3.4 branch to EOL status, that is:
> - we are not expecting new releases
> - drop 3.4 from download area (it will stay on archives as usual)
> - strongly encourage people to update to 3.5/3.6
> 
> 3.4 is far away from master branch and even from 3.6.
> There is a clean upgrade path from 3.4.LATEST to 3.5.7 and to 3.6 so
> users are able to upgrade.
> 
> I am not sure we need a VOTE, if we simply agree I can drop 3.4 from
> the "dist" are as long as I push the new website.
> 
> Best regards
> Enrico



Re: [RESULT] [VOTE] Apache ZooKeeper 3.6.0 candidate 4

2020-03-03 Thread Jordan Zimmerman
Congrats to all.

-Jordan

> On Mar 3, 2020, at 4:45 PM, Enrico Olivelli  wrote:
> 
> I'm happy to announce that we have unanimously approved this release.
> 
> There are 7 approving votes, 5 of which are binding:
> 
> - Szalay-Bekő Máté
> - Norbert Kalmar
> - Enrico Olivelli (binding)
> - Patrick Hunt (binding)
> - Andor Molnar (binding)
> - Flavio Junqueira (binding)
> - Michael Han (binding)
> 
> There are no disapproving votes.
> 
> I will promote the artifacts and complete the release procedure.
> 
> Thanks to every one who contributed to this great release !
> 
> Enrico Olivelli
> 
> Il giorno mar 3 mar 2020 alle ore 22:17 Michael Han 
> ha scritto:
>> 
>> +1
>> 
>> - verified checksum/sig.
>> - verified release notes.
>> - verified regenerated documentations.
>> - verified both java and c unit tests pass (ubuntu 18 / java11).
>> - verified with a few jetty admin commands and zk cli commands.
>> 
>> On Tue, Mar 3, 2020 at 2:24 AM Flavio Junqueira  wrote:
>> 
>>> +1 (binding)
>>> 
>>> - Built from sources (there are a good number of flaky tests, but it
>>> eventually built correctly)
>>> - Checked LICENSE and NOTICED
>>> - Checked release notes
>>> - Checked that the maven dependency resolve for the staging artifact
>>> - Ran some local smoke tests
>>> 
>>> -Flavio
>>> 
 On 3 Mar 2020, at 11:01, Andor Molnar  wrote:
 
 +1 (binding)
 
 + verified signatures, checksums
 + successful build on Mac and Centos 7.5 (including C tests)
 + run various smoke tests and latency tests with 3-node cluster
 + verified rolling upgrade from 3.5.7
 
 Thanks Enrico, I think you’re now good to go.
 
 Andor
 
 
 
> On 2020. Mar 1., at 10:03, Enrico Olivelli  wrote:
> 
> +1 (binding)
> verified signatures and checksums
> run a few smoke tests form binaries (standalone mode)
> tested Prometheus.io metrics endpoint
> build from sources, run automatic QA tests (rat, checkstyle,
>>> spotbugs...)
> all on Linux with Java 8 (AdoptOpenJDK)
> 
> We need at least one more PMC to vote please
> 
> Enrico
> 
> Il giorno dom 1 mar 2020 alle ore 01:58 Patrick Hunt
>  ha scritto:
>> 
>> +1. xsum/sig verified. rat ran clean. Compiled and ran some manual
>>> tests
>> with various ensemble sizes successfully.
>> 
>> Regards,
>> 
>> Patrick
>> 
>> On Fri, Feb 28, 2020 at 6:53 AM Enrico Olivelli 
>>> wrote:
>> 
>>> Thank you guys for voting.
>>> 
>>> We need more votes please
>>> 
>>> Enrico
>>> 
>>> Il giorno gio 27 feb 2020 alle ore 14:14 Norbert Kalmar
>>>  ha scritto:
 
 +1 (non-binding)
 
 - unit tests pass (PurgeTxnTest as well)
 - source tarball: compiled and started ZK + run few commands from
>>> source
 tarball
 - bin tarball: license files checked, started ZK + run few commands
 - signatures OK.
 - compared source tarball with git repository checked out at RC tag
>>> using
 meld. Found no divergence.
 
 Tested on MacOS and Ubuntu 16, using openJDK 1.8.242.
 
 - Norbert
 
 On Thu, Feb 27, 2020 at 11:17 AM Szalay-Bekő Máté <
 szalay.beko.m...@gmail.com> wrote:
 
> +1 (non-binding)
> 
> - I built the code and executed the java/C unit tests using 8u242
> (everything passed, except
>>> PurgeTxnTest.testPurgeWhenLogRollingInProgress
> what seems to never work on my machine.. I saw it before to be flaky
>>> also
> on the apache jenkins, I created a Jira iticket for fixing it:
> https://issues.apache.org/jira/browse/ZOOKEEPER-3740)
> - Using https://github.com/symat/zk-rolling-upgrade-test
> - I tested rolling upgrade from 3.5.7 to 3.6.0
> - I tested rolling restart on 3.6.0 to enable the multi-address
>>> feature
> with the new quorum protocol version
> - Using https://github.com/symat/zookeeper-docker-test I also
>>> tested
>>> the
> multi-address feature (disabling and re-enabling different virtual
>>> network
> interfaces to see that the cluster always recovers)
> 
> On Tue, Feb 25, 2020 at 4:13 PM Enrico Olivelli <
>>> eolive...@gmail.com>
> wrote:
> 
>> This is the fifth release candidate for 3.6.0.
>> 
>> It is a major release and it introduces a lot of new features, most
>> notably:
>> - Built-in data consistency check inside ZooKeeper
>> - Allow Followers to host Observers
>> - A new feature proposal to ZooKeeper: authentication enforcement
>> - Pluggable metrics system for ZooKeeper (and Prometheus.io
>>> integration)
>> - TLS Port unification
>> - Audit logging in ZooKeeper servers
>> - Improve resilience to network (advertise multiple addresses 

Re: [VOTE] Apache ZooKeeper release 3.5.7 candidate 2

2020-02-10 Thread Jordan Zimmerman
I ran Curator tests and they pass

+1 (non binding)

-Jordan

> On Feb 10, 2020, at 6:52 AM, Norbert Kalmar  wrote:
> 
> This is the third bugfix release candidate for 3.5.7. It fixes 25 issues,
> including third party CVE fixes, potential data loss and potential split
> brain if some rare conditions exists.
> 
> There are 4 additional patches compared to rc0 and rc1:
> - ZOOKEEPER-3453: missing 'SET' in zkCli on windows
> - ZOOKEEPER-3716: upgrade netty 4.1.42 to address CVE-2019-20444 CVE-20…
> - ZOOKEEPER-3718: The tarball generated by assembly is missing some files
> - ZOOKEEPER-3719: Fix C Client compilation issues
> 
> The full release notes are available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12346098
> 
> *** Please download, test and vote by February 13th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~nkalmar/zookeeper-3.5.7-candidate-2/
> 
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.7/
> 
> The release candidate tag in git to be voted upon: release-3.5.7-rc2
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Should we release this candidate?



Re: [VOTE] Apache ZooKeeper release 3.5.7 candidate 1

2020-02-09 Thread Jordan Zimmerman
3.5.7 is not in the staging repo. I'd like to test with Curator. 

https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/
 


-Jordan

> On Feb 7, 2020, at 7:29 AM, Norbert Kalmar  wrote:
> 
> This is the second bugfix release candidate for 3.5.7. It fixes 21 issues,
> including third party CVE fixes, potential data loss and potential split
> brain if some rare conditions exists.
> 
> (I have signed rc0 with the wrong key - sorry for that). Everything else is
> unchanged from rc0.
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12346098
> 
> *** Please download, test and vote by February 11th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~nkalmar/zookeeper-3.5.7-candidate-1/
> 
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.7/
> 
> The release candidate tag in git to be voted upon: release-3.5.7-rc1
> (points to the same commit as rc0)
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Should we release this candidate?



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 2

2020-02-09 Thread Jordan Zimmerman
The CURATOR-549-zk36-updates branch tests pass

+1 (non binding)

-Jordan

> On Feb 5, 2020, at 2:34 PM, Enrico Olivelli  wrote:
> 
> This is the third release candidate for Apache ZooKeeper 3.6.0.
> 
> It is a major release and it introduces a lot of new features, most notably:
> - Built-in data consistency check inside ZooKeeper
> - Allow Followers to host Observers
> - Authentication enforcement
> - Pluggable metrics system for ZooKeeper (and Prometheus.io integration)
> - TLS Port unification
> - Audit logging in ZooKeeper servers
> - Improve resilience to network (advertise multiple addresses for
> members of a Zookeeper cluster)
> - Persistent Recursive Watches
> - add an API and the corresponding CLI to get total count of recursive
> sub nodes under a specific path
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
> 
> *** Please download, test and vote by February 8th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/
> 
> Maven staging repo:
> https://repository.apache.org/content/repositories/orgapachezookeeper-1049/
> 
> The staging version of the website is:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-2/website/
> 
> The release candidate tag in git to be voted upon: release-3.6.0-2
> https://github.com/apache/zookeeper/tree/release-3.6.0-2
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Please note that we are adding a new jar to the dependency set for
> clients: zookeeper-metrics-providers.
> 
> Should we release this candidate?
> 
> Enrico Olivelli



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 1

2020-02-03 Thread Jordan Zimmerman
the new versions of java
>> doesn't
>>>>> like
>>>>>> :(
>>>>>>>>> 
>>>>>>>>> I'm not sure either if it's a showstopper or not. But possibly
>>>> this
>>>>>>> could
>>>>>>>>> come out when using kerberized ZK? Unfortunately kind of hard
>> to
>>>>> test
>>>>>>>>> "live".
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Norbert
>>>>>>>>> 
>>>>>>>>> On Mon, Feb 3, 2020 at 12:38 PM Szalay-Bekő Máté <
>>>>>>>>> szalay.beko.m...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> - I compiled and run all the unit tests using Ubuntu 18.04
>>>> (incl.
>>>>>> the
>>>>>>> C
>>>>>>>>>> client), using OpenJDK 1.8.212
>>>>>>>>>> - I also built and unit tested the python client
>>>>>>>>>> - I did some manual tests for the multi-address feature with
>>>>>> multiple
>>>>>>>>>> virtual networks (using
>>>>>>> https://github.com/symat/zookeeper-docker-test)
>>>>>>>>>> 
>>>>>>>>>> everything seemed to be OK, however...
>>>>>>>>>> 
>>>>>>>>>> using OpenJDK 1.8.242 or OpenJDK 11.0.6, I got some kerberos
>>>>> related
>>>>>>>>>> exceptions when running the following tests:
>>>>>>>>>> - QuorumKerberosAuthTest
>>>>>>>>>> - QuorumKerberosHostBasedAuthTest
>>>>>>>>>> - SaslKerberosAuthOverSSLTest
>>>>>>>>>> 
>>>>>>>>>> the error:
>>>>>>>>>> 2020-02-03 12:11:07,197 [myid:localhost:11223] - ERROR
>>>>>>>>>> [main-SendThread(localhost:11223):ZooKeeperSaslClient@336]
>> -
>>> An
>>>>>>> error:
>>>>>>>>>> (java.security.PrivilegedActionException:
>>>>>>>>>> javax.security.sasl.SaslException: GSS initiate failed
>> [Caused
>>>> by
>>>>>>>>>> GSSException: No valid credentials provided (Mechanism
>> level:
>>>> null
>>>>>>>>>> (5001))]) occurred when evaluating Zookeeper Quorum Member's
>>>>>> received
>>>>>>>>> SASL
>>>>>>>>>> token. Zookeeper Client will go to AUTH_FAILED state.
>>>>>>>>>> 
>>>>>>>>>> I tried it with Zulu 11.0.3 version and OpenJDK 11.0.2
>> version
>>>> and
>>>>>>> both
>>>>>>>>>> were working fine. So it looks there might some
>>> incompatibility
>>>>> with
>>>>>>> the
>>>>>>>>>> more recent JDK releases. (between 1.8.212 - 1.8.242, and
>> also
>>>>>> between
>>>>>>>>>> 11.0.3 and 11.0.6)
>>>>>>>>>> 
>>>>>>>>>> I also tested on OpenJDK 13.ea.30 and that worked.
>>>>>>>>>> 
>>>>>>>>>> I am not sure if it is a -1 or not... clearly these are some
>>>> test
>>>>>> and
>>>>>>>>> JDK
>>>>>>>>>> related issues. Also it can be only some strange thing with
>> my
>>>>>>>>> environment.
>>>>>>>>>> Can someone try to reproduce my problem?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Cheers,
>>>>>>>>>> Mate
>>>>>>>>>> 
>>>>>>>>>> On Mon, Feb 3, 2020 at 4:31 AM Jordan Zimmerman <
>>>>>>>>>> jor...@jordanzimmerman.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> No big issues with Curator that I could find
>>>>>>>>>>> 
>>>>>>>>>>> +1 (non binding)
>>>>>>>>>>> 
>>>>>>>>>>> -Jordan
>>>>>>>>>>> 
>>>>>>>>>>>> On Feb 1, 2020, at 10:02 AM, Enrico Olivelli <
>>>>>> eolive...@gmail.com
>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> This is the second release candidate for Apache
>> ZooKeeper
>>>>> 3.6.0.
>>>>>>>>>>>> 
>>>>>>>>>>>> It is a major release and it introduces a lot of new
>>>> features,
>>>>>>> most
>>>>>>>>>>> notably:
>>>>>>>>>>>> - Built-in data consistency check inside ZooKeeper
>>>>>>>>>>>> - Allow Followers to host Observers
>>>>>>>>>>>> - Authentication enforcement
>>>>>>>>>>>> - Pluggable metrics system for ZooKeeper (and
>>> Prometheus.io
>>>>>>>>>> integration)
>>>>>>>>>>>> - TLS Port unification
>>>>>>>>>>>> - Audit logging in ZooKeeper servers
>>>>>>>>>>>> - Improve resilience to network (advertise multiple
>>>> addresses
>>>>>> for
>>>>>>>>>>>> members of a Zookeeper cluster)
>>>>>>>>>>>> - Persistent Recursive Watches
>>>>>>>>>>>> - add an API and the corresponding CLI to get total
>> count
>>> of
>>>>>>>>> recursive
>>>>>>>>>>>> sub nodes under a specific path
>>>>>>>>>>>> 
>>>>>>>>>>>> The full release notes is available at:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
>>>>>>>>>>>> 
>>>>>>>>>>>> *** Please download, test and vote by February 4th 2020,
>>>> 23:59
>>>>>>>>> UTC+0.
>>>>>>>>>> ***
>>>>>>>>>>>> 
>>>>>>>>>>>> Source files:
>>>>>>>>>>>> 
>>>>>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/
>>>>>>>>>>>> 
>>>>>>>>>>>> Maven staging repo:
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://repository.apache.org/content/repositories/orgapachezookeeper-1047/
>>>>>>>>>>>> 
>>>>>>>>>>>> The staging version of the website is:
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/website/
>>>>>>>>>>>> 
>>>>>>>>>>>> The release candidate tag in git to be voted upon:
>>>>>> release-3.6.0-1
>>>>>>>>>>>> 
>> https://github.com/apache/zookeeper/tree/release-3.6.0-1
>>>>>>>>>>>> 
>>>>>>>>>>>> ZooKeeper's KEYS file containing PGP keys we use to sign
>>> the
>>>>>>>>> release:
>>>>>>>>>>>> https://www.apache.org/dist/zookeeper/KEYS
>>>>>>>>>>>> 
>>>>>>>>>>>> Please note that we are adding a new jar to the
>> dependency
>>>> set
>>>>>> for
>>>>>>>>>>>> clients: zookeeper-metrics-providers.
>>>>>>>>>>>> 
>>>>>>>>>>>> Should we release this candidate?
>>>>>>>>>>>> 
>>>>>>>>>>>> Enrico Olivelli
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 1

2020-02-02 Thread Jordan Zimmerman
No big issues with Curator that I could find

+1 (non binding)

-Jordan

> On Feb 1, 2020, at 10:02 AM, Enrico Olivelli  wrote:
> 
> This is the second release candidate for Apache ZooKeeper 3.6.0.
> 
> It is a major release and it introduces a lot of new features, most notably:
> - Built-in data consistency check inside ZooKeeper
> - Allow Followers to host Observers
> - Authentication enforcement
> - Pluggable metrics system for ZooKeeper (and Prometheus.io integration)
> - TLS Port unification
> - Audit logging in ZooKeeper servers
> - Improve resilience to network (advertise multiple addresses for
> members of a Zookeeper cluster)
> - Persistent Recursive Watches
> - add an API and the corresponding CLI to get total count of recursive
> sub nodes under a specific path
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
> 
> *** Please download, test and vote by February 4th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/
> 
> Maven staging 
> repo:https://repository.apache.org/content/repositories/orgapachezookeeper-1047/
> 
> The staging version of the website is:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/website/
> 
> The release candidate tag in git to be voted upon: release-3.6.0-1
> https://github.com/apache/zookeeper/tree/release-3.6.0-1
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Please note that we are adding a new jar to the dependency set for
> clients: zookeeper-metrics-providers.
> 
> Should we release this candidate?
> 
> Enrico Olivelli



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 1

2020-02-01 Thread Jordan Zimmerman
Maybe we can get 3.6.1 quickly after then? 

> On Feb 1, 2020, at 10:02 AM, Enrico Olivelli  wrote:
> 
> This is the second release candidate for Apache ZooKeeper 3.6.0.
> 
> It is a major release and it introduces a lot of new features, most notably:
> - Built-in data consistency check inside ZooKeeper
> - Allow Followers to host Observers
> - Authentication enforcement
> - Pluggable metrics system for ZooKeeper (and Prometheus.io integration)
> - TLS Port unification
> - Audit logging in ZooKeeper servers
> - Improve resilience to network (advertise multiple addresses for
> members of a Zookeeper cluster)
> - Persistent Recursive Watches
> - add an API and the corresponding CLI to get total count of recursive
> sub nodes under a specific path
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
> 
> *** Please download, test and vote by February 4th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/
> 
> Maven staging 
> repo:https://repository.apache.org/content/repositories/orgapachezookeeper-1047/
> 
> The staging version of the website is:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/website/
> 
> The release candidate tag in git to be voted upon: release-3.6.0-1
> https://github.com/apache/zookeeper/tree/release-3.6.0-1
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Please note that we are adding a new jar to the dependency set for
> clients: zookeeper-metrics-providers.
> 
> Should we release this candidate?
> 
> Enrico Olivelli



Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 1

2020-02-01 Thread Jordan Zimmerman
Any chance of getting https://github.com/apache/zookeeper/pull/1229 
 into 3.6.0?

-JZ

> On Feb 1, 2020, at 10:02 AM, Enrico Olivelli  wrote:
> 
> This is the second release candidate for Apache ZooKeeper 3.6.0.
> 
> It is a major release and it introduces a lot of new features, most notably:
> - Built-in data consistency check inside ZooKeeper
> - Allow Followers to host Observers
> - Authentication enforcement
> - Pluggable metrics system for ZooKeeper (and Prometheus.io integration)
> - TLS Port unification
> - Audit logging in ZooKeeper servers
> - Improve resilience to network (advertise multiple addresses for
> members of a Zookeeper cluster)
> - Persistent Recursive Watches
> - add an API and the corresponding CLI to get total count of recursive
> sub nodes under a specific path
> 
> The full release notes is available at:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
> 
> *** Please download, test and vote by February 4th 2020, 23:59 UTC+0. ***
> 
> Source files:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/
> 
> Maven staging 
> repo:https://repository.apache.org/content/repositories/orgapachezookeeper-1047/
> 
> The staging version of the website is:
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-1/website/
> 
> The release candidate tag in git to be voted upon: release-3.6.0-1
> https://github.com/apache/zookeeper/tree/release-3.6.0-1
> 
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> https://www.apache.org/dist/zookeeper/KEYS
> 
> Please note that we are adding a new jar to the dependency set for
> clients: zookeeper-metrics-providers.
> 
> Should we release this candidate?
> 
> Enrico Olivelli



Re: Looking for go binding

2020-01-31 Thread Jordan Zimmerman
No design discussions there. 


Jordan Zimmerman

> On Jan 31, 2020, at 1:16 PM, Giorgio Zoppi  wrote:
> 
> Hello smart people,
> we are looking for binding in go, possibly with grpc communication.
> Is there any working/high quality work around with zookeeper and golang?
> Best Regards,
> Giorgio


Re: Looking for go binding

2020-01-31 Thread Jordan Zimmerman
Please join Apache Slack if you can - we have ongoing discussions in the 
#zookeeper channel. 

the-asf.slack.com

> On Jan 31, 2020, at 1:57 PM, Giorgio Zoppi  wrote:
> 
> Ok,
> Thanks, i will give it a try and i come to you if i have issues.
> Best Regards,
> Giorgio.
> 
> El vie., 31 ene. 2020 a las 19:55, Jordan Zimmerman (<
> jor...@jordanzimmerman.com>) escribió:
> 
>> They're all my branches right now. See the doc I linked to - it has links
>> to all the branches with detailed explanations. If you just want the
>> latest/latest, the wip-jute-rpc-module is the branch you want (it's based
>> off the others).
>> 
>> -JZ
>> 
>>> On Jan 31, 2020, at 1:53 PM, Giorgio Zoppi 
>> wrote:
>>> 
>>> Ok Jordan,
>>> looks like that you need something like
>>> https://github.com/codesenberg/bombardier but customized for zookeeper,
>> but
>>> for helping on that i need modified code. Is the code already in the
>>> current branch,?  Or is it on one fo your branches?
>>> Best Regards,
>>> Giorgio
>>> 
>>> 
>>> 
>>> El vie., 31 ene. 2020 a las 19:44, Jordan Zimmerman (<
>>> jor...@jordanzimmerman.com>) escribió:
>>> 
>>>> There's very little change to ZooKeeper itself. All of the ZK specific
>>>> changes are currently in the wip-jute-rpc-encapsulate module and all
>>>> involve removing the direct use of ByteBuffers. ZooKeeper already has an
>>>> abstraction for alternate server handlers (ServerCnxnFactory and
>>>> ServerCnxn) - the wip-jute-rpc-module uses that.
>>>> 
>>>> I mostly need help with testing and validation. What does it look like
>> to
>>>> write a client with this? I started on a Java client for test purposes.
>> How
>>>> does this perform compared it standard ZK? Etc. What should be done
>>>> differently etc?
>>>> 
>>>> -JZ
>>>> 
>>>>> On Jan 31, 2020, at 1:40 PM, Giorgio Zoppi 
>>>> wrote:
>>>>> 
>>>>> Hi Jordan,
>>>>> this looks at first feeling a bit risky at later stages since zookeeper
>>>> is
>>>>> tightly coupled in many parts.   So you have the github branch name of
>>>> the
>>>>> first task proposal?
>>>>> i am in #backtobasics period, so i might help.
>>>>> Best Regards,
>>>>> Giorgio
>>>> 
>>>> 
>>> 
>>> --
>>> Life is a chess game - Anonymous.
>> 
>> 
> 
> -- 
> Life is a chess game - Anonymous.



Re: Looking for go binding

2020-01-31 Thread Jordan Zimmerman
They're all my branches right now. See the doc I linked to - it has links to 
all the branches with detailed explanations. If you just want the 
latest/latest, the wip-jute-rpc-module is the branch you want (it's based off 
the others).

-JZ

> On Jan 31, 2020, at 1:53 PM, Giorgio Zoppi  wrote:
> 
> Ok Jordan,
> looks like that you need something like
> https://github.com/codesenberg/bombardier but customized for zookeeper, but
> for helping on that i need modified code. Is the code already in the
> current branch,?  Or is it on one fo your branches?
> Best Regards,
> Giorgio
> 
> 
> 
> El vie., 31 ene. 2020 a las 19:44, Jordan Zimmerman (<
> jor...@jordanzimmerman.com>) escribió:
> 
>> There's very little change to ZooKeeper itself. All of the ZK specific
>> changes are currently in the wip-jute-rpc-encapsulate module and all
>> involve removing the direct use of ByteBuffers. ZooKeeper already has an
>> abstraction for alternate server handlers (ServerCnxnFactory and
>> ServerCnxn) - the wip-jute-rpc-module uses that.
>> 
>> I mostly need help with testing and validation. What does it look like to
>> write a client with this? I started on a Java client for test purposes. How
>> does this perform compared it standard ZK? Etc. What should be done
>> differently etc?
>> 
>> -JZ
>> 
>>> On Jan 31, 2020, at 1:40 PM, Giorgio Zoppi 
>> wrote:
>>> 
>>> Hi Jordan,
>>> this looks at first feeling a bit risky at later stages since zookeeper
>> is
>>> tightly coupled in many parts.   So you have the github branch name of
>> the
>>> first task proposal?
>>> i am in #backtobasics period, so i might help.
>>> Best Regards,
>>> Giorgio
>> 
>> 
> 
> -- 
> Life is a chess game - Anonymous.



Re: Looking for go binding

2020-01-31 Thread Jordan Zimmerman
There's very little change to ZooKeeper itself. All of the ZK specific changes 
are currently in the wip-jute-rpc-encapsulate module and all involve removing 
the direct use of ByteBuffers. ZooKeeper already has an abstraction for 
alternate server handlers (ServerCnxnFactory and ServerCnxn) - the 
wip-jute-rpc-module uses that.

I mostly need help with testing and validation. What does it look like to write 
a client with this? I started on a Java client for test purposes. How does this 
perform compared it standard ZK? Etc. What should be done differently etc?

-JZ 

> On Jan 31, 2020, at 1:40 PM, Giorgio Zoppi  wrote:
> 
> Hi Jordan,
> this looks at first feeling a bit risky at later stages since zookeeper is
> tightly coupled in many parts.   So you have the github branch name of the
> first task proposal?
> i am in #backtobasics period, so i might help.
> Best Regards,
> Giorgio



Re: Looking for go binding

2020-01-31 Thread Jordan Zimmerman
I've been working on gRPC for ZooKeeper. Here's is the proposal with links to 
the current branches I'm using. I really could use help on this so any/all 
interested please get involved.

https://docs.google.com/document/d/1wP61nCDeLLNPdXGxHxi_MIbt7c2pCl0ttm9KMjyOisQ/edit#heading=h.200w7srpqmog
 


-Jordan

> On Jan 31, 2020, at 1:15 PM, Giorgio Zoppi  wrote:
> 
> Hello smart people,
> we are looking for binding in go, possibly with grpc communication.
> Is there any working/high quality work around with zookeeper and golang?
> Best Regards,
> Giorgio



Re: Great Meetup

2020-01-30 Thread Jordan Zimmerman
Very good, thank you - also thanks for working to get the remote channel up.

-Jordan

> On Jan 30, 2020, at 6:44 PM, David Mollitor  wrote:
> 
> Just wanted to send a note to express my gratitude for setting up the ZK
> meetup @FB.  I had a great time and enjoyed meeting you all.
> 
> Belugabehr (David)



Re: [ANNOUNCE] Enrico Olivelli new ZooKeeper PMC Member

2020-01-21 Thread Jordan Zimmerman
Well deserved. Congratulations. 


Jordan Zimmerman

> On Jan 21, 2020, at 4:40 PM, Flavio Junqueira  wrote:
> 
> I'm pleased to announce that Enrico Olivelli recently became the newest 
> ZooKeeper PMC member. Enrico has contributed immensely to this community; he 
> became a ZooKeeper committer in May 2019 and now he joins the PMC.
> 
> Join me in congratulating him on the achievement. Congrats, Enrico!
> 
> -Flavio on behalf of the Apache ZooKeeper PMC


[jira] [Created] (ZOOKEEPER-3703) Publish a Test-Jar from ZooKeeper Server

2020-01-21 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3703:
---

 Summary: Publish a Test-Jar from ZooKeeper Server
 Key: ZOOKEEPER-3703
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3703
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Affects Versions: 3.5.6
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman
 Fix For: 3.6.0


It would be very helpful to Apache Curator and others if ZooKeeper published 
its testing code as a Maven Test JAR. Curator, for example, could use it to 
improve its testing server to make it easier to inject error conditions without 
having to have forced time delays and other hacks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: FYI - gRPC project for ZooKeeper

2020-01-17 Thread Jordan Zimmerman
Thanks Enrico. 


Jordan Zimmerman

> On Jan 17, 2020, at 3:58 AM, Enrico Olivelli  wrote:
> 
> Jordan,
> I have been following this work and I appreciate that very much.
> 
> Your doc draws a good picture of the status of our codebase.
> 
> Personally I see much value in opening Zookeeper to non Java native clients.
> 
> Reworking the internals (zkdatabase, server-to-server) as you state in your
> docs, is very dangerous and I am not sure it is worth to do in the
> short/mid term.
> 
> The very trade-off we should accept will come when we decide how much
> efficiently non-jute client requests are to be processed.
> My mind is mostly over problems like zero-copy memory handling, saving
> resources on  decode/encode.
> 
> My other concern is about the concurrency model on clients. Zookeeper
> client API/contract relies heavily on a strict ordering of event delivery
> to the application. I feel we can implement this correctly but it won't be
> easy.
> 
> To summarize I totally sponsor this work, your plan is reasonable, but I am
> not sure how much deep we can go inside the core of zk server.
> 
> Starting with a gRPC endpoint is a good starting point
> 
> Thank you for this hard work
> 
> 
> Enrico
> 
> Il ven 17 gen 2020, 02:21 Jordan Zimmerman  ha scritto:
> 
>> Hello folks,
>> 
>> I've been working on gRPC support for ZooKeeper. Please see the doc here
>> for latest details, links to branches, etc.:
>> 
>> https://docs.google.com/document/d/1wP61nCDeLLNPdXGxHxi_MIbt7c2pCl0ttm9KMjyOisQ/edit#
>> 
>> Also see:
>> 
>> https://issues.apache.org/jira/browse/ZOOKEEPER-102?focusedCommentId=17017618=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17017618
>> 
>> I'd appreciate some help, comments, etc. In particular, I need a ZooKeeper
>> committer to champion this.
>> 
>> -Jordan
>> 


FYI - gRPC project for ZooKeeper

2020-01-16 Thread Jordan Zimmerman
Hello folks,

I've been working on gRPC support for ZooKeeper. Please see the doc here
for latest details, links to branches, etc.:
https://docs.google.com/document/d/1wP61nCDeLLNPdXGxHxi_MIbt7c2pCl0ttm9KMjyOisQ/edit#

Also see:
https://issues.apache.org/jira/browse/ZOOKEEPER-102?focusedCommentId=17017618=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17017618

I'd appreciate some help, comments, etc. In particular, I need a ZooKeeper
committer to champion this.

-Jordan


Re: [VOTE] Apache ZooKeeper release 3.6.0 candidate 0

2020-01-16 Thread Jordan Zimmerman
Thanks for managing this Enrico! Any chance of deploying a 
3.6.0-CANDIDATE-0-SNAPSHOT (or some other similar name) so I can run Curator 
tests? It’s a real pain to manually insert the JAR vía Maven. 


Jordan Zimmerman

> On Jan 15, 2020, at 1:05 PM, Enrico Olivelli  wrote:
> 
> Alexander,
> I have pasted a wrong link in the VOTE email, I am sorry
> 
> The good link is
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12326518
> 
> You can also see the staged released notes in the website
> https://people.apache.org/~eolivelli/zookeeper-3.6.0-candidate-0/website/releasenotes.html
> 
> Thank you so much for reporting this issue
> 
> Happy testing
> 
> Enrico
> 
>> Il giorno mer 15 gen 2020 alle ore 16:55 Alexander Shraer 
>> ha scritto:
>> 
>> Hi Enrico,
>> 
>> Thank you for driving this release!
>> 
>> I have a question - i believe that Zookeeper-2024 (an order of magnitude
>> throughput improvement for mixed workloads) is part of the 3.6.0 release,
>> but it isn't mentioned in the release notes or the summary.
>> Could you please clarify ?
>> 
>> Thanks,
>> Alex
>> 
>> 
>>> On Wed, Jan 15, 2020 at 7:29 AM Flavio Junqueira  wrote:
>>> 
>>> I can't parse Rudy's message, is it an issue with my mail application?
>>> 
>>> -Flavio
>>> 
>>>> On 15 Jan 2020, at 15:00, rudy_steiner  wrote:
>>>> 
>>>> environment:* MacOS High Sierra 10.13.1* JDK
>>> 1.8.0_172I try to run junit test on branch-3.6, and unit test
>>> thread get stuck, log as follows:.INFO] Running
>>> org.apache.zookeeper.common.X509UtilTest[INFO] Tests run: 3,
>> Failures:
>>> 0, Errors: 0, Skipped: 0, Time elapsed: 27.797 s - in
>>> org.apache.zookeeper.server.SnapshotDigestTest[INFO] Running
>>> org.apache.zookeeper.common.TimeTest[INFO] Tests run: 1, Failures:
>> 0,
>>> Errors: 0, Skipped: 0, Time elapsed: 0.718 s - in
>>> org.apache.zookeeper.common.TimeTest[INFO] Tests run: 352, Failures:
>>> 0, Errors: 0, Skipped: 0, Time elapsed: 7.425 s - in
>>> org.apache.zookeeper.common.X509UtilTest[INFO] Running
>>> org.apache.zookeeper.common.PEMFileLoaderTest[INFO] Running
>>> org.apache.zookeeper.common.KeyStoreFileTypeTest[INFO] Tests run: 9,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.144 s - in
>>> org.apache.zookeeper.common.KeyStoreFileTypeTest[INFO] Running
>>> org.apache.zookeeper.audit.AuditEventTest[INFO] Tests run: 2,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.084 s - in
>>> org.apache.zookeeper.audit.AuditEventTest[INFO] Running
>>> org.apache.zookeeper.audit.StandaloneServerAuditTest[INFO] Tests
>> run:
>>> 72, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.027 s - in
>>> org.apache.zookeeper.common.PEMFileLoaderTest[INFO] Tests run: 5,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.197 s - in
>>> org.apache.zookeeper.common.FileChangeWatcherTest[INFO] Tests run:
>> 1,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.755 s - in
>>> org.apache.zookeeper.audit.StandaloneServerAuditTest[INFO] Running
>>> org.apache.zookeeper.audit.Log4jAuditLoggerTest[INFO] Running
>>> org.apache.zookeeper.ZKUtilTest[ERROR] Tests run: 4, Failures: 1,
>>> Errors: 0, Skipped: 0, Time elapsed: 0.194 s  FAILURE! - in
>>> org.apache.zookeeper.ZKUtilTest[ERROR]
>>> testUnreadableFileInput(org.apache.zookeeper.ZKUtilTest)  Time elapsed:
>>> 0.014 s   FAILURE!java.lang.AssertionError  at
>>> 
>> org.apache.zookeeper.ZKUtilTest.testUnreadableFileInput(ZKUtilTest.java:83)[INFO]
>>> Running org.apache.zookeeper.PortAssignmentTest[INFO] Tests run: 13,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 s - in
>>> org.apache.zookeeper.PortAssignmentTest[INFO] Running
>>> org.apache.zookeeper.VerGenTest[INFO] Tests run: 6, Failures: 0,
>>> Errors: 0, Skipped: 0, Time elapsed: 1.747 s - in
>>> org.apache.zookeeper.audit.Log4jAuditLoggerTest[INFO] Tests run: 14,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.327 s - in
>>> org.apache.zookeeper.VerGenTest[INFO] Running
>>> org.apache.zookeeper.ZooKeeperTest[INFO] Running
>>> org.apache.zookeeper.GetAllChildrenNumberTest[INFO] Running
>>> org.apache.zookeeper.RemoveWatchesCmdTest[INFO] Tests run: 2,
>>> Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.511 s - in
>>> org.apache.zookeeper.GetAllChildrenNumberTest[INFO] Running
&g

Re: ZooKeeper Meetup @ Facebook - Jan 29th

2020-01-10 Thread Jordan Zimmerman
If there’s video conferencing available I could do a remote presention on the 
ideas I have for gRPC support. 


Jordan Zimmerman

> On Jan 10, 2020, at 3:01 PM, Mohamed Jeelani  wrote:
> 
> Lets Meetup!  (Please Register at https://zkmeetup2020.splashthat.com/)
> 
> Your ZooKeeper friends @ Facebook would like to invite you to our office to 
> share and learn whats new with ZooKeeper. We will not only share what we at 
> Facebook have been up to, but we have exciting talks from speakers from the 
> ZooKeeper community lined up, including from Cloudera, Twitter, Salesforce 
> and SJSU who are eager to share what they've been working on
> 
> ...and of course we've got some cool swag for you as well.
> 
> January 29th 2020 4.30pm - 8pm
> 
> Facebook Campus, MPK 28
> Talks: 5pm - 7pm
> Networking & Happy Hour: 7pm - 8pm
> 
> This event will be live streamed via Facebook Live at 
> https://www.facebook.com/zkmeetup
> 
> Please Register at https://zkmeetup2020.splashthat.com/
> 


Re: ZK makes apache 2019 "top 5" projects

2019-12-12 Thread Jordan Zimmerman
Fantastic 


Jordan Zimmerman

> On Dec 12, 2019, at 3:49 AM, Flavio Junqueira  wrote:
> 
> +1, thank you all for the hard work.
> 
> -Flavio
> 
>> On 12 Dec 2019, at 08:36, Enrico Olivelli  wrote:
>> 
>> Yes, great.
>> 
>> Please also note that Kafka and Lucene/Solr that are still listed in that
>> list  are using Zookeeper :)
>> 
>> 
>> Enrico
>> 
>> Il gio 12 dic 2019, 05:46 tison  ha scritto:
>> 
>>> Kudos!
>>> 
>>> Best,
>>> tison.
>>> 
>>> 
>>> Patrick Hunt  于2019年12月12日周四 上午11:32写道:
>>> 
>>>> This is really awesome, check it out:
>>>> https://twitter.com/phunt/status/1204966326118141952
>>>> 
>>>> Kudos ZooKeeper community on all the hard work and efforts!
>>>> 
>>>> Patrick
>>>> 
>>> 
> 


Re: Any interest in a gRPC version of ZooKeeper

2019-11-27 Thread Jordan Zimmerman
FYI

We have an open discussion regarding replacing Jute, using gRPC and related 
things in this sub channel on the ASF Slack board. All are welcome to join in:

https://the-asf.slack.com/archives/CQKS7A3FT 
<https://the-asf.slack.com/archives/CQKS7A3FT>

-Jordan

> On Nov 18, 2019, at 9:25 AM, Jordan Zimmerman  
> wrote:
> 
> Hi Folks,
> 
> I've written a proof of concept implementation of a ServerCnxnFactory that 
> implements gRPC. The goal is to make it possible to easily write ZooKeeper 
> clients in non-JVM languages. Using the proof of concept I was able to write 
> a Golang client easily. What's the interest level of something like this? 
> Let's discuss if it's worth pursuing. I'd be willing to move this from proof 
> of concept to production but I'll need help (1 or 2 co-developers).
> 
> If you want to try it, I've pushed the Golang client and some instructions 
> here (let me know if you have any issues - I'm a go neophyte). Note: 
> "zookeeper/test.go" is the interesting file:
> 
>   https://github.com/Randgalt/zkgrpc <https://github.com/Randgalt/zkgrpc>
> 
> Here's the proof of concept on the ZK server side (the interesting files are 
> RpcServerCnxn.java, RpcServerCnxnFactory.java, RpcZooKeeperServer.java and 
> zookeeper.proto):
> 
>   https://github.com/apache/zookeeper/compare/master...Randgalt:wip-grpc 
> <https://github.com/apache/zookeeper/compare/master...Randgalt:wip-grpc> 
> 
> Issues:
> Writing a client, even with gRPC, will require some work. Sessions have to be 
> maintained, watchers have to be maintained, etc.
> Currently, Jute is deeply embedded in ZooKeeper. The proof of concept has to 
> emulate Jute byte buffers. Ideally, this will be abstracted so that only 
> records could be used so that the gRPC connection doesn't have to keep 
> marshalling/unmarshalling byte buffers
> I don't know enough about the gRPC client/server implementations to know if 
> it will meet the needs of ZooKeeper. Anyone have experience here?
> I haven't completely thought through how much work it will take to write 
> useful clients. As I've shown with the proof of concept simple ZK CRUD db 
> operations work well. I need to spend time writing a recipe such as Leader 
> Election to see how much work is required.
> I'm not sure how things like SASL and reconfig would work with gRPC
> 
> -Jordan



Re: Releasing 3.6.0 - ALPHA or not ?

2019-11-15 Thread Jordan Zimmerman
The 3.5.x-ALPHA scheme was extremely confusing for ZK's user base (doubly so 
given how long it remained in alpha then beta). Many companies used it anyway 
so adding the qualifier didn't serve much purpose. Better to leave it off and 
communicate known issues through standard channels.

My 0.02

-Jordan

> On Nov 15, 2019, at 3:52 PM, Fangmin Lv  wrote:
> 
> I like the idea of keeping the naming simple and getting rid of 3.6.x. And
> it seems reasonable
> to me to keep beta for a while before we make it a 'stable' version even
> though for the features
> or fixes contributed from different individuals/orgs may have run on their
> prod for a while.
> 
> I would suggest to cut 3.7.0 for the next branch, and only bump major
> version if there is a real
> big and non-compatible change we'd like to release.
> 
> Best,
> Fangmin
> 
> 
> On Wed, Oct 2, 2019 at 9:00 AM Zili Chen  wrote:
> 
>> Might be worth coming up with a proposal
>> (ie review all the existing 4.x jira and other wish list and put a
>> "proposal" wiki page together for 4.0?)
>> 
>> An option is start with a couple of proposals pages under our
>> wiki page[1] whether or not we process the major bump it
>> helps memorize ideas and consensus of our community
>> discussions and can be fast referred to.
>> 
>> Best,
>> tison.
>> 
>> [1] https://cwiki.apache.org/confluence/display/HADOOP2/ZooKeeper
>> 
>> 
>> Patrick Hunt  于2019年10月2日周三 下午11:40写道:
>> 
>>> On Wed, Oct 2, 2019 at 2:22 AM Enrico Olivelli 
>>> wrote:
>>> 
 If we release a 3.6.0-beta, shall the master point to 3.6.x ? or will
>> we
 bump the version to 4.0.0 or 3.7.0 ?
 are we creating a branch-3.6, will it be open for new
>> features/refactors
>>> ?
 
 
>>> Major version change means "not backward compatible". We've been at
>> Apache
>>> for > 10 years and never had to do this. Is it justified? ie what changes
>>> would we make. I can think of a few; update the API to address some long
>>> standing painpoints - ie version numbers are 32 bit ->64, fix the "epoch
>>> overlaps zxid" which is a major PITA IMO, no checksum in the messages,
>>> replace jute with protobuf, etc...
>>> 
>>> That's going to break alot of downstreams. As such it would make sense to
>>> have 3.7... while 4.0 was in play? Or keep b/w compat and address the
>>> things I mentioned above (doable but more costly and time consuming, less
>>> clean, etc...)
>>> 
>>> Depends what we want to accomplish. Given the uptick in community
>> activity
>>> it might be a great time to try. Might be worth coming up with a proposal
>>> (ie review all the existing 4.x jira and other wish list and put a
>>> "proposal" wiki page together for 4.0?)
>>> 
>>> 
 Ideally once we cut a major release we move all the development and all
>>> of
 the new features to master branch = next major release.
 
 In BookKeeper we have a concept of "latest stable" and "last released":
 - master branch -> not ready for production, not released yet
 - last released (3.6.0 in our case) -> latest and greatest, no blocker
 issues, it can be used in production, maybe not yet widespread, no more
>>> API
 changes, allow minor improvements backported from master branch
 - latest stable (3.5.6) in out case-> last point release of latest
>>> release
 branch, the branch has been around for some time and it is proven to be
 stable in production, only critical fixes accepted
 
 So I am leaning toward a 3.6.0 release, it is simpler for users (every
 role) to understand.
 People know that as soon as a major release is cut some issue may be
 encountered, this is why many companies wait to move to next major
>>> version
 only after one or two point releases are available.
 
 btw I can live with a 3.6.0-beta, but with some constraint on a release
 within a couple of months, ZooKeeper community is more and more active,
>>> it
 is becoming simpler to commit patches and cut releases.
 
 
>>> Yes. Make it short whatever you do. But providing an "onramp" sounds
>> like a
>>> reasonable approach to me.
>>> 
>>> Regards,
>>> 
>>> Patrick
>>> 
>>> 
 I will also be happy to drive this release as RM, whatever path we
>> decide
 as a community
 
 Enrico
 
 
 
 
 
 
 Il giorno mer 2 ott 2019 alle ore 11:04 Norbert Kalmar
  ha scritto:
 
> So if I understand this, "3.6.0-beta" (let's cut the 1 here as maybe
>> no
> need for a second beta?) and after a fixed time (say about 3 month)
> "3.6.0-beta2" OR "3.6.0" if it seems fit (vote on it again).
> This sounds good to me, +1 (non-binding).
> 
> Regards,
> Norbert
> 
> On Wed, Oct 2, 2019 at 10:54 AM Andor Molnar 
>> wrote:
> 
>> Hi,
>> 
>> I second Pat’s suggestion about release in beta for a fixed period
>>> and
>> after that follow Norbert’s versioning scheme: 3.6.0-beta1,
 3.6.0-beta2,
> …
>> , 3.6.0
>> 

[jira] [Created] (ZOOKEEPER-3605) ZOOKEEPER-3242 add a connection throttle. Default constructor needs to set it

2019-11-02 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3605:
---

 Summary: ZOOKEEPER-3242 add a connection throttle. Default 
constructor needs to set it
 Key: ZOOKEEPER-3605
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3605
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.6.0
Reporter: Jordan Zimmerman


ZOOKEEPER-3242 add a connection throttle. It gets set in the main constructor 
but not the alternate constructor. This is breaking Apache Curator's testing 
framework. It should also be set in the alternate constructor to avoid an NPE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


ZOOKEEPER-1416 - please validate my assumption

2019-10-23 Thread Jordan Zimmerman
The Persistent/Recursive watches (ZOOKEEPER-1416) PR is ready. But, I'm 
concerned about an assumption I've made. I worry about event ordering regarding 
multiple writes from multiple clients and watchers. Here's my assumption:

If you successfully set a Persistent watcher (i.e. you get confirmation via a 
synchronous addWatch() completing or the callback to an asynchronous addWatch) 
then any future ZooKeeper calls (getData, setData, etc.) have the same 
guarantees as any other Watcher. ZOOKEEPER-1416 adds the persistent watchers to 
both the dataWatchers and childWatchers managed by DataTree. This is what all 
other watchers do.

So, I believe it's true but we really need some confirmation from someone who 
knows those internals better than me. 

-Jordan

Re: [VOTE] Apache ZooKeeper release 3.5.6 candidate 4

2019-10-09 Thread Jordan Zimmerman
It's required by the Apache Release process:

http://www.apache.org/dev/release-distribution 


For every artifact distributed to the public through Apache channels, the PMC

• MUST supply a valid OpenPGP-compatible ASCII-armored detached 
signature file
• MUST supply at least one checksum file
• SHOULD supply a SHA-256 and/or SHA-512 checksum file
• SHOULD NOT supply a MD5 or SHA-1 checksum file (because these are 
deprecated)

For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
supply MD5 or SHA-1. Existing releases do not need to be changed.

> On Oct 9, 2019, at 6:50 PM, Andor Molnar  wrote:
> 
> Checking.
> Why do we generated SHA512 sums for the gpg signatures?
> Is that intentional?
> 
> Andor
> 
> 
> 
>> On 2019. Oct 8., at 22:36, Enrico Olivelli  wrote:
>> 
>> This is a bugfix release candidate for 3.5.6.
>> 
>> It fixes 29 issues, including upgrade of third party libraries,
>> TTL Node APIs for C API, support for PCKS12 Keystores, upgrade of Netty 4
>> and better procedure for the upgrade of servers from 3.4 to 3.5.
>> 
>> The full release notes is available at:
>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12345243
>> 
>> *** Please download, test and vote by October 11th 2019, 23:59 UTC+0. ***
>> 
>> Source files:
>> https://people.apache.org/~eolivelli/zookeeper-3.5.6-candidate-4
>> 
>> Maven staging repo:
>> https://repository.apache.org/content/repositories/orgapachezookeeper-1044
>> 
>> The release candidate tag in git to be voted upon: release-3.5.6-rc4
>> https://github.com/apache/zookeeper/tree/release-3.5.6-rc4
>> 
>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>> https://www.apache.org/dist/zookeeper/KEYS
>> 
>> Should we release this candidate?
>> 
>> Enrico Olivelli
> 



Re: PoweredBy Zookeeper

2019-09-13 Thread Jordan Zimmerman
FYI - we've had this on our wiki forever. By extension all these are powered by 
ZK:

https://cwiki.apache.org/confluence/display/CURATOR/Powered+By 


-Jordan

> On Sep 13, 2019, at 3:21 PM, Enrico Olivelli  wrote:
> 
> Hi,
> Maoling started this initiative to have a great Zookeeper use cases page:
> 
> https://github.com/apache/zookeeper/pull/1073
> 
> I think this is a good idea, but before 'approving' that change I feel we
> must discuss more about it.
> The major point for me is whether we should ask for consent for publishing
> such information
> 
> Usually maintainers of projects actively adk for their project to be added,
> with this page we are citing thid party productd and companies
> 
> Once we clear this point I will be happy to merge that change.
> 
> Thanks Maoling
> 
> 
> Enrico



Re: Consistency Guarantees

2019-08-18 Thread Jordan Zimmerman
Isn’t this a “no true Scotsman” argument? By this definition any eventually 
consistent system can never be considered linearizable. Right?


Jordan Zimmerman

> On Aug 18, 2019, at 1:47 PM, Karolos Antoniadis  wrote:
> 
> Hi Jordan,
> 
> When Aphyr tested ZooKeeper, he did not seem to know that it is not
> linearizable. See here: https://github.com/jepsen-io/jepsen/issues/399, where
> I pointed-out that even with *sync + read*, ZooKeeper might return stale
> data.
> 
> ZooKeeper can only be considered linearizable if we assume that specific
> timing constraints apply. Naturally, in a real system, we cannot make such
> assumptions if we want to be 100% safe.
> For instance, if the time(TCP timeout) > (syncLimit * tickTime), ZooKeeper
> provides linearizable reads. However, if this does not hold (e..g, skewed
> clocks), then ZooKeeper might return stale data.
> To conclude, I do not think we can argue that ZooKeeper is linearizable.
> 
> Cheers,
> Karolos
> 
> 
> On Sun, 18 Aug 2019 at 11:34, Jordan Zimmerman 
> wrote:
> 
>> ZooKeeper _is_ linearizable. I’m pretty sure the ZAB paper talks about it.
>> Aphyr does as well here: https://aphyr.com/posts/291-jepsen-zookeeper
>> 
>> 
>> Jordan Zimmerman
>> 
>>> On Aug 18, 2019, at 1:23 PM, Karolos Antoniadis 
>> wrote:
>>> 
>>> Hello everyone,
>>> 
>>> I was wondering on the exact consistency guarantees that ZooKeeper
>> provides.
>>> It seems that ZooKeeper does not provide strong consistency (i.e.,
>>> linearizability) since reads could potentially return arbitrarily old
>>> values.
>>> On the other hand, ZooKeeper provides sequential consistency, since the
>>> order of operations of a specific client is respected and all operations
>>> appear to take place in some total order (
>>> https://jepsen.io/consistency/models/sequential).
>>> However, ZooKeeper provides linearizable writes, and therefore it
>> provides
>>> something stronger than sequential consistency, but still not as strong
>> as
>>> linearizability. In other words, ZooKeeper guarantees are somewhere
>> between
>>> sequential consistency and linearizability.
>>> Is there a specific name for the specific consistency guarantees that
>>> ZooKeeper provides?
>>> What would the ZooKeeper community claim about the consistency guarantees
>>> of ZooKeeper?
>>> 
>>> Best Regards,
>>> Karolos
>> 


Re: Consistency Guarantees

2019-08-18 Thread Jordan Zimmerman
ZooKeeper _is_ linearizable. I’m pretty sure the ZAB paper talks about it. 
Aphyr does as well here: https://aphyr.com/posts/291-jepsen-zookeeper


Jordan Zimmerman

> On Aug 18, 2019, at 1:23 PM, Karolos Antoniadis  wrote:
> 
> Hello everyone,
> 
> I was wondering on the exact consistency guarantees that ZooKeeper provides.
> It seems that ZooKeeper does not provide strong consistency (i.e.,
> linearizability) since reads could potentially return arbitrarily old
> values.
> On the other hand, ZooKeeper provides sequential consistency, since the
> order of operations of a specific client is respected and all operations
> appear to take place in some total order (
> https://jepsen.io/consistency/models/sequential).
> However, ZooKeeper provides linearizable writes, and therefore it provides
> something stronger than sequential consistency, but still not as strong as
> linearizability. In other words, ZooKeeper guarantees are somewhere between
> sequential consistency and linearizability.
> Is there a specific name for the specific consistency guarantees that
> ZooKeeper provides?
> What would the ZooKeeper community claim about the consistency guarantees
> of ZooKeeper?
> 
> Best Regards,
> Karolos


Re: thoughts about extension to multi semantics

2019-08-17 Thread Jordan Zimmerman



> On Aug 17, 2019, at 4:41 PM, Ted Dunning  wrote:
> 
> On Sat, Aug 17, 2019 at 4:01 PM Jordan Zimmerman 
> wrote:
> 
>> 
>> 
>> ...
>>> I don't understand that. Watches can be set in a multi.
>> 
>> Not in the public API:
>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/Op.java
>> - is it supported in the back-end?
>> 
> 
> Who designed that mess?!?

lol


> 
>> ...>
>>> I don't understand that, either. But this time I just don't understand
>> what you are suggesting and how it helps.
>> 
>> The standard lock recipe creates an ephemeral-sequential node. Once your
>> node (with its sequence number) is returned you call getChildren() to see
>> if you have the lowest numbered node. The lowest numbered node is defined
>> to be the lock holder (or leader, etc.). This requires two round trips.
> 
> 
> Hmm... well looking at the directory in the same operation as the create
> sequential should be easy.

... big snip ...

It seems like the majority of ZK client use cases are a variant of: a) set an 
ephemeral node; b) query the children of the parent; c) watch for some changes; 
d) act and reset. It would be nice if the server provided something more than 
primitives. Of course, we now have Curator to mitigate the difficulty but when 
you need something that Curator doesn't provide you're faced with the 
complexity. 

-JZ

Re: thoughts about extension to multi semantics

2019-08-17 Thread Jordan Zimmerman


> On Aug 17, 2019, at 2:50 PM, Ted Dunning  wrote:
> 
> 
> 
> On Sat, Aug 17, 2019 at 10:19 AM Jordan Zimmerman  <mailto:jor...@jordanzimmerman.com>> wrote:
> Some thoughts:
> 
> It doesn't really help with any of the "standard" recipes as they all need to 
> set watches.
> 
> I don't understand that. Watches can be set in a multi.

Not in the public API: 
https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/Op.java
 - is it supported in the back-end?

>  
> Not to open a can of worms, but if there were a firehose version of watches 
> that could be set independently, this type of multi-op could radically 
> simplify some of the recipes. i.e. one could imagine a multi-op that creates 
> an ephemeral node and then returns a sorted list of child node names so that 
> leader election and locks can be done in one shot. 
> 
> I don't understand that, either. But this time I just don't understand what 
> you are suggesting and how it helps.

The standard lock recipe creates an ephemeral-sequential node. Once your node 
(with its sequence number) is returned you call getChildren() to see if you 
have the lowest numbered node. The lowest numbered node is defined to be the 
lock holder (or leader, etc.). This requires two round trips. It would be nice 
to consolidate this into 1 API call. Further, if you're not the lowest numbered 
node, you must set a watch on the node that precedes you so you know when to 
check again. This is all very cumbersome to do in client code (thus Curator). 
Maybe there's a way to specify this entire behavior in a multi call. 

---

I'll read/review the queue idea separately.

-Jordan



Re: thoughts about extension to multi semantics

2019-08-17 Thread Jordan Zimmerman
Some thoughts:

It doesn't really help with any of the "standard" recipes as they all need to 
set watches. Not to open a can of worms, but if there were a firehose version 
of watches that could be set independently, this type of multi-op could 
radically simplify some of the recipes. i.e. one could imagine a multi-op that 
creates an ephemeral node and then returns a sorted list of child node names so 
that leader election and locks can be done in one shot. 
An atomic counter could be done much more simply than how Curator does it now 
as the test/increment could be done server side
Queues would be easier (possibly - I need to think about this some more). 
Curator's queue code is very complex.

Anyway - I'll try to spend some time in Curator's various recipes to see how 
they would be simplified if this server-side feature was available.

-Jordan

> On Aug 16, 2019, at 11:51 AM, Ted Dunning  wrote:
> 
> The recent discussion about if/then/else idioms in ZK has raised the
> thought that it might be nice to have some extended semantics.
> 
> One version that I could see would be to to extend the current multi-op to
> allow multiple alternatives. The idea would be that there would effectively
> be multiple branches to be tried. The first one that succeeds atomically
> (all or nothing) would be used. The returned value would need to somehow
> indicate which alternative succeeded and would need to return any data
> accessed. The testing of alternatives would also be atomic so it wouldn't
> be possible for things to change within a single operation.
> 
> This extension would allow the previous question to be answered like this:
> 
>   pick_first {
> create(...)
>   } {
> set(...)
>   }
> 
> (the syntax here is just made up and wouldn't actually be supported ... it
> is just for pseudo code purposes).
> 
> 
> My theory is that this would be relatively easy to implement based on the
> current multi operation. Risk due to the change is pretty low given that
> there is code to copy.
> 
> My question is whether this would actually have all that much benefit.
> 
> Does anybody have an opinion on that?



Re: KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-13 Thread Jordan Zimmerman
This is the amount of code a client must write to achieve this in ZooKeeper: 
https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java
 
<https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java>
 - note that this class takes advantage of Curator as well. Comparing this use 
case to Kafka is mistaken. While some users might want durable and complete 
events, most really only want a simple way to follow everything that happens 
from a given parent downward. That this is effectively impossible to do in 
ZooKeeper (other than using Curator) is a large hole IMO.

-Jordan

> On Aug 13, 2019, at 8:20 AM, Jordan Zimmerman  
> wrote:
> 
> Also see https://issues.apache.org/jira/browse/ZOOKEEPER-1416 
> <https://issues.apache.org/jira/browse/ZOOKEEPER-1416>
> 
> There are many use cases where a client wants to see all events from a given 
> parent path down. The semantics of setting one-time watches on a single node 
> in ZK are cumbersome for these use cases. FWIW I had a working PR a few years 
> ago but it's fallen far behind 3.6 now.
> 
> -Jordan
> 
>> On Aug 13, 2019, at 8:18 AM, Andor Molnar > <mailto:an...@apache.org>> wrote:
>> 
>> Subscriber API
>> https://issues.apache.org/jira/browse/ZOOKEEPER-153 
>> <https://issues.apache.org/jira/browse/ZOOKEEPER-153>
>> 
>> Is it supposed to be something like a generic Observer API on the client 
>> side?
>> Observers essentially consume ordered updates of ZAB, so we would need to 
>> provide a way for users to implement their own “observers”. They should be 
>> able to filter for path to be more convenient.
>> 
>> Andor
>> 
>> 
>> 
>>> On 2019. Aug 2., at 20:48, Patrick Hunt  wrote:
>>> 
>>> Michael I think you are describing subscribe - this?
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-153
>>> wasn't there some work done to keep tlogs around for a while? Or am I miss
>>> remembering? (fb folks?)
>>> 
>>> I'll also add that we haven't done any benchmarking in quite some time. It
>>> would be interesting to collect a few of these use cases from the
>>> community, esp downstreams, and evaluate performance, see if we can address.
>>> 
>>> Patrick
>>> 
>>> On Fri, Aug 2, 2019 at 11:03 AM Michael Han  wrote:
>>> 
>>>> Folks,
>>>> 
>>>> Some of you might already see this. Comments?
>>>> 
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
>>>> 
>>>> 
>>>> What caught my eyes are:
>>>> 
>>>> *Worse still, although ZooKeeper is the store of record, the state in
>>>> ZooKeeper often doesn't match the state that is held in memory in the
>>>> controller.  For example, when a partition leader changes its ISR in ZK,
>>>> the controller will typically not learn about these changes for many
>>>> seconds.  There is no generic way for the controller to follow the
>>>> ZooKeeper event log.  Although the controller can set one-shot watches, the
>>>> number of watches is limited for performance reasons.  When a watch
>>>> triggers, it doesn't tell the controller the current state-- only that the
>>>> state has changed.  By the time the controller re-reads the znode and sets
>>>> up a new watch, the state may have changed from what it was when the watch
>>>> originally fired.  If there is no watch set, the controller may not learn
>>>> about the change at all.  In some cases, restarting the controller is the
>>>> only way to resolve the discrepancy.*
>>>> 
>>>> I've seen some similar zookeeper use cases that ended up like what's
>>>> described here. How can ZooKeeper solve this? It seems to me that the only
>>>> solution is to provide linearizable read on watched operations. Thoughts?
>>>> 
>>>> Michael.
>>>> 
>> 
> 



Re: KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-13 Thread Jordan Zimmerman
Also see https://issues.apache.org/jira/browse/ZOOKEEPER-1416 


There are many use cases where a client wants to see all events from a given 
parent path down. The semantics of setting one-time watches on a single node in 
ZK are cumbersome for these use cases. FWIW I had a working PR a few years ago 
but it's fallen far behind 3.6 now.

-Jordan

> On Aug 13, 2019, at 8:18 AM, Andor Molnar  wrote:
> 
> Subscriber API
> https://issues.apache.org/jira/browse/ZOOKEEPER-153
> 
> Is it supposed to be something like a generic Observer API on the client side?
> Observers essentially consume ordered updates of ZAB, so we would need to 
> provide a way for users to implement their own “observers”. They should be 
> able to filter for path to be more convenient.
> 
> Andor
> 
> 
> 
>> On 2019. Aug 2., at 20:48, Patrick Hunt  wrote:
>> 
>> Michael I think you are describing subscribe - this?
>> https://issues.apache.org/jira/browse/ZOOKEEPER-153
>> wasn't there some work done to keep tlogs around for a while? Or am I miss
>> remembering? (fb folks?)
>> 
>> I'll also add that we haven't done any benchmarking in quite some time. It
>> would be interesting to collect a few of these use cases from the
>> community, esp downstreams, and evaluate performance, see if we can address.
>> 
>> Patrick
>> 
>> On Fri, Aug 2, 2019 at 11:03 AM Michael Han  wrote:
>> 
>>> Folks,
>>> 
>>> Some of you might already see this. Comments?
>>> 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
>>> 
>>> 
>>> What caught my eyes are:
>>> 
>>> *Worse still, although ZooKeeper is the store of record, the state in
>>> ZooKeeper often doesn't match the state that is held in memory in the
>>> controller.  For example, when a partition leader changes its ISR in ZK,
>>> the controller will typically not learn about these changes for many
>>> seconds.  There is no generic way for the controller to follow the
>>> ZooKeeper event log.  Although the controller can set one-shot watches, the
>>> number of watches is limited for performance reasons.  When a watch
>>> triggers, it doesn't tell the controller the current state-- only that the
>>> state has changed.  By the time the controller re-reads the znode and sets
>>> up a new watch, the state may have changed from what it was when the watch
>>> originally fired.  If there is no watch set, the controller may not learn
>>> about the change at all.  In some cases, restarting the controller is the
>>> only way to resolve the discrepancy.*
>>> 
>>> I've seen some similar zookeeper use cases that ended up like what's
>>> described here. How can ZooKeeper solve this? It seems to me that the only
>>> solution is to provide linearizable read on watched operations. Thoughts?
>>> 
>>> Michael.
>>> 
> 



Re: Intellij Idea warning as error with -Xdoclint

2019-06-19 Thread Jordan Zimmerman
I'm getting this too - FYI

> On Jun 19, 2019, at 3:46 PM, Andor Molnar  wrote:
> 
> Hi Enrico,
> 
> I have the following error message in Idea since -Xdoclint is enabled in the 
> main pom.xml file:
> 
> /Users/andormolnar/git/my-zookeeper/zookeeper-jute/src/main/java/org/apache/jute/Utils.java
> Error:(194, 15) java: @param name not found
> Error:(231, 15) java: @param name not found
> 
> Strange that I don’t see the same warnings in console when running ‘mvn 
> install’.
> I confirm that removing “-Xdoclint” (or fixing javadoc issues) solves the 
> problem.
> 
> Andor
> 
> 



Re: Time to think about a 3.6.0 release?

2019-06-15 Thread Jordan Zimmerman
On Persistent/Recursive watches: I’m willing to rebase, etc if there’s 
confidence it will be merged. 


Jordan Zimmerman

> On Jun 15, 2019, at 10:59 AM, Andor Molnar  wrote:
> 
> Hi Enrico!
> 
> Very good point, I entirely support the idea.
> 
> Question to Friends@Facebook and Twitter contributors: how many outstanding
> Jiras/PRs do you have which you would like to see in 3.6?
> 
> I'd also like to highlight the long outstanding PR from Mapr:
> https://github.com/apache/zookeeper/pull/730
> 
> And some great new features which are still looking for to be merged:
> - Persistent recursive watchers:
> https://github.com/apache/zookeeper/pull/136
> - Enforce client auth: https://github.com/apache/zookeeper/pull/118
> - Slow operation log
> - Jetty port unification
> 
> Regards,
> Andor
> 
> 
> 
> 
> 
>> On Sat, Jun 15, 2019 at 1:31 PM Enrico Olivelli  wrote:
>> 
>> Hi Zookeepers !
>> I checked on JIRA and it seems that master in good shape, no real blockers
>> that mine the stability of the code.
>> 
>> We have plenty of cool pull requests almost ready to be merged (mostly from
>> Facebook friends and Twitter fork)
>> 
>> Current master branch is full of great features in respect to 3.5.
>> 
>> AFAIK There is no incompatibility with 3.5 so it is okay to stay with
>> 3.6.0, although I think that there is so much stuff to legit a switch to
>> 4.0.0 (but we can reserve such bump for the time we will separate the java
>> client and create a minimal compatibility breakage)
>> 
>> Thoughts?
>> 
>> Enrico
>> 


Re: [ANNOUNCE] Apache ZooKeeper 3.5.5

2019-05-21 Thread Jordan Zimmerman
AFAIK, the version of ZooKeeper 3.5.5 being released is fully compatible with 
Curator. Please let us know if you find any issues.

-Jordan

> On May 21, 2019, at 10:44 AM, rammohan ganapavarapu  
> wrote:
> 
> I am curious to know what are the compatible "Apache Curator" client
> version for 3.5.5 version.
> Also is there any upgrade path from 3.4.5 to 3.5.5. Can i add a 3.5.5
> follower/observer for a 3.4.5 leader and vice versa?
> 
> Thanks,
> Ram
> 
> On Tue, May 21, 2019 at 1:34 AM Tamas Penzes 
> wrote:
> 
>> Congratulations!
>> 
>> We waited for this release for a really long time. I'm looking forward to
>> use it.
>> ZooKeeper arrived to a new Era.
>> 
>> Regards, Tamaas
>> 
>> On Tue, May 21, 2019 at 4:18 AM Zili Chen  wrote:
>> 
>>> Congratulations!
>>> 
>>> rammohan ganapavarapu  于2019年5月21日周二 上午7:25写道:
>>> 
 Congratulations, finally it's out 
 
 On Mon, May 20, 2019, 11:59 AM Enrico Olivelli 
 wrote:
 
> Congratulations!
> 
> Enrico
> 
> Il lun 20 mag 2019, 19:28 Lars Francke  ha
> scritto:
> 
>> Congratulations on this release! It looks great and I'm looking
>>> forward
> to
>> using all those new features.
>> 
>> Thank you, everyone, for your work on this.
>> 
>> On Mon, May 20, 2019 at 7:06 PM Andor Molnar 
>>> wrote:
>> 
>>> The Apache ZooKeeper team is proud to announce Apache ZooKeeper
 version
>>> 3.5.5
>>> 
>>> ZooKeeper is a high-performance coordination service for
>>> distributed
>>> applications. It exposes common services - such as naming,
>>> configuration management, synchronization, and group services -
>> in
>>> a
>>> simple interface so you don't have to write them from scratch.
>> You
 can
>>> use it off-the-shelf to implement consensus, group management,
>>> leader
>>> election, and presence protocols. And you can build on it for
>> your
>>> own, specific needs.
>>> 
>>> For ZooKeeper release details and downloads, visit:
>>> https://zookeeper.apache.org/releases.html
>>> 
>>> ZooKeeper 3.5.5 Release Notes are at:
>>> https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html
>>> 
>>> We would like to thank the contributors that made the release
 possible.
>>> 
>>> Regards,
>>> 
>>> The ZooKeeper Team
>>> 
>>> 
>>> 
>> 
> 
 
>>> 
>> 



Re: [ANNOUNCE] Apache ZooKeeper 3.5.5

2019-05-20 Thread Jordan Zimmerman
Maven Central shows the first version of 3.5.x as Aug 2014. Here's to a five 
year dev cycle :D But, of course, congrats. 

-JZ

> On May 20, 2019, at 12:25 PM, Jeff Widman  wrote:
> 
> Congrats on dropping the "beta" tag, that's a major milestone!
> 
> Thanks to the team for all your hard work!
> 
> On Mon, May 20, 2019 at 10:06 AM Andor Molnar  wrote:
> 
>> The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
>> 3.5.5
>> 
>> ZooKeeper is a high-performance coordination service for distributed
>> applications. It exposes common services - such as naming,
>> configuration management, synchronization, and group services - in a
>> simple interface so you don't have to write them from scratch. You can
>> use it off-the-shelf to implement consensus, group management, leader
>> election, and presence protocols. And you can build on it for your
>> own, specific needs.
>> 
>> For ZooKeeper release details and downloads, visit:
>> https://zookeeper.apache.org/releases.html
>> 
>> ZooKeeper 3.5.5 Release Notes are at:
>> https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html
>> 
>> We would like to thank the contributors that made the release possible.
>> 
>> Regards,
>> 
>> The ZooKeeper Team
>> 
>> 
>> 
> 
> -- 
> 
> *Jeff Widman*
> jeffwidman.com  | 740-WIDMAN-J (943-6265)
> <><



Re: Jute buffer size increasing.

2019-03-08 Thread Jordan Zimmerman
The Jute buffer size is a manual setting, so it wouldn't increase by itself. If 
you find that you have to keep increasing it's due to a few possibilities: 
ZNodes with lots of children (with long names). Any 1 ZooKeeper API call is 
limited by jute max buffer. So, if a call to getChildren() where there's a lot 
of children or lots of children with long names (or both) can bust jute max 
buffer. Another possibility is ZNodes with large payloads.

-JZ

> On Mar 8, 2019, at 7:21 PM, Asim Zafir  wrote:
> 
> 
> + ZK dev community. Please assist. 
> On Fri, Mar 8, 2019 at 4:10 PM Asim Zafir  > wrote:
> Hi Jordon, 
> 
> We are seeing constance increase in jute buffer size on our zookeeper 
> instance. right now it is set to 128. We are primarily using zookeeper for 
> HBase cluster. I want to see what is contributing to the increase of jute 
> buffer size but so for after investigating the code and studying the protocol 
> itself it appear it is a function of number of watches that gets set on the 
> znodes. to see how many zookeeper watch objects are on zookeeper jvm 
> /instance I did a jmap history:live on zookeeper pid and I got the following 
> output (please see below). I am not sure what is [C, [B here and it doesn't 
> appear its refers to any class - I don't see this on dev instance of 
> zookeeper. due to suspect memory leak or another issue? Please guide me 
> through this as I can't find a resource who can go that far to give me any 
> hint as to what may be happening on my end. Also is it safe for ZK sizes to 
> increase that much? I will greatly appreciate your feedback and help on this.
> 
>  num #instances #bytes  class name
> --
>1:220810  140582448  [C
>2:109370   34857168  [B
>3:1038427476624  org.apache.zookeeper.data.StatPersisted
>4:2207035296872  java.lang.String
>5: 286823783712  
>6: 286823681168  
>7:1110003552000  java.util.HashMap$Entry
>8:1075693442208  
> java.util.concurrent.ConcurrentHashMap$HashEntry
>9:1038423322944  org.apache.zookeeper.server.DataNode
>   10:  26553179640  
>   11:  23132017056  
>   12:  26551842456  
>   13:   3181241568  
> [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
>   14:  75261221504  [Ljava.util.HashMap$Entry;
>   15:  1820 812976  
>   16:  8228 394944  java.util.HashMap
>   17:  2903 348432  java.lang.Class
>   18:  4077 229688  [S
>   19:  4138 221848  [[I
>   20:   231 125664  
>   21:  7796 124736  java.util.HashSet
>   22:  6771 108336  java.util.HashMap$KeySet
>   23:  1263  62968  [Ljava.lang.Object;
>   24:   746  59680  java.lang.reflect.Method
>   25:  3570  57120  java.lang.Object
>   26:   502  36144  org.apache.zookeeper.server.Request
>   27:   649  25960  java.lang.ref.SoftReference
>   28:   501  24048  org.apache.zookeeper.txn.TxnHeader
>   29:   188  21704  [I
>   30:   861  20664  java.lang.Long
>   31:   276  19872  java.lang.reflect.Constructor
>   32:   559  17888  
> java.util.concurrent.locks.ReentrantLock$NonfairSync
>   33:   422  16880  java.util.LinkedHashMap$Entry
>   34:   502  16064  
> org.apache.zookeeper.server.quorum.QuorumPacket
>   35:   455  14560  java.util.Hashtable$Entry
>   36:   495  14368  [Ljava.lang.String;
>   37:   318  12720  
> java.util.concurrent.ConcurrentHashMap$Segment
>   38: 3  12336  [Ljava.nio.ByteBuffer;
>   39:   514  12336  javax.management.ObjectName$Property
>   40:   505  12120  java.util.LinkedList$Node
>   41:   501  12024  
> org.apache.zookeeper.server.quorum.Leader$Proposal
>   42:   619  11920  [Ljava.lang.Class;
>   43:74  11840  org.apache.zookeeper.server.NIOServerCnxn
>   44:   145  11672  [Ljava.util.Hashtable$Entry;
>   45:   729  11664  java.lang.Integer
>   46:   346  11072  java.lang.ref.WeakReference
>   47:   449  10776  org.apache.zookeeper.txn.SetDataTxn
>   48:   156   9984  
> com.cloudera.cmf.event.shaded.org.apache.avro.Schema$Props
>   49:   266   8512  java.util.Vector
>   50:75   8400  sun.nio.ch.SocketChannelImpl
>   51:   175   8400  java.nio.HeapByteBuffer
>   52:   247   8320  

Re: Question on ZK commit/patch policy.

2019-03-05 Thread Jordan Zimmerman
Even on "trivial" changes having a Jira is very useful. Jira issues show up in 
Release Notes and when end users search for problems/solutions. Even a trivial 
change may be important to some user of ZooKeeper who might want to be able to 
check Jira to see when/why something happened.

-JZ

> On Mar 5, 2019, at 4:16 AM, Justin Ling Mao  wrote:
> 
> agree with this from Brian Nixon.--->"For trivial changes like spelling, 
> whitespace, pruning of import, does itmake sense to have one super/umbrella 
> ticket with multiple PRs attached"
> - Original Message -From: Brian Nixon 
> To: dev@zookeeper.apache.org
> Subject: Re: Question on ZK commit/patch policy.
> Date: 2019-03-05 05:54
> 
> I like having JIRAs for all changes because it allows one to track all the
> changes to given components through the JIRA web interface and it forces
> the contributor to spend some time upfront making sure their change is a
> single coherent unit.
> For trivial changes like spelling, whitespace, pruning of import, does it
> make sense to have one super/umbrella ticket with multiple PRs attached?
> -Brian
> On Wed, Feb 27, 2019 at 1:04 PM Enrico Olivelli  wrote:
>> I think that having a JIRA makes it simpler to create release notes and
>> track bugfixes/new features.
>> Trivial changes, like typos are not worth a JIRA.
>> 
>> My 2 cents
>> Enrico
>> 
>> Il mer 27 feb 2019, 17:57 Patrick Hunt  ha scritto:
>> 
>>> Yea, the commit I just did was a single missing space so no big deal.
>>> Jordan's link is to curator current policy which seems very similar to
>>> ours.
>>> 
>>> I know what current state is. My question though is what do people think?
>>> Stay with the current mechanism or move to something else? Staying put is
>>> fine, I just wanted to review given it's been a while (10+ years!) since
>> we
>>> last considered this and with github/gitbox and time baselines have
>> changed
>>> considerably over that time.
>>> 
>>> Patrick
>>> 
>>> 
>>> On Wed, Feb 27, 2019 at 8:44 AM Andor Molnar >> 
>>> wrote:
>>> 
 There were a few typo/language/cosmetic related patches which were so
>>> small
 that we've decided it's probably not worth the effort to create a Jira
>>> for
 every one of them.
 Similarly, I haven't created Jiras for issues that were found in
>> release
 candidates.
 
 Other than this we generally still don't accept patches without Jira
>>> ticket
 and properly formatted title / commit message.
 
 Andor
 
 
 
 On Wed, Feb 27, 2019 at 5:38 PM Patrick Hunt  wrote:
 
> Historically we've only committed changes that have an associated
>> JIRA.
 Now
> with the move to gitbox we are seeing increased submissions (PRs)
>> that
> don't include a JIRA - I just committed one and then realized that it
> didn't include a JIRA (sorry about that!). Given github and the
>> recent
 move
> to gitbox significantly streamlines the contribution process I'm
 wondering
> if we should reconsider our process. Any thoughts? Anyone work on
>>> another
> Apache project that does things differently and has pro/con to share?
> 
> Regards,
> 
> Patrick
> 
 
>>> 
>> 



Re: Question on ZK commit/patch policy.

2019-02-27 Thread Jordan Zimmerman
For Curator we require a Jira. If we get a PR without a Jira we always ask them 
to create one.

-JZ

> On Feb 27, 2019, at 11:37 AM, Patrick Hunt  wrote:
> 
> Historically we've only committed changes that have an associated JIRA. Now
> with the move to gitbox we are seeing increased submissions (PRs) that
> don't include a JIRA - I just committed one and then realized that it
> didn't include a JIRA (sorry about that!). Given github and the recent move
> to gitbox significantly streamlines the contribution process I'm wondering
> if we should reconsider our process. Any thoughts? Anyone work on another
> Apache project that does things differently and has pro/con to share?
> 
> Regards,
> 
> Patrick



Re: Question on ZK commit/patch policy.

2019-02-27 Thread Jordan Zimmerman
Note: we discuss this on our wiki too: 
https://cwiki.apache.org/confluence/display/CURATOR/Submitting+Pull+Requests

> On Feb 27, 2019, at 11:37 AM, Patrick Hunt  wrote:
> 
> Historically we've only committed changes that have an associated JIRA. Now
> with the move to gitbox we are seeing increased submissions (PRs) that
> don't include a JIRA - I just committed one and then realized that it
> didn't include a JIRA (sorry about that!). Given github and the recent move
> to gitbox significantly streamlines the contribution process I'm wondering
> if we should reconsider our process. Any thoughts? Anyone work on another
> Apache project that does things differently and has pro/con to share?
> 
> Regards,
> 
> Patrick



[jira] [Created] (ZOOKEEPER-3269) Testable facade would benefit from a queueEvent() method

2019-02-03 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-3269:
---

 Summary: Testable facade would benefit from a queueEvent() method
 Key: ZOOKEEPER-3269
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3269
 Project: ZooKeeper
  Issue Type: New Feature
  Components: java client
Reporter: Jordan Zimmerman
 Fix For: 3.6.0


For testing and other reasons it would be very useful to add a way to inject an 
event into ZooKeeper's event queue. ZooKeeper already has the {{Testable}} for 
features such as this (low level, backdoor, testing, etc.). This queueEvent 
method would be particularly helpful to Apache Curator and we'd very much 
appreciate its inclusion.

The method should have the signature:

{code}
void queueEvent(WatchedEvent event);
{code}

Calling this would have the affect of queueing an event into the clients queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Leader election

2018-12-06 Thread Jordan Zimmerman
> Old service leader will detect network partition max 15 seconds after it
> happened.

If the old service leader is in a very long GC it will not detect the 
partition. In the face of VM pauses, etc. it's not possible to avoid 2 leaders 
for a short period of time.

-JZ

Re: Leader election

2018-12-06 Thread Jordan Zimmerman
> it seems like the
> inconsistency may be caused by the partition of the Zookeeper cluster
> itself

Yes - there are many ways in which you can end up with 2 leaders. However, if 
properly tuned and configured, it will be for a few seconds at most. During a 
GC pause no work is being done anyway. But, this stuff is very tricky. 
Requiring an atomically unique leader is actually a design smell and you should 
reconsider your architecture.

> Maybe we can use a more
> lightweight Hazelcast for example?

There is no distributed system that can guarantee a single leader. Instead you 
need to adjust your design and algorithms to deal with this (using optimistic 
locking, etc.).

-Jordan

> On Dec 6, 2018, at 1:52 PM, Michael Borokhovich  wrote:
> 
> Thanks Jordan,
> 
> Yes, I will try Curator.
> Also, beyond the problem described in the Tech Note, it seems like the
> inconsistency may be caused by the partition of the Zookeeper cluster
> itself. E.g., if a "leader" client is connected to the partitioned ZK node,
> it may be notified not at the same time as the other clients connected to
> the other ZK nodes. So, another client may take leadership while the
> current leader still unaware of the change. Is it true?
> 
> Another follow up question. If Zookeeper can guarantee a single leader, is
> it worth using it just for leader election? Maybe we can use a more
> lightweight Hazelcast for example?
> 
> Michael.
> 
> 
> On Thu, Dec 6, 2018 at 4:50 AM Jordan Zimmerman 
> wrote:
> 
>> It is not possible to achieve the level of consistency you're after in an
>> eventually consistent system such as ZooKeeper. There will always be an
>> edge case where two ZooKeeper clients will believe they are leaders (though
>> for a short period of time). In terms of how it affects Apache Curator, we
>> have this Tech Note on the subject:
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10 <
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10> (the
>> description is true for any ZooKeeper client, not just Curator clients). If
>> you do still intend to use a ZooKeeper lock/leader I suggest you try Apache
>> Curator as writing these "recipes" is not trivial and have many gotchas
>> that aren't obvious.
>> 
>> -Jordan
>> 
>> http://curator.apache.org <http://curator.apache.org/>
>> 
>> 
>>> On Dec 5, 2018, at 6:20 PM, Michael Borokhovich 
>> wrote:
>>> 
>>> Hello,
>>> 
>>> We have a service that runs on 3 hosts for high availability. However, at
>>> any given time, exactly one instance must be active. So, we are thinking
>> to
>>> use Leader election using Zookeeper.
>>> To this goal, on each service host we also start a ZK server, so we have
>> a
>>> 3-nodes ZK cluster and each service instance is a client to its dedicated
>>> ZK server.
>>> Then, we implement a leader election on top of Zookeeper using a basic
>>> recipe:
>>> https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_leaderElection.
>>> 
>>> I have the following questions doubts regarding the approach:
>>> 
>>> 1. It seems like we can run into inconsistency issues when network
>>> partition occurs. Zookeeper documentation says that the inconsistency
>>> period may last “tens of seconds”. Am I understanding correctly that
>> during
>>> this time we may have 0 or 2 leaders?
>>> 2. Is it possible to reduce this inconsistency time (let's say to 3
>>> seconds) by tweaking tickTime and syncLimit parameters?
>>> 3. Is there a way to guarantee exactly one leader all the time? Should we
>>> implement a more complex leader election algorithm than the one suggested
>>> in the recipe (using ephemeral_sequential nodes)?
>>> 
>>> Thanks,
>>> Michael.
>> 
>> 



Re: Leader election

2018-12-06 Thread Jordan Zimmerman
It is not possible to achieve the level of consistency you're after in an 
eventually consistent system such as ZooKeeper. There will always be an edge 
case where two ZooKeeper clients will believe they are leaders (though for a 
short period of time). In terms of how it affects Apache Curator, we have this 
Tech Note on the subject: 
https://cwiki.apache.org/confluence/display/CURATOR/TN10 
 (the description is 
true for any ZooKeeper client, not just Curator clients). If you do still 
intend to use a ZooKeeper lock/leader I suggest you try Apache Curator as 
writing these "recipes" is not trivial and have many gotchas that aren't 
obvious. 

-Jordan

http://curator.apache.org 


> On Dec 5, 2018, at 6:20 PM, Michael Borokhovich  wrote:
> 
> Hello,
> 
> We have a service that runs on 3 hosts for high availability. However, at
> any given time, exactly one instance must be active. So, we are thinking to
> use Leader election using Zookeeper.
> To this goal, on each service host we also start a ZK server, so we have a
> 3-nodes ZK cluster and each service instance is a client to its dedicated
> ZK server.
> Then, we implement a leader election on top of Zookeeper using a basic
> recipe:
> https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_leaderElection.
> 
> I have the following questions doubts regarding the approach:
> 
> 1. It seems like we can run into inconsistency issues when network
> partition occurs. Zookeeper documentation says that the inconsistency
> period may last “tens of seconds”. Am I understanding correctly that during
> this time we may have 0 or 2 leaders?
> 2. Is it possible to reduce this inconsistency time (let's say to 3
> seconds) by tweaking tickTime and syncLimit parameters?
> 3. Is there a way to guarantee exactly one leader all the time? Should we
> implement a more complex leader election algorithm than the one suggested
> in the recipe (using ephemeral_sequential nodes)?
> 
> Thanks,
> Michael.



Re: [VOTE] Migrate ZK to Maven build

2018-04-23 Thread Jordan Zimmerman
+1 (non binding)

> On Apr 23, 2018, at 6:21 PM, Mohammad arshad  
> wrote:
> 
> +1
> 
> -Original Message-
> From: Andor Molnar [mailto:an...@cloudera.com] 
> Sent: Monday, April 23, 2018 4:43 PM
> To: dev@zookeeper.apache.org
> Subject: Re: [VOTE] Migrate ZK to Maven build
> 
> +1 (non-binding)
> 
> On Mon, Apr 23, 2018 at 10:30 AM, Tamas Penzes  wrote:
> 
>> +1 (non-binding)
>> 
>> On Fri, Apr 20, 2018 at 4:06 PM, Norbert Kalmar 
>> wrote:
>> 
>>> Hi,
>>> 
>>> Let's start a vote on migrating to maven instead of ant.
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3021
>>> 
>>> *Shall we migrate ZooKeeper build from ant to Maven?*
>>> 
>>> Please reply with [Yes / +1] or [No / -1] to this thread.
>>> 
>>> Thanks,
>>> Norbert
>>> 
>> 
>> 
>> 
>> --
>> *Tamás Pénzes* | Engineering Manager
>> e. tam...@cloudera.com
>> cloudera.com 
>> 
>> [image: Cloudera] 
>> 
>> [image: Cloudera on Twitter]  [image:
>> Cloudera on Facebook]  [image: 
>> Cloudera on LinkedIn] 
>> --
>> 



Re: [VOTE] Upgrade 3.5 and trunk to Java8

2018-03-22 Thread Jordan Zimmerman
+1 (non binding)

> On Mar 22, 2018, at 12:57 PM, Andor Molnar  wrote:
> 
> Hi all,
> 
> Let's start the vote on upgrading to Java8.
> https://issues.apache.org/jira/browse/ZOOKEEPER-3002
> 
> *Shall we upgrade the minimum required Java version to compile and run
> ZooKeeper on 3.5 and master branches to Java 1.8?*
> 
> Please cast your vote by replying 'Yes' or 'No' to this thread.
> 
> Thanks,
> Andor



[jira] [Commented] (ZOOKEEPER-2963) standalone

2018-02-14 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364826#comment-16364826
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2963:
-

This my favorite bug of all time.

> standalone
> --
>
> Key: ZOOKEEPER-2963
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2963
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: wu xiaoxue
>Assignee: maoling
>Priority: Major
>
> Today is Valentine's Day.I am still a single dog.
> When reading this line code annotation, I burst into tear.
> My New Year's Resolution is girlfriend(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Criticism on ZK

2018-02-13 Thread Jordan Zimmerman
> • Unlike Kafka it does not have a vibrant and huge community (merge those 
> PR’s please, anyone?)

This is clearly true. The community was active 5 or so years ago but in the 
past few years it's almost non-existent. Patrick is the only active committer. 
It can take years (!!) and numerous cajoling emails to get engagement on pull 
requests. Releases happen only once or twice a year. The worst culprit has been 
the so-called alpha/beta of 3.5.x. Whatever the beliefs of the ZooKeeper team 
are, 3.5.x has been in production at major tech companies for _years_ yet it's 
still treated as a non-released version. Even if we were to accept the 
alpha/beta label, the original 3.5.0 alpha was 3 and half years ago! That's 
crazy and has contributed dramatically to the negative perception of ZK.

> It uses a protocol which is hard to understand and it’s hard to maintain a 
> large Zookeeper cluster

This is a red herring. Raft may be easy to understand from the whitepaper but 
any distributed protocol is difficult in practice. Further, no user of a tool 
such as etcd or ZooKeeper remotely cares about the protocol. That's an 
implementation detail.

> It’s a bit outdated, compared say with Raft

Another red herring. Raft and ZAB are, essentially, the same protocol.

> It’s written in Java (yes, it’s opinionated but this is a problem for us as 
> ZK is an infrastructure component)

There is a current bias against Java. The reasons for this are beyond the scope 
of what we can discuss here. But, in my view, it's ludicrous. That said, the 
non-Java clients for ZooKeeper are lacking and this is a problem. I don't 
believe there is a good Go client for ZooKeeper for example.

> We run everything in Kubernetes and k8s by default has an in-built Raft 
> implementation, etcd

etcd is a good key/store system. However, I'm not sure how well it does for 
leaders/locks/etc. at scale. Also, is there a good Java/JVM client for it? I 
know they've been working on one but what is it's status? We are working 
against trends in the DevOps world here. DevOps has moved almost entirely to Go 
and the Hashicorp borg. If it's not in Go they're not really interested. This 
is not a problem for ZooKeeper as it addresses a different space - 
applications. But, the Ops people IMO confuse the two products and think "we 
already have etcd why do we need another system to support." A good white paper 
detailing the real differences between etcd/consul and ZooKeeper is needed.

> Linearizability (if there is a word like this) - check this comparison chart

This is just wrong. All operations in ZooKeeper are ordered. This, I think, 
comes up when using etcd as a k/v store. These two use cases, 
locks/leaders/register vs k/v store keep coming up. ZooKeeper is not a 
database. etcd _can_ be used as a k/v store.

> Performance and inherent scalability issues

ZK's performance is better than etcd AFAIK for the uses cases it was designed 
for. However, operating ZooKeeper can be a bear. I know that it's very 
difficult to find qualified ops engineers who can manage ZK ensembles at high 
scale. In particular, if ZK is used as a quasi-database it can be very 
difficult to operate (we're having that problem at Elasticsearch Cloud).

> Client side complexity and thick clients

Well, as the author of Apache Curator, I don't see why this is a problem. What 
does it matter if the client does a lot of the work or the server. It's opaque 
to application writers. In any event, most of the "recipes" in Curator are not 
in-the-box with consul/etcd. These need to be written and then you have a thick 
client again. Most of the things you want to do with ZooKeeper are already 
implemented in Curator. However, if you're not on the JVM you don't get those. 

> Lack of service discovery

Curator has had Service Discovery since its beginning: 
http://curator.apache.org/curator-x-discovery/index.html 
 

-Jordan

> On Feb 13, 2018, at 6:02 AM, Flavio Junqueira  wrote:
> 
> Hello community,
> 
> I came across this blog post:
> 
>  https://banzaicloud.com/blog/kafka-on-etcd/
> 
> And I thought it would be a good idea to discuss the criticism as a 
> community. Let me copy the points here and add some notes:
> 
>   • Unlike Kafka it does not have a vibrant and huge community (merge 
> those PR’s please, anyone?)
> I have personally met and worked with a lot of great people in this community 
> over the years, and as such, I probably have a pretty biased view. But, it is 
> a common concern that we are not fast enough at responding. We also don't 
> have conferences and large meetups compared to other communities. Are those 
> really necessary, though? What can we do to be a better community?
> 
>   • It uses a protocol which is hard to understand and it’s hard to 
> maintain a large Zookeeper cluster
> I can't really speak for the hard to understand part, and I don't understand 
> what 

Re: [ANNOUNCE] New ZooKeeper committer: Abraham Fine

2018-01-30 Thread Jordan Zimmerman
Gratz!

> On Jan 30, 2018, at 1:35 PM, Brian Nixon  wrote:
> 
> Congratulations, Abe!
> 
> On Tue, Jan 30, 2018 at 3:23 AM, Michelle Tan  wrote:
> 
>> Congratulations Abe! :D
>> 
>> Regards,
>> Michelle
>> 
>> On Tue, Jan 30, 2018 at 11:18 AM, 岭秀  wrote:
>> 
>>> Congratulations to Abe!  A well-deserved honor
>>> 
>>> 
>>> -
 On Tue, Jan 30, 2018 at 1:22 AM, Patrick Hunt 
>> wrote:
 
> The Apache ZooKeeper PMC recently extended committer karma to Abe and
>>> he
> has accepted. Abe has made some great contributions and we are
>> looking
> forward to even more :)
> 
> Congratulations and welcome aboard Abe!
> 
> Patrick
> 
 
>>> 
>> 



[jira] [Created] (ZOOKEEPER-2971) Create release notes for 3.5.4

2018-01-28 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2971:
---

 Summary: Create release notes for 3.5.4
 Key: ZOOKEEPER-2971
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2971
 Project: ZooKeeper
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Assignee: Patrick Hunt
 Fix For: 3.5.4


ZOOKEEPER-2901 and ZOOKEEPER-2903 fix a serious bug with TTL nodes in 3.5.3. 
The release notes for 3.5.4 should describe the problem and how it was 
worked-around/fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Failed: ZOOKEEPER- PreCommit Build #1427

2018-01-26 Thread Jordan Zimmerman

> [exec] -1 findbugs.  The patch appears to introduce 1 new Findbugs 
> (version 3.0.1) warnings.


-- RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT: Return value of method without side 
effect is ignored


This findbugs warning is wrong. It's not noticing that there's on override for 
the method in the Enum for TTL. How do we handle this? I don't see that the 
Findbugs annotations are included in the build.

-Jordan

[jira] [Commented] (ZOOKEEPER-984) jenkins failure in testSessionMoved - NPE in quorum

2018-01-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310264#comment-16310264
 ] 

Jordan Zimmerman commented on ZOOKEEPER-984:


FWIW - we just saw this in a 3.5.3 instance. 

> jenkins failure in testSessionMoved - NPE in quorum
> ---
>
> Key: ZOOKEEPER-984
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-984
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Patrick Hunt
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: consoleText16.txt
>
>
> Got the following NPE on my internal jenkins setup running against released 
> 3.3.2 (see attached log)
> {noformat}
> [junit] 2011-02-06 10:39:56,988 - WARN  
> [QuorumPeer:/0.0.0.0:11365:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,988 - INFO  [SyncThread:3:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,989 - WARN  
> [QuorumPeer:/0.0.0.0:11364:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,989 - INFO  [SyncThread:2:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,990 - WARN  
> [QuorumPeer:/0.0.0.0:11363:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,990 - INFO  [SyncThread:5:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,990 - WARN  
> [QuorumPeer:/0.0.0.0:11366:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,990 - INFO  [SyncThread:1:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,991 - INFO  [SyncThread:4:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,995 - INFO  
> [main-SendThread(localhost.localdomain:11363):ClientCnxn$SendThread@738] - 
> Session establishment complete on server 
> localhost.localdomain/127.0.0.1:11363, sessionid = 0x12dfc45e6dd, 
> negotiated timeout = 3
> [junit] 2011-02-06 10:39:56,996 - INFO  
> [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
> 0x12dfc45e6dd with negotiated timeout 3 for client /127.0.0.1:37810
> [junit] 2011-02-06 10:39:56,999 - INFO  [main:ZooKeeper@436] - Initiating 
> client connection, connectString=127.0.0.1:11364 sessionTimeout=3 
> watcher=org.apache.zookeeper.test.QuorumTest$5@248523a0 
> sessionId=85001345146093568 sessionPasswd=
> [junit] 2011-02-06 10:39:57,000 - INFO  
> [main-SendThread():ClientCnxn$SendThread@1041] - Opening socket connection to 
> server /127.0.0.1:11364
> [junit] 2011-02-06 10:39:57,000 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:NIOServerCnxn$Factory@251] - 
> Accepted socket connection from /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,001 - INFO  
> [main-SendThread(localhost.localdomain:11364):ClientCnxn$SendThread@949] - 
> Socket connection established to localhost.localdomain/127.0.0.1:11364, 
> initiating session
> [junit] 2011-02-06 10:39:57,002 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:NIOServerCnxn@770] - Client 
> attempting to renew session 0x12dfc45e6dd at /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,002 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:Learner@95] - Revalidating 
> client: 85001345146093568
> [junit] 2011-02-06 10:39:57,003 - INFO  
> [QuorumPeer:/0.0.0.0:11364:NIOServerCnxn@1580] - Established session 
> 0x12dfc45e6dd with negotiated timeout 3 for client /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,004 - INFO  
> [main-SendThread(localhost.localdomain:11364):ClientCnxn$SendThread@738] - 
> Session establishment complete on server 
> localhost.localdomain/127.0.0.1:11364, sessionid = 0x12dfc45e6dd, 
> negotiated timeout = 3
> [junit] 2011-02-06 10:39:57,005 - WARN  
> [CommitProcessor:2:NIOServerCnxn@1524] - Unexpected exception. Destruction 
> averted.
> [junit] java.lang.NullPointerException
> [junit]   at 
> org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
> [junit]   at 
> org.apache.zookeeper.proto.SetDataResponse.serialize(SetDataResponse.java:40)
> [junit]   at 
> org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
> [junit]   at 
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1500)
> [junit]   at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.jav

[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-11-27 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267445#comment-16267445
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~phunt] - the ttlNodesEnabled now applies to stand alone mode too. I ported 
this to ZOOKEEPER-2903 as well.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Let's cut a ZK 3.5.4-beta release

2017-11-21 Thread Jordan Zimmerman
> Afaict the only real blocker for the release at this point is
> ZOOKEEPER-2901 - Jordan can you resolve the comments, after which we should
> be good to go. LMK if there's anything I'm missing.

I'll have this done in the next day or so. Please wait for me if you can!

-Jordan

[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-11-07 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242349#comment-16242349
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~phunt] Anything else needed before this can be merged along with 
ZOOKEEPER-2903?


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Meetup at Cloudera

2017-10-23 Thread Jordan Zimmerman
>  One way we considered was
> to move to a primarily chat based system rather than relying entirely on
> emails.

What about Gitter? It was too easy so I created this:

https://gitter.im/apache/apache-zookeeper 


Why not use this?

-Jordan

[jira] [Commented] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211143#comment-16211143
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2921:
-

Fair point - I updated the description...

Do we need a new threshold value or is re-using {{fsyncWarningThresholdMS}} 
sufficient?

> fsyncWarningThresholdMS is applied on each getChannel().force() - also needed 
> on entire commit
> --
>
> Key: ZOOKEEPER-2921
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>    Reporter: Jordan Zimmerman
>Priority: Minor
>
> FileTxnLog.commit() has a warning when an individual sync takes longer than 
> {{fsyncWarningThresholdMS}}. However, it would also be useful to warn when 
> the entire commit operation takes longer than {{fsyncWarningThresholdMS}} as 
> this can cause client connection failures. Currently, commit() can take 
> longer than 2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2921:

Description: FileTxnLog.commit() has a warning when an individual sync 
takes longer than {{fsyncWarningThresholdMS}}. However, it would also be useful 
to warn when the entire commit operation takes longer than 
{{fsyncWarningThresholdMS}} as this can cause client connection failures. 
Currently, commit() can take longer than 2/3 of a session but still not log a 
warning.  (was: FileTxnLog.commit() has a warning when an individual sync takes 
longer than {{fsyncWarningThresholdMS}}. However, it would be more useful to 
warn when the entire commit operation takes longer than 
{{fsyncWarningThresholdMS}} as this can cause client connection failures. 
Currently, commit() can take longer than 2/3 of a session but still not log a 
warning.)

> fsyncWarningThresholdMS is applied on each getChannel().force() - also needed 
> on entire commit
> --
>
> Key: ZOOKEEPER-2921
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>    Reporter: Jordan Zimmerman
>Priority: Minor
>
> FileTxnLog.commit() has a warning when an individual sync takes longer than 
> {{fsyncWarningThresholdMS}}. However, it would also be useful to warn when 
> the entire commit operation takes longer than {{fsyncWarningThresholdMS}} as 
> this can cause client connection failures. Currently, commit() can take 
> longer than 2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2921:
---

 Summary: fsyncWarningThresholdMS is applied on each 
getChannel().force() - also needed on entire commit
 Key: ZOOKEEPER-2921
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Priority: Minor


FileTxnLog.commit() has a warning when an individual sync takes longer than 
{{fsyncWarningThresholdMS}}. However, it would be more useful to warn when the 
entire commit operation takes longer than {{fsyncWarningThresholdMS}} as this 
can cause client connection failures. Currently, commit() can take longer than 
2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Meetup at Cloudera

2017-10-19 Thread Jordan Zimmerman
It would be awesome if it can be recorded for those of us who can't attend.

-Jordan

> On Oct 18, 2017, at 7:21 PM, Patrick Hunt <ph...@apache.org> wrote:
> 
> That link didn't work for me - perhaps because of the "preview" portion of
> the URL?
> 
> I ended up using:
> https://www.meetup.com/Bay-Area-Cloudera-User-Group/events/244294554/
> 
> Patrick
> 
> 
> On Wed, Oct 18, 2017 at 9:58 AM, Abraham Fine <af...@apache.org> wrote:
> 
>> Yes, there is still going to be a meetup. Here is the meetup page,
>> please sign up here:
>> https://www.meetup.com/preview/Bay-Area-Cloudera-
>> User-Group/events/244294554
>> 
>> There will be food and beer!
>> 
>> On Wed, Oct 18, 2017, at 07:09, Benjamin Reed wrote:
>>> is there still going to be a meetup tomorrow? i don't see an
>>> announcement anywhere.
>>> 
>>> ben
>>> 
>>> On Fri, Oct 13, 2017 at 12:02 PM, Jordan Zimmerman
>>> <jor...@jordanzimmerman.com> wrote:
>>>> Damn - I'm in Germany - that's 2 in the morning. Can't make it. If
>> it's the week after, though, I can do it.
>>>> 
>>>> -Jordan
>>>> 
>>>>> On Oct 13, 2017, at 9:00 PM, Abraham Fine <af...@apache.org> wrote:
>>>>> 
>>>>> The current plan is 5PM-8PM PST, is that acceptable?
>>>>> 
>>>>> Abe
>>>>> 
>>>>> On Fri, Oct 13, 2017, at 11:43, Jordan Zimmerman wrote:
>>>>>> OK - however, it's turning out the Oct 19 may not work for me. It
>> depends
>>>>>> on the time. The following week is much better. FYI
>>>>>> 
>>>>>>> On Oct 13, 2017, at 8:00 PM, Abraham Fine <af...@apache.org> wrote:
>>>>>>> 
>>>>>>> Hey Jordan-
>>>>>>> 
>>>>>>> I think having a presentation on persistent watches would be great.
>>>>>>> 
>>>>>>> I'll send out info on joining remotely as soon as its available to
>> me.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Abe
>>>>>>> 
>>>>>>> On Wed, Oct 11, 2017, at 00:40, Jordan Zimmerman wrote:
>>>>>>>> As usual I'd like to attend remotely if that's possible. I'm in
>> Europe
>>>>>>>> until the 25th though but if it's at the right time I can present
>> on some
>>>>>>>> nice new features in Curator or possibly the work I've been doing
>> for
>>>>>>>> Persistent Watches in ZooKeeper itself.
>>>>>>>> 
>>>>>>>> -Jordan
>>>>>>>> 
>>>>>>>>> On Oct 11, 2017, at 1:06 AM, Abraham Fine <af...@apache.org>
>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello ZooKeeper Community-
>>>>>>>>> 
>>>>>>>>> It has been a while since our last meetup and it would be great
>> to bring
>>>>>>>>> everyone together again. Cloudera would be able to host a meetup
>> at our
>>>>>>>>> headquarters in Palo Alto, CA next week (I'm thinking 10/19).
>>>>>>>>> 
>>>>>>>>> I was hoping to use the mailing lists to gauge interest. Please
>> reply if
>>>>>>>>> you think you would be able to attend or would prefer a different
>> date.
>>>>>>>>> 
>>>>>>>>> Looking forward to hearing from everyone.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Abe
>>>>>>>> 
>>>>>> 
>>>> 
>> 



[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207880#comment-16207880
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

derp - fixed

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207580#comment-16207580
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 10/17/17 4:34 PM:
---

Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Server IDs up to 255 unless you want TTLS then it's 254.


was (Author: randgalt):
Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Session IDs up to 255 unless you want TTLS then it's 254.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207580#comment-16207580
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Session IDs up to 255 unless you want TTLS then it's 254.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Meetup at Cloudera

2017-10-13 Thread Jordan Zimmerman
Damn - I'm in Germany - that's 2 in the morning. Can't make it. If it's the 
week after, though, I can do it.

-Jordan

> On Oct 13, 2017, at 9:00 PM, Abraham Fine <af...@apache.org> wrote:
> 
> The current plan is 5PM-8PM PST, is that acceptable?
> 
> Abe
> 
> On Fri, Oct 13, 2017, at 11:43, Jordan Zimmerman wrote:
>> OK - however, it's turning out the Oct 19 may not work for me. It depends
>> on the time. The following week is much better. FYI
>> 
>>> On Oct 13, 2017, at 8:00 PM, Abraham Fine <af...@apache.org> wrote:
>>> 
>>> Hey Jordan-
>>> 
>>> I think having a presentation on persistent watches would be great.
>>> 
>>> I'll send out info on joining remotely as soon as its available to me.
>>> 
>>> Thanks,
>>> Abe
>>> 
>>> On Wed, Oct 11, 2017, at 00:40, Jordan Zimmerman wrote:
>>>> As usual I'd like to attend remotely if that's possible. I'm in Europe
>>>> until the 25th though but if it's at the right time I can present on some
>>>> nice new features in Curator or possibly the work I've been doing for
>>>> Persistent Watches in ZooKeeper itself.
>>>> 
>>>> -Jordan
>>>> 
>>>>> On Oct 11, 2017, at 1:06 AM, Abraham Fine <af...@apache.org> wrote:
>>>>> 
>>>>> Hello ZooKeeper Community-
>>>>> 
>>>>> It has been a while since our last meetup and it would be great to bring
>>>>> everyone together again. Cloudera would be able to host a meetup at our
>>>>> headquarters in Palo Alto, CA next week (I'm thinking 10/19).
>>>>> 
>>>>> I was hoping to use the mailing lists to gauge interest. Please reply if
>>>>> you think you would be able to attend or would prefer a different date.
>>>>> 
>>>>> Looking forward to hearing from everyone. 
>>>>> 
>>>>> Thanks,
>>>>> Abe
>>>> 
>> 



Re: Meetup at Cloudera

2017-10-13 Thread Jordan Zimmerman
OK - however, it's turning out the Oct 19 may not work for me. It depends on 
the time. The following week is much better. FYI

> On Oct 13, 2017, at 8:00 PM, Abraham Fine <af...@apache.org> wrote:
> 
> Hey Jordan-
> 
> I think having a presentation on persistent watches would be great.
> 
> I'll send out info on joining remotely as soon as its available to me.
> 
> Thanks,
> Abe
> 
> On Wed, Oct 11, 2017, at 00:40, Jordan Zimmerman wrote:
>> As usual I'd like to attend remotely if that's possible. I'm in Europe
>> until the 25th though but if it's at the right time I can present on some
>> nice new features in Curator or possibly the work I've been doing for
>> Persistent Watches in ZooKeeper itself.
>> 
>> -Jordan
>> 
>>> On Oct 11, 2017, at 1:06 AM, Abraham Fine <af...@apache.org> wrote:
>>> 
>>> Hello ZooKeeper Community-
>>> 
>>> It has been a while since our last meetup and it would be great to bring
>>> everyone together again. Cloudera would be able to host a meetup at our
>>> headquarters in Palo Alto, CA next week (I'm thinking 10/19).
>>> 
>>> I was hoping to use the mailing lists to gauge interest. Please reply if
>>> you think you would be able to attend or would prefer a different date.
>>> 
>>> Looking forward to hearing from everyone. 
>>> 
>>> Thanks,
>>> Abe
>> 



Re: Let's cut a ZK 3.4.11 release

2017-10-13 Thread Jordan Zimmerman
I have 2 issues I'm really anxious to get in 3.4.5:

https://github.com/apache/zookeeper/pull/378 


https://github.com/apache/zookeeper/pull/332

> On Oct 13, 2017, at 8:25 PM, Enrico Olivelli  wrote:
> 
> It would be great to have a 3.5.4 beta release, 3.5 branch is great as it
> has SSL support, but is still lacks quorum peer auth.
> 
> Thank you
> Enrico
> 
> Il ven 13 ott 2017, 20:12 Abraham Fine  ha scritto:
> 
>> Great idea, +1
>> 
>> On Fri, Oct 13, 2017, at 11:11, Patrick Hunt wrote:
>>> Hi folks, any objection to cutting a 3.4.11 release? It's been awhile
>>> since
>>> 3.4.10 and we have over 50 JIRA that have gone into the 3.4 branch.
>>> 
>>> If there are no objections I'll start the process sometime next week.
>>> 
>>> Regards,
>>> 
>>> Patrick
>> 
> -- 
> 
> 
> -- Enrico Olivelli



Re: Meetup at Cloudera

2017-10-11 Thread Jordan Zimmerman
As usual I'd like to attend remotely if that's possible. I'm in Europe until 
the 25th though but if it's at the right time I can present on some nice new 
features in Curator or possibly the work I've been doing for Persistent Watches 
in ZooKeeper itself.

-Jordan

> On Oct 11, 2017, at 1:06 AM, Abraham Fine  wrote:
> 
> Hello ZooKeeper Community-
> 
> It has been a while since our last meetup and it would be great to bring
> everyone together again. Cloudera would be able to host a meetup at our
> headquarters in Palo Alto, CA next week (I'm thinking 10/19).
> 
> I was hoping to use the mailing lists to gauge interest. Please reply if
> you think you would be able to attend or would prefer a different date.
> 
> Looking forward to hearing from everyone. 
> 
> Thanks,
> Abe



[jira] [Commented] (ZOOKEEPER-2503) Inconsistency between myid documentation and implementation

2017-10-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189920#comment-16189920
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2503:
-

FYI - Please consider ZOOKEEPER-2901 when making this change. 

> Inconsistency between myid documentation and implementation
> ---
>
> Key: ZOOKEEPER-2503
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2503
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.9, 3.5.2
>Reporter: Michael Han
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> In ZK documentation, we have:
> "The myid file consists of a single line containing only the text of that 
> machine's id. So myid of server 1 would contain the text "1" and nothing 
> else. The id must be unique within the ensemble and should have a value 
> between 1 and 255."
> This however is not enforced in code, which should be fixed either in 
> documentation that we remove the restriction of the range 1-255 or in code we 
> enforce such constraint.
> Discussion thread:
> http://zookeeper-user.578899.n2.nabble.com/Is-myid-actually-limited-to-1-255-td7581270.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189858#comment-16189858
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~mjohnson207] - no. It wasn't checked before and there's already an issue for 
this: https://issues.apache.org/jira/browse/ZOOKEEPER-2503

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2907) Logged request buffer isn't useful

2017-09-28 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2907:

Description: 
There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.

PrepRequestProcessor#pRequest() and FinalRequestProcessor#processRequest()

  was:
There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.


> Logged request buffer isn't useful
> --
>
> Key: ZOOKEEPER-2907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2907
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Jordan Zimmerman
>Priority: Minor
>
> There are two places in the server code that log request errors with a 
> message ala "Dumping request buffer..." followed by a hex dump of the request 
> buffer. There are 2 major problems with this output:
> # The request type is not output
> # The byte-to-hex inline code doesn't pad numbers < 16
> These two combine to make the output data nearly useless.
> PrepRequestProcessor#pRequest() and FinalRequestProcessor#processRequest()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2907) Logged request buffer isn't useful

2017-09-28 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2907:
---

 Summary: Logged request buffer isn't useful
 Key: ZOOKEEPER-2907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2907
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3, 3.4.10
Reporter: Jordan Zimmerman
Priority: Minor


There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-28 Thread Jordan Zimmerman
I pulled in the build file changes and that fixed the tests - good news.

So, https://github.com/apache/zookeeper/pull/377 
<https://github.com/apache/zookeeper/pull/377> is ready.

-Jordan

> On Sep 27, 2017, at 11:07 PM, Jordan Zimmerman <jor...@jordanzimmerman.com> 
> wrote:
> 
> This is on Jenkins. 
> 
> https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1051/testReport/
>  
> <https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1051/testReport/>
> 
>> On Sep 27, 2017, at 11:06 PM, Patrick Hunt <ph...@apache.org> wrote:
>> 
>> Check your classpath (typ the build/libs and build/test/libs directories) - 
>> how many log4j jar files do you have? Are there conflicting versions? (same 
>> jar diff versions I mean).
>> 
>> Patrick
>> 
>> On Wed, Sep 27, 2017 at 8:57 PM, Jordan Zimmerman 
>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote:
>> Now I'm getting a different error:
>> 
>> 2017-09-28 03:47:24,878 [myid:2] - ERROR [Thread-1:AppenderDynamicMBean@209] 
>> - Could not add DynamicLayoutMBean for 
>> [CONSOLE,layout=org.apache.log4j.PatternLayout].
>> javax.management.InstanceAlreadyExistsException: 
>> log4j:appender=CONSOLE,layout=org.apache.log4j.PatternLayout
>> 
>> 
>>> On Sep 27, 2017, at 1:17 PM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>>> <mailto:jor...@jordanzimmerman.com>> wrote:
>>> 
>>> I didn't change anything. I branched from master. What should I do any 
>>> ideas?
>>> 
>>>> On Sep 27, 2017, at 1:15 PM, Patrick Hunt <ph...@apache.org 
>>>> <mailto:ph...@apache.org>> wrote:
>>>> 
>>>> Has the log4j configuration changed at all? iirc the console appender 
>>>> needs to be setup for those tests to function.
>>>> 
>>>> Patrick
>>>> 
>>>> On Sat, Sep 23, 2017 at 8:01 AM, Jordan Zimmerman 
>>>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote:
>>>> There are 4 tests throwing NPEs in Jenkins due to:
>>>> 
>>>> Layout layout = Logger.getRootLogger().getAppender("CONSOLE")
>>>> .getLayout();
>>>> 
>>>> Is this a known issue? Any workaround?
>>>> 
>>>> -Jordan
>>>> 
>>>>> On Sep 21, 2017, at 9:17 AM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>>>>> <mailto:jor...@jordanzimmerman.com>> wrote:
>>>>> 
>>>>> In LeaderSessionTracker.java there is this bit of code:
>>>>> 
>>>>> if (!localSessionsEnabled
>>>>> || (getServerIdFromSessionId(sessionId) == serverId)) {
>>>>> throw new SessionExpiredException();
>>>>> }
>>>>> 
>>>>> "serverId" is a long. This can only work if Server IDs are 255 or less. I 
>>>>> realize this is in the docs. But is it enforced? See: 
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2503>
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés 
>>>>>> <r...@itevenworks.net <mailto:r...@itevenworks.net>> wrote:
>>>>>> 
>>>>>> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
>>>>>> <mailto:cami...@apache.org>> wrote:
>>>>>> Ok let's take this back to either public mailing list or jira. I'd write 
>>>>>> up
>>>>>> thoughts on jira and ask there+ml to look. I'll try to look tonight
>>>>>> 
>>>>>> Thanks Camille!
>>>>>> 
>>>>>> Also, I merged this originally so I will work with Jordan on getting 
>>>>>> this fixed. Let me know
>>>>>> when you have a write up of your proposed solution and I'll take a look. 
>>>>>> Thanks!
>>>>>> 
>>>>>> 
>>>>>> -rgs
>>>>>>  
>>>>>> 
>>>>>> 
>>>>>> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>>>>> <mailto:jor...@jordanzimmerman.com>>
>>>>>> wrote:
>>>>>> 
>>>>>> > I'd like to fix it as my company an

Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-27 Thread Jordan Zimmerman
This is on Jenkins. 

https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1051/testReport/

> On Sep 27, 2017, at 11:06 PM, Patrick Hunt <ph...@apache.org> wrote:
> 
> Check your classpath (typ the build/libs and build/test/libs directories) - 
> how many log4j jar files do you have? Are there conflicting versions? (same 
> jar diff versions I mean).
> 
> Patrick
> 
> On Wed, Sep 27, 2017 at 8:57 PM, Jordan Zimmerman <jor...@jordanzimmerman.com 
> <mailto:jor...@jordanzimmerman.com>> wrote:
> Now I'm getting a different error:
> 
> 2017-09-28 03:47:24,878 [myid:2] - ERROR [Thread-1:AppenderDynamicMBean@209] 
> - Could not add DynamicLayoutMBean for 
> [CONSOLE,layout=org.apache.log4j.PatternLayout].
> javax.management.InstanceAlreadyExistsException: 
> log4j:appender=CONSOLE,layout=org.apache.log4j.PatternLayout
> 
> 
>> On Sep 27, 2017, at 1:17 PM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>> <mailto:jor...@jordanzimmerman.com>> wrote:
>> 
>> I didn't change anything. I branched from master. What should I do any ideas?
>> 
>>> On Sep 27, 2017, at 1:15 PM, Patrick Hunt <ph...@apache.org 
>>> <mailto:ph...@apache.org>> wrote:
>>> 
>>> Has the log4j configuration changed at all? iirc the console appender needs 
>>> to be setup for those tests to function.
>>> 
>>> Patrick
>>> 
>>> On Sat, Sep 23, 2017 at 8:01 AM, Jordan Zimmerman 
>>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote:
>>> There are 4 tests throwing NPEs in Jenkins due to:
>>> 
>>> Layout layout = Logger.getRootLogger().getAppender("CONSOLE")
>>> .getLayout();
>>> 
>>> Is this a known issue? Any workaround?
>>> 
>>> -Jordan
>>> 
>>>> On Sep 21, 2017, at 9:17 AM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>>>> <mailto:jor...@jordanzimmerman.com>> wrote:
>>>> 
>>>> In LeaderSessionTracker.java there is this bit of code:
>>>> 
>>>> if (!localSessionsEnabled
>>>> || (getServerIdFromSessionId(sessionId) == serverId)) {
>>>> throw new SessionExpiredException();
>>>> }
>>>> 
>>>> "serverId" is a long. This can only work if Server IDs are 255 or less. I 
>>>> realize this is in the docs. But is it enforced? See: 
>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2503>
>>>> 
>>>> 
>>>> 
>>>>> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net 
>>>>> <mailto:r...@itevenworks.net>> wrote:
>>>>> 
>>>>> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
>>>>> <mailto:cami...@apache.org>> wrote:
>>>>> Ok let's take this back to either public mailing list or jira. I'd write 
>>>>> up
>>>>> thoughts on jira and ask there+ml to look. I'll try to look tonight
>>>>> 
>>>>> Thanks Camille!
>>>>> 
>>>>> Also, I merged this originally so I will work with Jordan on getting this 
>>>>> fixed. Let me know
>>>>> when you have a write up of your proposed solution and I'll take a look. 
>>>>> Thanks!
>>>>> 
>>>>> 
>>>>> -rgs
>>>>>  
>>>>> 
>>>>> 
>>>>> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>>>> <mailto:jor...@jordanzimmerman.com>>
>>>>> wrote:
>>>>> 
>>>>> > I'd like to fix it as my company and probably many others are now using 
>>>>> > it
>>>>> > in production. The question is how to fix it safely and correctly. Is 
>>>>> > email
>>>>> > the best way to discuss this? Jira? Something else?
>>>>> >
>>>>> > I must say that there appears to be a trivial fix but I need the ZK
>>>>> > committers to think about this. In 
>>>>> > SessionTrackerImpl#initializeNextSession()
>>>>> > only some of the server ID bits are used. We could easily just mask the 
>>>>> > 2
>>>>> > high bits as well. But, what are the implications of this? Where is this
>>>>> > serve

Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-27 Thread Jordan Zimmerman
Now I'm getting a different error:

2017-09-28 03:47:24,878 [myid:2] - ERROR [Thread-1:AppenderDynamicMBean@209] - 
Could not add DynamicLayoutMBean for 
[CONSOLE,layout=org.apache.log4j.PatternLayout].
javax.management.InstanceAlreadyExistsException: 
log4j:appender=CONSOLE,layout=org.apache.log4j.PatternLayout


> On Sep 27, 2017, at 1:17 PM, Jordan Zimmerman <jor...@jordanzimmerman.com> 
> wrote:
> 
> I didn't change anything. I branched from master. What should I do any ideas?
> 
>> On Sep 27, 2017, at 1:15 PM, Patrick Hunt <ph...@apache.org 
>> <mailto:ph...@apache.org>> wrote:
>> 
>> Has the log4j configuration changed at all? iirc the console appender needs 
>> to be setup for those tests to function.
>> 
>> Patrick
>> 
>> On Sat, Sep 23, 2017 at 8:01 AM, Jordan Zimmerman 
>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote:
>> There are 4 tests throwing NPEs in Jenkins due to:
>> 
>> Layout layout = Logger.getRootLogger().getAppender("CONSOLE")
>> .getLayout();
>> 
>> Is this a known issue? Any workaround?
>> 
>> -Jordan
>> 
>>> On Sep 21, 2017, at 9:17 AM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>>> <mailto:jor...@jordanzimmerman.com>> wrote:
>>> 
>>> In LeaderSessionTracker.java there is this bit of code:
>>> 
>>> if (!localSessionsEnabled
>>> || (getServerIdFromSessionId(sessionId) == serverId)) {
>>> throw new SessionExpiredException();
>>> }
>>> 
>>> "serverId" is a long. This can only work if Server IDs are 255 or less. I 
>>> realize this is in the docs. But is it enforced? See: 
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2503>
>>> 
>>> 
>>> 
>>>> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net 
>>>> <mailto:r...@itevenworks.net>> wrote:
>>>> 
>>>> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
>>>> <mailto:cami...@apache.org>> wrote:
>>>> Ok let's take this back to either public mailing list or jira. I'd write up
>>>> thoughts on jira and ask there+ml to look. I'll try to look tonight
>>>> 
>>>> Thanks Camille!
>>>> 
>>>> Also, I merged this originally so I will work with Jordan on getting this 
>>>> fixed. Let me know
>>>> when you have a write up of your proposed solution and I'll take a look. 
>>>> Thanks!
>>>> 
>>>> 
>>>> -rgs
>>>>  
>>>> 
>>>> 
>>>> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>>> <mailto:jor...@jordanzimmerman.com>>
>>>> wrote:
>>>> 
>>>> > I'd like to fix it as my company and probably many others are now using 
>>>> > it
>>>> > in production. The question is how to fix it safely and correctly. Is 
>>>> > email
>>>> > the best way to discuss this? Jira? Something else?
>>>> >
>>>> > I must say that there appears to be a trivial fix but I need the ZK
>>>> > committers to think about this. In 
>>>> > SessionTrackerImpl#initializeNextSession()
>>>> > only some of the server ID bits are used. We could easily just mask the 2
>>>> > high bits as well. But, what are the implications of this? Where is this
>>>> > serverId byte used? What must be double checked?
>>>> >
>>>> > -Jordan
>>>> >
>>>> > On Sep 20, 2017, at 2:46 PM, Camille Fournier <cami...@apache.org 
>>>> > <mailto:cami...@apache.org>> wrote:
>>>> >
>>>> > Would you rather roll back the feature or put in a fix?
>>>> >
>>>> > On Sep 20, 2017 3:44 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>>> > <mailto:jor...@jordanzimmerman.com>>
>>>> > wrote:
>>>> >
>>>> >> Hey Folks,
>>>> >>
>>>> >> This is very serious. Please - let's discuss immediately. I'm not 
>>>> >> certain
>>>> >> how to fix this.
>>>> >>
>>>> >> -JZ
>>>> >>
>>>> >> On Sep 20, 2017, at 2:17 PM, Jordan Zimmerman 
>>>> >> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>>
>>>> >> wrote:
>>>> >>
>>>> >> See: https://issues.apache.org/jira/browse/ZOOKEEPER-2901 
>>>> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-2901>
>>>> >>
>>>> >> It appears that the high order byte of a session ID is reserved for the
>>>> >> ServerID. I don't know how I could have missed this or how this got by 
>>>> >> code
>>>> >> review, but Container Nodes and TTL nodes are using the 2 high bits to
>>>> >> denote container/TTL. I'll work on a fix ASAP. But, can someone validate
>>>> >> this?
>>>> >>
>>>> >> -Jordan
>>>> >>
>>>> >>
>>>> >>
>>>> >
>>>> 
>>> 
>> 
>> 
> 



Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-27 Thread Jordan Zimmerman
I didn't change anything. I branched from master. What should I do any ideas?

> On Sep 27, 2017, at 1:15 PM, Patrick Hunt <ph...@apache.org> wrote:
> 
> Has the log4j configuration changed at all? iirc the console appender needs 
> to be setup for those tests to function.
> 
> Patrick
> 
> On Sat, Sep 23, 2017 at 8:01 AM, Jordan Zimmerman <jor...@jordanzimmerman.com 
> <mailto:jor...@jordanzimmerman.com>> wrote:
> There are 4 tests throwing NPEs in Jenkins due to:
> 
> Layout layout = Logger.getRootLogger().getAppender("CONSOLE")
> .getLayout();
> 
> Is this a known issue? Any workaround?
> 
> -Jordan
> 
>> On Sep 21, 2017, at 9:17 AM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>> <mailto:jor...@jordanzimmerman.com>> wrote:
>> 
>> In LeaderSessionTracker.java there is this bit of code:
>> 
>> if (!localSessionsEnabled
>> || (getServerIdFromSessionId(sessionId) == serverId)) {
>> throw new SessionExpiredException();
>> }
>> 
>> "serverId" is a long. This can only work if Server IDs are 255 or less. I 
>> realize this is in the docs. But is it enforced? See: 
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2503>
>> 
>> 
>> 
>>> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net 
>>> <mailto:r...@itevenworks.net>> wrote:
>>> 
>>> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
>>> <mailto:cami...@apache.org>> wrote:
>>> Ok let's take this back to either public mailing list or jira. I'd write up
>>> thoughts on jira and ask there+ml to look. I'll try to look tonight
>>> 
>>> Thanks Camille!
>>> 
>>> Also, I merged this originally so I will work with Jordan on getting this 
>>> fixed. Let me know
>>> when you have a write up of your proposed solution and I'll take a look. 
>>> Thanks!
>>> 
>>> 
>>> -rgs
>>>  
>>> 
>>> 
>>> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>> <mailto:jor...@jordanzimmerman.com>>
>>> wrote:
>>> 
>>> > I'd like to fix it as my company and probably many others are now using it
>>> > in production. The question is how to fix it safely and correctly. Is 
>>> > email
>>> > the best way to discuss this? Jira? Something else?
>>> >
>>> > I must say that there appears to be a trivial fix but I need the ZK
>>> > committers to think about this. In 
>>> > SessionTrackerImpl#initializeNextSession()
>>> > only some of the server ID bits are used. We could easily just mask the 2
>>> > high bits as well. But, what are the implications of this? Where is this
>>> > serverId byte used? What must be double checked?
>>> >
>>> > -Jordan
>>> >
>>> > On Sep 20, 2017, at 2:46 PM, Camille Fournier <cami...@apache.org 
>>> > <mailto:cami...@apache.org>> wrote:
>>> >
>>> > Would you rather roll back the feature or put in a fix?
>>> >
>>> > On Sep 20, 2017 3:44 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>>> > <mailto:jor...@jordanzimmerman.com>>
>>> > wrote:
>>> >
>>> >> Hey Folks,
>>> >>
>>> >> This is very serious. Please - let's discuss immediately. I'm not certain
>>> >> how to fix this.
>>> >>
>>> >> -JZ
>>> >>
>>> >> On Sep 20, 2017, at 2:17 PM, Jordan Zimmerman 
>>> >> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>>
>>> >> wrote:
>>> >>
>>> >> See: https://issues.apache.org/jira/browse/ZOOKEEPER-2901 
>>> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-2901>
>>> >>
>>> >> It appears that the high order byte of a session ID is reserved for the
>>> >> ServerID. I don't know how I could have missed this or how this got by 
>>> >> code
>>> >> review, but Container Nodes and TTL nodes are using the 2 high bits to
>>> >> denote container/TTL. I'll work on a fix ASAP. But, can someone validate
>>> >> this?
>>> >>
>>> >> -Jordan
>>> >>
>>> >>
>>> >>
>>> >
>>> 
>> 
> 
> 



More Jenkins problems

2017-09-25 Thread Jordan Zimmerman
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1040/console 


There's some kind of ivy exception here. 


 [exec] BUILD FAILED
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:373:
 impossible to ivy retrieve: java.lang.RuntimeException: problem during 
retrieve of org.apache.zookeeper#zookeeper: java.text.ParseException: failed to 
parse report: 
/home/jenkins/.ant/cache/org.apache.zookeeper-zookeeper-default.xml: The markup 
in the document following the root element must be well-formed.
 [exec] at 
org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:249)
 [exec] at org.apache.ivy.Ivy.retrieve(Ivy.java:561)
 [exec] at org.apache.ivy.ant.IvyRetrieve.doExecute(IvyRetrieve.java:98)
 [exec] at org.apache.ivy.ant.IvyTask.execute(IvyTask.java:271)
 [exec] at 
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293)
 [exec] at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
 [exec] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 [exec] at java.lang.reflect.Method.invoke(Method.java:606)
 [exec] at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
 [exec] at org.apache.tools.ant.Task.perform(Task.java:348)
 [exec] at org.apache.tools.ant.Target.execute(Target.java:435)
 [exec] at org.apache.tools.ant.Target.performTasks(Target.java:456)
 [exec] at 
org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405)
 [exec] at org.apache.tools.ant.Project.executeTarget(Project.java:1376)
 [exec] at 
org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
 [exec] at 
org.apache.tools.ant.Project.executeTargets(Project.java:1260)
 [exec] at org.apache.tools.ant.Main.runBuild(Main.java:857)
 [exec] at org.apache.tools.ant.Main.startAnt(Main.java:236)
 [exec] at org.apache.tools.ant.launch.Launcher.run(Launcher.java:287)
 [exec] at org.apache.tools.ant.launch.Launcher.main(Launcher.java:113)
 [exec] Caused by: java.text.ParseException: failed to parse report: 
/home/jenkins/.ant/cache/org.apache.zookeeper-zookeeper-default.xml: The markup 
in the 



Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-23 Thread Jordan Zimmerman
There are 4 tests throwing NPEs in Jenkins due to:

Layout layout = Logger.getRootLogger().getAppender("CONSOLE")
.getLayout();

Is this a known issue? Any workaround?

-Jordan

> On Sep 21, 2017, at 9:17 AM, Jordan Zimmerman <jor...@jordanzimmerman.com> 
> wrote:
> 
> In LeaderSessionTracker.java there is this bit of code:
> 
> if (!localSessionsEnabled
> || (getServerIdFromSessionId(sessionId) == serverId)) {
> throw new SessionExpiredException();
> }
> 
> "serverId" is a long. This can only work if Server IDs are 255 or less. I 
> realize this is in the docs. But is it enforced? See: 
> https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
> <https://issues.apache.org/jira/browse/ZOOKEEPER-2503>
> 
> 
> 
>> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net 
>> <mailto:r...@itevenworks.net>> wrote:
>> 
>> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
>> <mailto:cami...@apache.org>> wrote:
>> Ok let's take this back to either public mailing list or jira. I'd write up
>> thoughts on jira and ask there+ml to look. I'll try to look tonight
>> 
>> Thanks Camille!
>> 
>> Also, I merged this originally so I will work with Jordan on getting this 
>> fixed. Let me know
>> when you have a write up of your proposed solution and I'll take a look. 
>> Thanks!
>> 
>> 
>> -rgs
>>  
>> 
>> 
>> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>> <mailto:jor...@jordanzimmerman.com>>
>> wrote:
>> 
>> > I'd like to fix it as my company and probably many others are now using it
>> > in production. The question is how to fix it safely and correctly. Is email
>> > the best way to discuss this? Jira? Something else?
>> >
>> > I must say that there appears to be a trivial fix but I need the ZK
>> > committers to think about this. In 
>> > SessionTrackerImpl#initializeNextSession()
>> > only some of the server ID bits are used. We could easily just mask the 2
>> > high bits as well. But, what are the implications of this? Where is this
>> > serverId byte used? What must be double checked?
>> >
>> > -Jordan
>> >
>> > On Sep 20, 2017, at 2:46 PM, Camille Fournier <cami...@apache.org 
>> > <mailto:cami...@apache.org>> wrote:
>> >
>> > Would you rather roll back the feature or put in a fix?
>> >
>> > On Sep 20, 2017 3:44 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
>> > <mailto:jor...@jordanzimmerman.com>>
>> > wrote:
>> >
>> >> Hey Folks,
>> >>
>> >> This is very serious. Please - let's discuss immediately. I'm not certain
>> >> how to fix this.
>> >>
>> >> -JZ
>> >>
>> >> On Sep 20, 2017, at 2:17 PM, Jordan Zimmerman <jor...@jordanzimmerman.com 
>> >> <mailto:jor...@jordanzimmerman.com>>
>> >> wrote:
>> >>
>> >> See: https://issues.apache.org/jira/browse/ZOOKEEPER-2901 
>> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-2901>
>> >>
>> >> It appears that the high order byte of a session ID is reserved for the
>> >> ServerID. I don't know how I could have missed this or how this got by 
>> >> code
>> >> review, but Container Nodes and TTL nodes are using the 2 high bits to
>> >> denote container/TTL. I'll work on a fix ASAP. But, can someone validate
>> >> this?
>> >>
>> >> -Jordan
>> >>
>> >>
>> >>
>> >
>> 
> 



[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-22 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177575#comment-16177575
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

I'm happy to switch the default to true. I was being cautious. Can we get to 
consensus?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2903) Port ZOOKEEPER-2901 to 3.5.4

2017-09-22 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2903:
---

 Summary: Port ZOOKEEPER-2901 to 3.5.4
 Key: ZOOKEEPER-2903
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2903
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman
Priority: Blocker
 Fix For: 3.5.4


The TTL/Server ID bug is quite serious and should be back-ported to the 3.5.x 
branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175451#comment-16175451
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

There is another option here. We update the documentation to say that if you're 
going to use container and TTL nodes then your server ID must <= 127. Thoughts?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175342#comment-16175342
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Update - after researching this further, the exposure from Container Nodes 
doesn't exist. Container Nodes are denoted by ephemeralOwner of 
{{Long.MIN_VALUE}}. There can never be a session ID with this value so we're 
safe. Thus, the only exposure is for TTL nodes. I'm still researching and will 
continue to report back here.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2901:

Comment: was deleted

(was: The fix for this is straightforward. The hard part is backward 
compatibility:

* End users have data files with potentially corrupted data
** If they've used a ServerId > 127 with ZK versions 3.5.1+
** If they've used a ServerId > 63 with ZK version 3.5.3
* ContainerManager will treat ephemeral nodes created by servers with the bad 
Server IDs as container or TTL nodes. 

The fix created here _must_ expire these sessions so that they don't cause 
problems. The tricky part is how to do this. We need a way to identify old 
session IDs and new ones. We _could_ bump the {{FileTxnLog.VERSION}} but that 
would also be tricky to do in a backward compatible way. I'd appreciate ideas 
here.)

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175282#comment-16175282
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing `FileTxnLog.VERSION`. Every file (snapshot and transaction) has a 
header that maps to `FileHeader.java`. The field `dbId` isn't really used for 
anything. For snapshots it's -1 and transactions it's 0. So, we can easily use 
this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175282#comment-16175282
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/21/17 6:55 PM:
--

Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing {{FileTxnLog.VERSION}}. Every file (snapshot and transaction) has a 
header that maps to {{FileHeader.java}}. The field {{dbId}} isn't really used 
for anything. For snapshots it's -1 and transactions it's 0. So, we can easily 
use this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?


was (Author: randgalt):
Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing `FileTxnLog.VERSION`. Every file (snapshot and transaction) has a 
header that maps to `FileHeader.java`. The field `dbId` isn't really used for 
anything. For snapshots it's -1 and transactions it's 0. So, we can easily use 
this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175108#comment-16175108
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

The fix for this is straightforward. The hard part is backward compatibility:

* End users have data files with potentially corrupted data
** If they've used a ServerId > 127 with ZK versions 3.5.1+
** If they've used a ServerId > 63 with ZK version 3.5.3
* ContainerManager will treat ephemeral nodes created by servers with the bad 
Server IDs as container or TTL nodes. 

The fix created here _must_ expire these sessions so that they don't cause 
problems. The tricky part is how to do this. We need a way to identify old 
session IDs and new ones. We _could_ bump the {{FileTxnLog.VERSION}} but that 
would also be tricky to do in a backward compatible way. I'd appreciate ideas 
here.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Major issue with Container Nodes/TTL nodes!!!

2017-09-21 Thread Jordan Zimmerman
In LeaderSessionTracker.java there is this bit of code:

if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}

"serverId" is a long. This can only work if Server IDs are 255 or less. I 
realize this is in the docs. But is it enforced? See: 
https://issues.apache.org/jira/browse/ZOOKEEPER-2503 
<https://issues.apache.org/jira/browse/ZOOKEEPER-2503>



> On Sep 20, 2017, at 3:10 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net> 
> wrote:
> 
> On 20 September 2017 at 12:54, Camille Fournier <cami...@apache.org 
> <mailto:cami...@apache.org>> wrote:
> Ok let's take this back to either public mailing list or jira. I'd write up
> thoughts on jira and ask there+ml to look. I'll try to look tonight
> 
> Thanks Camille!
> 
> Also, I merged this originally so I will work with Jordan on getting this 
> fixed. Let me know
> when you have a write up of your proposed solution and I'll take a look. 
> Thanks!
> 
> 
> -rgs
>  
> 
> 
> On Sep 20, 2017 3:52 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
> <mailto:jor...@jordanzimmerman.com>>
> wrote:
> 
> > I'd like to fix it as my company and probably many others are now using it
> > in production. The question is how to fix it safely and correctly. Is email
> > the best way to discuss this? Jira? Something else?
> >
> > I must say that there appears to be a trivial fix but I need the ZK
> > committers to think about this. In 
> > SessionTrackerImpl#initializeNextSession()
> > only some of the server ID bits are used. We could easily just mask the 2
> > high bits as well. But, what are the implications of this? Where is this
> > serverId byte used? What must be double checked?
> >
> > -Jordan
> >
> > On Sep 20, 2017, at 2:46 PM, Camille Fournier <cami...@apache.org 
> > <mailto:cami...@apache.org>> wrote:
> >
> > Would you rather roll back the feature or put in a fix?
> >
> > On Sep 20, 2017 3:44 PM, "Jordan Zimmerman" <jor...@jordanzimmerman.com 
> > <mailto:jor...@jordanzimmerman.com>>
> > wrote:
> >
> >> Hey Folks,
> >>
> >> This is very serious. Please - let's discuss immediately. I'm not certain
> >> how to fix this.
> >>
> >> -JZ
> >>
> >> On Sep 20, 2017, at 2:17 PM, Jordan Zimmerman <jor...@jordanzimmerman.com 
> >> <mailto:jor...@jordanzimmerman.com>>
> >> wrote:
> >>
> >> See: https://issues.apache.org/jira/browse/ZOOKEEPER-2901 
> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-2901>
> >>
> >> It appears that the high order byte of a session ID is reserved for the
> >> ServerID. I don't know how I could have missed this or how this got by code
> >> review, but Container Nodes and TTL nodes are using the 2 high bits to
> >> denote container/TTL. I'll work on a fix ASAP. But, can someone validate
> >> this?
> >>
> >> -Jordan
> >>
> >>
> >>
> >
> 



[jira] [Resolved] (ZOOKEEPER-2902) Exhibitor

2017-09-21 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman resolved ZOOKEEPER-2902.
-
Resolution: Invalid

[~ANH] as we said when you opened the previous issue. The Exhibitor is not 
related to Apache ZooKeeper. Please stop opening issues for it here.

> Exhibitor
> -
>
> Key: ZOOKEEPER-2902
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2902
> Project: ZooKeeper
>  Issue Type: Test
> Environment: Ubuntu 16.04
>Reporter: ANH
>
> Any one can help me in configuring exhibitor other than giving this link 
> https://github.com/soabase/exhibitor ?? 
> Extremely sorry to raise tickets related to Exhibitor.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2902) Exhibitor

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174810#comment-16174810
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2902 at 9/21/17 2:06 PM:
--

[~ANH] as we said when you opened the previous issue. The Exhibitor project is 
not related to Apache ZooKeeper. Please stop opening issues for it here.


was (Author: randgalt):
[~ANH] as we said when you opened the previous issue. The Exhibitor is not 
related to Apache ZooKeeper. Please stop opening issues for it here.

> Exhibitor
> -
>
> Key: ZOOKEEPER-2902
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2902
> Project: ZooKeeper
>  Issue Type: Test
> Environment: Ubuntu 16.04
>Reporter: ANH
>
> Any one can help me in configuring exhibitor other than giving this link 
> https://github.com/soabase/exhibitor ?? 
> Extremely sorry to raise tickets related to Exhibitor.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174792#comment-16174792
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/21/17 1:57 PM:
--

[~mjohnson207] - I think that can work. The hard part is deciding what to do 
about existing sessions when the new server loads. I think the only choice is 
to somehow invalidate those sessions. We need to do this because of this code 
in LeaderSessionTracker.java - which I don't understand TBH

{code}
/*
 * if local session is not enabled or it used to be our local session
 * throw sessions expires
 */
if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}
{code}

It's the only place in the code where the ServerId from the session ID is used.


was (Author: randgalt):
[~mjohnson207] - I think that can work. The hard part is deciding what to do 
about existing sessions when the new server loads. I think the only choice is 
to somehow invalidate those sessions. We need to do this because of this code 
in LeaderSessionTracker.java - which I don't understand TBH

{code}
/*
 * if local session is not enabled or it used to be our local session
 * throw sessions expires
 */
if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}
{code}

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>    Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   5   6   >