Re: [3.4.12] Missing OVERSEER doc. solr-user@lucene....

2019-05-22 Thread Andor Molnar
Hi Will,

What is your question to the ZooKeeper community?

Andor



On Wed, May 22, 2019 at 5:53 AM Will Martin  wrote:

> Cross-posting this for a sound reporter. He is a top technical
> resource on the list. Not given to hyperbole in bug reports.
>
>
>
> Is there a acl’d JIRA for zookeeper?
>
> 
>
> to solr-user
>
> [image: https://mail.google.com/mail/u/0/images/cleardot.gif]
>
> We have a 6.6.2 cluster in prod that appears to have no overseer. In
> /overseer_elect on ZK, there is an election folder, but no leader document.
> An OVERSEERSTATUS request fails with a timeout.
>
> I’m going to try ADDROLE, but I’d be delighted to hear any other ideas.
> We’ve diverted all the traffic to the backing cluster, so we can blow this
> one away and rebuild.
>
> Looking at the Zookeeper logs, I see a few instances of network failures
> across all three nodes.
>
>
>
>
>
> I **have the logs** from each of the Zookeepers.
>
> We are running 3.4.12.
>
>
>
> 
>


[3.4.12] Missing OVERSEER doc. solr-user@lucene....

2019-05-21 Thread Will Martin
Cross-posting this for a sound reporter. He is a top technical resource on 
the list. Not given to hyperbole in bug reports.

Is there a acl'd JIRA for zookeeper?


to solr-user

We have a 6.6.2 cluster in prod that appears to have no overseer. In 
/overseer_elect on ZK, there is an election folder, but no leader document. An 
OVERSEERSTATUS request fails with a timeout.

I'm going to try ADDROLE, but I'd be delighted to hear any other ideas. We've 
diverted all the traffic to the backing cluster, so we can blow this one away 
and rebuild.

Looking at the Zookeeper logs, I see a few instances of network failures across 
all three nodes.


I *have the logs* from each of the Zookeepers.

We are running 3.4.12.




zookeeper 3.4.12 c client doesnot work on solaris 11 machine

2018-08-08 Thread ashwinikgj
Hi team,

I compiled zookeeper 3.4.12 c client on solaris 11 machine and trying to test 
it through cli.sh. Here is the sample code that I changed in cli.c. Whenever I 
run the ./cli_mt file it always throws below error

2018-08-08 09:49:44,437:18000(0x2):ZOO_ERROR@handle_socket_error_msg@1670: 
Socket [10.272.80.184:4831] zk retcode=-4, errno=0(Error 0): connect() call 
failed. Anybody facing same issue?.

---
cli.c file


int main(int argc, char **argv) {
#ifndef THREADED
fd_set rfds, wfds, efds;
int processed=0;
#endif

char p[2048];
#ifdef YCA  
char *cert=0;
char appId[64];
#endif
char buffer[4096];
int bufoff;
//FILE *fh;
struct Stat stat;
int buflen= sizeof(buffer);
int rc;

verbose = 0;
zoo_set_debug_level(ZOO_LOG_LEVEL_WARN);
zoo_deterministic_conn_order(1); // enable deterministic order
hostPort = argv[1];
printf("\n%s\n", "Lets connect");

zh = zookeeper_init("sld06hzt:4831", watcher, 3, , 0, 0);
if (!zh) {
printf("\n%d\n", errno);
return errno;
}

strcpy(p,"sadmin:ldap");
zoo_add_auth(zh,"digest",p,strlen(p),0,0);

  rc = zoo_get(zh, "/test", 0, buffer, , );
  
if (rc)  
{
 printf("\n%s\n", buffer);

}

 zookeeper_close(zh);
  return 2;
}


Regards,
Ashwini.


[ANNOUNCE] Apache ZooKeeper 3.4.12

2018-05-01 Thread Abraham Fine
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.4.12.

ZooKeeper is a high-performance coordination service for distributed 
applications. It exposes common services - such as naming, configuration 
management, synchronization, and group services - in a simple interface so you 
don't have to write them from scratch. You can use it off-the-shelf to 
implement consensus, group management, leader election, and presence protocols. 
And you can build on it for your own, specific needs.

For ZooKeeper release details and downloads, visit:
http://zookeeper.apache.org/releases.html

ZooKeeper 3.4.12 Release Notes are at:
http://zookeeper.apache.org/doc/r3.4.12/releasenotes.html

We would like to thank the contributors that made the release possible.

Regards,

The ZooKeeper Team


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-24 Thread Abraham Fine
Thank you to all voters. 

With 4 binding +1 votes and 1 non-binding +1 vote this vote passes. I'll cut 
this release tonight.

Thanks,
Abe

On Mon, Apr 23, 2018, at 08:24, Flavio Junqueira wrote:
> +1, verified the following:
> 
> - checksums and signature
> - build passes
> - rat tool output does not indicate any problem
> - LICENSE and NOTICE look both ok
> - local simple smoke tests work
> 
> -Flavio
> 
> 
> > On 2 Apr 2018, at 02:01, Michael Han <h...@apache.org> wrote:
> > 
> > +1
> > 
> > - verified xsum/sig.
> > - release notes looks good.
> > - verified cluster with different sizes.
> > - verified with few 4lw commands.
> > - verified data / log dir swap was fixed.
> > - all unit test passed.
> > 
> > 
> > On Wed, Mar 28, 2018 at 11:55 AM, Patrick Hunt <ph...@apache.org> wrote:
> > 
> >> +1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
> >> which seemed fine. Ran the tests and they passed. LGTM.
> >> 
> >> Patrick
> >> 
> >> On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine <af...@apache.org> wrote:
> >> 
> >>> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> >>> including issues that
> >>> affect incorrect handling of the dataDir and the dataLogDir.
> >>> 
> >>> This candidate fixes an issue in the release notes of candidate 0.
> >>> 
> >>> The full release notes are available at:
> >>> 
> >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> >>> projectId=12310801=12342040
> >>> 
> >>> *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
> >>> 
> >>> Source files:
> >>> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
> >>> 
> >>> Maven staging repo:
> >>> https://repository.apache.org/content/groups/staging/org/
> >>> apache/zookeeper/zookeeper/3.4.12/
> >>> 
> >>> The release candidate tag in git to be voted upon: release-3.4.12-rc1
> >>> 
> >>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> >>> http://www.apache.org/dist/zookeeper/KEYS
> >>> 
> >>> Should we release this candidate?
> >>> 
> >> 
> 


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-23 Thread Flavio Junqueira
+1, verified the following:

- checksums and signature
- build passes
- rat tool output does not indicate any problem
- LICENSE and NOTICE look both ok
- local simple smoke tests work

-Flavio


> On 2 Apr 2018, at 02:01, Michael Han <h...@apache.org> wrote:
> 
> +1
> 
> - verified xsum/sig.
> - release notes looks good.
> - verified cluster with different sizes.
> - verified with few 4lw commands.
> - verified data / log dir swap was fixed.
> - all unit test passed.
> 
> 
> On Wed, Mar 28, 2018 at 11:55 AM, Patrick Hunt <ph...@apache.org> wrote:
> 
>> +1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
>> which seemed fine. Ran the tests and they passed. LGTM.
>> 
>> Patrick
>> 
>> On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine <af...@apache.org> wrote:
>> 
>>> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
>>> including issues that
>>> affect incorrect handling of the dataDir and the dataLogDir.
>>> 
>>> This candidate fixes an issue in the release notes of candidate 0.
>>> 
>>> The full release notes are available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12310801=12342040
>>> 
>>> *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/groups/staging/org/
>>> apache/zookeeper/zookeeper/3.4.12/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.4.12-rc1
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 
>> 



Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-20 Thread Patrick Hunt
The voting timeline for releases is a minimum to ensure everyone has the
opportunity to participate, it's not a max. The vote can run for as long as
necessary.

Patrick

On Fri, Apr 20, 2018 at 7:50 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> The original email said:
>
> bq. vote by March 31st 2018,
>
> IMHO Apr 30th is not far ahead :-)
>
> If you think RC1 should receive more vote, please extend the voting
> deadline.
>
> On Wed, Apr 18, 2018 at 1:29 PM, Abraham Fine <af...@apache.org> wrote:
>
> > I'm waiting for one more additional vote on the release. When that is
> done
> > it will be available.
> >
> > On Wed, Apr 18, 2018, at 12:38, Ted Yu wrote:
> > > I don't see 3.4.12 artifact under
> > > https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper
> > >
> > > Abraham:
> > > Can you clarify ?
> > >
> > > Thanks
> > >
> > > On Mon, Apr 16, 2018 at 9:35 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > >
> > > > Hi,
> > > > If I understand correctly, zookeeper users can expect maven artifacts
> > for
> > > > 3.4.12 to be posted soon.
> > > >
> > > > Thanks
> > > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-20 Thread Ted Yu
The original email said:

bq. vote by March 31st 2018,

IMHO Apr 30th is not far ahead :-)

If you think RC1 should receive more vote, please extend the voting
deadline.

On Wed, Apr 18, 2018 at 1:29 PM, Abraham Fine <af...@apache.org> wrote:

> I'm waiting for one more additional vote on the release. When that is done
> it will be available.
>
> On Wed, Apr 18, 2018, at 12:38, Ted Yu wrote:
> > I don't see 3.4.12 artifact under
> > https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper
> >
> > Abraham:
> > Can you clarify ?
> >
> > Thanks
> >
> > On Mon, Apr 16, 2018 at 9:35 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > Hi,
> > > If I understand correctly, zookeeper users can expect maven artifacts
> for
> > > 3.4.12 to be posted soon.
> > >
> > > Thanks
> > >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-18 Thread Abraham Fine
I'm waiting for one more additional vote on the release. When that is done it 
will be available.

On Wed, Apr 18, 2018, at 12:38, Ted Yu wrote:
> I don't see 3.4.12 artifact under
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper
> 
> Abraham:
> Can you clarify ?
> 
> Thanks
> 
> On Mon, Apr 16, 2018 at 9:35 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
> > Hi,
> > If I understand correctly, zookeeper users can expect maven artifacts for
> > 3.4.12 to be posted soon.
> >
> > Thanks
> >


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-18 Thread Ted Yu
I don't see 3.4.12 artifact under
https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper

Abraham:
Can you clarify ?

Thanks

On Mon, Apr 16, 2018 at 9:35 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Hi,
> If I understand correctly, zookeeper users can expect maven artifacts for
> 3.4.12 to be posted soon.
>
> Thanks
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-16 Thread Ted Yu
Hi,
If I understand correctly, zookeeper users can expect maven artifacts for
3.4.12 to be posted soon.

Thanks


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Abraham Fine
Thanks for following up Alex.


On Fri, Apr 13, 2018, at 14:48, Alexander Shraer wrote:
> We discussed with Pat offline and agreed to go without this patch,
> especially since we need to patch 3 branches: 3.4, 3.5 and master.> We'll 
> prepare 3.5 and master and then commit all 3 together in time
> for the next release. So Abe, please go ahead with your release.> 
> Alex
> 
> On Fri, Apr 13, 2018 at 2:26 PM, Patrick Hunt
> <ph...@apache.org> wrote:>> Hey folks. I've been on vacation. My 0.02 - given 
> the release
>> candidate is>>  well underway, has sufficient votes/time to finalize, this 
>> is not a>>  regression in 3.4.12 and it's not yet committed I would think we
>>  finalize/push 3.4.12 then quickly followup with a 3.4.13 that
>>  addresses>>  this. Alex could be the RM given his interest/advocacy.
>> 
>>  Regards,
>> 
>>  Patrick
>> 
>> 
>> On Fri, Apr 13, 2018 at 11:55 AM, Abraham Fine
>> <af...@apache.org> wrote:>> 
>>  > Given that the primary driver of this release is to fix an issue
>>  > with the>>  > misuse of dataDir and dataLogDir I would rather see this 
>> release
>>  > make it>>  > out the door with minimal additional changes to core
>>  > functionality so>>  > people can more confidently upgrade.
>>  >
>>  > What do you think Pat?
>>  >
>>  > Abe
>>  >
>>  > On Fri, Apr 13, 2018, at 11:37, Alexander Shraer wrote:
>>  > > Now that we have the fix, why delay it to next release?
>>  > >
>>  > > On Fri, Apr 13, 2018 at 11:09 AM Abraham Fine <af...@apache.org>
>>  > > wrote:>>  > >
>>  > > > Let's wait until the next release to include this fix.
>>  > > >
>>  > > > On Mon, Apr 9, 2018, at 15:14, Alexander Shraer wrote:
>>  > > > > Hi,
>>  > > > >
>>  > > > > Please take a look on the new PR for ZK-2959:
>>  > > > > https://github.com/apache/zookeeper/pull/500
>>  > > > > If there are no further comments, I can commit it.
>>  > > > >
>>  > > > > Thanks,
>>  > > > > Alex
>>  > > > >
>>  > > > > On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer
>>  > > > > <shra...@gmail.com>>  > >
>>  > > > wrote:
>>  > > > >
>>  > > > > > Hi,
>>  > > > > >
>>  > > > > > The bug described in  ZOOKEEPER-2959
>>  > > > > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is
>>  > > > > > that>>  > > > > > getEpochToPropose an waitForEpochAck do not 
>> distinguish
>>  > > > > > between>>  > > > followers
>>  > > > > > and observers.
>>  > > > > > This can cause a candidate leader's acceptedEpoch to be
>>  > > > > > updated>>  > with
>>  > > > only
>>  > > > > > support from observers. Same for waitForEpochAck - passing
>>  > > > > > this>>  > method
>>  > > > > > allows the candidate leader to update the currentEpoch.
>>  > > > > > The latter>>  > > > helps
>>  > > > > > this server to win FLE elections continuously, and the
>>  > > > > > former>>  > > > > > (acceptedEpoch)
>>  > > > > > causes anyone trying to connect to the server to think
>>  > > > > > that it has>>  > more
>>  > > > > > up-to-date data and trucate their logs to match.
>>  > > > > >
>>  > > > > >
>>  > > > > > Alex
>>  > > > > >
>>  > > > > > On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv
>>  > > > > > <lvfang...@gmail.com>>>  > > > wrote:
>>  > > > > >
>>  > > > > >> Hi Alex,
>>  > > > > >>
>>  > > > > >> Can you give more details about the data loss scenario in
>>  > > > > >> Jira>>  > > > > >> ZOOKEEPER-2959 <https://issues.apache.org/
>>  > jira/browse/ZOOKEEPER-2959
>>  > > > >?
>>  > > > > >> As far as I know, the leader will ignore the observers'
>>  > > > > >>

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Alexander Shraer
We discussed with Pat offline and agreed to go without this patch,
especially since we need to patch 3 branches: 3.4, 3.5 and master.
We'll prepare 3.5 and master and then commit all 3 together in time for the
next release. So Abe, please go ahead with your release.

Alex

On Fri, Apr 13, 2018 at 2:26 PM, Patrick Hunt <ph...@apache.org> wrote:

> Hey folks. I've been on vacation. My 0.02 - given the release candidate is
> well underway, has sufficient votes/time to finalize, this is not a
> regression in 3.4.12 and it's not yet committed I would think we
> finalize/push 3.4.12 then quickly followup with a 3.4.13 that addresses
> this. Alex could be the RM given his interest/advocacy.
>
> Regards,
>
> Patrick
>
> On Fri, Apr 13, 2018 at 11:55 AM, Abraham Fine <af...@apache.org> wrote:
>
> > Given that the primary driver of this release is to fix an issue with the
> > misuse of dataDir and dataLogDir I would rather see this release make it
> > out the door with minimal additional changes to core functionality so
> > people can more confidently upgrade.
> >
> > What do you think Pat?
> >
> > Abe
> >
> > On Fri, Apr 13, 2018, at 11:37, Alexander Shraer wrote:
> > > Now that we have the fix, why delay it to next release?
> > >
> > > On Fri, Apr 13, 2018 at 11:09 AM Abraham Fine <af...@apache.org>
> wrote:
> > >
> > > > Let's wait until the next release to include this fix.
> > > >
> > > > On Mon, Apr 9, 2018, at 15:14, Alexander Shraer wrote:
> > > > > Hi,
> > > > >
> > > > > Please take a look on the new PR for ZK-2959:
> > > > > https://github.com/apache/zookeeper/pull/500
> > > > > If there are no further comments, I can commit it.
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > > > On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer <
> shra...@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > The bug described in  ZOOKEEPER-2959
> > > > > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
> > > > > > getEpochToPropose an waitForEpochAck do not distinguish between
> > > > followers
> > > > > > and observers.
> > > > > > This can cause a candidate leader's acceptedEpoch to be updated
> > with
> > > > only
> > > > > > support from observers. Same for waitForEpochAck - passing this
> > method
> > > > > > allows the candidate leader to update the currentEpoch. The
> latter
> > > > helps
> > > > > > this server to win FLE elections continuously, and the former
> > > > > > (acceptedEpoch)
> > > > > > causes anyone trying to connect to the server to think that it
> has
> > more
> > > > > > up-to-date data and trucate their logs to match.
> > > > > >
> > > > > >
> > > > > > Alex
> > > > > >
> > > > > > On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > >> Hi Alex,
> > > > > >>
> > > > > >> Can you give more details about the data loss scenario in Jira
> > > > > >> ZOOKEEPER-2959 <https://issues.apache.org/
> > jira/browse/ZOOKEEPER-2959
> > > > >?
> > > > > >> As far as I know, the leader will ignore the observers' ACK in
> > > > > >> waitForNewLeaderAck, so it will not start serve traffic until it
> > > > received
> > > > > >> the actual quorum ACK, if it doesn't have enough followers
> support
> > > > before
> > > > > >> timeout, it will quit leading and it's learners will re-sync
> with
> > new
> > > > > >> leader.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Fangmin
> > > > > >>
> > > > > >> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <
> > shra...@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Btw we actually observed the described issue (data loss),
> > thankfully
> > > > in a
> > > > > >>> test environment. So I thou

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Patrick Hunt
Hey folks. I've been on vacation. My 0.02 - given the release candidate is
well underway, has sufficient votes/time to finalize, this is not a
regression in 3.4.12 and it's not yet committed I would think we
finalize/push 3.4.12 then quickly followup with a 3.4.13 that addresses
this. Alex could be the RM given his interest/advocacy.

Regards,

Patrick

On Fri, Apr 13, 2018 at 11:55 AM, Abraham Fine <af...@apache.org> wrote:

> Given that the primary driver of this release is to fix an issue with the
> misuse of dataDir and dataLogDir I would rather see this release make it
> out the door with minimal additional changes to core functionality so
> people can more confidently upgrade.
>
> What do you think Pat?
>
> Abe
>
> On Fri, Apr 13, 2018, at 11:37, Alexander Shraer wrote:
> > Now that we have the fix, why delay it to next release?
> >
> > On Fri, Apr 13, 2018 at 11:09 AM Abraham Fine <af...@apache.org> wrote:
> >
> > > Let's wait until the next release to include this fix.
> > >
> > > On Mon, Apr 9, 2018, at 15:14, Alexander Shraer wrote:
> > > > Hi,
> > > >
> > > > Please take a look on the new PR for ZK-2959:
> > > > https://github.com/apache/zookeeper/pull/500
> > > > If there are no further comments, I can commit it.
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer <shra...@gmail.com
> >
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > The bug described in  ZOOKEEPER-2959
> > > > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
> > > > > getEpochToPropose an waitForEpochAck do not distinguish between
> > > followers
> > > > > and observers.
> > > > > This can cause a candidate leader's acceptedEpoch to be updated
> with
> > > only
> > > > > support from observers. Same for waitForEpochAck - passing this
> method
> > > > > allows the candidate leader to update the currentEpoch. The latter
> > > helps
> > > > > this server to win FLE elections continuously, and the former
> > > > > (acceptedEpoch)
> > > > > causes anyone trying to connect to the server to think that it has
> more
> > > > > up-to-date data and trucate their logs to match.
> > > > >
> > > > >
> > > > > Alex
> > > > >
> > > > > On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com>
> > > wrote:
> > > > >
> > > > >> Hi Alex,
> > > > >>
> > > > >> Can you give more details about the data loss scenario in Jira
> > > > >> ZOOKEEPER-2959 <https://issues.apache.org/
> jira/browse/ZOOKEEPER-2959
> > > >?
> > > > >> As far as I know, the leader will ignore the observers' ACK in
> > > > >> waitForNewLeaderAck, so it will not start serve traffic until it
> > > received
> > > > >> the actual quorum ACK, if it doesn't have enough followers support
> > > before
> > > > >> timeout, it will quit leading and it's learners will re-sync with
> new
> > > > >> leader.
> > > > >>
> > > > >> Thanks,
> > > > >> Fangmin
> > > > >>
> > > > >> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <
> shra...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Btw we actually observed the described issue (data loss),
> thankfully
> > > in a
> > > > >>> test environment. So I thought this is important to share with
> the
> > > > >>> community.
> > > > >>>
> > > > >>> Unfortunately I don’t have time to run a new ZK release for
> this, so
> > > I’m
> > > > >>> not going to -1 your candidate, but we are actively working on a
> fix
> > > (ie
> > > > >>> a
> > > > >>> test at this point) and I can commit that as soon as we have
> that.
> > > > >>>
> > > > >>> It may be worth while to delay the release by a few more days,
> but
> > > it’s
> > > > >>> totally up to you since you’re running it.
> > > > >>>
> > > > >>> Cheers
> > > > &

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Abraham Fine
gt;>> >
> > > >>> >
> > > >>> >
> > > >>> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <
> > shra...@gmail.com>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > Yes sort of, FLE is finished, then enough observer's messages
> > reach
> > > >>> the
> > > >>> > > leader before participant's messages do.
> > > >>> > > Whether its rare depends on the number of observers and
> > > >>> participants. For
> > > >>> > > example with very few participants and many observers
> > > >>> > > your chance of hitting this are quite high.
> > > >>> > >
> > > >>> > > Alex
> > > >>> > >
> > > >>> > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <
> > an...@cloudera.com>
> > > >>> > wrote:
> > > >>> > >
> > > >>> > > > Maybe I'm missing something here, but this looks like a rare
> > edge
> > > >>> case
> > > >>> > to
> > > >>> > > > me. Participants must finish the leader election successfully
> > and
> > > >>> right
> > > >>> > > > after enough followers should fail to send epoch to the
> > leader, so
> > > >>> > > > observers can take it over.
> > > >>> > > >
> > > >>> > > > Is that description accurate?
> > > >>> > > >
> > > >>> > > > Andor
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <
> > > >>> shra...@gmail.com>
> > > >>> > > > wrote:
> > > >>> > > >
> > > >>> > > > > To clarify - in a deployment with observers this bug can
> > > >>> potentially
> > > >>> > > > cause
> > > >>> > > > > data loss. A server could be elected leader based just on the
> > > >>> support
> > > >>> > > of
> > > >>> > > > > observers, even if this servers data is stale wrt other
> > > >>> followers.
> > > >>> > > > >
> > > >>> > > > > It is certainly a blocker, just not sure if for 3.4.11 or
> > 3.4.12.
> > > >>> > > > >
> > > >>> > > > >
> > > >>> > > > > Alex
> > > >>> > > > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <
> > an...@cloudera.com
> > > >>> >
> > > >>> > > wrote:
> > > >>> > > > >
> > > >>> > > > > > I don't think it's a blocker.
> > > >>> > > > > > The jira and PR has been open since last December and
> > 3.4.11
> > > >>> has
> > > >>> > > > released
> > > >>> > > > > > without it.
> > > >>> > > > > >
> > > >>> > > > > > Although this bug is also important to fix, I believe it's
> > more
> > > >>> > > > important
> > > >>> > > > > > to release a fix for the regression we've found in 3.4.11
> > asap.
> > > >>> > > > > >
> > > >>> > > > > > Abe, any thoughts?
> > > >>> > > > > >
> > > >>> > > > > > Regards,
> > > >>> > > > > > Andor
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <
> > > >>> > shra...@gmail.com>
> > > >>> > > > > > wrote:
> > > >>> > > > > >
> > > >>> > > > > > > Sorry for coming in at the last moment. I'm not sure
> > when the
> > > >>> > next
> > > >>> > > > 3.4
> > > >>> > > > > > > release is scheduled, so just wanted to mention this bug,
> > > >>> > > > > > > which I believe is a blocker for either this or next
> > release:
> > > >>> > > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > > >>> > > > > > >
> > > >>> > > > > > > Best,
> > > >>> > > > > > > Alex
> > > >>> > > > > > >
> > > >>> > > > > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <
> > yuzhih...@gmail.com>
> > > >>> > > wrote:
> > > >>> > > > > > >
> > > >>> > > > > > > > Can the vote be closed ?
> > > >>> > > > > > > >
> > > >>> > > > > > > > It seems we have enough +1's
> > > >>> > > > > > > >
> > > >>> > > > > > > > Thanks
> > > >>> > > > > > > >
> > > >>> > > > > > >
> > > >>> > > > > >
> > > >>> > > > >
> > > >>> > > >
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> >


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Alexander Shraer
Now that we have the fix, why delay it to next release?

On Fri, Apr 13, 2018 at 11:09 AM Abraham Fine <af...@apache.org> wrote:

> Let's wait until the next release to include this fix.
>
> On Mon, Apr 9, 2018, at 15:14, Alexander Shraer wrote:
> > Hi,
> >
> > Please take a look on the new PR for ZK-2959:
> > https://github.com/apache/zookeeper/pull/500
> > If there are no further comments, I can commit it.
> >
> > Thanks,
> > Alex
> >
> > On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer <shra...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > The bug described in  ZOOKEEPER-2959
> > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
> > > getEpochToPropose an waitForEpochAck do not distinguish between
> followers
> > > and observers.
> > > This can cause a candidate leader's acceptedEpoch to be updated with
> only
> > > support from observers. Same for waitForEpochAck - passing this method
> > > allows the candidate leader to update the currentEpoch. The latter
> helps
> > > this server to win FLE elections continuously, and the former
> > > (acceptedEpoch)
> > > causes anyone trying to connect to the server to think that it has more
> > > up-to-date data and trucate their logs to match.
> > >
> > >
> > > Alex
> > >
> > > On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com>
> wrote:
> > >
> > >> Hi Alex,
> > >>
> > >> Can you give more details about the data loss scenario in Jira
> > >> ZOOKEEPER-2959 <https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> >?
> > >> As far as I know, the leader will ignore the observers' ACK in
> > >> waitForNewLeaderAck, so it will not start serve traffic until it
> received
> > >> the actual quorum ACK, if it doesn't have enough followers support
> before
> > >> timeout, it will quit leading and it's learners will re-sync with new
> > >> leader.
> > >>
> > >> Thanks,
> > >> Fangmin
> > >>
> > >> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <shra...@gmail.com>
> > >> wrote:
> > >>
> > >>> Btw we actually observed the described issue (data loss), thankfully
> in a
> > >>> test environment. So I thought this is important to share with the
> > >>> community.
> > >>>
> > >>> Unfortunately I don’t have time to run a new ZK release for this, so
> I’m
> > >>> not going to -1 your candidate, but we are actively working on a fix
> (ie
> > >>> a
> > >>> test at this point) and I can commit that as soon as we have that.
> > >>>
> > >>> It may be worth while to delay the release by a few more days, but
> it’s
> > >>> totally up to you since you’re running it.
> > >>>
> > >>> Cheers
> > >>> Alex
> > >>> On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com>
> wrote:
> > >>>
> > >>> > Got that. I still believe it's a completely valid issue which has
> to be
> > >>> > addressed, but it's not a showstopper. I'm afraid we're not going
> to
> > >>> > convince each other, so it's probably Abe's call if he want to
> create
> > >>> > another release candidate for the fix.
> > >>> >
> > >>> > I reviewed the code on github and I think it just needs to be
> covered
> > >>> with
> > >>> > a unit test to be complete.
> > >>> >
> > >>> > Regards,
> > >>> > Andor
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <
> shra...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> > > Yes sort of, FLE is finished, then enough observer's messages
> reach
> > >>> the
> > >>> > > leader before participant's messages do.
> > >>> > > Whether its rare depends on the number of observers and
> > >>> participants. For
> > >>> > > example with very few participants and many observers
> > >>> > > your chance of hitting this are quite high.
> > >>> > >
> > >>> > > Alex
> > >

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-13 Thread Abraham Fine
Let's wait until the next release to include this fix. 

On Mon, Apr 9, 2018, at 15:14, Alexander Shraer wrote:
> Hi,
> 
> Please take a look on the new PR for ZK-2959:
> https://github.com/apache/zookeeper/pull/500
> If there are no further comments, I can commit it.
> 
> Thanks,
> Alex
> 
> On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer <shra...@gmail.com> wrote:
> 
> > Hi,
> >
> > The bug described in  ZOOKEEPER-2959
> > <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
> > getEpochToPropose an waitForEpochAck do not distinguish between followers
> > and observers.
> > This can cause a candidate leader's acceptedEpoch to be updated with only
> > support from observers. Same for waitForEpochAck - passing this method
> > allows the candidate leader to update the currentEpoch. The latter helps
> > this server to win FLE elections continuously, and the former
> > (acceptedEpoch)
> > causes anyone trying to connect to the server to think that it has more
> > up-to-date data and trucate their logs to match.
> >
> >
> > Alex
> >
> > On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com> wrote:
> >
> >> Hi Alex,
> >>
> >> Can you give more details about the data loss scenario in Jira
> >> ZOOKEEPER-2959 <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>?
> >> As far as I know, the leader will ignore the observers' ACK in
> >> waitForNewLeaderAck, so it will not start serve traffic until it received
> >> the actual quorum ACK, if it doesn't have enough followers support before
> >> timeout, it will quit leading and it's learners will re-sync with new
> >> leader.
> >>
> >> Thanks,
> >> Fangmin
> >>
> >> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <shra...@gmail.com>
> >> wrote:
> >>
> >>> Btw we actually observed the described issue (data loss), thankfully in a
> >>> test environment. So I thought this is important to share with the
> >>> community.
> >>>
> >>> Unfortunately I don’t have time to run a new ZK release for this, so I’m
> >>> not going to -1 your candidate, but we are actively working on a fix (ie
> >>> a
> >>> test at this point) and I can commit that as soon as we have that.
> >>>
> >>> It may be worth while to delay the release by a few more days, but it’s
> >>> totally up to you since you’re running it.
> >>>
> >>> Cheers
> >>> Alex
> >>> On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com> wrote:
> >>>
> >>> > Got that. I still believe it's a completely valid issue which has to be
> >>> > addressed, but it's not a showstopper. I'm afraid we're not going to
> >>> > convince each other, so it's probably Abe's call if he want to create
> >>> > another release candidate for the fix.
> >>> >
> >>> > I reviewed the code on github and I think it just needs to be covered
> >>> with
> >>> > a unit test to be complete.
> >>> >
> >>> > Regards,
> >>> > Andor
> >>> >
> >>> >
> >>> >
> >>> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Yes sort of, FLE is finished, then enough observer's messages reach
> >>> the
> >>> > > leader before participant's messages do.
> >>> > > Whether its rare depends on the number of observers and
> >>> participants. For
> >>> > > example with very few participants and many observers
> >>> > > your chance of hitting this are quite high.
> >>> > >
> >>> > > Alex
> >>> > >
> >>> > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com>
> >>> > wrote:
> >>> > >
> >>> > > > Maybe I'm missing something here, but this looks like a rare edge
> >>> case
> >>> > to
> >>> > > > me. Participants must finish the leader election successfully and
> >>> right
> >>> > > > after enough followers should fail to send epoch to the leader, so
> >>> > > > observers can take it over.
> >>> > > >
> >>> > > > Is that description a

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-11 Thread Ted Yu
Hi,
The PR for ZK-2959 already has +1.

Can the PR be merged ?

Thanks


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-09 Thread Alexander Shraer
Hi,

Please take a look on the new PR for ZK-2959:
https://github.com/apache/zookeeper/pull/500
If there are no further comments, I can commit it.

Thanks,
Alex

On Fri, Apr 6, 2018 at 11:33 AM, Alexander Shraer <shra...@gmail.com> wrote:

> Hi,
>
> The bug described in  ZOOKEEPER-2959
> <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
> getEpochToPropose an waitForEpochAck do not distinguish between followers
> and observers.
> This can cause a candidate leader's acceptedEpoch to be updated with only
> support from observers. Same for waitForEpochAck - passing this method
> allows the candidate leader to update the currentEpoch. The latter helps
> this server to win FLE elections continuously, and the former
> (acceptedEpoch)
> causes anyone trying to connect to the server to think that it has more
> up-to-date data and trucate their logs to match.
>
>
> Alex
>
> On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> Can you give more details about the data loss scenario in Jira
>> ZOOKEEPER-2959 <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>?
>> As far as I know, the leader will ignore the observers' ACK in
>> waitForNewLeaderAck, so it will not start serve traffic until it received
>> the actual quorum ACK, if it doesn't have enough followers support before
>> timeout, it will quit leading and it's learners will re-sync with new
>> leader.
>>
>> Thanks,
>> Fangmin
>>
>> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <shra...@gmail.com>
>> wrote:
>>
>>> Btw we actually observed the described issue (data loss), thankfully in a
>>> test environment. So I thought this is important to share with the
>>> community.
>>>
>>> Unfortunately I don’t have time to run a new ZK release for this, so I’m
>>> not going to -1 your candidate, but we are actively working on a fix (ie
>>> a
>>> test at this point) and I can commit that as soon as we have that.
>>>
>>> It may be worth while to delay the release by a few more days, but it’s
>>> totally up to you since you’re running it.
>>>
>>> Cheers
>>> Alex
>>> On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com> wrote:
>>>
>>> > Got that. I still believe it's a completely valid issue which has to be
>>> > addressed, but it's not a showstopper. I'm afraid we're not going to
>>> > convince each other, so it's probably Abe's call if he want to create
>>> > another release candidate for the fix.
>>> >
>>> > I reviewed the code on github and I think it just needs to be covered
>>> with
>>> > a unit test to be complete.
>>> >
>>> > Regards,
>>> > Andor
>>> >
>>> >
>>> >
>>> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com>
>>> > wrote:
>>> >
>>> > > Yes sort of, FLE is finished, then enough observer's messages reach
>>> the
>>> > > leader before participant's messages do.
>>> > > Whether its rare depends on the number of observers and
>>> participants. For
>>> > > example with very few participants and many observers
>>> > > your chance of hitting this are quite high.
>>> > >
>>> > > Alex
>>> > >
>>> > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com>
>>> > wrote:
>>> > >
>>> > > > Maybe I'm missing something here, but this looks like a rare edge
>>> case
>>> > to
>>> > > > me. Participants must finish the leader election successfully and
>>> right
>>> > > > after enough followers should fail to send epoch to the leader, so
>>> > > > observers can take it over.
>>> > > >
>>> > > > Is that description accurate?
>>> > > >
>>> > > > Andor
>>> > > >
>>> > > >
>>> > > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <
>>> shra...@gmail.com>
>>> > > > wrote:
>>> > > >
>>> > > > > To clarify - in a deployment with observers this bug can
>>> potentially
>>> > > > cause
>>> > > > > data loss. A server could be elected leader based just on the
>>> support
>>> > > of
>>> > > &g

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-06 Thread Alexander Shraer
Hi,

The bug described in  ZOOKEEPER-2959
<https://issues.apache.org/jira/browse/ZOOKEEPER-2959>  is that
getEpochToPropose an waitForEpochAck do not distinguish between followers
and observers.
This can cause a candidate leader's acceptedEpoch to be updated with only
support from observers. Same for waitForEpochAck - passing this method
allows the candidate leader to update the currentEpoch. The latter helps
this server to win FLE elections continuously, and the former
(acceptedEpoch)
causes anyone trying to connect to the server to think that it has more
up-to-date data and trucate their logs to match.


Alex

On Fri, Apr 6, 2018 at 10:04 AM, Fangmin Lv <lvfang...@gmail.com> wrote:

> Hi Alex,
>
> Can you give more details about the data loss scenario in Jira
> ZOOKEEPER-2959 <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>? As
> far as I know, the leader will ignore the observers' ACK in
> waitForNewLeaderAck, so it will not start serve traffic until it received
> the actual quorum ACK, if it doesn't have enough followers support before
> timeout, it will quit leading and it's learners will re-sync with new
> leader.
>
> Thanks,
> Fangmin
>
> On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <shra...@gmail.com>
> wrote:
>
>> Btw we actually observed the described issue (data loss), thankfully in a
>> test environment. So I thought this is important to share with the
>> community.
>>
>> Unfortunately I don’t have time to run a new ZK release for this, so I’m
>> not going to -1 your candidate, but we are actively working on a fix (ie a
>> test at this point) and I can commit that as soon as we have that.
>>
>> It may be worth while to delay the release by a few more days, but it’s
>> totally up to you since you’re running it.
>>
>> Cheers
>> Alex
>> On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com> wrote:
>>
>> > Got that. I still believe it's a completely valid issue which has to be
>> > addressed, but it's not a showstopper. I'm afraid we're not going to
>> > convince each other, so it's probably Abe's call if he want to create
>> > another release candidate for the fix.
>> >
>> > I reviewed the code on github and I think it just needs to be covered
>> with
>> > a unit test to be complete.
>> >
>> > Regards,
>> > Andor
>> >
>> >
>> >
>> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com>
>> > wrote:
>> >
>> > > Yes sort of, FLE is finished, then enough observer's messages reach
>> the
>> > > leader before participant's messages do.
>> > > Whether its rare depends on the number of observers and participants.
>> For
>> > > example with very few participants and many observers
>> > > your chance of hitting this are quite high.
>> > >
>> > > Alex
>> > >
>> > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com>
>> > wrote:
>> > >
>> > > > Maybe I'm missing something here, but this looks like a rare edge
>> case
>> > to
>> > > > me. Participants must finish the leader election successfully and
>> right
>> > > > after enough followers should fail to send epoch to the leader, so
>> > > > observers can take it over.
>> > > >
>> > > > Is that description accurate?
>> > > >
>> > > > Andor
>> > > >
>> > > >
>> > > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com
>> >
>> > > > wrote:
>> > > >
>> > > > > To clarify - in a deployment with observers this bug can
>> potentially
>> > > > cause
>> > > > > data loss. A server could be elected leader based just on the
>> support
>> > > of
>> > > > > observers, even if this servers data is stale wrt other followers.
>> > > > >
>> > > > > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
>> > > > >
>> > > > >
>> > > > > Alex
>> > > > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com>
>> > > wrote:
>> > > > >
>> > > > > > I don't think it's a blocker.
>> > > > > > The jira and PR has been open since last December and 3.4.11 has
>> > > > released
>> &

Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-06 Thread Fangmin Lv
Hi Alex,

Can you give more details about the data loss scenario in Jira
ZOOKEEPER-2959 <https://issues.apache.org/jira/browse/ZOOKEEPER-2959>? As
far as I know, the leader will ignore the observers' ACK in
waitForNewLeaderAck, so it will not start serve traffic until it received
the actual quorum ACK, if it doesn't have enough followers support before
timeout, it will quit leading and it's learners will re-sync with new
leader.

Thanks,
Fangmin

On Thu, Apr 5, 2018 at 12:57 PM, Alexander Shraer <shra...@gmail.com> wrote:

> Btw we actually observed the described issue (data loss), thankfully in a
> test environment. So I thought this is important to share with the
> community.
>
> Unfortunately I don’t have time to run a new ZK release for this, so I’m
> not going to -1 your candidate, but we are actively working on a fix (ie a
> test at this point) and I can commit that as soon as we have that.
>
> It may be worth while to delay the release by a few more days, but it’s
> totally up to you since you’re running it.
>
> Cheers
> Alex
> On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com> wrote:
>
> > Got that. I still believe it's a completely valid issue which has to be
> > addressed, but it's not a showstopper. I'm afraid we're not going to
> > convince each other, so it's probably Abe's call if he want to create
> > another release candidate for the fix.
> >
> > I reviewed the code on github and I think it just needs to be covered
> with
> > a unit test to be complete.
> >
> > Regards,
> > Andor
> >
> >
> >
> > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com>
> > wrote:
> >
> > > Yes sort of, FLE is finished, then enough observer's messages reach the
> > > leader before participant's messages do.
> > > Whether its rare depends on the number of observers and participants.
> For
> > > example with very few participants and many observers
> > > your chance of hitting this are quite high.
> > >
> > > Alex
> > >
> > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com>
> > wrote:
> > >
> > > > Maybe I'm missing something here, but this looks like a rare edge
> case
> > to
> > > > me. Participants must finish the leader election successfully and
> right
> > > > after enough followers should fail to send epoch to the leader, so
> > > > observers can take it over.
> > > >
> > > > Is that description accurate?
> > > >
> > > > Andor
> > > >
> > > >
> > > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com>
> > > > wrote:
> > > >
> > > > > To clarify - in a deployment with observers this bug can
> potentially
> > > > cause
> > > > > data loss. A server could be elected leader based just on the
> support
> > > of
> > > > > observers, even if this servers data is stale wrt other followers.
> > > > >
> > > > > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
> > > > >
> > > > >
> > > > > Alex
> > > > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com>
> > > wrote:
> > > > >
> > > > > > I don't think it's a blocker.
> > > > > > The jira and PR has been open since last December and 3.4.11 has
> > > > released
> > > > > > without it.
> > > > > >
> > > > > > Although this bug is also important to fix, I believe it's more
> > > > important
> > > > > > to release a fix for the regression we've found in 3.4.11 asap.
> > > > > >
> > > > > > Abe, any thoughts?
> > > > > >
> > > > > > Regards,
> > > > > > Andor
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <
> > shra...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Sorry for coming in at the last moment. I'm not sure when the
> > next
> > > > 3.4
> > > > > > > release is scheduled, so just wanted to mention this bug,
> > > > > > > which I believe is a blocker for either this or next release:
> > > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > > > > > >
> > > > > > > Best,
> > > > > > > Alex
> > > > > > >
> > > > > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Can the vote be closed ?
> > > > > > > >
> > > > > > > > It seems we have enough +1's
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-04-05 Thread Flavio Junqueira
We should consider not using md5 for the next RC as per the ASF policy:

https://www.apache.org/dev/release-distribution.html#sigs-and-sums

-Flavio

> On 26 Mar 2018, at 21:37, Abraham Fine <af...@apache.org> wrote:
> 
> Thank you everyone for your votes to far. Due to the issues that Michael Han 
> pointed out I will cancel this RC and release a new one with the correct 
> release notes.
> 
> Expect the new RC very soon.
> 
> Thanks,
> Abe
> 
> On Mon, Mar 26, 2018, at 12:29, Patrick Hunt wrote:
>> +1 - sig/xsum verify, RAT looks ok, was able to build/test successfully
>> under jdk7/mac. Tested out a few deployment combinations and it seemed ok.
>> Manually verified swapping data and datalogdir resulted in the server not
>> coming up and human readable error in the logs.
>> 
>> Patrick
>> 
>> On Thu, Mar 22, 2018 at 1:05 PM, Abraham Fine <af...@apache.org> wrote:
>> 
>>> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
>>> including issues that
>>> affect incorrect handling of the dataDir and the dataLogDir.
>>> 
>>> The full release notes is available at:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12310801=12342040
>>> 
>>> *** Please download, test and vote by March 27th 2018, 23:59 UTC+0. ***
>>> 
>>> Source files:
>>> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-0/
>>> 
>>> Maven staging repo:
>>> https://repository.apache.org/content/groups/staging/org/
>>> apache/zookeeper/zookeeper/3.4.12/
>>> 
>>> The release candidate tag in git to be voted upon: release-3.4.12-rc0
>>> 
>>> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
>>> http://www.apache.org/dist/zookeeper/KEYS
>>> 
>>> Should we release this candidate?
>>> 



Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Alexander Shraer
Btw we actually observed the described issue (data loss), thankfully in a
test environment. So I thought this is important to share with the
community.

Unfortunately I don’t have time to run a new ZK release for this, so I’m
not going to -1 your candidate, but we are actively working on a fix (ie a
test at this point) and I can commit that as soon as we have that.

It may be worth while to delay the release by a few more days, but it’s
totally up to you since you’re running it.

Cheers
Alex
On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar <an...@cloudera.com> wrote:

> Got that. I still believe it's a completely valid issue which has to be
> addressed, but it's not a showstopper. I'm afraid we're not going to
> convince each other, so it's probably Abe's call if he want to create
> another release candidate for the fix.
>
> I reviewed the code on github and I think it just needs to be covered with
> a unit test to be complete.
>
> Regards,
> Andor
>
>
>
> On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com>
> wrote:
>
> > Yes sort of, FLE is finished, then enough observer's messages reach the
> > leader before participant's messages do.
> > Whether its rare depends on the number of observers and participants. For
> > example with very few participants and many observers
> > your chance of hitting this are quite high.
> >
> > Alex
> >
> > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com>
> wrote:
> >
> > > Maybe I'm missing something here, but this looks like a rare edge case
> to
> > > me. Participants must finish the leader election successfully and right
> > > after enough followers should fail to send epoch to the leader, so
> > > observers can take it over.
> > >
> > > Is that description accurate?
> > >
> > > Andor
> > >
> > >
> > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com>
> > > wrote:
> > >
> > > > To clarify - in a deployment with observers this bug can potentially
> > > cause
> > > > data loss. A server could be elected leader based just on the support
> > of
> > > > observers, even if this servers data is stale wrt other followers.
> > > >
> > > > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
> > > >
> > > >
> > > > Alex
> > > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com>
> > wrote:
> > > >
> > > > > I don't think it's a blocker.
> > > > > The jira and PR has been open since last December and 3.4.11 has
> > > released
> > > > > without it.
> > > > >
> > > > > Although this bug is also important to fix, I believe it's more
> > > important
> > > > > to release a fix for the regression we've found in 3.4.11 asap.
> > > > >
> > > > > Abe, any thoughts?
> > > > >
> > > > > Regards,
> > > > > Andor
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <
> shra...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Sorry for coming in at the last moment. I'm not sure when the
> next
> > > 3.4
> > > > > > release is scheduled, so just wanted to mention this bug,
> > > > > > which I believe is a blocker for either this or next release:
> > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > > > > >
> > > > > > Best,
> > > > > > Alex
> > > > > >
> > > > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Can the vote be closed ?
> > > > > > >
> > > > > > > It seems we have enough +1's
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Andor Molnar
Got that. I still believe it's a completely valid issue which has to be
addressed, but it's not a showstopper. I'm afraid we're not going to
convince each other, so it's probably Abe's call if he want to create
another release candidate for the fix.

I reviewed the code on github and I think it just needs to be covered with
a unit test to be complete.

Regards,
Andor



On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer <shra...@gmail.com> wrote:

> Yes sort of, FLE is finished, then enough observer's messages reach the
> leader before participant's messages do.
> Whether its rare depends on the number of observers and participants. For
> example with very few participants and many observers
> your chance of hitting this are quite high.
>
> Alex
>
> On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com> wrote:
>
> > Maybe I'm missing something here, but this looks like a rare edge case to
> > me. Participants must finish the leader election successfully and right
> > after enough followers should fail to send epoch to the leader, so
> > observers can take it over.
> >
> > Is that description accurate?
> >
> > Andor
> >
> >
> > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com>
> > wrote:
> >
> > > To clarify - in a deployment with observers this bug can potentially
> > cause
> > > data loss. A server could be elected leader based just on the support
> of
> > > observers, even if this servers data is stale wrt other followers.
> > >
> > > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
> > >
> > >
> > > Alex
> > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com>
> wrote:
> > >
> > > > I don't think it's a blocker.
> > > > The jira and PR has been open since last December and 3.4.11 has
> > released
> > > > without it.
> > > >
> > > > Although this bug is also important to fix, I believe it's more
> > important
> > > > to release a fix for the regression we've found in 3.4.11 asap.
> > > >
> > > > Abe, any thoughts?
> > > >
> > > > Regards,
> > > > Andor
> > > >
> > > >
> > > >
> > > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <shra...@gmail.com>
> > > > wrote:
> > > >
> > > > > Sorry for coming in at the last moment. I'm not sure when the next
> > 3.4
> > > > > release is scheduled, so just wanted to mention this bug,
> > > > > which I believe is a blocker for either this or next release:
> > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > > > >
> > > > > Best,
> > > > > Alex
> > > > >
> > > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > > >
> > > > > > Can the vote be closed ?
> > > > > >
> > > > > > It seems we have enough +1's
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Alexander Shraer
Yes sort of, FLE is finished, then enough observer's messages reach the
leader before participant's messages do.
Whether its rare depends on the number of observers and participants. For
example with very few participants and many observers
your chance of hitting this are quite high.

Alex

On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar <an...@cloudera.com> wrote:

> Maybe I'm missing something here, but this looks like a rare edge case to
> me. Participants must finish the leader election successfully and right
> after enough followers should fail to send epoch to the leader, so
> observers can take it over.
>
> Is that description accurate?
>
> Andor
>
>
> On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com>
> wrote:
>
> > To clarify - in a deployment with observers this bug can potentially
> cause
> > data loss. A server could be elected leader based just on the support of
> > observers, even if this servers data is stale wrt other followers.
> >
> > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
> >
> >
> > Alex
> > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com> wrote:
> >
> > > I don't think it's a blocker.
> > > The jira and PR has been open since last December and 3.4.11 has
> released
> > > without it.
> > >
> > > Although this bug is also important to fix, I believe it's more
> important
> > > to release a fix for the regression we've found in 3.4.11 asap.
> > >
> > > Abe, any thoughts?
> > >
> > > Regards,
> > > Andor
> > >
> > >
> > >
> > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <shra...@gmail.com>
> > > wrote:
> > >
> > > > Sorry for coming in at the last moment. I'm not sure when the next
> 3.4
> > > > release is scheduled, so just wanted to mention this bug,
> > > > which I believe is a blocker for either this or next release:
> > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > > >
> > > > Best,
> > > > Alex
> > > >
> > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > > >
> > > > > Can the vote be closed ?
> > > > >
> > > > > It seems we have enough +1's
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Andor Molnar
Maybe I'm missing something here, but this looks like a rare edge case to
me. Participants must finish the leader election successfully and right
after enough followers should fail to send epoch to the leader, so
observers can take it over.

Is that description accurate?

Andor


On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer <shra...@gmail.com> wrote:

> To clarify - in a deployment with observers this bug can potentially cause
> data loss. A server could be elected leader based just on the support of
> observers, even if this servers data is stale wrt other followers.
>
> It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.
>
>
> Alex
> On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com> wrote:
>
> > I don't think it's a blocker.
> > The jira and PR has been open since last December and 3.4.11 has released
> > without it.
> >
> > Although this bug is also important to fix, I believe it's more important
> > to release a fix for the regression we've found in 3.4.11 asap.
> >
> > Abe, any thoughts?
> >
> > Regards,
> > Andor
> >
> >
> >
> > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <shra...@gmail.com>
> > wrote:
> >
> > > Sorry for coming in at the last moment. I'm not sure when the next 3.4
> > > release is scheduled, so just wanted to mention this bug,
> > > which I believe is a blocker for either this or next release:
> > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> > >
> > > Best,
> > > Alex
> > >
> > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > >
> > > > Can the vote be closed ?
> > > >
> > > > It seems we have enough +1's
> > > >
> > > > Thanks
> > > >
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Alexander Shraer
To clarify - in a deployment with observers this bug can potentially cause
data loss. A server could be elected leader based just on the support of
observers, even if this servers data is stale wrt other followers.

It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12.


Alex
On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar <an...@cloudera.com> wrote:

> I don't think it's a blocker.
> The jira and PR has been open since last December and 3.4.11 has released
> without it.
>
> Although this bug is also important to fix, I believe it's more important
> to release a fix for the regression we've found in 3.4.11 asap.
>
> Abe, any thoughts?
>
> Regards,
> Andor
>
>
>
> On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer <shra...@gmail.com>
> wrote:
>
> > Sorry for coming in at the last moment. I'm not sure when the next 3.4
> > release is scheduled, so just wanted to mention this bug,
> > which I believe is a blocker for either this or next release:
> > https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> >
> > Best,
> > Alex
> >
> > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > Can the vote be closed ?
> > >
> > > It seems we have enough +1's
> > >
> > > Thanks
> > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Andor Molnar
I don't think it's a blocker.
The jira and PR has been open since last December and 3.4.11 has released
without it.

Although this bug is also important to fix, I believe it's more important
to release a fix for the regression we've found in 3.4.11 asap.

Abe, any thoughts?

Regards,
Andor



On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer  wrote:

> Sorry for coming in at the last moment. I'm not sure when the next 3.4
> release is scheduled, so just wanted to mention this bug,
> which I believe is a blocker for either this or next release:
> https://issues.apache.org/jira/browse/ZOOKEEPER-2959
>
> Best,
> Alex
>
> On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu  wrote:
>
> > Can the vote be closed ?
> >
> > It seems we have enough +1's
> >
> > Thanks
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Alexander Shraer
Sorry for coming in at the last moment. I'm not sure when the next 3.4
release is scheduled, so just wanted to mention this bug,
which I believe is a blocker for either this or next release:
https://issues.apache.org/jira/browse/ZOOKEEPER-2959

Best,
Alex

On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu  wrote:

> Can the vote be closed ?
>
> It seems we have enough +1's
>
> Thanks
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-05 Thread Ted Yu
Can the vote be closed ?

It seems we have enough +1's

Thanks


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-01 Thread Michael Han
+1

- verified xsum/sig.
- release notes looks good.
- verified cluster with different sizes.
- verified with few 4lw commands.
- verified data / log dir swap was fixed.
- all unit test passed.


On Wed, Mar 28, 2018 at 11:55 AM, Patrick Hunt <ph...@apache.org> wrote:

> +1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
> which seemed fine. Ran the tests and they passed. LGTM.
>
> Patrick
>
> On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine <af...@apache.org> wrote:
>
> > This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> > including issues that
> > affect incorrect handling of the dataDir and the dataLogDir.
> >
> > This candidate fixes an issue in the release notes of candidate 0.
> >
> > The full release notes are available at:
> >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > projectId=12310801=12342040
> >
> > *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
> >
> > Source files:
> > http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
> >
> > Maven staging repo:
> > https://repository.apache.org/content/groups/staging/org/
> > apache/zookeeper/zookeeper/3.4.12/
> >
> > The release candidate tag in git to be voted upon: release-3.4.12-rc1
> >
> > ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> > http://www.apache.org/dist/zookeeper/KEYS
> >
> > Should we release this candidate?
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-04-01 Thread Rakesh Radhakrishnan
+1

Verified using jdk 1.8.0_111 and CentOS 7.3.1611 env

- successfully compiled and ran the tests, all passed.
- xsum/sig verified, OK.
- set up 3 node cluster, verified through zk cmds and a couple of 4lws,
looks fine.


Rakesh

On Thu, Mar 29, 2018 at 12:25 AM, Patrick Hunt <ph...@apache.org> wrote:

> +1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
> which seemed fine. Ran the tests and they passed. LGTM.
>
> Patrick
>
> On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine <af...@apache.org> wrote:
>
> > This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> > including issues that
> > affect incorrect handling of the dataDir and the dataLogDir.
> >
> > This candidate fixes an issue in the release notes of candidate 0.
> >
> > The full release notes are available at:
> >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > projectId=12310801=12342040
> >
> > *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
> >
> > Source files:
> > http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
> >
> > Maven staging repo:
> > https://repository.apache.org/content/groups/staging/org/
> > apache/zookeeper/zookeeper/3.4.12/
> >
> > The release candidate tag in git to be voted upon: release-3.4.12-rc1
> >
> > ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> > http://www.apache.org/dist/zookeeper/KEYS
> >
> > Should we release this candidate?
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-03-28 Thread Patrick Hunt
+1. sig/xsum verified, RAT ran OK. I tested a few operational scenarios
which seemed fine. Ran the tests and they passed. LGTM.

Patrick

On Mon, Mar 26, 2018 at 10:05 PM, Abraham Fine <af...@apache.org> wrote:

> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> including issues that
> affect incorrect handling of the dataDir and the dataLogDir.
>
> This candidate fixes an issue in the release notes of candidate 0.
>
> The full release notes are available at:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12310801=12342040
>
> *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
>
> Source files:
> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
>
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/
> apache/zookeeper/zookeeper/3.4.12/
>
> The release candidate tag in git to be voted upon: release-3.4.12-rc1
>
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
>
> Should we release this candidate?
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-03-27 Thread Ted Yu
+1

checked signatures
unit tests passed on Linux


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-03-27 Thread Andor Molnar
+1 for candidate 1

- verified signatures
- unit test passed on Mac and Ubuntu 16.04 (including c++)
- verified 3 node cluster

Andor



On Tue, Mar 27, 2018 at 7:05 AM, Abraham Fine <af...@apache.org> wrote:

> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> including issues that
> affect incorrect handling of the dataDir and the dataLogDir.
>
> This candidate fixes an issue in the release notes of candidate 0.
>
> The full release notes are available at:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12310801=12342040
>
> *** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
>
> Source files:
> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
>
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/
> apache/zookeeper/zookeeper/3.4.12/
>
> The release candidate tag in git to be voted upon: release-3.4.12-rc1
>
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
>
> Should we release this candidate?
>


[VOTE] Apache ZooKeeper release 3.4.12 candidate 1

2018-03-26 Thread Abraham Fine
This is a bugfix release candidate for 3.4.12. It fixes 22 issues, including 
issues that
affect incorrect handling of the dataDir and the dataLogDir.

This candidate fixes an issue in the release notes of candidate 0.
 
The full release notes are available at:
 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12342040
 
*** Please download, test and vote by March 31st 2018, 23:59 UTC+0. ***
 
Source files:
http://people.apache.org/~afine/zookeeper-3.4.12-candidate-1/
 
Maven staging repo:
https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.4.12/
 
The release candidate tag in git to be voted upon: release-3.4.12-rc1
 
ZooKeeper's KEYS file containing PGP keys we use to sign the release:
http://www.apache.org/dist/zookeeper/KEYS
 
Should we release this candidate?


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-26 Thread Abraham Fine
Thank you everyone for your votes to far. Due to the issues that Michael Han 
pointed out I will cancel this RC and release a new one with the correct 
release notes.

Expect the new RC very soon.

Thanks,
Abe

On Mon, Mar 26, 2018, at 12:29, Patrick Hunt wrote:
> +1 - sig/xsum verify, RAT looks ok, was able to build/test successfully
> under jdk7/mac. Tested out a few deployment combinations and it seemed ok.
> Manually verified swapping data and datalogdir resulted in the server not
> coming up and human readable error in the logs.
> 
> Patrick
> 
> On Thu, Mar 22, 2018 at 1:05 PM, Abraham Fine <af...@apache.org> wrote:
> 
> > This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> > including issues that
> > affect incorrect handling of the dataDir and the dataLogDir.
> >
> > The full release notes is available at:
> >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > projectId=12310801=12342040
> >
> > *** Please download, test and vote by March 27th 2018, 23:59 UTC+0. ***
> >
> > Source files:
> > http://people.apache.org/~afine/zookeeper-3.4.12-candidate-0/
> >
> > Maven staging repo:
> > https://repository.apache.org/content/groups/staging/org/
> > apache/zookeeper/zookeeper/3.4.12/
> >
> > The release candidate tag in git to be voted upon: release-3.4.12-rc0
> >
> > ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> > http://www.apache.org/dist/zookeeper/KEYS
> >
> > Should we release this candidate?
> >


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-26 Thread Patrick Hunt
+1 - sig/xsum verify, RAT looks ok, was able to build/test successfully
under jdk7/mac. Tested out a few deployment combinations and it seemed ok.
Manually verified swapping data and datalogdir resulted in the server not
coming up and human readable error in the logs.

Patrick

On Thu, Mar 22, 2018 at 1:05 PM, Abraham Fine <af...@apache.org> wrote:

> This is a bugfix release candidate for 3.4.12. It fixes 22 issues,
> including issues that
> affect incorrect handling of the dataDir and the dataLogDir.
>
> The full release notes is available at:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12310801=12342040
>
> *** Please download, test and vote by March 27th 2018, 23:59 UTC+0. ***
>
> Source files:
> http://people.apache.org/~afine/zookeeper-3.4.12-candidate-0/
>
> Maven staging repo:
> https://repository.apache.org/content/groups/staging/org/
> apache/zookeeper/zookeeper/3.4.12/
>
> The release candidate tag in git to be voted upon: release-3.4.12-rc0
>
> ZooKeeper's KEYS file containing PGP keys we use to sign the release:
> http://www.apache.org/dist/zookeeper/KEYS
>
> Should we release this candidate?
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-26 Thread Andor Molnar
Hi Abe,

Thanks for putting this important release together.

+1

I've run the following tests:
- verified signatures,
- all unit tests passed on Mac (java only) and Ubuntu 16.04 (java & c++
successful),
- tested standalone and 3-node cluster,
- verified the fix & validation of datadir/logdir swap,
- verified a few 4lw commands

Regards,
Andor



On Fri, Mar 23, 2018 at 1:16 AM, Ted Yu  wrote:

> Hi,
> I ran test suite for the RC.
>
> Testcase: testSessionTimeout took 22.686 sec
>   Caused an ERROR
> KeeperErrorCode = ConnectionLoss for /stest
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /stest
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
>   at
> org.apache.zookeeper.test.SessionTest.testSessionTimeout(
> SessionTest.java:300)
>
> 
>
> Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
>   Caused an ERROR
> KeeperErrorCode = ConnectionLoss for /watchtest/child2
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /watchtest/child2
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(
> WatcherTest.java:369)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(
> WatcherTest.java:255)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWi
> thLocal(WatcherTest.java:268)
>
> Has anyone else seen the above test failures ?
>
> Cheers
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-25 Thread Michael Han
-1 because release note is not correct. ZOOKEEPER-2948 and ZOOKEEPER-2806
should be included in release notes and corresponding JIRA needs update.

Other than this, rc0 looks good.

- verified xsum/sig
- go through release notes.
- verified cluster with different sizes.
- verified with few 4lw commands.
- verified data / log dir swap was fixed.
- all unit test passed.


On Fri, Mar 23, 2018 at 1:23 PM, Ted Yu  wrote:

> I just ran the test suite again - there was no test failure.
>
> So +1 from my side.
>
> On Fri, Mar 23, 2018 at 9:54 AM, Abraham Fine  wrote:
>
> > Do they always fail when run with the rest of the test suite or is it
> > inconsistent?
> >
> > The reason I ask is that the failure you are reporting is a
> ConnectionLoss
> > and testSessionTimeout has a history of being flaky (generally on
> ZooKeeper
> > 3.5 though).
> >
> >
> > On Fri, Mar 23, 2018, at 09:45, Ted Yu wrote:
> > > Here is OS:
> > >
> > > Linux h.com 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC
> > 2016
> > > x86_64 x86_64 x86_64 GNU/Linux
> > >
> > > java version "1.8.0_161"
> > > Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
> > > Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
> > >
> > > The tests don't fail when run alone.
> > >
> > > FYI
> > >
> > > On Fri, Mar 23, 2018 at 9:41 AM, Abraham Fine 
> wrote:
> > >
> > > > Hi Ted-
> > > >
> > > > Thanks for running the test cases on the RC. I am not able to
> reproduce
> > > > the failures. Would you mind telling us a little bit more about the
> > > > environment you are running the tests in (operating system, jvm)? In
> > > > addition, to the failures occur every time you run the tests or just
> > > > occasionally?
> > > >
> > > > Thanks,
> > > > Abe
> > > >
> > > > On Thu, Mar 22, 2018, at 17:16, Ted Yu wrote:
> > > > > Hi,
> > > > > I ran test suite for the RC.
> > > > >
> > > > > Testcase: testSessionTimeout took 22.686 sec
> > > > >   Caused an ERROR
> > > > > KeeperErrorCode = ConnectionLoss for /stest
> > > > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > > > KeeperErrorCode = ConnectionLoss for /stest
> > > > >   at org.apache.zookeeper.KeeperException.create(
> > > > KeeperException.java:102)
> > > > >   at org.apache.zookeeper.KeeperException.create(
> > > > KeeperException.java:54)
> > > > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
> > > > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
> > > > >   at
> > > > > org.apache.zookeeper.test.SessionTest.testSessionTimeout(
> > > > SessionTest.java:300)
> > > > >
> > > > > 
> > > > >
> > > > > Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
> > > > >   Caused an ERROR
> > > > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > > > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > > > >   at
> > > > > org.apache.zookeeper.KeeperException.create(
> > KeeperException.java:102)
> > > > >   at
> > > > > org.apache.zookeeper.KeeperException.create(
> KeeperException.java:54)
> > > > >   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
> > > > >   at
> > > > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(
> > > > WatcherTest.java:369)
> > > > >   at
> > > > > org.apache.zookeeper.test.WatcherTest.
> testWatcherAutoResetWithLocal(
> > > > WatcherTest.java:255)
> > > > >   at
> > > > > org.apache.zookeeper.test.WatcherTest.
> testWatcherAutoResetDisabledWi
> > > > thLocal(WatcherTest.java:268)
> > > > >
> > > > > Has anyone else seen the above test failures ?
> > > > >
> > > > > Cheers
> > > >
> >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-23 Thread Ted Yu
I just ran the test suite again - there was no test failure.

So +1 from my side.

On Fri, Mar 23, 2018 at 9:54 AM, Abraham Fine  wrote:

> Do they always fail when run with the rest of the test suite or is it
> inconsistent?
>
> The reason I ask is that the failure you are reporting is a ConnectionLoss
> and testSessionTimeout has a history of being flaky (generally on ZooKeeper
> 3.5 though).
>
>
> On Fri, Mar 23, 2018, at 09:45, Ted Yu wrote:
> > Here is OS:
> >
> > Linux h.com 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC
> 2016
> > x86_64 x86_64 x86_64 GNU/Linux
> >
> > java version "1.8.0_161"
> > Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
> > Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
> >
> > The tests don't fail when run alone.
> >
> > FYI
> >
> > On Fri, Mar 23, 2018 at 9:41 AM, Abraham Fine  wrote:
> >
> > > Hi Ted-
> > >
> > > Thanks for running the test cases on the RC. I am not able to reproduce
> > > the failures. Would you mind telling us a little bit more about the
> > > environment you are running the tests in (operating system, jvm)? In
> > > addition, to the failures occur every time you run the tests or just
> > > occasionally?
> > >
> > > Thanks,
> > > Abe
> > >
> > > On Thu, Mar 22, 2018, at 17:16, Ted Yu wrote:
> > > > Hi,
> > > > I ran test suite for the RC.
> > > >
> > > > Testcase: testSessionTimeout took 22.686 sec
> > > >   Caused an ERROR
> > > > KeeperErrorCode = ConnectionLoss for /stest
> > > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > > KeeperErrorCode = ConnectionLoss for /stest
> > > >   at org.apache.zookeeper.KeeperException.create(
> > > KeeperException.java:102)
> > > >   at org.apache.zookeeper.KeeperException.create(
> > > KeeperException.java:54)
> > > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
> > > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
> > > >   at
> > > > org.apache.zookeeper.test.SessionTest.testSessionTimeout(
> > > SessionTest.java:300)
> > > >
> > > > 
> > > >
> > > > Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
> > > >   Caused an ERROR
> > > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > > >   at
> > > > org.apache.zookeeper.KeeperException.create(
> KeeperException.java:102)
> > > >   at
> > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> > > >   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
> > > >   at
> > > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(
> > > WatcherTest.java:369)
> > > >   at
> > > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(
> > > WatcherTest.java:255)
> > > >   at
> > > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWi
> > > thLocal(WatcherTest.java:268)
> > > >
> > > > Has anyone else seen the above test failures ?
> > > >
> > > > Cheers
> > >
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-23 Thread Abraham Fine
Do they always fail when run with the rest of the test suite or is it 
inconsistent?

The reason I ask is that the failure you are reporting is a ConnectionLoss and 
testSessionTimeout has a history of being flaky (generally on ZooKeeper 3.5 
though). 


On Fri, Mar 23, 2018, at 09:45, Ted Yu wrote:
> Here is OS:
> 
> Linux h.com 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
> 
> java version "1.8.0_161"
> Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
> 
> The tests don't fail when run alone.
> 
> FYI
> 
> On Fri, Mar 23, 2018 at 9:41 AM, Abraham Fine  wrote:
> 
> > Hi Ted-
> >
> > Thanks for running the test cases on the RC. I am not able to reproduce
> > the failures. Would you mind telling us a little bit more about the
> > environment you are running the tests in (operating system, jvm)? In
> > addition, to the failures occur every time you run the tests or just
> > occasionally?
> >
> > Thanks,
> > Abe
> >
> > On Thu, Mar 22, 2018, at 17:16, Ted Yu wrote:
> > > Hi,
> > > I ran test suite for the RC.
> > >
> > > Testcase: testSessionTimeout took 22.686 sec
> > >   Caused an ERROR
> > > KeeperErrorCode = ConnectionLoss for /stest
> > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > KeeperErrorCode = ConnectionLoss for /stest
> > >   at org.apache.zookeeper.KeeperException.create(
> > KeeperException.java:102)
> > >   at org.apache.zookeeper.KeeperException.create(
> > KeeperException.java:54)
> > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
> > >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
> > >   at
> > > org.apache.zookeeper.test.SessionTest.testSessionTimeout(
> > SessionTest.java:300)
> > >
> > > 
> > >
> > > Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
> > >   Caused an ERROR
> > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > >   at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> > >   at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> > >   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
> > >   at
> > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(
> > WatcherTest.java:369)
> > >   at
> > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(
> > WatcherTest.java:255)
> > >   at
> > > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWi
> > thLocal(WatcherTest.java:268)
> > >
> > > Has anyone else seen the above test failures ?
> > >
> > > Cheers
> >


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-23 Thread Ted Yu
Here is OS:

Linux h.com 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux

java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

The tests don't fail when run alone.

FYI

On Fri, Mar 23, 2018 at 9:41 AM, Abraham Fine  wrote:

> Hi Ted-
>
> Thanks for running the test cases on the RC. I am not able to reproduce
> the failures. Would you mind telling us a little bit more about the
> environment you are running the tests in (operating system, jvm)? In
> addition, to the failures occur every time you run the tests or just
> occasionally?
>
> Thanks,
> Abe
>
> On Thu, Mar 22, 2018, at 17:16, Ted Yu wrote:
> > Hi,
> > I ran test suite for the RC.
> >
> > Testcase: testSessionTimeout took 22.686 sec
> >   Caused an ERROR
> > KeeperErrorCode = ConnectionLoss for /stest
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /stest
> >   at org.apache.zookeeper.KeeperException.create(
> KeeperException.java:102)
> >   at org.apache.zookeeper.KeeperException.create(
> KeeperException.java:54)
> >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
> >   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
> >   at
> > org.apache.zookeeper.test.SessionTest.testSessionTimeout(
> SessionTest.java:300)
> >
> > 
> >
> > Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
> >   Caused an ERROR
> > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /watchtest/child2
> >   at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> >   at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> >   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
> >   at
> > org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(
> WatcherTest.java:369)
> >   at
> > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(
> WatcherTest.java:255)
> >   at
> > org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWi
> thLocal(WatcherTest.java:268)
> >
> > Has anyone else seen the above test failures ?
> >
> > Cheers
>


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-23 Thread Abraham Fine
Hi Ted-

Thanks for running the test cases on the RC. I am not able to reproduce the 
failures. Would you mind telling us a little bit more about the environment you 
are running the tests in (operating system, jvm)? In addition, to the failures 
occur every time you run the tests or just occasionally?

Thanks,
Abe

On Thu, Mar 22, 2018, at 17:16, Ted Yu wrote:
> Hi,
> I ran test suite for the RC.
> 
> Testcase: testSessionTimeout took 22.686 sec
>   Caused an ERROR
> KeeperErrorCode = ConnectionLoss for /stest
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /stest
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
>   at
> org.apache.zookeeper.test.SessionTest.testSessionTimeout(SessionTest.java:300)
> 
> 
> 
> Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
>   Caused an ERROR
> KeeperErrorCode = ConnectionLoss for /watchtest/child2
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /watchtest/child2
>   at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>   at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>   at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(WatcherTest.java:369)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(WatcherTest.java:255)
>   at
> org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWithLocal(WatcherTest.java:268)
> 
> Has anyone else seen the above test failures ?
> 
> Cheers


Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-22 Thread Ted Yu
Hi,
I ran test suite for the RC.

Testcase: testSessionTimeout took 22.686 sec
  Caused an ERROR
KeeperErrorCode = ConnectionLoss for /stest
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /stest
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1105)
  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1133)
  at
org.apache.zookeeper.test.SessionTest.testSessionTimeout(SessionTest.java:300)



Testcase: testWatcherAutoResetDisabledWithLocal took 8.545 sec
  Caused an ERROR
KeeperErrorCode = ConnectionLoss for /watchtest/child2
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /watchtest/child2
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
  at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:876)
  at
org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(WatcherTest.java:369)
  at
org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(WatcherTest.java:255)
  at
org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetDisabledWithLocal(WatcherTest.java:268)

Has anyone else seen the above test failures ?

Cheers


[VOTE] Apache ZooKeeper release 3.4.12 candidate 0

2018-03-22 Thread Abraham Fine
This is a bugfix release candidate for 3.4.12. It fixes 22 issues, including 
issues that
affect incorrect handling of the dataDir and the dataLogDir.
 
The full release notes is available at:
 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12342040
 
*** Please download, test and vote by March 27th 2018, 23:59 UTC+0. ***
 
Source files:
http://people.apache.org/~afine/zookeeper-3.4.12-candidate-0/
 
Maven staging repo:
https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.4.12/
 
The release candidate tag in git to be voted upon: release-3.4.12-rc0
 
ZooKeeper's KEYS file containing PGP keys we use to sign the release:
http://www.apache.org/dist/zookeeper/KEYS
 
Should we release this candidate?


Re: 3.4.12

2018-03-15 Thread Flavio Junqueira
+1 for cutting a 3.4.12 RC. Thanks for volunteering, Abe.

-Flavio

> On 8 Mar 2018, at 18:53, Rakesh Radhakrishnan <rake...@apache.org> wrote:
> 
> Appreciate Abe for the initiative and efforts!
> 
> +1, for "3.4.12" releasing.
> 
> Please feel free to ping me if any help needed when making this release.
> 
> Regards,
> Rakesh
> 
> On Sat, Mar 3, 2018 at 4:19 AM, Abraham Fine <af...@apache.org> wrote:
> 
>> I am very much interested in taking a turn as a RM and I think it is a
>> great time to do a release (now that 2967, 2249, and 2960 arge merged in).
>> 
>> I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is
>> anything else that we need to merge in before cutting a release.
>> 
>> Abe
>> 
>> On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
>>> There are 19 resolved issues http://bit.ly/2oK9aTx
>>> and 14 unresolved http://bit.ly/2oFWywS
>>> ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
>>> regression and was pushed from 3.4.11, we could do so again given it's
>>> still being worked on.
>>> 
>>> Abe are you interested in taking a turn as RM?
>>> 
>>> Patrick
>>> 
>>> On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>>> Hi dev,
>>>> 
>>>> User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
>>>> Are we good to cut 3.4.12 soon or still waiting for something to be
>>>> committed?
>>>> 
>>>> Andor
>>>> 
>> 



Re: 3.4.12

2018-03-08 Thread Rakesh Radhakrishnan
Appreciate Abe for the initiative and efforts!

+1, for "3.4.12" releasing.

Please feel free to ping me if any help needed when making this release.

Regards,
Rakesh

On Sat, Mar 3, 2018 at 4:19 AM, Abraham Fine <af...@apache.org> wrote:

> I am very much interested in taking a turn as a RM and I think it is a
> great time to do a release (now that 2967, 2249, and 2960 arge merged in).
>
> I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is
> anything else that we need to merge in before cutting a release.
>
> Abe
>
> On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
> > There are 19 resolved issues http://bit.ly/2oK9aTx
> > and 14 unresolved http://bit.ly/2oFWywS
> > ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
> > regression and was pushed from 3.4.11, we could do so again given it's
> > still being worked on.
> >
> > Abe are you interested in taking a turn as RM?
> >
> > Patrick
> >
> > On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:
> >
> > > Hi dev,
> > >
> > > User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
> > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
> > > Are we good to cut 3.4.12 soon or still waiting for something to be
> > > committed?
> > >
> > > Andor
> > >
>


Re: 3.4.12

2018-03-05 Thread Abraham Fine
Thank's Pat.

I'll get started on this ASAP.

On Mon, Mar 5, 2018, at 10:14, Patrick Hunt wrote:
> Thanks for "volunteering" Abe. :-)
> 
> sgtm, +1 - there are some important fixes to get out for 3.4.
> 
> FYI the "how to release" page is here:
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease+using+git
> ff to ping me if you run into issues (update that page if you do find
> issues)
> 
> Regards,
> 
> Patrick
> 
> On Fri, Mar 2, 2018 at 2:49 PM, Abraham Fine <af...@apache.org> wrote:
> 
> > I am very much interested in taking a turn as a RM and I think it is a
> > great time to do a release (now that 2967, 2249, and 2960 arge merged in).
> >
> > I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is
> > anything else that we need to merge in before cutting a release.
> >
> > Abe
> >
> > On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
> > > There are 19 resolved issues http://bit.ly/2oK9aTx
> > > and 14 unresolved http://bit.ly/2oFWywS
> > > ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
> > > regression and was pushed from 3.4.11, we could do so again given it's
> > > still being worked on.
> > >
> > > Abe are you interested in taking a turn as RM?
> > >
> > > Patrick
> > >
> > > On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:
> > >
> > > > Hi dev,
> > > >
> > > > User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
> > > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
> > > > Are we good to cut 3.4.12 soon or still waiting for something to be
> > > > committed?
> > > >
> > > > Andor
> > > >
> >


Re: 3.4.12

2018-03-05 Thread Patrick Hunt
Thanks for "volunteering" Abe. :-)

sgtm, +1 - there are some important fixes to get out for 3.4.

FYI the "how to release" page is here:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease+using+git
ff to ping me if you run into issues (update that page if you do find
issues)

Regards,

Patrick

On Fri, Mar 2, 2018 at 2:49 PM, Abraham Fine <af...@apache.org> wrote:

> I am very much interested in taking a turn as a RM and I think it is a
> great time to do a release (now that 2967, 2249, and 2960 arge merged in).
>
> I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is
> anything else that we need to merge in before cutting a release.
>
> Abe
>
> On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
> > There are 19 resolved issues http://bit.ly/2oK9aTx
> > and 14 unresolved http://bit.ly/2oFWywS
> > ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
> > regression and was pushed from 3.4.11, we could do so again given it's
> > still being worked on.
> >
> > Abe are you interested in taking a turn as RM?
> >
> > Patrick
> >
> > On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:
> >
> > > Hi dev,
> > >
> > > User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
> > > <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
> > > Are we good to cut 3.4.12 soon or still waiting for something to be
> > > committed?
> > >
> > > Andor
> > >
>


Re: 3.4.12

2018-03-02 Thread Abraham Fine
I am very much interested in taking a turn as a RM and I think it is a great 
time to do a release (now that 2967, 2249, and 2960 arge merged in).

I agree that ZOOKEEPER-2184 can be pushed again and I don't think there is 
anything else that we need to merge in before cutting a release.

Abe

On Thu, Mar 1, 2018, at 21:52, Patrick Hunt wrote:
> There are 19 resolved issues http://bit.ly/2oK9aTx
> and 14 unresolved http://bit.ly/2oFWywS
> ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
> regression and was pushed from 3.4.11, we could do so again given it's
> still being worked on.
> 
> Abe are you interested in taking a turn as RM?
> 
> Patrick
> 
> On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:
> 
> > Hi dev,
> >
> > User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
> > <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
> > Are we good to cut 3.4.12 soon or still waiting for something to be
> > committed?
> >
> > Andor
> >


Re: 3.4.12

2018-03-01 Thread Patrick Hunt
There are 19 resolved issues http://bit.ly/2oK9aTx
and 14 unresolved http://bit.ly/2oFWywS
ZOOKEEPER-2184 is the only unresolved blocker, however that's not a
regression and was pushed from 3.4.11, we could do so again given it's
still being worked on.

Abe are you interested in taking a turn as RM?

Patrick

On Thu, Mar 1, 2018 at 4:38 PM, Andor Molnar <an...@cloudera.com> wrote:

> Hi dev,
>
> User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
> <https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
> Are we good to cut 3.4.12 soon or still waiting for something to be
> committed?
>
> Andor
>


3.4.12

2018-03-01 Thread Andor Molnar
Hi dev,

User has recently run into the regression of 3.4.11 (ZOOKEEPER-2960
<https://issues.apache.org/jira/browse/ZOOKEEPER-2960>) (again?)
Are we good to cut 3.4.12 soon or still waiting for something to be
committed?

Andor