Re: Why does ZooKeeper follower shutdown itself when it can not read from leader

2019-05-22 Thread Patrick Hunt
That was/is the original intent.  ZK was built to "fail fast" when it
didn't know how to handle a particular case, or that case might be error
prone to handle. The expectation is that the parent will restart the ZK
server process when it fails.

Patrick

On Wed, May 22, 2019 at 6:27 PM Qian Zhang  wrote:

> Hi Andor,
>
> I am using ZooKeeper release 3.4.10.
>
> I checked the code, if follower fails to read from leader (e.g., read
> timeout), it will close the socket, see
>
> https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/quorum/Follower.java#L91:L85
> for
> details. And once the socket is close, it will make follower fails to write
> (I guess same socket is used here) which will be treated as an severe
> unrecoverable error, and then shutdown follower, see
>
> https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/quorum/FollowerRequestProcessor.java#L90:L95
>  and
>
> https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/ZooKeeperCriticalThread.java#L48:L51
> .
>
> So it seems shutting down follower when it cannot read from leader is the
> design behavior? Or if my understanding is wrong can you please let me know
> the design behavior in this case? Thanks!
>
>
> Regards,
> Qian Zhang
>
>
> On Wed, May 22, 2019 at 8:52 AM Qian Zhang  wrote:
>
> > Anyone has any ideas?
> >
> > Regards,
> > Qian Zhang
> >
> >
> > On Sun, May 19, 2019 at 6:15 PM Qian Zhang  wrote:
> >
> >> Hi,
> >>
> >> I have a ZooKeeper cluster which has 5 nodes. Today the leader cannot be
> >> connected due to a hardware issue, and then I found the 4 followers just
> >> shutdown, here is the logs:
> >>
> >>> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN
> >>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> >>> following the leader
> >>>   java.net.SocketTimeoutException:
> >>> Read timed out
> >>> at
> >>> java.net.SocketInputStream.socketRead0(Native Method)
> >>> at
> >>> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> >>> at
> >>> java.net.SocketInputStream.read(SocketInputStream.java:171)
> >>> at
> >>> java.net.SocketInputStream.read(SocketInputStream.java:141)
> >>> at
> >>> java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> >>> at
> >>> java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> >>> at
> >>> java.io.DataInputStream.readInt(DataInputStream.java:387)
> >>> at
> >>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>> at
> >>>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>> at
> >>>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
> >>> at
> >>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> >>> at
> >>>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>> at
> >>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:937)
> >>> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO
> >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] -
> >>> Accepted socket connectio
> >>> n from /10.249.255.10:42306
> >>> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN
> >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@896] -
> >>> Connection request from old cl
> >>> ient /10.249.255.10:42306; will be dropped if server is in r-o mode
> >>> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO
> >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@942] -
> >>> Client attempting to establish
> >>>  new session at /10.249.255.10:42306
> >>> May 18 15:34:28 MD001076 java[29148]: [myid:1] ERROR
> >>> [FollowerRequestProcessor:1:ZooKeeperCriticalThread@49] - Severe
> >>> unrecoverable error, from threa
> >>> d : FollowerRequestProcessor:1
> >>>   java.net.SocketException: Socket
> >>> closed
> >>> at
> >>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
> >>> at
> >>> java.net.SocketOutputStream.write(SocketOutputStream.java:155)
> >>> at
> >>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>> at
> >>> 

Re: Why does ZooKeeper follower shutdown itself when it can not read from leader

2019-05-22 Thread Qian Zhang
Hi Andor,

I am using ZooKeeper release 3.4.10.

I checked the code, if follower fails to read from leader (e.g., read
timeout), it will close the socket, see
https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/quorum/Follower.java#L91:L85
for
details. And once the socket is close, it will make follower fails to write
(I guess same socket is used here) which will be treated as an severe
unrecoverable error, and then shutdown follower, see
https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/quorum/FollowerRequestProcessor.java#L90:L95
 and
https://github.com/apache/zookeeper/blob/release-3.4.10/src/java/main/org/apache/zookeeper/server/ZooKeeperCriticalThread.java#L48:L51
.

So it seems shutting down follower when it cannot read from leader is the
design behavior? Or if my understanding is wrong can you please let me know
the design behavior in this case? Thanks!


Regards,
Qian Zhang


On Wed, May 22, 2019 at 8:52 AM Qian Zhang  wrote:

> Anyone has any ideas?
>
> Regards,
> Qian Zhang
>
>
> On Sun, May 19, 2019 at 6:15 PM Qian Zhang  wrote:
>
>> Hi,
>>
>> I have a ZooKeeper cluster which has 5 nodes. Today the leader cannot be
>> connected due to a hardware issue, and then I found the 4 followers just
>> shutdown, here is the logs:
>>
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN
>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
>>> following the leader
>>>   java.net.SocketTimeoutException:
>>> Read timed out
>>> at
>>> java.net.SocketInputStream.socketRead0(Native Method)
>>> at
>>> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>>> at
>>> java.net.SocketInputStream.read(SocketInputStream.java:171)
>>> at
>>> java.net.SocketInputStream.read(SocketInputStream.java:141)
>>> at
>>> java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>>> at
>>> java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>>> at
>>> java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> at
>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>> at
>>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>>> at
>>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
>>> at
>>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:937)
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] -
>>> Accepted socket connectio
>>> n from /10.249.255.10:42306
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@896] -
>>> Connection request from old cl
>>> ient /10.249.255.10:42306; will be dropped if server is in r-o mode
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@942] -
>>> Client attempting to establish
>>>  new session at /10.249.255.10:42306
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] ERROR
>>> [FollowerRequestProcessor:1:ZooKeeperCriticalThread@49] - Severe
>>> unrecoverable error, from threa
>>> d : FollowerRequestProcessor:1
>>>   java.net.SocketException: Socket
>>> closed
>>> at
>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
>>> at
>>> java.net.SocketOutputStream.write(SocketOutputStream.java:155)
>>> at
>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>> at
>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>> at
>>> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:139)
>>> at
>>> org.apache.zookeeper.server.quorum.Learner.request(Learner.java:188)
>>> at
>>> org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:90)
>>> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO
>>> 

[jira] [Commented] (ZOOKEEPER-1147) Add support for local sessions

2019-05-22 Thread Brian Nixon (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846245#comment-16846245
 ] 

Brian Nixon commented on ZOOKEEPER-1147:


[~larsfrancke] - just created ZOOKEEPER-3400 to create some documentation.

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>Priority: Major
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1147.patch, ZOOKEEPER-1147.patch, 
> ZOOKEEPER-1147.patch, ZOOKEEPER-1147.patch, ZOOKEEPER-1147.patch, 
> ZOOKEEPER-1147.patch, ZOOKEEPER-1147.patch, ZOOKEEPER-1147.patch, 
> ZOOKEEPER-1147.patch
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZOOKEEPER-3400) Add documentation on local sessions

2019-05-22 Thread Brian Nixon (JIRA)
Brian Nixon created ZOOKEEPER-3400:
--

 Summary: Add documentation on local sessions
 Key: ZOOKEEPER-3400
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3400
 Project: ZooKeeper
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.6.0, 3.5.6
Reporter: Brian Nixon


ZOOKEEPER-1147 added local sessions (client sessions not ratified by the 
leader) to ZooKeeper as a lightweight augmentation of the existing global 
sessions.

 

Add some outward facing documentation that describes this feature 
([https://zookeeper.apache.org/doc/r3.5.5/zookeeperProgrammers.html#ch_zkSessions]
 seems like a reasonable place).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [zookeeper] eolivelli commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
eolivelli commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect 
when all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494976524
 
 
   Maybe @lvfangmin had some problem during the execution of the merge script 
and he did not notice


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] enixon commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
enixon commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when 
all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494969597
 
 
   Do we expect the same corrupt commit message if 944 is merged and, if so, 
how do I avoid it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] enixon commented on issue #939: ZOOKEEPER-3385: Add admin command to display leader

2019-05-22 Thread GitBox
enixon commented on issue #939: ZOOKEEPER-3385: Add admin command to display 
leader
URL: https://github.com/apache/zookeeper/pull/939#issuecomment-494965575
 
 
   retest ant build


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] enixon commented on issue #924: ZOOKEEPER-3371: Port unification for Jetty admin server

2019-05-22 Thread GitBox
enixon commented on issue #924: ZOOKEEPER-3371: Port unification for Jetty 
admin server
URL: https://github.com/apache/zookeeper/pull/924#issuecomment-494965124
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] enixon commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
enixon commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when 
all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494960869
 
 
   @maoling @eolivelli this is strange to me. 
   
   The commit message on the commit that created the pull request was a single 
line "ZOOKEEPER-3394: Delay observer reconnect when all learner masters have 
been tried" that matched the convention of "Jira number: Jira title". Other 
pull requests seem to merge just fine when the title was long (see 937 for 
ZOOKEEPER-3383). I'd love to know what caused the corruption in this case so we 
can warn others.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Should we move contrib and recipes into separate repos?

2019-05-22 Thread Ted Dunning
I think that the key question is whether independent releases make sense.

If you have two things that must always be released in synchrony, then
having them in the same repo is a good idea.

If you have two things that are released completely independently, then
separate repos make sense.


On Wed, May 22, 2019 at 10:22 AM Enrico Olivelli 
wrote:

> Il mer 22 mag 2019, 09:59 Tamas Penzes  ha
> scritto:
>
> > Hi All,
> >
> > There is trend among Apache projects to minimise the content of the main
> > project and move not crucial parts into separate repos. I mostly prefer
> the
> > mono-repo style development, but in some cases where the connection
> between
> > the main project and a subproject is weak I find this idea supportable.
> >
> > We have discussed it earlier that it might be a good idea to move the
> > zookeeper-contrib and zookeeper-recipes into separate repositories still
> > maintained by the ZooKeeper team, but I would be curious about your
> > opinion.
> >
> > Do you find this idea useful?
> > What do you think, what would be the pros and cons for such separation?
> >
>
> I guess that if we separate those modules they will soon be out of sync
> with the latest master.
> We won't ever 'release' them and/or try to compile againts latest zookeeper
> version.
>
> I feel 'recipes' should stay in the repo as they are like a reference live
> guide about using zookeeper.
>
> Are 'contrib' modules in use?
>
> If the answer is 'no' we should drop them.
> Current versions will stay on git forever but we won't maintain them
> anymore
>
>
> Enrico
>
>
>
> > Thanks, Tamaas
> >
>


[GitHub] [zookeeper] eolivelli commented on issue #918: ZOOKEEPER-3366: Pluggable metrics system for ZooKeeper - move remaining metrics to MetricsProvider

2019-05-22 Thread GitBox
eolivelli commented on issue #918: ZOOKEEPER-3366: Pluggable metrics system for 
ZooKeeper - move remaining metrics to MetricsProvider
URL: https://github.com/apache/zookeeper/pull/918#issuecomment-494956812
 
 
   Gently pinging @lvfangmin
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] eolivelli commented on issue #953: ZOOKEEPER-3398 Learner.connectToLeader() may take too long to time-out

2019-05-22 Thread GitBox
eolivelli commented on issue #953: ZOOKEEPER-3398 Learner.connectToLeader() may 
take too long to time-out 
URL: https://github.com/apache/zookeeper/pull/953#issuecomment-494956125
 
 
   @vladimirvic
   Your idea is good.
   But of we introduce some socketTimeout option we need to introduce one 
option per every use case, like you are doing here.
   If we have a single socketTimeout we will come into the same problem you are 
fixing here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] eolivelli commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
eolivelli commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect 
when all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494954601
 
 
   Good catch @maoling.
   I also noticed it today.
   But we can't rewrite git history.
   I should have written an email to the dev@ list.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] lvfangmin commented on issue #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
lvfangmin commented on issue #933: ZOOKEEPER-3379: De-flaky test in Quorum 
Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933#issuecomment-494895082
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky 
test in Quorum Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933#discussion_r286594305
 
 

 ##
 File path: 
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/LearnerHandlerMetricsTest.java
 ##
 @@ -60,36 +61,40 @@ public void setup() throws IOException {
 
 //adding 5ms artificial delay when sending each packet
 BinaryOutputArchive oa = mock(BinaryOutputArchive.class);
-doAnswer(new Answer() {
-@Override
-public Object answer(InvocationOnMock invocationOnMock) throws 
Throwable {
-Thread.sleep(5);
-return  null;
-}
-}).when(oa).writeRecord(any(QuorumPacket.class), Matchers.anyString());
+doAnswer(invocationOnMock -> {Thread.sleep(5); return null;})
+.when(oa).writeRecord(any(QuorumPacket.class), 
Matchers.anyString());
+
+BufferedOutputStream bos = mock(BufferedOutputStream.class);
+// flush is called when all packets are sent and the queue is empty
+doAnswer(invocationOnMock -> {allSentLatch.countDown(); return 
null;}).when(bos).flush();
 
 Review comment:
   I added check for null to avoid NPE. I try to separate setup from testing 
logic.  Initializing the latch I think is part of the testing logic because it 
might not always be appropriate to set the latch at the beginning of the test. 
The place where the latch is set and the count the latch is set to depend on 
what is being tested.   
   There is one test method in this test file so far. So I could move all the 
setup into the test. But again I want to separate setup from testing logic 
itself. If later new test methods are to be added and they don't share the 
common setup, then we may need to refactor the setup code--but if they don't 
share the common setup, they probably shouldn't be in the same test file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] jhuan31 opened a new pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
jhuan31 opened a new pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum 
Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933
 
 
   To address a potential flaky test in PR #849 (LearnerHandlerMetricsTest)  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] jhuan31 closed pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
jhuan31 closed pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum 
Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky 
test in Quorum Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933#discussion_r286590804
 
 

 ##
 File path: 
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/LearnerHandlerMetricsTest.java
 ##
 @@ -43,6 +43,7 @@
 public class LearnerHandlerMetricsTest {
 private MockLearnerHandler learnerHandler;
 private long sid = 5;
+private CountDownLatch allSentLatch;
 
 Review comment:
   yes.  made it volatile


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky test in Quorum Packet Metrics

2019-05-22 Thread GitBox
jhuan31 commented on a change in pull request #933: ZOOKEEPER-3379: De-flaky 
test in Quorum Packet Metrics
URL: https://github.com/apache/zookeeper/pull/933#discussion_r286590875
 
 

 ##
 File path: 
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/LearnerHandlerMetricsTest.java
 ##
 @@ -60,36 +61,40 @@ public void setup() throws IOException {
 
 //adding 5ms artificial delay when sending each packet
 BinaryOutputArchive oa = mock(BinaryOutputArchive.class);
-doAnswer(new Answer() {
-@Override
-public Object answer(InvocationOnMock invocationOnMock) throws 
Throwable {
-Thread.sleep(5);
-return  null;
-}
-}).when(oa).writeRecord(any(QuorumPacket.class), Matchers.anyString());
+doAnswer(invocationOnMock -> {Thread.sleep(5); return null;})
+.when(oa).writeRecord(any(QuorumPacket.class), 
Matchers.anyString());
+
+BufferedOutputStream bos = mock(BufferedOutputStream.class);
+// flush is called when all packets are sent and the queue is empty
+doAnswer(invocationOnMock -> {allSentLatch.countDown(); return 
null;}).when(bos).flush();
 
 learnerHandler = new MockLearnerHandler(socket, leader);
 learnerHandler.setOutputArchive(oa);
-learnerHandler.setBufferedOutput(mock(BufferedOutputStream.class));
+learnerHandler.setBufferedOutput(bos);
 learnerHandler.sid = sid;
 }
 
 @Test
-public void testMetrics() {
+public void testMetrics() throws InterruptedException {
 ServerMetrics.getMetrics().resetAll();
 
 //adding 1001 packets in the queue, two marker packets will be added 
since the interval is every 1000 packets
 for (int i=0; i<1001; i++) {
 learnerHandler.queuePacket(new QuorumPacket());
 }
+
+allSentLatch = new CountDownLatch(1);
+
 learnerHandler.startSendingPackets();
 
+allSentLatch.await(8, TimeUnit.SECONDS);
+
 //make sure we have enough time to send all the packets in the queue
-try {
+/*try {
 
 Review comment:
   removed. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


ZooKeeper_branch34_openjdk8 - Build # 333 - Still Failing

2019-05-22 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk8/333/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 44.30 KB...]
[junit] Running org.apache.zookeeper.test.SaslAuthFailNotifyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.594 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.723 sec
[junit] Running org.apache.zookeeper.test.SaslAuthMissingClientConfigTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.617 sec
[junit] Running org.apache.zookeeper.test.SaslClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.081 sec
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.78 sec
[junit] Running org.apache.zookeeper.test.SessionTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.92 sec
[junit] Running org.apache.zookeeper.test.SessionTimeoutTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.896 sec
[junit] Running org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.953 sec
[junit] Running org.apache.zookeeper.test.StatTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.919 sec
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.74 sec
[junit] Running org.apache.zookeeper.test.SyncCallTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.717 sec
[junit] Running org.apache.zookeeper.test.TruncateTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
8.646 sec
[junit] Running org.apache.zookeeper.test.UpgradeTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.981 sec
[junit] Running org.apache.zookeeper.test.WatchedEventTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.104 sec
[junit] Running org.apache.zookeeper.test.WatcherFuncTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.319 sec
[junit] Running org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
29.987 sec
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
10.584 sec
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.8 
sec
[junit] Running org.apache.jute.BinaryInputArchiveTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.086 sec

fail.build.on.test.failure:

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk8/build.xml:1425:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk8/build.xml:1428:
 Tests failed!

Total time: 40 minutes 50 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Recording test results
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.EphemeralNodeDeletionTest.testEphemeralNodeDeletion

Error Message:
After session close ephemeral node must be deleted expected null, but 
was:<4294967302,4294967302,1558546917368,1558546917368,0,0,0,145110229481291776,1,0,4294967302
>

Stack Trace:
junit.framework.AssertionFailedError: After session close ephemeral node must 
be deleted expected null, but 
was:<4294967302,4294967302,1558546917368,1558546917368,0,0,0,145110229481291776,1,0,4294967302
>
at 

Re: [DISCUSS] Should we move contrib and recipes into separate repos?

2019-05-22 Thread Enrico Olivelli
Il mer 22 mag 2019, 09:59 Tamas Penzes  ha
scritto:

> Hi All,
>
> There is trend among Apache projects to minimise the content of the main
> project and move not crucial parts into separate repos. I mostly prefer the
> mono-repo style development, but in some cases where the connection between
> the main project and a subproject is weak I find this idea supportable.
>
> We have discussed it earlier that it might be a good idea to move the
> zookeeper-contrib and zookeeper-recipes into separate repositories still
> maintained by the ZooKeeper team, but I would be curious about your
> opinion.
>
> Do you find this idea useful?
> What do you think, what would be the pros and cons for such separation?
>

I guess that if we separate those modules they will soon be out of sync
with the latest master.
We won't ever 'release' them and/or try to compile againts latest zookeeper
version.

I feel 'recipes' should stay in the repo as they are like a reference live
guide about using zookeeper.

Are 'contrib' modules in use?

If the answer is 'no' we should drop them.
Current versions will stay on git forever but we won't maintain them anymore


Enrico



> Thanks, Tamaas
>


[jira] [Commented] (ZOOKEEPER-3311) Allow a delay to the transaction log flush

2019-05-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846072#comment-16846072
 ] 

Hudson commented on ZOOKEEPER-3311:
---

FAILURE: Integrated in Jenkins build Zookeeper-trunk-single-thread #368 (See 
[https://builds.apache.org/job/Zookeeper-trunk-single-thread/368/])
ZOOKEEPER-3311: Allow a delay to the transaction log flush (eolivelli: rev 
cc431f70020b9a2028edcc61e41cff9ee85b078f)
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/SyncRequestProcessor.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServerBean.java
* (edit) zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServerMXBean.java


> Allow a delay to the transaction log flush 
> ---
>
> Key: ZOOKEEPER-3311
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3311
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 3.6.0
>Reporter: Brian Nixon
>Assignee: Brian Nixon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The SyncRequestProcessor flushes writes to disk either when 1000 writes are 
> pending to be flushed or when the processor fails to retrieve another write 
> from its incoming queue. The "flush when queue empty" condition operates 
> poorly under many workloads as it can quickly degrade into flushing after 
> every write -- losing all benefits of batching and leading to a continuous 
> stream of flushes + fsyncs which overwhelm the underlying disk.
>  
> A configurable flush delay would ensure flushes do not happen more frequently 
> than once every X milliseconds. This can be used in-place of or jointly with 
> batch size triggered flushes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Enrico Olivelli
Il mer 22 mag 2019, 12:17 Norbert Kalmar  ha
scritto:

> And "release with Ant" I mean the 3.5.5 src tarball containing the Ant
> script as well, so people have the option to build it with Ant.
>

IMHO this is not a big deal. The release has been performed with Maven and
the resulting artifacts are different from the Ant results.
But I don't have a strong opinion here

Enrico

>
>
> On Wed, May 22, 2019 at 12:13 PM Norbert Kalmar 
> wrote:
>
> > Sorry, I was too quick to reply, didn't think it through.
> > We did release the first stable 3.5 ZooKeeper with Ant, so I guess we
> will
> > need to support on 3.5.x
> >
> > On Wed, May 22, 2019 at 12:12 PM Norbert Kalmar 
> > wrote:
> >
> >> +1, let's remove it. I would say lets remove Ant from 3.5 branch as
> well.
> >> Having 2 build system is just a huge source of confusion, especially
> that
> >> we have dependency versions in two different locations.
> >>
> >> Regards,
> >> Norbert
> >>
> >> On Wed, May 22, 2019 at 9:44 AM Tamas Penzes
> 
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Not that long ago we did have a discuss about removing Ant from
> >>> ZooKeeper.
> >>> I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it
> >>> was
> >>> built and released with Maven.
> >>> Is it time to remove Ant from the master branch?
> >>>
> >>> That would mean that Ant would not be available from the next minor
> >>> version, which is probably 3.6.0.
> >>>
> >>> Please share your opinion.
> >>>
> >>> Thanks, Tamaas
> >>>
> >>
>


Re: [DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Enrico Olivelli
We are not ready.
 I am migrating the precommit script to Maven.
The ant based one is doing a lot of checks.
I am converting all of this stuff to Maven.

I think we won't drop ant from branch-3.4 but only from 3.5 and master.

As soon as the script is migrated there is no need to keep ant at all.

Indeed I am going to send patches that add new maven modules for metrics
providers and I don't won't to add new ant based stuff.

I would like to see new integration tests and for this work we need a more
modular codebase, so Maven is our way.

I am doing as much as possible to drop Ant soon


Enrico

Il mer 22 mag 2019, 12:17 Norbert Kalmar  ha
scritto:

> And "release with Ant" I mean the 3.5.5 src tarball containing the Ant
> script as well, so people have the option to build it with Ant.
>
>
> On Wed, May 22, 2019 at 12:13 PM Norbert Kalmar 
> wrote:
>
> > Sorry, I was too quick to reply, didn't think it through.
> > We did release the first stable 3.5 ZooKeeper with Ant, so I guess we
> will
> > need to support on 3.5.x
> >
> > On Wed, May 22, 2019 at 12:12 PM Norbert Kalmar 
> > wrote:
> >
> >> +1, let's remove it. I would say lets remove Ant from 3.5 branch as
> well.
> >> Having 2 build system is just a huge source of confusion, especially
> that
> >> we have dependency versions in two different locations.
> >>
> >> Regards,
> >> Norbert
> >>
> >> On Wed, May 22, 2019 at 9:44 AM Tamas Penzes
> 
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Not that long ago we did have a discuss about removing Ant from
> >>> ZooKeeper.
> >>> I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it
> >>> was
> >>> built and released with Maven.
> >>> Is it time to remove Ant from the master branch?
> >>>
> >>> That would mean that Ant would not be available from the next minor
> >>> version, which is probably 3.6.0.
> >>>
> >>> Please share your opinion.
> >>>
> >>> Thanks, Tamaas
> >>>
> >>
>


[jira] [Commented] (ZOOKEEPER-3311) Allow a delay to the transaction log flush

2019-05-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845987#comment-16845987
 ] 

Hudson commented on ZOOKEEPER-3311:
---

FAILURE: Integrated in Jenkins build ZooKeeper-trunk #533 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/533/])
ZOOKEEPER-3311: Allow a delay to the transaction log flush (eolivelli: rev 
cc431f70020b9a2028edcc61e41cff9ee85b078f)
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServerMXBean.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServerBean.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/SyncRequestProcessor.java
* (edit) zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md


> Allow a delay to the transaction log flush 
> ---
>
> Key: ZOOKEEPER-3311
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3311
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 3.6.0
>Reporter: Brian Nixon
>Assignee: Brian Nixon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The SyncRequestProcessor flushes writes to disk either when 1000 writes are 
> pending to be flushed or when the processor fails to retrieve another write 
> from its incoming queue. The "flush when queue empty" condition operates 
> poorly under many workloads as it can quickly degrade into flushing after 
> every write -- losing all benefits of batching and leading to a continuous 
> stream of flushes + fsyncs which overwhelm the underlying disk.
>  
> A configurable flush delay would ensure flushes do not happen more frequently 
> than once every X milliseconds. This can be used in-place of or jointly with 
> batch size triggered flushes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


ZooKeeper-trunk - Build # 533 - Failure

2019-05-22 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/533/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 190.04 KB...]
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.096 sec, Thread: 2, Class: org.apache.zookeeper.test.SessionTimeoutTest
[junit] Running org.apache.zookeeper.test.SessionTrackerCheckTest in thread 
2
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.086 sec, Thread: 2, Class: org.apache.zookeeper.test.SessionTrackerCheckTest
[junit] Running org.apache.zookeeper.test.SessionUpgradeTest in thread 2
[junit] Tests run: 109, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
444.186 sec, Thread: 3, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] Running org.apache.zookeeper.test.StandaloneTest in thread 3
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.689 sec, Thread: 3, Class: org.apache.zookeeper.test.StandaloneTest
[junit] Running org.apache.zookeeper.test.StatTest in thread 3
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.609 sec, Thread: 3, Class: org.apache.zookeeper.test.StatTest
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest in thread 3
[junit] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.493 sec, Thread: 3, Class: org.apache.zookeeper.test.StaticHostProviderTest
[junit] Running org.apache.zookeeper.test.StringUtilTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.07 sec, Thread: 3, Class: org.apache.zookeeper.test.StringUtilTest
[junit] Running org.apache.zookeeper.test.SyncCallTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.142 sec, Thread: 3, Class: org.apache.zookeeper.test.SyncCallTest
[junit] Running org.apache.zookeeper.test.TruncateTest in thread 3
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
84.136 sec, Thread: 4, Class: org.apache.zookeeper.test.RestoreCommittedLogTest
[junit] Running org.apache.zookeeper.test.WatchEventWhenAutoResetTest in 
thread 4
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
22.531 sec, Thread: 2, Class: org.apache.zookeeper.test.SessionUpgradeTest
[junit] Running org.apache.zookeeper.test.WatchedEventTest in thread 2
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.092 sec, Thread: 2, Class: org.apache.zookeeper.test.WatchedEventTest
[junit] Running org.apache.zookeeper.test.WatcherFuncTest in thread 2
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
9.853 sec, Thread: 3, Class: org.apache.zookeeper.test.TruncateTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
3.048 sec, Thread: 2, Class: org.apache.zookeeper.test.WatcherFuncTest
[junit] Running org.apache.zookeeper.test.WatcherTest in thread 3
[junit] Running org.apache.zookeeper.test.X509AuthTest in thread 2
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.104 sec, Thread: 2, Class: org.apache.zookeeper.test.X509AuthTest
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest in 
thread 2
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
9.141 sec, Thread: 2, Class: org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest in thread 2
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.934 sec, Thread: 2, Class: org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Running org.apache.zookeeper.util.PemReaderTest in thread 2
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
22.277 sec, Thread: 4, Class: 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest
[junit] Running org.apache.jute.BinaryInputArchiveTest in thread 4
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.187 sec, Thread: 4, Class: org.apache.jute.BinaryInputArchiveTest
[junit] Running org.apache.jute.UtilsTest in thread 4
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.128 sec, Thread: 4, Class: org.apache.jute.UtilsTest
[junit] Running org.apache.jute.XmlInputArchiveTest in thread 4
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.21 sec, Thread: 4, Class: org.apache.jute.XmlInputArchiveTest
[junit] Tests run: 64, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
4.732 sec, Thread: 2, Class: org.apache.zookeeper.util.PemReaderTest
[junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
31.419 sec, Thread: 3, Class: org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time 

Re: [DISCUSS] Should we move contrib and recipes into separate repos?

2019-05-22 Thread Patrick Hunt
The downside is that you need to separately maintain, support, build infra
round, ... and release those repos. They get out of sync with the main
repo/releases, etc...

Another option is to move them to github or retire them entirely.

Patrick

On Wed, May 22, 2019 at 12:59 AM Tamas Penzes 
wrote:

> Hi All,
>
> There is trend among Apache projects to minimise the content of the main
> project and move not crucial parts into separate repos. I mostly prefer the
> mono-repo style development, but in some cases where the connection between
> the main project and a subproject is weak I find this idea supportable.
>
> We have discussed it earlier that it might be a good idea to move the
> zookeeper-contrib and zookeeper-recipes into separate repositories still
> maintained by the ZooKeeper team, but I would be curious about your
> opinion.
>
> Do you find this idea useful?
> What do you think, what would be the pros and cons for such separation?
>
> Thanks, Tamaas
>


[GitHub] [zookeeper] maoling commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
maoling commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when 
all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494800171
 
 
   @anmolnar 
   sorry for my unclear expression.
   What I mean is the [corrupt commit 
message](https://github.com/apache/zookeeper/commit/39a316a3bab747d879ef974308ad08377983)
 without JIRA-ID and any description.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Norbert Kalmar
Sorry, I was too quick to reply, didn't think it through.
We did release the first stable 3.5 ZooKeeper with Ant, so I guess we will
need to support on 3.5.x

On Wed, May 22, 2019 at 12:12 PM Norbert Kalmar 
wrote:

> +1, let's remove it. I would say lets remove Ant from 3.5 branch as well.
> Having 2 build system is just a huge source of confusion, especially that
> we have dependency versions in two different locations.
>
> Regards,
> Norbert
>
> On Wed, May 22, 2019 at 9:44 AM Tamas Penzes 
> wrote:
>
>> Hi All,
>>
>> Not that long ago we did have a discuss about removing Ant from ZooKeeper.
>> I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it was
>> built and released with Maven.
>> Is it time to remove Ant from the master branch?
>>
>> That would mean that Ant would not be available from the next minor
>> version, which is probably 3.6.0.
>>
>> Please share your opinion.
>>
>> Thanks, Tamaas
>>
>


Re: [DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Norbert Kalmar
And "release with Ant" I mean the 3.5.5 src tarball containing the Ant
script as well, so people have the option to build it with Ant.


On Wed, May 22, 2019 at 12:13 PM Norbert Kalmar 
wrote:

> Sorry, I was too quick to reply, didn't think it through.
> We did release the first stable 3.5 ZooKeeper with Ant, so I guess we will
> need to support on 3.5.x
>
> On Wed, May 22, 2019 at 12:12 PM Norbert Kalmar 
> wrote:
>
>> +1, let's remove it. I would say lets remove Ant from 3.5 branch as well.
>> Having 2 build system is just a huge source of confusion, especially that
>> we have dependency versions in two different locations.
>>
>> Regards,
>> Norbert
>>
>> On Wed, May 22, 2019 at 9:44 AM Tamas Penzes 
>> wrote:
>>
>>> Hi All,
>>>
>>> Not that long ago we did have a discuss about removing Ant from
>>> ZooKeeper.
>>> I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it
>>> was
>>> built and released with Maven.
>>> Is it time to remove Ant from the master branch?
>>>
>>> That would mean that Ant would not be available from the next minor
>>> version, which is probably 3.6.0.
>>>
>>> Please share your opinion.
>>>
>>> Thanks, Tamaas
>>>
>>


Re: [DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Norbert Kalmar
+1, let's remove it. I would say lets remove Ant from 3.5 branch as well.
Having 2 build system is just a huge source of confusion, especially that
we have dependency versions in two different locations.

Regards,
Norbert

On Wed, May 22, 2019 at 9:44 AM Tamas Penzes 
wrote:

> Hi All,
>
> Not that long ago we did have a discuss about removing Ant from ZooKeeper.
> I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it was
> built and released with Maven.
> Is it time to remove Ant from the master branch?
>
> That would mean that Ant would not be available from the next minor
> version, which is probably 3.6.0.
>
> Please share your opinion.
>
> Thanks, Tamaas
>


Re: [3.4.12] Missing OVERSEER doc. solr-user@lucene....

2019-05-22 Thread Andor Molnar
Hi Will,

What is your question to the ZooKeeper community?

Andor



On Wed, May 22, 2019 at 5:53 AM Will Martin  wrote:

> Cross-posting this for a sound reporter. He is a top technical
> resource on the list. Not given to hyperbole in bug reports.
>
>
>
> Is there a acl’d JIRA for zookeeper?
>
> 
>
> to solr-user
>
> [image: https://mail.google.com/mail/u/0/images/cleardot.gif]
>
> We have a 6.6.2 cluster in prod that appears to have no overseer. In
> /overseer_elect on ZK, there is an election folder, but no leader document.
> An OVERSEERSTATUS request fails with a timeout.
>
> I’m going to try ADDROLE, but I’d be delighted to hear any other ideas.
> We’ve diverted all the traffic to the backing cluster, so we can blow this
> one away and rebuild.
>
> Looking at the Zookeeper logs, I see a few instances of network failures
> across all three nodes.
>
>
>
>
>
> I **have the logs** from each of the Zookeepers.
>
> We are running 3.4.12.
>
>
>
> 
>


[jira] [Resolved] (ZOOKEEPER-3311) Allow a delay to the transaction log flush

2019-05-22 Thread Enrico Olivelli (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enrico Olivelli resolved ZOOKEEPER-3311.

   Resolution: Fixed
Fix Version/s: 3.6.0

Issue resolved by pull request 851
[https://github.com/apache/zookeeper/pull/851]

> Allow a delay to the transaction log flush 
> ---
>
> Key: ZOOKEEPER-3311
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3311
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 3.6.0
>Reporter: Brian Nixon
>Assignee: Brian Nixon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The SyncRequestProcessor flushes writes to disk either when 1000 writes are 
> pending to be flushed or when the processor fails to retrieve another write 
> from its incoming queue. The "flush when queue empty" condition operates 
> poorly under many workloads as it can quickly degrade into flushing after 
> every write -- losing all benefits of batching and leading to a continuous 
> stream of flushes + fsyncs which overwhelm the underlying disk.
>  
> A configurable flush delay would ensure flushes do not happen more frequently 
> than once every X milliseconds. This can be used in-place of or jointly with 
> batch size triggered flushes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [zookeeper] asfgit closed pull request #851: ZOOKEEPER-3311: Allow a delay to the transaction log flush

2019-05-22 Thread GitBox
asfgit closed pull request #851: ZOOKEEPER-3311: Allow a delay to the 
transaction log flush
URL: https://github.com/apache/zookeeper/pull/851
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] anmolnar commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
anmolnar commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when 
all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494725584
 
 
   @maoling As long as the Jira clearly explains the motivation behind the 
patch, I think we're good to go, but apart from that, you're absolutely right. 
@enixon Next time, please document your patch properly in the PR description 
too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Build failed in Jenkins: ZooKeeper-trunk-owasp #371

2019-05-22 Thread Apache Jenkins Server
See 


Changes:

[fangmin] ZOOKEEPER-3323: Add TxnSnapLog metrics

[fangmin] 948

--
[...truncated 30.64 KB...]
[ivy:retrieve]  [SUCCESSFUL ] 
org.apache.lucene#lucene-queryparser;7.6.0!lucene-queryparser.jar (1114ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/apache/velocity/velocity/1.7/velocity-1.7.jar
 ...
[ivy:retrieve] ... (438kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] org.apache.velocity#velocity;1.7!velocity.jar 
(108ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/glassfish/javax.json/1.0.4/javax.json-1.0.4.jar
 ...
[ivy:retrieve] . (83kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
org.glassfish#javax.json;1.0.4!javax.json.jar(bundle) (426ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/jsoup/jsoup/1.11.3/jsoup-1.11.3.jar ...
[ivy:retrieve]  (386kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] org.jsoup#jsoup;1.11.3!jsoup.jar (42ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/sun/mail/mailapi/1.6.3/mailapi-1.6.3.jar ...
[ivy:retrieve] .. (291kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] com.sun.mail#mailapi;1.6.3!mailapi.jar (32ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/code/gson/gson/2.8.5/gson-2.8.5.jar 
...
[ivy:retrieve] .. (235kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] com.google.code.gson#gson;2.8.5!gson.jar (28ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/guava/guava/27.0.1-jre/guava-27.0.1-jre.jar
 ...
[ivy:retrieve] 
...
 (2682kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.google.guava#guava;27.0.1-jre!guava.jar(bundle) (76ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/h3xstream/retirejs/retirejs-core/3.0.1/retirejs-core-3.0.1.jar
 ...
[ivy:retrieve]  (26kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.h3xstream.retirejs#retirejs-core;3.0.1!retirejs-core.jar (23ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/apache/lucene/lucene-queries/7.6.0/lucene-queries-7.6.0.jar
 ...
[ivy:retrieve]  (258kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
org.apache.lucene#lucene-queries;7.6.0!lucene-queries.jar (27ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/apache/lucene/lucene-sandbox/7.6.0/lucene-sandbox-7.6.0.jar
 ...
[ivy:retrieve] . (271kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
org.apache.lucene#lucene-sandbox;7.6.0!lucene-sandbox.jar (26ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/commons-lang/commons-lang/2.4/commons-lang-2.4.jar
 ...
[ivy:retrieve]  (255kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] commons-lang#commons-lang;2.4!commons-lang.jar 
(26ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
 ...
[ivy:retrieve] ... (4kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.google.guava#failureaccess;1.0.1!failureaccess.jar(bundle) (21ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar
 ...
[ivy:retrieve] .. (2kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.google.guava#listenablefuture;.0-empty-to-avoid-conflict-with-guava!listenablefuture.jar
 (22ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/checkerframework/checker-qual/2.5.2/checker-qual-2.5.2.jar
 ...
[ivy:retrieve] ... (188kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
org.checkerframework#checker-qual;2.5.2!checker-qual.jar (26ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/errorprone/error_prone_annotations/2.2.0/error_prone_annotations-2.2.0.jar
 ...
[ivy:retrieve] ... (13kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.google.errorprone#error_prone_annotations;2.2.0!error_prone_annotations.jar 
(22ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar
 ...
[ivy:retrieve] . (8kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] 
com.google.j2objc#j2objc-annotations;1.1!j2objc-annotations.jar (21ms)
[ivy:retrieve] downloading 
https://repo1.maven.org/maven2/org/codehaus/mojo/animal-sniffer-annotations/1.17/animal-sniffer-annotations-1.17.jar

[GitHub] [zookeeper] eolivelli commented on issue #956: ZOOKEEPER-3399: Remove logging in getGlobalOutstandingLimit for optimal performance.

2019-05-22 Thread GitBox
eolivelli commented on issue #956: ZOOKEEPER-3399: Remove logging in 
getGlobalOutstandingLimit for optimal performance.
URL: https://github.com/apache/zookeeper/pull/956#issuecomment-494716596
 
 
   retest ant build


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zookeeper] maoling commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when all learner masters hav…

2019-05-22 Thread GitBox
maoling commented on issue #948: ZOOKEEPER-3394: Delay observer reconnect when 
all learner masters hav…
URL: https://github.com/apache/zookeeper/pull/948#issuecomment-494716164
 
 
   @lvfangmin 
   Is this PR merged with a corrupt description?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[DISCUSS] Should we move contrib and recipes into separate repos?

2019-05-22 Thread Tamas Penzes
Hi All,

There is trend among Apache projects to minimise the content of the main
project and move not crucial parts into separate repos. I mostly prefer the
mono-repo style development, but in some cases where the connection between
the main project and a subproject is weak I find this idea supportable.

We have discussed it earlier that it might be a good idea to move the
zookeeper-contrib and zookeeper-recipes into separate repositories still
maintained by the ZooKeeper team, but I would be curious about your opinion.

Do you find this idea useful?
What do you think, what would be the pros and cons for such separation?

Thanks, Tamaas


[jira] [Commented] (ZOOKEEPER-3323) Add TxnSnapLog metrics

2019-05-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845643#comment-16845643
 ] 

Hudson commented on ZOOKEEPER-3323:
---

FAILURE: Integrated in Jenkins build Zookeeper-trunk-single-thread #367 (See 
[https://builds.apache.org/job/Zookeeper-trunk-single-thread/367/])
ZOOKEEPER-3323: Add TxnSnapLog metrics (fangmin: rev 
d08f51ad1514bfa512597b1ce4bbc2e8144be576)
* (add) 
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerMetrics.java


> Add TxnSnapLog metrics
> --
>
> Key: ZOOKEEPER-3323
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3323
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: metric system
>Reporter: Jie Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] Is it time to remove Apache Ant from ZooKeeper?

2019-05-22 Thread Tamas Penzes
Hi All,

Not that long ago we did have a discuss about removing Ant from ZooKeeper.
I'd like to restart that discussion since ZooKeeper 3.5.5 is GA and it was
built and released with Maven.
Is it time to remove Ant from the master branch?

That would mean that Ant would not be available from the next minor
version, which is probably 3.6.0.

Please share your opinion.

Thanks, Tamaas


[GitHub] [zookeeper] vladimirivic commented on issue #953: ZOOKEEPER-3398 Learner.connectToLeader() may take too long to time-out

2019-05-22 Thread GitBox
vladimirivic commented on issue #953: ZOOKEEPER-3398 Learner.connectToLeader() 
may take too long to time-out 
URL: https://github.com/apache/zookeeper/pull/953#issuecomment-494673203
 
 
   @lvfangmin @eolivelli @anmolnar - what I am thinking is that it may be worth 
going through the code and see if this config could be generalized to be used 
exclusively for socket connection timeout wherever else we have similar logic. 
   
   In that way we keep initLimit and syncLimit tied to quorum semantics where 
as this could be called socketTimeout (or some better name) which is somewhat 
on a lower level and belongs to the networking layer of Zookeeper.
   
   Not sure if this is a good idea, just had this thought. I would need to do 
some homework on my end but let me know what you think first.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services