date:20160911

See https://builds.apache.org/job/ZooKeeper_branch35_openjdk7/227/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 452895 lines...]
[junit] 2016-09-12 05:47:21,398 [myid:] - INFO  
[main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:16852
[junit] 2016-09-12 05:47:21,404 [myid:] - INFO  [main:ZooKeeperServer@529] 
- shutting down
[junit] 2016-09-12 05:47:21,404 [myid:] - ERROR [main:ZooKeeperServer@501] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-12 05:47:21,404 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down
[junit] 2016-09-12 05:47:21,404 [myid:] - INFO  
[main:PrepRequestProcessor@965] - Shutting down
[junit] 2016-09-12 05:47:21,404 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2016-09-12 05:47:21,405 [myid:] - INFO  [ProcessThread(sid:0 
cport:16852)::PrepRequestProcessor@154] - PrepRequestProcessor exited loop!
[junit] 2016-09-12 05:47:21,405 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2016-09-12 05:47:21,405 [myid:] - INFO  
[main:FinalRequestProcessor@479] - shutdown of request processor complete
[junit] 2016-09-12 05:47:21,406 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port16852,name1=InMemoryDataTree]
[junit] 2016-09-12 05:47:21,406 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port16852]
[junit] 2016-09-12 05:47:21,407 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 16852
[junit] 2016-09-12 05:47:21,407 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-09-12 05:47:21,415 [myid:] - INFO  [main:ClientBase@568] - 
fdcount after test is: 1373 at start it was 1385
[junit] 2016-09-12 05:47:21,416 [myid:] - INFO  [main:ZKTestCase$1@65] - 
SUCCEEDED testWatcherAutoResetWithLocal
[junit] 2016-09-12 05:47:21,416 [myid:] - INFO  [main:ZKTestCase$1@60] - 
FINISHED testWatcherAutoResetWithLocal
[junit] Tests run: 101, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
443.884 sec, Thread: 3, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] 2016-09-12 05:47:21,503 [myid:127.0.0.1:16729] - INFO  
[main-SendThread(127.0.0.1:16729):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:16729. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-12 05:47:21,505 [myid:127.0.0.1:16729] - WARN  
[main-SendThread(127.0.0.1:16729):ClientCnxn$SendThread@1235] - Session 
0x1005e9a0435 for server 127.0.0.1/127.0.0.1:16729, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-09-12 05:47:21,852 [myid:127.0.0.1:16735] - INFO  
[main-SendThread(127.0.0.1:16735):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:16735. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-12 05:47:21,854 [myid:127.0.0.1:16735] - WARN  
[main-SendThread(127.0.0.1:16735):ClientCnxn$SendThread@1235] - Session 
0x3005e9a044f for server 127.0.0.1/127.0.0.1:16735, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-09-12 05:47:21,990 [myid:127.0.0.1:16732] - INFO  
[main-SendThread(127.0.0.1:16732):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:16732. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-12 05:47:21,991 [myid:127.0.0.1:16732] - WARN  
[main-SendThread(127.0.0.1:16732):ClientCnxn$SendThread@1235] - Session 
0x2005e9a0434 for server 127.0.0.1/127.0.0.1:16732, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

Re: [VOTE] move Apache Zookeeper to git

2016-09-11 Thread Patrick Hunt

afaik there has never been github integration for anything with ZK.
QAbot only runs against jira/svn.

FYI: I've gone through all the jenkins jobs (3.4/3.5/trunk) and gotten
them working again. There was a ton of cruft in there which I
attempted to cleanup. I think things should be ok, but I will be
monitoring over the next few days. If you notice obvious issues please
lmk (vs say flakey tests).

Additionally - qabot (precommit job) is broken. The zookeeper script
./src/java/test/bin/test-patch.sh is used by QAbot, and it uses svn
directly. We'll need to patch this script in order to get qabot
functional again - replace svn with git usage. There's only a few
lines but I'm not familiar with this script. If anyone wants to take a
stab please submit a jira/patch. I've turned off precommit job on
jenkins until we get this straightened out.
https://builds.apache.org/view/S-Z/view/ZooKeeper/job/PreCommit-ZOOKEEPER-Build/

Patrick

On Sun, Sep 11, 2016 at 9:32 PM, Benjamin Reed  wrote:
> sure. i'll update it to reference git rather than svn.
>
> if i understand correctly pull requests that were submitted via github were
> reviewed by the qa bot (or something like that) in the past, but it was
> turned off. we should turn that back on i think.
>
> thanx
> ben
>
> On Sun, Sep 11, 2016 at 8:49 PM, Patrick Hunt  wrote:
>>
>> FYI Apache INFRA has made the cutover -
>> https://issues.apache.org/jira/browse/INFRA-12573
>>
>> At this point we need to update the "how to contribute" etc... Ben do
>> you want to take a stab at that? I can update the respective Jenkins
>> jobs.
>>
>> What else is there?
>>
>> Patrick
>>
>> On Wed, Sep 7, 2016 at 9:59 AM, Chris Nauroth 
>> wrote:
>> > Thank you for doing this, Eddie.  I just picked up the code review.
>> >
>> > --Chris Nauroth
>> >
>> > On 9/7/16, 9:49 AM, "Edward Ribeiro"  wrote:
>> >
>> > Hey folks, as part of this major change, I took a look at the
>> > gitignore and
>> > it already lacks a lot of file extensions for a modern Java project.
>> > Therefore, I created a trivial patch (shameless plug) that updates
>> > for more
>> > commonly extensions:
>> > https://issues.apache.org/jira/browse/ZOOKEEPER-2557
>> >
>> > Could you please review it and (the committers) this incorporated
>> > into
>> > branches before the transition if everything is alright, whenever
>> > you have
>> > time? The final gitignore doesn't look particularly big and cover
>> > only
>> > mostly the common IDE extensions and temporary files.
>> >
>> > Cheers,
>> > Eddie
>> >
>> >
>> > On Wed, Sep 7, 2016 at 7:31 AM, Flavio Junqueira 
>> > wrote:
>> >
>> > > +1
>> > >
>> > > > On 07 Sep 2016, at 06:10, Patrick Hunt  wrote:
>> > > >
>> > > > Quick update (more details on the INFRA jira). It might take
>> > upwards of
>> > > 24
>> > > > hours to do the svn->git migration although our repo isn't that
>> > large,
>> > > > likely less. INFRA can do it, for example, on Saturday around
>> > 18:00 UTC.
>> > > > Any concerns with such an approach?
>> > > >
>> > > > Patrick
>> > > >
>> > > > On Sun, Sep 4, 2016 at 9:20 PM, Patrick Hunt 
>> > wrote:
>> > > >
>> > > >> Follow along here:
>> > https://issues.apache.org/jira/browse/INFRA-12573
>> > > >>
>> > > >> Patrick
>> > > >>
>> > > >> On Sun, Sep 4, 2016 at 8:33 AM, Benjamin Reed
>> >  wrote:
>> > > >>
>> > > >>> with 10 votes for (5 of which are from the PMC) on no votes
>> > against.
>> > > the
>> > > >>> vote passes.
>> > > >>>
>> > > >>> pat please make git happen! :)
>> > > >>>
>> > > >>> thanx for voting!
>> > > >>>
>> > > >>> On Thu, Sep 1, 2016 at 9:25 AM, Michael Han
>> >  wrote:
>> > > >>>
>> > >  +1
>> > > 
>> > >  On Thu, Sep 1, 2016 at 6:08 AM, Michelle Tan
>> > 
>> > > >>> wrote:
>> > > 
>> > > > +1
>> > > >
>> > > > On Thu, Sep 1, 2016 at 2:01 PM, Flavio Junqueira
>> > 
>> > > >>> wrote:
>> > > >
>> > > >> +1
>> > > >>
>> > > >>> On 01 Sep 2016, at 13:28, Edward Ribeiro <
>> > > >>> edward.ribe...@gmail.com>
>> > > >> wrote:
>> > > >>>
>> > > >>> +1 (non binding)
>> > > >>>
>> > > >>> On Thu, Sep 1, 2016 at 3:44 AM, Jordan Zimmerman <
>> > > >> jor...@jordanzimmerman.com
>> > >  wrote:
>> > > >>>
>> > >  +1 (non binding)
>> > > 
>> > > > On Aug 31, 2016, at 8:29 PM, Benjamin Reed
>> > 
>> > >  wrote:
>> > > >
>> > > > flip the switch to git and update the relevant scripts
>> > and docs.
>> > > >
>> > >

ZooKeeper-trunk-openjdk7 - Build # 1163 - Still Failing

See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/1163/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 465496 lines...]
[junit] 2016-09-12 05:35:08,372 [myid:] - INFO  [ProcessThread(sid:0 
cport:30317)::PrepRequestProcessor@647] - Processed session termination for 
sessionid: 0x100b066c4f1
[junit] 2016-09-12 05:35:08,396 [myid:] - WARN  [New I/O worker 
#6635:NettyServerCnxnFactory$CnxnChannelHandler@142] - Exception caught [id: 
0x9ecc96fe, /127.0.0.1:36643 :> /127.0.0.1:30317] EXCEPTION: 
java.nio.channels.ClosedChannelException
[junit] java.nio.channels.ClosedChannelException
[junit] at 
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
[junit] at 
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:479)
[junit] at 
org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:203)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:201)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:151)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:315)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-09-12 05:35:08,396 [myid:] - INFO  
[SyncThread:0:MBeanRegistry@128] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port30317,name1=Connections,name2=127.0.0.1,name3=0x100b066c4f1]
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x100b066c4f1 closed
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 134354
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 1643
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testWatcherAutoResetWithLocal
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-09-12 05:35:08,499 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-09-12 05:35:08,499 [myid:] - INFO  
[main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:30317
[junit] 2016-09-12 05:35:08,497 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x100b066c4f1
[junit] 2016-09-12 05:35:08,503 [myid:] - INFO  [main:ZooKeeperServer@529] 
- shutting down
[junit] 2016-09-12 05:35:08,503 [myid:] - ERROR [main:ZooKeeperServer@501] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-12 05:35:08,503 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down
[junit] 2016-09-12 05:35:08,503 [myid:] - INFO  
[main:PrepRequestProcessor@965] - Shutting down
[junit] 2016-09-12 05:35:08,504 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2016-09-12 05:35:08,504 [myid:] - INFO  [ProcessThread(sid:0 
cport:30317)::PrepRequestProcessor@154] - PrepRequestProcessor exited loop!
[junit] 2016-09-12 05:35:08,504 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2016-09-12 05:35:08,504 [myid:] - INFO  
[main:FinalRequestProcessor@479] - shutdown of request processor complete
[junit] 2016-09-12 05:35:08,505 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port30317,name1=InMemoryDataTree]
[junit] 2016-09-12 05:35:08,505 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port30317]
[junit] 2016-09-12 05:35:08,505 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 30317
[junit] 2016-09-12

[jira] [Updated] (ZOOKEEPER-2575) /./// does not have the form scheme:id:perm and client is quit.

2016-09-11 Thread Prabhunath Yadav (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhunath Yadav updated ZOOKEEPER-2575:

Description: 
while creating node using command (random arguments like this).
create /  /./// or some wrong format it shows the message 
/./// does not have the form scheme:id:perm
with Exception in thread "main" 
org.apache.zookeeper.KeeperException$InvalidACLException: 
KeeperErrorCode=InvalidACL
.

It should give the accurate message but it should not get closed or quit.

  was:
while creating node using command 
create /  /./// or some wrong format it shows the message 
/./// does not have the form scheme:id:perm
with Exception in thread "main" 
org.apache.zookeeper.KeeperException$InvalidACLException: 
KeeperErrorCode=InvalidACL
.

It should give the accurate message but it should not get closed or quit.


> /./// does not have the form scheme:id:perm and client is quit.
> ---
>
> Key: ZOOKEEPER-2575
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2575
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.2
>Reporter: Prabhunath Yadav
>Priority: Minor
>
> while creating node using command (random arguments like this).
> create /  /./// or some wrong format it shows the message 
> /./// does not have the form scheme:id:perm
> with Exception in thread "main" 
> org.apache.zookeeper.KeeperException$InvalidACLException: 
> KeeperErrorCode=InvalidACL
> .
> It should give the accurate message but it should not get closed or quit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2575) /./// does not have the form scheme:id:perm and client is quit.

2016-09-11 Thread Prabhunath Yadav (JIRA)

Prabhunath Yadav created ZOOKEEPER-2575:
---

 Summary: /./// does not have the form scheme:id:perm and client is 
quit.
 Key: ZOOKEEPER-2575
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2575
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.5.2
Reporter: Prabhunath Yadav
Priority: Minor


while creating node using command 
create /  /./// or some wrong format it shows the message 
/./// does not have the form scheme:id:perm
with Exception in thread "main" 
org.apache.zookeeper.KeeperException$InvalidACLException: 
KeeperErrorCode=InvalidACL
.

It should give the accurate message but it should not get closed or quit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files

2016-09-11 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483048#comment-15483048
 ] 

Rakesh R edited comment on ZOOKEEPER-2574 at 9/12/16 4:39 AM:
--

Good catch, [~abhishekrai]. Thanks for the proposed patch with good unit 
testing.

I agree the following scenario will occur in learner side, where snapshotting 
has happened multiple times without accompanying log rollover, [refer source 
code|https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L439].
 I think there is mismatch in the zookeeper documentation, it says "This 
snapshot supercedes all previous logs", [please refer zk 
doc|https://zookeeper.apache.org/doc/r3.4.9/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup].
 But in this particular case the latest transactions after snapshotting is gong 
to the existing transaction log file, which is behind if we look at the 
naming(zxid part). I'm not sure whether this is intentionally implemented?. 
Again, iiuc, recovery will use a snapshot file + delta taken from the 
transaction log file name after snapshotted zxid, but in this case admins need 
to consider the log file just before, the snapshot file. Should we need to make 
the snapshot file and log file consistent to avoid any corner cases in future?

[~phunt], any thoughts?

{code}
2. Following files exist:
log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
snapshot.110 - snapshot as of zxid=110
snapshot.120 - snapshot as of zxid=120
snapshot.130 - snapshot as of zxid=130
{code}




was (Author: rakeshr):
Good catch, [~abhishekrai]. Thanks for the proposed patch with good unit 
testing.

I agree the following scenario will occur in learner side, where snapshotting 
has happened multiple times without accompanying log rollover, [refer source 
code|https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L439].
 I think there is mismatch in the zookeeper documentation, it says "This 
snapshot supercedes all previous logs", [please refer zk 
doc|https://zookeeper.apache.org/doc/r3.4.9/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup].
 But in this particular case the latest transactions after snapshotting is gong 
to the existing transaction log file, which is behind if we look at the 
naming(zxid part). I'm not sure whether this is intentionally implemented?. 
Again, iiuc, recovery will use a snapshot file + delta taken from the 
transaction log file name after snapshotted zxid, but in this case admins need 
to consider the log file just before, the snapshot file. Should we need to make 
the snapshot file and log file consistent?

[~phunt], any thoughts?

{code}
2. Following files exist:
log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
snapshot.110 - snapshot as of zxid=110
snapshot.120 - snapshot as of zxid=120
snapshot.130 - snapshot as of zxid=130
{code}



> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, 
> ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files

2016-09-11 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483048#comment-15483048
 ] 

Rakesh R commented on ZOOKEEPER-2574:
-

Good catch, [~abhishekrai]. Thanks for the proposed patch with good unit 
testing.

I agree the following scenario will occur in learner side, where snapshotting 
has happened multiple times without accompanying log rollover, [refer source 
code|https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L439].
 I think there is mismatch in the zookeeper documentation, it says "This 
snapshot supercedes all previous logs", [please refer zk 
doc|https://zookeeper.apache.org/doc/r3.4.9/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup].
 But in this particular case the latest transactions after snapshotting is gong 
to the existing transaction log file, which is behind if we look at the 
naming(zxid part). I'm not sure whether this is intentionally implemented?. 
Again, iiuc, recovery will use a snapshot file + delta taken from the 
transaction log file name after snapshotted zxid, but in this case admins need 
to consider the log file just before, the snapshot file. Should we need to make 
the snapshot file and log file consistent?

[~phunt], any thoughts?

{code}
2. Following files exist:
log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
snapshot.110 - snapshot as of zxid=110
snapshot.120 - snapshot as of zxid=120
snapshot.130 - snapshot as of zxid=130
{code}



> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, 
> ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

ZooKeeper-trunk - Build # 3076 - Still Failing

See https://builds.apache.org/job/ZooKeeper-trunk/3076/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 461002 lines...]
 [exec]  : elapsed 1000 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 10338 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 1039 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1013 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1024 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1050 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1166 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17692 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1016 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3115 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 30742 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1043 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
16379 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
16177 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1064 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4527 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 4405 : OK
 [exec] Zookeeper_readOnly::testReadOnly : elapsed 4111 : OK
 [exec] OK (72)
 [exec] PASS: zktest-mt
 [exec] ==
 [exec] All 2 tests passed
 [exec] ==
 [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit'

test-core-cppunit:

test-core:

test-contrib:

BUILD SUCCESSFUL
Total time: 18 minutes 7 seconds
[FINDBUGS] Collecting findbugs analysis files...
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[FINDBUGS] Finding all files that match the pattern 
trunk/artifacts/findbugs/*.xml
[FINDBUGS] Computing warning deltas based on reference build #3074
[WARNINGS] Parsing warnings in console log with parser Java Compiler (javac)
[WARNINGS] Parsing warnings in console log with parser JavaDoc Tool
[WARNINGS] Computing warning deltas based on reference build #3074
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: No artifacts found that match the file pattern 
"trunk/artifacts/*.tar.gz, trunk/artifacts/findbugs/*ml, 
trunk/build/tmp/zk.log, trunk/build/test/test-cppunit/*.log". Configuration 
error?
ERROR: ?trunk/artifacts/*.tar.gz? doesn?t match anything: even ?trunk? doesn?t 
exist
Build step 'Archive the artifacts' changed build result to FAILURE
Recording fingerprints
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Publishing Javadoc
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

ZooKeeper_branch35_jdk7 - Build # 661 - Failure

See https://builds.apache.org/job/ZooKeeper_branch35_jdk7/661/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 407577 lines...]
 [exec] Zookeeper_simpleSystem::testLogCallbackClearLog Message Received: 
[2016-09-12 04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1027: Client 
environment:zookeeper.version=zookeeper C client 3.5.2]
 [exec] Log Message Received: [2016-09-12 
04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1031: Client 
environment:host.name=pietas.apache.org]
 [exec] Log Message Received: [2016-09-12 
04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1038: Client 
environment:os.name=Linux]
 [exec] Log Message Received: [2016-09-12 
04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1039: Client 
environment:os.arch=3.13.0-92-generic]
 [exec] Log Message Received: [2016-09-12 
04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1040: Client 
environment:os.version=#139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016]
 [exec] Log Message Received: [2016-09-12 
04:30:41,688:9858(0x2aacc04be540):ZOO_INFO@log_env@1048: Client 
environment:user.name=jenkins]
 [exec] Log Message Received: [2016-09-12 
04:30:41,689:9858(0x2aacc04be540):ZOO_INFO@log_env@1056: Client 
environment:user.home=/home/jenkins]
 [exec] Log Message Received: [2016-09-12 
04:30:41,689:9858(0x2aacc04be540):ZOO_INFO@log_env@1068: Client 
environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch35_jdk7/build/test/test-cppunit]
 [exec] Log Message Received: [2016-09-12 
04:30:41,689:9858(0x2aacc04be540):ZOO_INFO@zookeeper_init_internal@: 
Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 
watcher=0x45d200 sessionId=0 sessionPasswd= context=0x7ffebfa55b40 
flags=0]
 [exec] Log Message Received: [2016-09-12 
04:30:41,689:9858(0x2aacc251b700):ZOO_INFO@check_events@2360: initiated 
connection to server [127.0.0.1:22181]]
 [exec] Log Message Received: [2016-09-12 
04:30:41,692:9858(0x2aacc251b700):ZOO_INFO@check_events@2412: session 
establishment complete on server [127.0.0.1:22181], 
sessionId=0x100af270645000f, negotiated timeout=1 ]
 [exec]  : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 10630 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1000 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 1023 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1003 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1005 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1011 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1035 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17164 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1009 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 4034 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 30718 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1026 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15138 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15122 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1026 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4510 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 4496 : OK
 [exec] Zookeeper_readOnly::testReadOnly : elapsed 4166 : OK
 [exec] OK (72)
 [exec] PASS: zktest-mt
 [exec] ==
 [exec] All 2 tests passed
 [exec] ==
 [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch35_jdk7/build/test/test-cppunit'

test-core-cppunit:

test-core:

test-contrib:

BUILD SUCCESSFUL
Total time: 16 minutes 12 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: No artifacts found that match the file pattern 
"branch-3.5/build/tmp/zk.log, branch-3.5/build/test/test-cppunit/*.log". 
Configuration error?
ERROR: ?branch-3.5/build/tmp/zk.log? doesn?t match anything, but 
?build/tmp/zk.log? does. Perhaps that?s what you mean?
Build step 'Archive the artifacts' changed build result to FAILURE
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was

Re: [VOTE] move Apache Zookeeper to git

2016-09-11 Thread Benjamin Reed

sure. i'll update it to reference git rather than svn.

if i understand correctly pull requests that were submitted via github were
reviewed by the qa bot (or something like that) in the past, but it was
turned off. we should turn that back on i think.

thanx
ben

On Sun, Sep 11, 2016 at 8:49 PM, Patrick Hunt  wrote:

> FYI Apache INFRA has made the cutover -
> https://issues.apache.org/jira/browse/INFRA-12573
>
> At this point we need to update the "how to contribute" etc... Ben do
> you want to take a stab at that? I can update the respective Jenkins
> jobs.
>
> What else is there?
>
> Patrick
>
> On Wed, Sep 7, 2016 at 9:59 AM, Chris Nauroth 
> wrote:
> > Thank you for doing this, Eddie.  I just picked up the code review.
> >
> > --Chris Nauroth
> >
> > On 9/7/16, 9:49 AM, "Edward Ribeiro"  wrote:
> >
> > Hey folks, as part of this major change, I took a look at the
> gitignore and
> > it already lacks a lot of file extensions for a modern Java project.
> > Therefore, I created a trivial patch (shameless plug) that updates
> for more
> > commonly extensions:  https://issues.apache.org/
> jira/browse/ZOOKEEPER-2557
> >
> > Could you please review it and (the committers) this incorporated
> into
> > branches before the transition if everything is alright, whenever
> you have
> > time? The final gitignore doesn't look particularly big and cover
> only
> > mostly the common IDE extensions and temporary files.
> >
> > Cheers,
> > Eddie
> >
> >
> > On Wed, Sep 7, 2016 at 7:31 AM, Flavio Junqueira 
> wrote:
> >
> > > +1
> > >
> > > > On 07 Sep 2016, at 06:10, Patrick Hunt  wrote:
> > > >
> > > > Quick update (more details on the INFRA jira). It might take
> upwards of
> > > 24
> > > > hours to do the svn->git migration although our repo isn't that
> large,
> > > > likely less. INFRA can do it, for example, on Saturday around
> 18:00 UTC.
> > > > Any concerns with such an approach?
> > > >
> > > > Patrick
> > > >
> > > > On Sun, Sep 4, 2016 at 9:20 PM, Patrick Hunt 
> wrote:
> > > >
> > > >> Follow along here: https://issues.apache.org/
> jira/browse/INFRA-12573
> > > >>
> > > >> Patrick
> > > >>
> > > >> On Sun, Sep 4, 2016 at 8:33 AM, Benjamin Reed 
> wrote:
> > > >>
> > > >>> with 10 votes for (5 of which are from the PMC) on no votes
> against.
> > > the
> > > >>> vote passes.
> > > >>>
> > > >>> pat please make git happen! :)
> > > >>>
> > > >>> thanx for voting!
> > > >>>
> > > >>> On Thu, Sep 1, 2016 at 9:25 AM, Michael Han 
> wrote:
> > > >>>
> > >  +1
> > > 
> > >  On Thu, Sep 1, 2016 at 6:08 AM, Michelle Tan <
> pheyyin...@gmail.com>
> > > >>> wrote:
> > > 
> > > > +1
> > > >
> > > > On Thu, Sep 1, 2016 at 2:01 PM, Flavio Junqueira <
> f...@apache.org>
> > > >>> wrote:
> > > >
> > > >> +1
> > > >>
> > > >>> On 01 Sep 2016, at 13:28, Edward Ribeiro <
> > > >>> edward.ribe...@gmail.com>
> > > >> wrote:
> > > >>>
> > > >>> +1 (non binding)
> > > >>>
> > > >>> On Thu, Sep 1, 2016 at 3:44 AM, Jordan Zimmerman <
> > > >> jor...@jordanzimmerman.com
> > >  wrote:
> > > >>>
> > >  +1 (non binding)
> > > 
> > > > On Aug 31, 2016, at 8:29 PM, Benjamin Reed <
> br...@apache.org>
> > >  wrote:
> > > >
> > > > flip the switch to git and update the relevant scripts
> and docs.
> > > >
> > > > i couldn't figure out which timeframe this falls under
> in the
> > >  voting
> > > > procedure table, but i think it's safe to go with 3
> days, so the
> > >  vote
> > >  will
> > > > close on Saturday, September 3 at 6:30pm pdt.
> > > >
> > > > +1 from me
> > > 
> > > 
> > > >>
> > > >>
> > > >
> > > 
> > > 
> > > 
> > >  --
> > >  Cheers
> > >  Michael.
> > > 
> > > >>>
> > > >>
> > > >>
> > >
> > >
> >
> >
>

ZooKeeper-trunk-WinVS2008 - Build # 2249 - Failure

See https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/2249/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 100 lines...]
[ivy:retrieve]  found commons-cli#commons-cli;1.2 in maven2
[ivy:retrieve]  found log4j#log4j;1.2.17 in maven2
[ivy:retrieve]  found io.netty#netty;3.10.5.Final in maven2
[ivy:retrieve]  found net.java.dev.javacc#javacc;5.0 in maven2
[ivy:retrieve] :: resolution report :: resolve 864ms :: artifacts dl 20ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   16  |   0   |   0   |   0   ||   16  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.zookeeper#zookeeper
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  16 artifacts copied, 0 already retrieved (4635kB/52ms)

generate_jute_parser:
[mkdir] Created dir: 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\build\jute_compiler\org\apache\jute\compiler\generated
[ivy:artifactproperty] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:artifactproperty] :: loading settings :: file = 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\ivysettings.xml
 [move] Moving 1 file to 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\build\lib
   [javacc] Java Compiler Compiler Version 5.0 (Parser Generator)
   [javacc] (type "javacc" with no arguments for help)
   [javacc] Reading from file 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\src\java\main\org\apache\jute\compiler\generated\rcc.jj
 . . .
   [javacc] File "TokenMgrError.java" does not exist.  Will create one.
   [javacc] File "ParseException.java" does not exist.  Will create one.
   [javacc] File "Token.java" does not exist.  Will create one.
   [javacc] File "SimpleCharStream.java" does not exist.  Will create one.
   [javacc] Parser generated successfully.

jute:
[javac] Compiling 39 source files to 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\build\classes

compile_jute_uptodate:

compile_jute:
[mkdir] Created dir: 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\src\java\generated
[mkdir] Created dir: 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\src\c\generated
 [java] ../../zookeeper.jute Parsed Successfully
 [java] ../../zookeeper.jute Parsed Successfully
[touch] Creating 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\src\java\generated\.generated

BUILD SUCCESSFUL
Total time: 6 seconds
[ZooKeeper-trunk-WinVS2008] $ cmd /c call 
C:\Users\CHRIST~1\AppData\Local\Temp\hudson5371991188661527381.bat

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008>set 
ZOOKEEPER_HOME=f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk 

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008>msbuild 
trunk/src/c/zookeeper.sln /p:Configuration=Release 
Microsoft (R) Build Engine version 4.0.30319.34209
[Microsoft .NET Framework, version 4.0.30319.34209]
Copyright (C) Microsoft Corporation. All rights reserved.

MSBUILD : error MSB1009: Project file does not exist.
Switch: trunk/src/c/zookeeper.sln

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008>exit 1 
Build step 'Execute Windows batch command' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

Re: [VOTE] move Apache Zookeeper to git

2016-09-11 Thread Patrick Hunt

FYI Apache INFRA has made the cutover -
https://issues.apache.org/jira/browse/INFRA-12573

At this point we need to update the "how to contribute" etc... Ben do
you want to take a stab at that? I can update the respective Jenkins
jobs.

What else is there?

Patrick

On Wed, Sep 7, 2016 at 9:59 AM, Chris Nauroth  wrote:
> Thank you for doing this, Eddie.  I just picked up the code review.
>
> --Chris Nauroth
>
> On 9/7/16, 9:49 AM, "Edward Ribeiro"  wrote:
>
> Hey folks, as part of this major change, I took a look at the gitignore 
> and
> it already lacks a lot of file extensions for a modern Java project.
> Therefore, I created a trivial patch (shameless plug) that updates for 
> more
> commonly extensions:  https://issues.apache.org/jira/browse/ZOOKEEPER-2557
>
> Could you please review it and (the committers) this incorporated into
> branches before the transition if everything is alright, whenever you have
> time? The final gitignore doesn't look particularly big and cover only
> mostly the common IDE extensions and temporary files.
>
> Cheers,
> Eddie
>
>
> On Wed, Sep 7, 2016 at 7:31 AM, Flavio Junqueira  wrote:
>
> > +1
> >
> > > On 07 Sep 2016, at 06:10, Patrick Hunt  wrote:
> > >
> > > Quick update (more details on the INFRA jira). It might take upwards 
> of
> > 24
> > > hours to do the svn->git migration although our repo isn't that large,
> > > likely less. INFRA can do it, for example, on Saturday around 18:00 
> UTC.
> > > Any concerns with such an approach?
> > >
> > > Patrick
> > >
> > > On Sun, Sep 4, 2016 at 9:20 PM, Patrick Hunt  wrote:
> > >
> > >> Follow along here: https://issues.apache.org/jira/browse/INFRA-12573
> > >>
> > >> Patrick
> > >>
> > >> On Sun, Sep 4, 2016 at 8:33 AM, Benjamin Reed  
> wrote:
> > >>
> > >>> with 10 votes for (5 of which are from the PMC) on no votes against.
> > the
> > >>> vote passes.
> > >>>
> > >>> pat please make git happen! :)
> > >>>
> > >>> thanx for voting!
> > >>>
> > >>> On Thu, Sep 1, 2016 at 9:25 AM, Michael Han  
> wrote:
> > >>>
> >  +1
> > 
> >  On Thu, Sep 1, 2016 at 6:08 AM, Michelle Tan 
> > >>> wrote:
> > 
> > > +1
> > >
> > > On Thu, Sep 1, 2016 at 2:01 PM, Flavio Junqueira 
> > >>> wrote:
> > >
> > >> +1
> > >>
> > >>> On 01 Sep 2016, at 13:28, Edward Ribeiro <
> > >>> edward.ribe...@gmail.com>
> > >> wrote:
> > >>>
> > >>> +1 (non binding)
> > >>>
> > >>> On Thu, Sep 1, 2016 at 3:44 AM, Jordan Zimmerman <
> > >> jor...@jordanzimmerman.com
> >  wrote:
> > >>>
> >  +1 (non binding)
> > 
> > > On Aug 31, 2016, at 8:29 PM, Benjamin Reed 
> >  wrote:
> > >
> > > flip the switch to git and update the relevant scripts and 
> docs.
> > >
> > > i couldn't figure out which timeframe this falls under in the
> >  voting
> > > procedure table, but i think it's safe to go with 3 days, so 
> the
> >  vote
> >  will
> > > close on Saturday, September 3 at 6:30pm pdt.
> > >
> > > +1 from me
> > 
> > 
> > >>
> > >>
> > >
> > 
> > 
> > 
> >  --
> >  Cheers
> >  Michael.
> > 
> > >>>
> > >>
> > >>
> >
> >
>
>

[jira] [Comment Edited] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2016-09-11 Thread Michael Han (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482766#comment-15482766
]

Michael Han edited comment on ZOOKEEPER-2080 at 9/12/16 1:50 AM:
-

bq. The shut down of leader election doesn't need to be in the synchronized
block.
This is what the patch did, by moving the shut down of LE out of the sync
block.

bq. it should be volatile.
Yes, agreed. Will update patch.

bq. is it possible that due to a race we miss that we are supposed to start LE?
Possible in theory - an example:
* shuttingDownLE is set to true in
[FastLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L302]
* There are a couple of other places in Leader / Follower where
[processReconfig|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L438]
will be called with restart LE flag set to true, which will set the
shuttingDownLE flag to false.
* So in theory, between we set shuttingDownLE to true in
[FastLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L302]
and we ran into check the same flag in
[QuorumPeer|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1066],
there could be concurrent operations which involves processReconfig that
change the state of shuttingDownLE from true to false. As a result, we miss the
event of the state change of shuttingDownLE in QuorumPeer's main loop.

Now does it matter if we miss such event due to interleaving operations from
processReconfig set shuttingDownLE from true to false? I am not sure, as I am
still learning the semantic of reconfig and FLE to make a better guess here.

Other notes:
* Do we need to synchronize on [this
code|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1066]
in QuorumPeer? I think probably we should, because similar 'test and set'
operation around shuttingDownLE in QuorumPeer (such as
[restartLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1419]
is synchronized, which makes sense (as typically in multithreading, what we
need to protected is not just the signal variable but also the executions that
depends on the value of the variable).

* I think we have another potential dead lock in
[QuorumPeer.restartLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1415].
Here the method is synchronized, but part of its execution requires invokes of
[FastLeaderElection.shutdown|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1418],
which, as previously analyzed could end up at a place in
[QuorumCnxManager|https://github.com/apache/zookeeper/blob/3c37184e83a3e68b73544cebccf9388eea26f523/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L475]
where it requires obtain same QuorumPeer lock. I don't think we are shot by
this yet, could either because we don't have test case cover such case, or
could because that is just a theory that in practice we will never end up with
such code execution flow.

* While looking through code, I find it is hard to reason about current
behavior due to the mixed state changes that could be done by many parties. I
am thinking if there is better ways of doing this, for example using an actor
based mode to restructure the code such that we have an event driven
architecture where each object has a main thread which only dispatch events and
entities (such as QuorumPeer and FastLeaderElection) will communicate each
other by passing event messages - there would not be any shared state and state
changes for each entities will then be easier to reason about. I think I will
have a better evaluation after more investigation around FLE codebase.

was (Author: hanm):
bq. The shut down of leader election doesn't need to be in the synchronized
block.
This is what the patch did, by moving the shut down of LE out of the sync
block.

bq. it should be volatile.
Yes, agreed. Will update patch.

[jira] [Comment Edited] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2016-09-11 Thread Michael Han (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482766#comment-15482766
]

Michael Han edited comment on ZOOKEEPER-2080 at 9/12/16 1:48 AM:
-

bq. The shut down of leader election doesn't need to be in the synchronized
block.
This is what the patch did, by moving the shut down of LE out of the sync
block.

bq. it should be volatile.
Yes, agreed. Will update patch.

Other notes:
* Do we need to synchronize on this code in QuorumPeer? I think probably we
should, because similar 'test and set' operation around shuttingDownLE in
QuorumPeer (such as
[restartLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1419]
is synchronized, which makes sense (as typically in multithreading, what we
need to protected is not just the signal variable but also the executions that
depends on the value of the variable).

was (Author: hanm):
bq. The shut down of leader election doesn't need to be in the synchronized
block.
This is what the patch did, by moving the shut down of LE out of the sync
block.

bq. it should be volatile.
Yes, agreed. Will update patch.

[jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2016-09-11 Thread Michael Han (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482766#comment-15482766
 ] 

Michael Han commented on ZOOKEEPER-2080:


bq. The shut down of leader election doesn't need to be in the synchronized 
block.
This is what the patch did, by moving the shut down of LE out of the sync 
block. 

bq. it should be volatile.
Yes, agreed. Will update patch.

bq. is it possible that due to a race we miss that we are supposed to start LE? 
Possible in theory - an example:
* shuttingDownLE is set to true in 
[FastLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L302]
* There are a couple of other places in Leader / Follower where 
[processReconfig|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L302]
 will be called with restart LE flag set to true, which will set the 
shuttingDownLE flag to false.
* So in theory, between we set shuttingDownLE to true in 
[FastLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L302]
 and we ran into check the same flag in 
[QuorumPeer|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1066],
 there could be concurrent operations which involves processReconfig that 
change the state of shuttingDownLE from true to false. As a result, we miss the 
event of the state change of shuttingDownLE in QuorumPeer's main loop.

Now does it matter if we miss such event due to interleaving operations from 
processReconfig set shuttingDownLE from true to false? I am not sure, as I am 
still learning the semantic of reconfig and FLE to make a better guess here. 

Other notes:
* Do we need to synchronize on this code in QuorumPeer? I think probably we 
should, because similar 'test and set' operation around shuttingDownLE in 
QuorumPeer (such as 
[restartLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1419]
 is synchronized, which makes sense (as typically in multithreading, what we 
need to protected is not just the signal variable but also the executions that 
depends on the value of the variable).

* I think we have another potential dead lock in 
[QuorumPeer.restartLeaderElection|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1415].
 Here the method is synchronized, but part of its execution requires invokes of 
[FastLeaderElection.shutdown|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java#L1418],
 which, as previously analyzed could end up at a place in 
[QuorumCnxManager|https://github.com/apache/zookeeper/blob/3c37184e83a3e68b73544cebccf9388eea26f523/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L475]
 where it requires obtain same QuorumPeer lock. I don't think we are shot by 
this yet, could either because we don't have test case cover such case, or 
could because that is just a theory that in practice we will never end up with 
such code execution flow.

* While looking through code, I find it is hard to reason about current 
behavior due to the mixed state changes that could be done by many parties. I 
am thinking if there is better ways of doing this, for example using an actor 
based  mode to restructure the code such that we have an event driven 
architecture where each object has a main thread which only dispatch events and 
entities (such as QuorumPeer and FastLeaderElection) will communicate each 
other by passing event messages - there would not be any shared state and state 
changes for each entities will then be easier to reason about. I think I will 
have a better evaluation after more investigation around FLE codebase. 

> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Michael Han
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, 
> ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, 
> jacoco-ZOOKEEPER-2080.unzip-grows-to-70MB.7z, repro-20150816.log, 
> threaddump.log
>
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
>

ZooKeeper-trunk - Build # 3075 - Failure

See https://builds.apache.org/job/ZooKeeper-trunk/3075/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 459100 lines...]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af269b28540):ZOO_INFO@log_env@1040: Client 
environment:os.version=#63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af269b28540):ZOO_INFO@log_env@1048: Client 
environment:user.name=jenkins]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af269b28540):ZOO_INFO@log_env@1056: Client 
environment:user.home=/home/jenkins]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af269b28540):ZOO_INFO@log_env@1068: Client 
environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/test-cppunit]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af269b28540):ZOO_INFO@zookeeper_init_internal@: 
Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 
watcher=0x45d200 sessionId=0 sessionPasswd= context=0x7fff27fd3550 
flags=0]
 [exec] Log Message Received: [2016-09-11 
23:31:55,355:15712(0x2af26bb85700):ZOO_INFO@check_events@2360: initiated 
connection to server [127.0.0.1:22181]]
 [exec] Log Message Received: [2016-09-11 
23:31:55,372:15712(0x2af26bb85700):ZOO_INFO@check_events@2412: session 
establishment complete on server [127.0.0.1:22181], 
sessionId=0x10274814704000f, negotiated timeout=1 ]
 [exec]  : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 10293 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 1038 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1012 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1016 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1058 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1149 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17668 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1181 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3100 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 30651 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1050 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15045 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15063 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1066 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4525 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 4371 : OK
 [exec] Zookeeper_readOnly::testReadOnly : elapsed 4125 : OK
 [exec] OK (72)
 [exec] PASS: zktest-mt
 [exec] ==
 [exec] 1 of 2 tests failed
 [exec] Please report to u...@zookeeper.apache.org
 [exec] ==
 [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/test-cppunit'
 [exec] make[1]: *** [check-TESTS] Error 1
 [exec] make: *** [check-am] Error 2

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1322: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1282: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1292: 
exec returned: 2

Total time: 17 minutes 15 seconds
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording fingerprints
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Publishing Javadoc
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7

ZooKeeper-trunk-openjdk7 - Build # 1162 - Still Failing

See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/1162/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 526531 lines...]
[junit] at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-09-11 20:13:04,998 [myid:] - INFO  [New I/O worker 
#10199:ClientCnxnSocketNetty$ZKClientHandler@384] - channel is disconnected: 
[id: 0x3c4ab0c5, /127.0.0.1:45690 :> 127.0.0.1/127.0.0.1:14060]
[junit] 2016-09-11 20:13:04,998 [myid:] - INFO  [New I/O worker 
#10199:ClientCnxnSocketNetty@208] - channel is told closing
[junit] 2016-09-11 20:13:04,998 [myid:127.0.0.1:14060] - INFO  
[main-SendThread(127.0.0.1:14060):ClientCnxn$SendThread@1231] - channel for 
sessionid 0x0 is lost, closing socket connection and attempting reconnect
[junit] 2016-09-11 20:13:05,628 [myid:127.0.0.1:14036] - INFO  
[main-SendThread(127.0.0.1:14036):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:14036. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-11 20:13:05,629 [myid:] - INFO  [New I/O boss 
#9438:ClientCnxnSocketNetty$1@127] - future isn't success, cause: {}
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14036
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-09-11 20:13:05,629 [myid:] - WARN  [New I/O boss 
#9438:ClientCnxnSocketNetty$ZKClientHandler@439] - Exception caught: [id: 
0xd000aceb] EXCEPTION: java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14036
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14036
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files

2016-09-11 Thread Arshad Mohammad (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482179#comment-15482179
 ] 

Arshad Mohammad commented on ZOOKEEPER-2574:


Latest patch LGTM +1(non-binding)

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, 
> ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482129#comment-15482129
 ] 

Hadoop QA commented on ZOOKEEPER-2574:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12827931/ZOOKEEPER-2574.3.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//console

This message is automatically generated.

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, 
> ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: ZOOKEEPER-2574 PreCommit Build #3414

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 451850 lines...]
 [exec]   against trunk revision 1759917.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3414//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 54b961a65372c8b185c4611611cc5c26d03fd9e2 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 17 minutes 41 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2574
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: ZOOKEEPER-2574.3.patch

Thanks [~arshad.mohammad] for the review.  I've applied your suggestions and 
uploaded the latest patch.

Also, I noticed that on Hadoop QA, a test is failing 
(org.apache.zookeeper.test.QuorumTest) but I cannot reproduce this failure 
locally and it also seems unrelated.

Thanks!

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, 
> ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

ZooKeeper_branch35_solaris - Build # 246 - Still Failing

See https://builds.apache.org/job/ZooKeeper_branch35_solaris/246/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 436189 lines...]
[junit] 2016-09-11 17:24:13,386 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-09-11 17:24:13,386 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-09-11 17:24:13,387 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-09-11 17:24:13,388 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-09-11 17:24:13,388 [myid:] - INFO  [main:ZooKeeperServer@889] 
- minSessionTimeout set to 6000
[junit] 2016-09-11 17:24:13,389 [myid:] - INFO  [main:ZooKeeperServer@898] 
- maxSessionTimeout set to 6
[junit] 2016-09-11 17:24:13,389 [myid:] - INFO  [main:ZooKeeperServer@159] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test2481539315367179239.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test2481539315367179239.junit.dir/version-2
[junit] 2016-09-11 17:24:13,389 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test2481539315367179239.junit.dir/version-2/snapshot.b
[junit] 2016-09-11 17:24:13,391 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test2481539315367179239.junit.dir/version-2/snapshot.b
[junit] 2016-09-11 17:24:13,393 [myid:] - ERROR [main:ZooKeeperServer@501] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 17:24:13,393 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-09-11 17:24:13,394 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:56246
[junit] 2016-09-11 17:24:13,398 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:56246
[junit] 2016-09-11 17:24:13,398 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-09-11 17:24:13,399 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:56246 (no session established for client)
[junit] 2016-09-11 17:24:13,401 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-09-11 17:24:13,402 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-09-11 17:24:13,402 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-09-11 17:24:13,402 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-09-11 17:24:13,403 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-09-11 17:24:13,403 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 18014
[junit] 2016-09-11 17:24:13,403 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-09-11 17:24:13,403 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-09-11 17:24:13,403 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x12397428eb6 closed
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x12397428eb6
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-09-11 17:24:13,472 [myid:] - INFO

ZooKeeper_branch34_openjdk7 - Build # 1207 - Still Failing

See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1207/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 230709 lines...]
[junit] 2016-09-11 15:37:04,519 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 2016-09-11 15:37:04,519 [myid:] - INFO  [main:ZooKeeperServer@497] 
- shutting down
[junit] 2016-09-11 15:37:04,519 [myid:] - ERROR [main:ZooKeeperServer@472] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 15:37:04,519 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2016-09-11 15:37:04,520 [myid:] - INFO  
[main:PrepRequestProcessor@765] - Shutting down
[junit] 2016-09-11 15:37:04,520 [myid:] - INFO  
[main:SyncRequestProcessor@208] - Shutting down
[junit] 2016-09-11 15:37:04,520 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2016-09-11 15:37:04,520 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@186] - SyncRequestProcessor exited!
[junit] 2016-09-11 15:37:04,520 [myid:] - INFO  
[main:FinalRequestProcessor@402] - shutdown of request processor complete
[junit] 2016-09-11 15:37:04,521 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-09-11 15:37:04,522 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-09-11 15:37:04,523 [myid:] - INFO  [main:ClientBase@445] - 
STARTING server
[junit] 2016-09-11 15:37:04,523 [myid:] - INFO  [main:ClientBase@366] - 
CREATING server instance 127.0.0.1:11221
[junit] 2016-09-11 15:37:04,524 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2016-09-11 15:37:04,524 [myid:] - INFO  [main:ClientBase@341] - 
STARTING server instance 127.0.0.1:11221
[junit] 2016-09-11 15:37:04,525 [myid:] - INFO  [main:ZooKeeperServer@173] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test856078244964791796.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test856078244964791796.junit.dir/version-2
[junit] 2016-09-11 15:37:04,529 [myid:] - ERROR [main:ZooKeeperServer@472] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 15:37:04,529 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-09-11 15:37:04,530 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:55296
[junit] 2016-09-11 15:37:04,530 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:55296
[junit] 2016-09-11 15:37:04,531 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2016-09-11 15:37:04,531 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1008] - Closed socket connection for client 
/127.0.0.1:55296 (no session established for client)
[junit] 2016-09-11 15:37:04,531 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-09-11 15:37:04,533 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2016-09-11 15:37:04,534 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2016-09-11 15:37:04,534 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-09-11 15:37:04,534 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-09-11 15:37:04,534 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 32638
[junit] 2016-09-11 15:37:04,535 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2016-09-11 15:37:04,535 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2016-09-11 15:37:04,535 [myid:] - INFO  [main:ClientBase@522] - 
tearDown starting
[junit] 2016-09-11 15:37:04,601 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x15719e50d9a closed
[junit] 2016-09-11 15:37:04,601 [myid:] - INFO  [main:ClientBase@492] - 
STOPPING server
[junit] 2016-09-11 15:37:04,601 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for 
session: 0x15719e50d9a

ZooKeeper_branch34_solaris - Build # 1289 - Still Failing

See https://builds.apache.org/job/ZooKeeper_branch34_solaris/1289/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 179246 lines...]
[junit] 2016-09-11 13:48:57,064 [myid:] - INFO  [main:ZooKeeperServer@497] 
- shutting down
[junit] 2016-09-11 13:48:57,064 [myid:] - ERROR [main:ZooKeeperServer@472] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 13:48:57,064 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2016-09-11 13:48:57,064 [myid:] - INFO  
[main:PrepRequestProcessor@765] - Shutting down
[junit] 2016-09-11 13:48:57,064 [myid:] - INFO  
[main:SyncRequestProcessor@208] - Shutting down
[junit] 2016-09-11 13:48:57,064 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2016-09-11 13:48:57,065 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@186] - SyncRequestProcessor exited!
[junit] 2016-09-11 13:48:57,065 [myid:] - INFO  
[main:FinalRequestProcessor@402] - shutdown of request processor complete
[junit] 2016-09-11 13:48:57,065 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-09-11 13:48:57,066 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-09-11 13:48:57,066 [myid:] - INFO  [main:ClientBase@445] - 
STARTING server
[junit] 2016-09-11 13:48:57,066 [myid:] - INFO  [main:ClientBase@366] - 
CREATING server instance 127.0.0.1:11221
[junit] 2016-09-11 13:48:57,067 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2016-09-11 13:48:57,067 [myid:] - INFO  [main:ClientBase@341] - 
STARTING server instance 127.0.0.1:11221
[junit] 2016-09-11 13:48:57,067 [myid:] - INFO  [main:ZooKeeperServer@173] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test2711436226713463038.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test2711436226713463038.junit.dir/version-2
[junit] 2016-09-11 13:48:57,070 [myid:] - ERROR [main:ZooKeeperServer@472] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 13:48:57,070 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-09-11 13:48:57,070 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:38917
[junit] 2016-09-11 13:48:57,070 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:38917
[junit] 2016-09-11 13:48:57,071 [myid:] - INFO  
[Thread-5:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2016-09-11 13:48:57,071 [myid:] - INFO  
[Thread-5:NIOServerCnxn@1008] - Closed socket connection for client 
/127.0.0.1:38917 (no session established for client)
[junit] 2016-09-11 13:48:57,071 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-09-11 13:48:57,072 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2016-09-11 13:48:57,072 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 9207
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2016-09-11 13:48:57,073 [myid:] - INFO  [main:ClientBase@522] - 
tearDown starting
[junit] 2016-09-11 13:48:57,152 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for 
session: 0x15719821071
[junit] 2016-09-11 13:48:57,152 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x15719821071 closed
[junit] 2016-09-11 13:48:57,152 [myid:] - INFO  [main:ClientBase@492] - 
STOPPING server
[junit] 2016-09-11 13:48:57,153 [myid:] - INFO  [main:ZooKeeperServer@497] 
- shutting down
[junit] 2016-09-11

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files

2016-09-11 Thread Arshad Mohammad (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481700#comment-15481700
 ] 

Arshad Mohammad commented on ZOOKEEPER-2574:


Thanks [~abhishekrai] for working on this issue. 
The solution looks good to me. I can see FileTxnLog.getLogFiles has the logic 
to retain highest transaction log which is less than the leastZxidToBeRetain. 
Few comments
1) can we rename exclude to retainedTxnLogs or some thing similar. 
{code}
final Set exclude = new HashSet();
{code}
2) FileTxnSnapLog.getSnapshotLogs(long) javadoc does not have complete 
information.
can you update it to indicate that it also returns "one snapshot log which is 
highest but less than the given zxid"

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

ZooKeeper-trunk-jdk8 - Build # 743 - Still Failing

See https://builds.apache.org/job/ZooKeeper-trunk-jdk8/743/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 466979 lines...]
[junit] 2016-09-11 11:59:56,392 [myid:] - INFO  [main:ZKTestCase$1@65] - 
SUCCEEDED testWatcherAutoResetWithLocal
[junit] 2016-09-11 11:59:56,392 [myid:] - INFO  [main:ZKTestCase$1@60] - 
FINISHED testWatcherAutoResetWithLocal
[junit] Tests run: 101, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
553.901 sec, Thread: 2, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] 2016-09-11 11:59:56,830 [myid:127.0.0.1:14042] - INFO  
[main-SendThread(127.0.0.1:14042):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:14042. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-11 11:59:56,831 [myid:127.0.0.1:14042] - WARN  
[main-SendThread(127.0.0.1:14042):ClientCnxn$SendThread@1235] - Session 
0x300c2efba12 for server 127.0.0.1/127.0.0.1:14042, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-09-11 12:02:26,448 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 134370
[junit] 2016-09-11 12:02:26,448 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 55
[junit] 2016-09-11 12:02:26,448 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testManyChildWatchersAutoReset
[junit] 2016-09-11 12:02:26,449 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-09-11 12:02:26,450 [myid:] - INFO  [ProcessThread(sid:0 
cport:30076)::PrepRequestProcessor@647] - Processed session termination for 
sessionid: 0x100c2e856d5
[junit] 2016-09-11 12:02:26,472 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x100c2e856d5 closed
[junit] 2016-09-11 12:02:26,473 [myid:] - INFO  
[NIOWorkerThread-31:MBeanRegistry@128] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port30076,name1=Connections,name2=127.0.0.1,name3=0x100c2e856d5]
[junit] 2016-09-11 12:02:26,473 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x100c2e856d5
[junit] 2016-09-11 12:02:26,473 [myid:] - INFO  [ProcessThread(sid:0 
cport:30076)::PrepRequestProcessor@647] - Processed session termination for 
sessionid: 0x100c2e856d50001
[junit] 2016-09-11 12:02:26,474 [myid:] - INFO  
[NIOWorkerThread-31:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:60626 which had sessionid 0x100c2e856d5
[junit] 2016-09-11 12:02:26,497 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x100c2e856d50001 closed
[junit] 2016-09-11 12:02:26,497 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x100c2e856d50001
[junit] 2016-09-11 12:02:26,497 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-09-11 12:02:26,497 [myid:] - INFO  
[NIOWorkerThread-11:MBeanRegistry@128] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port30076,name1=Connections,name2=127.0.0.1,name3=0x100c2e856d50001]
[junit] 2016-09-11 12:02:26,498 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2016-09-11 12:02:26,499 [myid:] - INFO  
[NIOWorkerThread-11:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:60625 which had sessionid 0x100c2e856d50001
[junit] 2016-09-11 12:02:26,500 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:30076:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-09-11 12:02:26,501 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-09-11 12:02:26,502 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-09-11 12:02:26,505 [myid:] - INFO  [main:ZooKeeperServer@529] 
- shutting down
[junit] 2016-09-11 12:02:26,505 [myid:] - ERROR [main:ZooKeeperServer@501] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 12:02:26,506 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down

ZooKeeper_branch35_openjdk7 - Build # 226 - Still Failing

See https://builds.apache.org/job/ZooKeeper_branch35_openjdk7/226/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 450506 lines...]
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-09-11 10:07:20,845 [myid:] - INFO  
[SyncThread:0:MBeanRegistry@128] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port11466,name1=Connections,name2=127.0.0.1,name3=0x1005a6499eb]
[junit] 2016-09-11 10:07:20,945 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x1005a6499eb closed
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 55597
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 468
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testWatcherAutoResetWithLocal
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-09-11 10:07:20,947 [myid:] - INFO  
[main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:11466
[junit] 2016-09-11 10:07:20,946 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x1005a6499eb
[junit] 2016-09-11 10:07:20,952 [myid:] - INFO  [main:ZooKeeperServer@529] 
- shutting down
[junit] 2016-09-11 10:07:20,952 [myid:] - ERROR [main:ZooKeeperServer@501] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-09-11 10:07:20,953 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down
[junit] 2016-09-11 10:07:20,953 [myid:] - INFO  
[main:PrepRequestProcessor@965] - Shutting down
[junit] 2016-09-11 10:07:20,953 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2016-09-11 10:07:20,953 [myid:] - INFO  [ProcessThread(sid:0 
cport:11466)::PrepRequestProcessor@154] - PrepRequestProcessor exited loop!
[junit] 2016-09-11 10:07:20,954 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2016-09-11 10:07:20,954 [myid:] - INFO  
[main:FinalRequestProcessor@479] - shutdown of request processor complete
[junit] 2016-09-11 10:07:20,955 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port11466,name1=InMemoryDataTree]
[junit] 2016-09-11 10:07:20,955 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11466]
[junit] 2016-09-11 10:07:20,956 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11466
[junit] 2016-09-11 10:07:20,956 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-09-11 10:07:20,965 [myid:] - INFO  [main:ClientBase@568] - 
fdcount after test is: 1377 at start it was 1377
[junit] 2016-09-11 10:07:20,965 [myid:] - INFO  [main:ZKTestCase$1@65] - 
SUCCEEDED testWatcherAutoResetWithLocal
[junit] 2016-09-11 10:07:20,966 [myid:] - INFO  [main:ZKTestCase$1@60] - 
FINISHED testWatcherAutoResetWithLocal
[junit] Tests run: 101, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
439.079 sec, Thread: 1, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] 2016-09-11 10:07:21,145 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@158] - SessionTrackerImpl exited loop!
[junit] 2016-09-11 10:07:21,146 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@158] - SessionTrackerImpl exited loop!
[junit] 2016-09-11 10:07:21,424 [myid:127.0.0.1:11349] - INFO  
[main-SendThread(127.0.0.1:11349):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:11349. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-09-11 10:07:21,425 [myid:127.0.0.1:11349] - WARN  
[main-SendThread(127.0.0.1:11349):ClientCnxn$SendThread@1235] - Session 
0x3005a61a436 for server 127.0.0.1/127.0.0.1:11349, unexpected error, 
closing socket connection and attempting reconnect
[junit]

[jira] [Commented] (ZOOKEEPER-2310) Snapshot files must be synced to prevent inconsistency or data loss


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481335#comment-15481335
 ] 

Hadoop QA commented on ZOOKEEPER-2310:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12827922/ZOOKEEPER-2310.3.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//console

This message is automatically generated.

> Snapshot files must be synced to prevent inconsistency or data loss
> ---
>
> Key: ZOOKEEPER-2310
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2310
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
> Attachments: ZOOKEEPER-2310.3.patch, zookeeper-2310-version-2.patch, 
> zookeeper-2310.patch
>
>
> Today, Zookeeper server syncs transaction log files to disk by default, but 
> does not sync snapshot files.  Consequently, an untimely crash may result in 
> a lost or incomplete snapshot file.  During recovery, if the server finds a 
> valid older snapshot file, it will load it and replay subsequent log(s), 
> skipping the incomplete snapshot file.  It's possible that the skipped file 
> had some transactions which are not present in the replayed transaction logs. 
>  Since quorum synchronization is based on last transaction ID of each server, 
> this will never get noticed, resulting in inconsistency between servers and 
> possible data loss.
> Following sequence of events describes a sample scenario where this can 
> happen:
> # Server F is a follower in a Zookeeper ensemble.
> # F's most recent valid snapshot file is named "snapshot.10" containing state 
> up to zxid = 10.  F is currently writing to the transaction log file 
> "log.11", with the most recent zxid = 20.
> # Fresh round of election.
> # F receives a few new transactions 21 to 30 from new leader L as the "diff". 
>  Current server behavior is to dump current state plus diff to a new snapshot 
> file, "snapshot.30".
> # F finalizes the snapshot file, but file contents are still buffered in OS 
> caches.  Zookeeper does not sync snapshot file contents to disk.
> # F receives a new transaction 31 from the leader, which it appends to the 
> existing transaction log file, "log.11" and syncs the file to disk.
> # Server machine crashes or is cold rebooted.
> # After recovery, snapshot file "snapshot.30" may not exist or may be empty.  
> See below for why that may happen.
> # In either case, F looks for the last finalized snapshot file, finds and 
> loads "snapshot.10".  It then replays transactions from "log.11".  
> Ultimately, its last seen zxid will be 31, but it would not have replayed 
> transactions 21 to 30 received via the "diff" from the leader.
> # Clients which are connected to F may see different data than clients 
> connected to other members of the ensemble, violating single system image 
> invariant.  Also, if F were to become a leader at some point, it could use 
> its state to seed other servers, and they all could lose the writes in the 
> missing interval above.
> *Notes:*
> - Reason why snapshot file may be missing or incomplete:
> -- Zookeeper does not sync the data directory after creating a snapshot file. 
>  Even if a newly created file is synced to disk, if the corresponding 
> directory entry is not, then the file will not be visible in the namespace.
> -- Zookeeper does not sync snapshot files.  So, they may be empty or 
> incomplete during recovery from an untimely crash.
> - In step (6) above, the server could also have written the new transaction 
> 31 to a new log file, "log.31".  The

Failed: ZOOKEEPER-2310 PreCommit Build #3413

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2310
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 460866 lines...]
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3413//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] e6c540bb28e11d6382df2bf6eaa287e09203ebb6 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1605:
 exec returned: 1

Total time: 20 minutes 27 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2310
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-2310) Snapshot files must be synced to prevent inconsistency or data loss


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2310:

Attachment: ZOOKEEPER-2310.3.patch

> Snapshot files must be synced to prevent inconsistency or data loss
> ---
>
> Key: ZOOKEEPER-2310
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2310
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
> Attachments: ZOOKEEPER-2310.3.patch, zookeeper-2310-version-2.patch, 
> zookeeper-2310.patch
>
>
> Today, Zookeeper server syncs transaction log files to disk by default, but 
> does not sync snapshot files.  Consequently, an untimely crash may result in 
> a lost or incomplete snapshot file.  During recovery, if the server finds a 
> valid older snapshot file, it will load it and replay subsequent log(s), 
> skipping the incomplete snapshot file.  It's possible that the skipped file 
> had some transactions which are not present in the replayed transaction logs. 
>  Since quorum synchronization is based on last transaction ID of each server, 
> this will never get noticed, resulting in inconsistency between servers and 
> possible data loss.
> Following sequence of events describes a sample scenario where this can 
> happen:
> # Server F is a follower in a Zookeeper ensemble.
> # F's most recent valid snapshot file is named "snapshot.10" containing state 
> up to zxid = 10.  F is currently writing to the transaction log file 
> "log.11", with the most recent zxid = 20.
> # Fresh round of election.
> # F receives a few new transactions 21 to 30 from new leader L as the "diff". 
>  Current server behavior is to dump current state plus diff to a new snapshot 
> file, "snapshot.30".
> # F finalizes the snapshot file, but file contents are still buffered in OS 
> caches.  Zookeeper does not sync snapshot file contents to disk.
> # F receives a new transaction 31 from the leader, which it appends to the 
> existing transaction log file, "log.11" and syncs the file to disk.
> # Server machine crashes or is cold rebooted.
> # After recovery, snapshot file "snapshot.30" may not exist or may be empty.  
> See below for why that may happen.
> # In either case, F looks for the last finalized snapshot file, finds and 
> loads "snapshot.10".  It then replays transactions from "log.11".  
> Ultimately, its last seen zxid will be 31, but it would not have replayed 
> transactions 21 to 30 received via the "diff" from the leader.
> # Clients which are connected to F may see different data than clients 
> connected to other members of the ensemble, violating single system image 
> invariant.  Also, if F were to become a leader at some point, it could use 
> its state to seed other servers, and they all could lose the writes in the 
> missing interval above.
> *Notes:*
> - Reason why snapshot file may be missing or incomplete:
> -- Zookeeper does not sync the data directory after creating a snapshot file. 
>  Even if a newly created file is synced to disk, if the corresponding 
> directory entry is not, then the file will not be visible in the namespace.
> -- Zookeeper does not sync snapshot files.  So, they may be empty or 
> incomplete during recovery from an untimely crash.
> - In step (6) above, the server could also have written the new transaction 
> 31 to a new log file, "log.31".  The final outcome would still be the same.
> We are able to deterministically reproduce this problem using the following 
> steps:
> # Create a new Zookeeper ensemble on 3 hosts: A, B, and C.
> # Ensured each server has at least one snapshot file in its data dir.
> # Stop Zookeeper process on server A.
> # Slow down disk syncs on server A (see example script below). This ensures 
> that snapshot files written by Zookeeper don't make it to disk spontaneously. 
>  Log files will be written to disk as Zookeeper explicitly issues a sync call 
> on such files.
> # Connect to server B and create a new znode /test1.
> # Start Zookeeper process on A, wait for it to write a new snapshot to its 
> datadir.  This snapshot would contain /test1 but it won’t be synced to disk 
> yet.
> # Connect to A and verify that /test1 is visible.
> # Connect to B and create another znode /test2.  This will cause A’s 
> transaction log to grow further to receive /test2.
> # Cold reboot A.
> # A’s last snapshot is a zero-sized file or is missing altogether since it 
> did not get synced to disk before reboot.  We have seen both in different 
> runs.
> # Connect to A and verify that /test1 does not exist.  It exists on B and C.
> Slowing down disk syncs:
> {noformat}
> echo 36 | sudo tee /proc/sys/vm/dirty_writeback_centisecs
> echo 36 | sudo tee

[jira] [Commented] (ZOOKEEPER-2310) Snapshot files must be synced to prevent inconsistency or data loss


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481300#comment-15481300
 ] 

Hadoop QA commented on ZOOKEEPER-2310:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12802042/zookeeper-2310-version-2.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3412//console

This message is automatically generated.

> Snapshot files must be synced to prevent inconsistency or data loss
> ---
>
> Key: ZOOKEEPER-2310
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2310
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
> Attachments: zookeeper-2310-version-2.patch, zookeeper-2310.patch
>
>
> Today, Zookeeper server syncs transaction log files to disk by default, but 
> does not sync snapshot files.  Consequently, an untimely crash may result in 
> a lost or incomplete snapshot file.  During recovery, if the server finds a 
> valid older snapshot file, it will load it and replay subsequent log(s), 
> skipping the incomplete snapshot file.  It's possible that the skipped file 
> had some transactions which are not present in the replayed transaction logs. 
>  Since quorum synchronization is based on last transaction ID of each server, 
> this will never get noticed, resulting in inconsistency between servers and 
> possible data loss.
> Following sequence of events describes a sample scenario where this can 
> happen:
> # Server F is a follower in a Zookeeper ensemble.
> # F's most recent valid snapshot file is named "snapshot.10" containing state 
> up to zxid = 10.  F is currently writing to the transaction log file 
> "log.11", with the most recent zxid = 20.
> # Fresh round of election.
> # F receives a few new transactions 21 to 30 from new leader L as the "diff". 
>  Current server behavior is to dump current state plus diff to a new snapshot 
> file, "snapshot.30".
> # F finalizes the snapshot file, but file contents are still buffered in OS 
> caches.  Zookeeper does not sync snapshot file contents to disk.
> # F receives a new transaction 31 from the leader, which it appends to the 
> existing transaction log file, "log.11" and syncs the file to disk.
> # Server machine crashes or is cold rebooted.
> # After recovery, snapshot file "snapshot.30" may not exist or may be empty.  
> See below for why that may happen.
> # In either case, F looks for the last finalized snapshot file, finds and 
> loads "snapshot.10".  It then replays transactions from "log.11".  
> Ultimately, its last seen zxid will be 31, but it would not have replayed 
> transactions 21 to 30 received via the "diff" from the leader.
> # Clients which are connected to F may see different data than clients 
> connected to other members of the ensemble, violating single system image 
> invariant.  Also, if F were to become a leader at some point, it could use 
> its state to seed other servers, and they all could lose the writes in the 
> missing interval above.
> *Notes:*
> - Reason why snapshot file may be missing or incomplete:
> -- Zookeeper does not sync the data directory after creating a snapshot file. 
>  Even if a newly created file is synced to disk, if the corresponding 
> directory entry is not, then the file will not be visible in the namespace.
> -- Zookeeper does not sync snapshot files.  So, they may be empty or 
> incomplete during recovery from an untimely crash.
> - In step (6) above, the server could also have written the new transaction 
> 31 to a new log file, "log.31".  The final outcome would still be the same.
> We are able to deterministically reproduce this problem using the following 
> steps:
> # Create a new Zookeeper ensemble on 3 hosts: A, B, and C.
> # Ensured each server has at least one snapshot file in its data dir.
> # Stop Zookeeper process on server A.
> # Slow down disk syncs on server A (see example script below). This ensures 
> that snapshot files written by Zookeeper don't make it to disk spontaneously. 
>  Log files will be written to disk as Zookeeper explicitly issues a sync call 
> on such files.
> # Connect to server B and create a new znode /test1.
> # Start Zookeeper process on A, wait for it to write a new

Failed: ZOOKEEPER-2310 PreCommit Build #3412

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2310
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3412/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 119 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12802042/zookeeper-2310-version-2.patch
 [exec]   against trunk revision 1759917.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3412//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 3e6f0a10e090b14825c62abdc4a6921f0da6da18 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1605:
 exec returned: 1

Total time: 47 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2310
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

Failed: ZOOKEEPER-2574 PreCommit Build #3411

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 435798 lines...]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] ea843d4e4b8a2585fd32c91ec3ba7d73be382c15 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1605:
 exec returned: 1

Total time: 22 minutes 40 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Compressed 559.61 KB of artifacts by 45.7% relative to #3407
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2574
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.zookeeper.test.QuorumTest.testLeaderShutdown

Error Message:
Timeout occurred. Please note the time in the report does not reflect the time 
until the timeout.

Stack Trace:
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in 
the report does not reflect the time until the timeout.
at java.lang.Thread.run(Thread.java:745)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481246#comment-15481246
 ] 

Hadoop QA commented on ZOOKEEPER-2574:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12827921/ZOOKEEPER-2574.2.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3411//console

This message is automatically generated.

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: (was: ZOOKEEPER-2574.2.patch)

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: ZOOKEEPER-2574.2.patch

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481196#comment-15481196
 ] 

Hadoop QA commented on ZOOKEEPER-2574:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12827920/ZOOKEEPER-2574.2.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//console

This message is automatically generated.

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Failed: ZOOKEEPER-2574 PreCommit Build #3410

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 627 lines...]
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] -1 javac.  The patch appears to cause tar ant target to fail.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3410//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] bccf1b6b4ab28dcab0016e0a55f4ee6df498633b logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1605:
 exec returned: 2

Total time: 2 minutes 26 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Compressed 271.51 KB of artifacts by 23.6% relative to #3407
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2574
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: ZOOKEEPER-2574.2.patch

Uploading patch for trunk, previous patch does not work on trunk (works on 
3.4.8 and 3.5.2).

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15481163#comment-15481163
 ] 

Hadoop QA commented on ZOOKEEPER-2574:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12827918/ZOOKEEPER-2574.patch
  against trunk revision 1759917.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//console

This message is automatically generated.

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Failed: ZOOKEEPER-2574 PreCommit Build #3409

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 628 lines...]
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] -1 javac.  The patch appears to cause tar ant target to fail.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3409//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 1b905055e910544b6973373c2fcb4310a3fbb21b logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1605:
 exec returned: 2

Total time: 2 minutes 26 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Compressed 271.51 KB of artifacts by 23.6% relative to #3407
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2574
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: (was: ZOOKEEPER-2574.patch)

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rai updated ZOOKEEPER-2574:

Attachment: ZOOKEEPER-2574.patch

> PurgeTxnLog can inadvertently delete required txn log files
> ---
>
> Key: ZOOKEEPER-2574
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
> Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>Reporter: Abhishek Rai
>Assignee: Abhishek Rai
>Priority: Blocker
> Fix For: 3.4.10, 3.5.3
>
> Attachments: ZOOKEEPER-2574.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to 
> FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java.  As a 
> result, some old-looking but required txn log files can be deleted, resulting 
> in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but 
> without accompanying log rollover, which is possible if the server was 
> running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is 
> older than the zxid of the oldest snapshot (110).  This results in loss of 
> transactions in the range 131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to 
> FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log 
> file with starting zxid < oldest retained snapshot's highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files