[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-08-08 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572791#comment-16572791
 ] 

Hudson commented on ZOOKEEPER-2926:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #143 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/143/])
ZOOKEEPER-2926: Fix potential data consistency issue due to the session (hanm: 
rev cd209456b67cde5aba771b1a240ebe5607398459)
* (edit) src/java/test/org/apache/zookeeper/test/ClientBase.java
* (edit) src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java
* (edit) src/java/test/org/apache/zookeeper/server/MockServerCnxn.java
* (edit) src/java/main/org/apache/zookeeper/server/SessionTracker.java
* (edit) 
src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java
* (edit) src/java/test/org/apache/zookeeper/server/PrepRequestProcessorTest.java
* (add) 
src/java/test/org/apache/zookeeper/server/quorum/SessionUpgradeQuorumTest.java
* (edit) src/java/test/org/apache/zookeeper/test/QuorumBase.java
* (edit) src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java
* (edit) src/java/test/org/apache/zookeeper/server/SessionTrackerTest.java
* (edit) src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java
* (edit) src/java/test/org/apache/zookeeper/test/SessionTrackerCheckTest.java
* (edit) 
src/java/main/org/apache/zookeeper/server/quorum/LocalSessionTracker.java
* (edit) src/java/main/org/apache/zookeeper/server/ServerCnxn.java
* (edit) 
src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java
* (edit) 
src/java/main/org/apache/zookeeper/server/quorum/UpgradeableSessionTracker.java


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-07-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550299#comment-16550299
 ] 

Hadoop QA commented on ZOOKEEPER-2926:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 20 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1974//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1974//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1974//console

This message is automatically generated.

> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327601#comment-16327601
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/447
  
@anmolnar I also find it's hard to track when using overwriting the commit, 
I'll use stack instead.

For the tests, I need to inject some failure in the Quorum being set up, 
that's why I need the customized way I'm doing in my code, I think I prefer to 
create a separate test class.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327596#comment-16327596
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r161855408
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/SessionTrackerCheckTest.java ---
@@ -185,8 +182,8 @@ public void testLeaderSessionTracker() throws Exception 
{
 TICK_TIME, expirer.sid, false, testZKSListener());
 
 // Global session
-sessionId = 0xdeadbeef;
-tracker.addSession(sessionId, CONNECTION_TIMEOUT);
+sessionId = 0xdeadbef0;
--- End diff --

I think this is not changed by intention, might happen when merge patch, 
I'll get this fixed.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325706#comment-16325706
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r161405611
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/SessionTrackerCheckTest.java ---
@@ -185,8 +182,8 @@ public void testLeaderSessionTracker() throws Exception 
{
 TICK_TIME, expirer.sid, false, testZKSListener());
 
 // Global session
-sessionId = 0xdeadbeef;
-tracker.addSession(sessionId, CONNECTION_TIMEOUT);
+sessionId = 0xdeadbef0;
--- End diff --

Why have you changed this? :)


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325703#comment-16325703
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/447
  
@lvfangmin Got it. Recently we talked about why multiple commits are better 
for review over squashing. Now I see why.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324707#comment-16324707
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/447
  
@anmolnar I changed the code based on your comments and did amend, is this 
the right process?


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324503#comment-16324503
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/447
  
@lvfangmin Have you amended your commit?
I see a lot of changes since I last reviewed this PR.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324478#comment-16324478
 ] 

Hadoop QA commented on ZOOKEEPER-2926:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 26 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1411//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1411//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1411//console

This message is automatically generated.

> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324396#comment-16324396
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r161300774
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -1187,13 +1187,8 @@ private ProcessTxnResult processTxn(Request request, 
TxnHeader hdr,
 if (opCode == OpCode.createSession) {
 if (hdr != null && txn instanceof CreateSessionTxn) {
 CreateSessionTxn cst = (CreateSessionTxn) txn;
-sessionTracker.addGlobalSession(sessionId, 
cst.getTimeOut());
-} else if (request != null && request.isLocalSession()) {
-request.request.rewind();
-int timeout = request.request.getInt();
-request.request.rewind();
-sessionTracker.addSession(request.sessionId, timeout);
-} else {
+sessionTracker.commitSession(sessionId, cst.getTimeOut());
+} else if (request == null || !request.isLocalSession()) {
--- End diff --

Local session creation is happened in ZooKeeperServer when the when 
processing connection request, PrepRequestProcessor only deals with the global 
sessions.

Local session creation will also send a createSession op into the processor 
pipeline, so if we remove the else check, a lot of warning log will show for 
local session createSession op, that's not what we wanted.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324397#comment-16324397
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r161296695
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
---
@@ -280,6 +278,12 @@ public synchronized boolean addSession(long id, int 
sessionTimeout) {
 return added;
 }
 
+public synchronized boolean commitSession(long id, int sessionTimeout) 
{
--- End diff --

Yes, that's kind of commit changes to ZkDB, like the usual txns commit to 
DataTree in ZkDB.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324395#comment-16324395
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r161296493
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -579,13 +579,8 @@ protected void pRequest2Txn(int type, long zxid, 
Request request,
 int to = request.request.getInt();
 request.setTxn(new CreateSessionTxn(to));
 request.request.rewind();
-if (request.isLocalSession()) {
-// This will add to local session tracker if it is 
enabled
-zks.sessionTracker.addSession(request.sessionId, to);
-} else {
-// Explicitly add to global session if the flag is not 
set
-zks.sessionTracker.addGlobalSession(request.sessionId, 
to);
-}
+// only add the global session tracker but not to ZKDb
+zks.sessionTracker.addGlobalSession(request.sessionId, to);
--- End diff --

On LeaderSessionTracker, we need to differentiate add local session or 
global session, but since we only call createSession when add local session, I 
think I can simplify the interface more by removing addGlobalSession and rename 
addSession to trackSession.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322341#comment-16322341
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r160970304
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -579,13 +579,8 @@ protected void pRequest2Txn(int type, long zxid, 
Request request,
 int to = request.request.getInt();
 request.setTxn(new CreateSessionTxn(to));
 request.request.rewind();
-if (request.isLocalSession()) {
-// This will add to local session tracker if it is 
enabled
-zks.sessionTracker.addSession(request.sessionId, to);
-} else {
-// Explicitly add to global session if the flag is not 
set
-zks.sessionTracker.addGlobalSession(request.sessionId, 
to);
-}
+// only add the global session tracker but not to ZKDb
+zks.sessionTracker.addGlobalSession(request.sessionId, to);
--- End diff --

You could remove this check safely, because effectively there was no 
difference between `addSession()` and `addGlobalSession()` calls.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322342#comment-16322342
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r160971106
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
---
@@ -280,6 +278,12 @@ public synchronized boolean addSession(long id, int 
sessionTimeout) {
 return added;
 }
 
+public synchronized boolean commitSession(long id, int sessionTimeout) 
{
--- End diff --

`sessionsWithTimeout` is a Map within ZKdb, so adding sessions to it is 
equivalent to persisting them.

I think you could also remove the `addGlobalSession()` method completely, 
because it doesn't do anything special than forwarding the call to 
`addSession()`.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322340#comment-16322340
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r160972325
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -1187,13 +1187,8 @@ private ProcessTxnResult processTxn(Request request, 
TxnHeader hdr,
 if (opCode == OpCode.createSession) {
 if (hdr != null && txn instanceof CreateSessionTxn) {
 CreateSessionTxn cst = (CreateSessionTxn) txn;
-sessionTracker.addGlobalSession(sessionId, 
cst.getTimeOut());
-} else if (request != null && request.isLocalSession()) {
-request.request.rewind();
-int timeout = request.request.getInt();
-request.request.rewind();
-sessionTracker.addSession(request.sessionId, timeout);
-} else {
+sessionTracker.commitSession(sessionId, cst.getTimeOut());
+} else if (request == null || !request.isLocalSession()) {
--- End diff --

Do you need this check here?
Basically you removed the part of requesting localSession creation, because 
it has already happened in PrepRequestProcessor and it's safe to leave the 
default 'else' handler as it was.


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322343#comment-16322343
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r160976160
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java ---
@@ -102,32 +101,42 @@ public boolean isGlobalSession(long sessionId) {
 }
 
 public boolean addGlobalSession(long sessionId, int sessionTimeout) {
+// no global session tracker, do nothing
+return false;
--- End diff --

Is it possible to throw UnsupportedOperationException instead?


> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319101#comment-16319101
 ] 

Hadoop QA commented on ZOOKEEPER-2926:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 26 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1403//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1403//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1403//console

This message is automatically generated.

> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management

2018-01-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319086#comment-16319086
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2926:
---

GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/447

[ZOOKEEPER-2926] Fix potential data consistency issue due to the session 
management bug



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2926

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #447


commit 04ec15841c5d164f1212eea9cdfd02e10aa4478e
Author: Fangmin Lyu 
Date:   2018-01-09T06:58:32Z

fix potential data consistency issue due to global session added too early




> Data inconsistency issue due to the flaw in the session management
> --
>
> Key: ZOOKEEPER-2926
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Priority: Critical
>
> The local session upgrading feature will upgrade the session locally before 
> receving a quorum commit of creating global session. It's possible that the 
> server shutdown before the creating session request being sent to leader, if 
> we retained the ZKDatabase or there is Snapshot happened just before 
> shutdown, then only this server will have the global session. 
> If that server didn't become leader, then it will have more global sessions 
> than others, and those global sessions won't expire as the leader doesn't 
> know it's existence. If the server became leader, it will accept the client 
> renew session request and the client is allowed to create ephemeral nodes, 
> which means other servers only have ephemeral nodes but not that global 
> session. If there is follower going to have SNAP sync with it, then it will 
> also have the global session. If the server without that global session 
> becomes new leader, it will check and delete those dangling ephemeral node 
> before serving traffic. These could introduce the issues that the ephemeral 
> node being exist on some servers but not others. 
> There is dangling global session issue even without local session feature, 
> because on leader it will update the ZKDatabase when processing 
> ConnectionRequest and in the PrepRequestProcessor before it's quorum 
> committed, which also has this risk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)