[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

krystal he updated ZOOKEEPER-4681:
----------------------------------
    Description: 
Using a [tool|https://github.com/kry4tall/CC-ZOO358] that I modifyed from 
[Filip Niksic's zootester|https://github.com/fniksic/zootester] for testing 
ZooKeeper, I discovered the following scenario which causes uncommitted 
requests to be executed.

Zab protocol has three rounds: PROPOSE, ACK, and COMMIT. By adding relevant 
code to the zookeeper source code,my tool can drop the PROPOSAL, ACK and COMMIT 
messages and collect the values of some variables of each server instance at 
the end of each round. Except affecting message reception, my code will not 
affect other actions of Zookeeper.

 

Setup:

ubuntu 22.04.2, jdk13.0.2, maven 3.9.0, ant 1.10.13.

Replace directory called "zookeeper-server" in Zookeeper 3.5.8 with the 
"zookeeper-server" in [my github repo|https://github.com/kry4tall/CC-ZOO358]. 
Ant the modified Zookeeper 3.5.8 to get zookeeper-3.5.8.jar. Replace 
zookeeper-3.5.8.jar downloaded by maven.

Create a directory called "states" and a file called 
"[scenarios|https://github.com/kry4tall/CC-ZOO358/blob/krystal/zoo-tester/test/scenarios]";.
 Write the path to test.properties in zoo-tester's resource directory.

Use "-s scenario-X"(X = 1,2,3,4,5,6) as the startup parameter to run the main 
method of ZooTester.

 

Base scenario:

Initially, start an ensemble with 3 servers called A, B, and C, and initialize 
2 znodes called /key0 and /key1, and set them to 0 and 1 respectively.
 # Request to set /key0 to 1000 on 3 servers.
 # *(Optional) Isolate the proposal messages which leader send to 2 followers.*
 # *(Optional) Isolate the ack messages which 2 followers send to leader.*
 # (Optional) Stop all servers and then restart them.
 # (Optional) Read /key0 and /key1 in all servers respectively.
 # Request to set /key1 to 1001 on 3 servers.
 # (Optional) Stop all servers and then restart them.
 # Read /key0 and /key1 in all servers respectively.

Mark the execution step list [1,2,5,6,8] as {*}scenario1{*}, [1,2,4,5,6,8] as 
{*}scenario2{*}, [1,2,5,6,7,8] as {*}scenario3{*}, [1,2,4,5,6,7,8] as 
{*}scenario4{*}, [1,2,6,8] as *scenario5* and [1,2,6,7,8] as {*}scenario6{*}, 
[1,3,5,6,8] as {*}scenario7{*}, [1,3,4,5,6,8] as {*}scenario8{*}, [1,3,5,6,7,8] 
as {*}scenario9{*}, [1,3,4,5,6,7,8] as {*}scenario10{*}, [1,3,6,8] as 
*scenario11* and [1,3,6,7,8] as {*}scenario12{*}.

The output of these 12 scenarios is placed in the attachment. As a comparison, 
I have also attached the results of scenario [1,5,6,8] where {*}no message loss 
action was performed{*}. We can see that the results have no problems.

The typical case of a bug caused by dropping proposal message is scenario6. In 
the optional steps, scenario6 selects step2 and step7. By performing these two 
operations, we finally obtained the following result which violates data 
consistency: @ 0: /key0 -> 0, /key1 -> 1001; @ 1: /key0 -> 0, /key1 -> 1001; @ 
2: /key0 -> 1000, /key1 -> 1001. 

The typical case of a bug caused by dropping ack message is scenario7. In the 
optional steps, scenario7 selects step3 and step5. In this scenario, we 
obtained the following result which violates data consistency after step5: @ 0: 
/key0 -> 0, /key1 -> 1; @ 1: /key0 -> 1000, /key1 -> 1; @ 2: /key0 -> 1000, 
/key1 -> 1. 

In addition, by comparing scenario2 and scenario3, we can find that restarting 
the cluster will affect the results. By comparing scenario1 and scenario5, we 
can find that step5, the operation of reading the content of the znode, also 
affects the results.

  was:
Using a [tool|https://github.com/kry4tall/CC-ZOO358] that I modifyed from 
[Filip Niksic's zootester|https://github.com/fniksic/zootester] for testing 
ZooKeeper, I discovered the following scenario which causes uncommitted 
requests to be executed.

Zab protocol has three rounds: PROPOSE, ACK, and COMMIT. By adding relevant 
code to the zookeeper source code,my tool can drop the PROPOSAL, ACK and COMMIT 
messages and collect the values of some variables of each server instance at 
the end of each round. Except affecting message reception, my code will not 
affect other actions of Zookeeper.

 

Setup:

ubuntu 22.04.2, maven 3.9.0, ant 1.10.13.

Replace directory called "zookeeper-server" in Zookeeper 3.5.8 with the 
"zookeeper-server" in [my github repo|https://github.com/kry4tall/CC-ZOO358]. 
Ant the modified Zookeeper 3.5.8 to get zookeeper-3.5.8.jar. Replace 
zookeeper-3.5.8.jar downloaded by maven.

Create a directory called "states" and a file called 
"[scenarios|https://github.com/kry4tall/CC-ZOO358/blob/krystal/zoo-tester/test/scenarios]";.
 Write the path to test.properties in zoo-tester's resource directory.

Use "-s scenario-X"(X = 1,2,3,4,5,6) as the startup parameter to run the main 
method of ZooTester.

 

Base scenario:

Initially, start an ensemble with 3 servers called A, B, and C, and initialize 
2 znodes called /key0 and /key1, and set them to 0 and 1 respectively.
 # Request to set /key0 to 1000 on 3 servers.
 # *(Optional) Isolate the proposal messages which leader send to 2 followers.*
 # *(Optional) Isolate the ack messages which 2 followers send to leader.*
 # (Optional) Stop all servers and then restart them.
 # (Optional) Read /key0 and /key1 in all servers respectively.
 # Request to set /key1 to 1001 on 3 servers.
 # (Optional) Stop all servers and then restart them.
 # Read /key0 and /key1 in all servers respectively.

Mark the execution step list [1,2,5,6,8] as {*}scenario1{*}, [1,2,4,5,6,8] as 
{*}scenario2{*}, [1,2,5,6,7,8] as {*}scenario3{*}, [1,2,4,5,6,7,8] as 
{*}scenario4{*}, [1,2,6,8] as *scenario5* and [1,2,6,7,8] as {*}scenario6{*}, 
[1,3,5,6,8] as {*}scenario7{*}, [1,3,4,5,6,8] as {*}scenario8{*}, [1,3,5,6,7,8] 
as {*}scenario9{*}, [1,3,4,5,6,7,8] as {*}scenario10{*}, [1,3,6,8] as 
*scenario11* and [1,3,6,7,8] as {*}scenario12{*}.

The output of these 12 scenarios is placed in the attachment. As a comparison, 
I have also attached the results of scenario [1,5,6,8] where {*}no message loss 
action was performed{*}. We can see that the results have no problems.

The typical case of a bug caused by dropping proposal message is scenario6. In 
the optional steps, scenario6 selects step2 and step7. By performing these two 
operations, we finally obtained the following result which violates data 
consistency: @ 0: /key0 -> 0, /key1 -> 1001; @ 1: /key0 -> 0, /key1 -> 1001; @ 
2: /key0 -> 1000, /key1 -> 1001. 

The typical case of a bug caused by dropping ack message is scenario7. In the 
optional steps, scenario7 selects step3 and step5. In this scenario, we 
obtained the following result which violates data consistency after step5: @ 0: 
/key0 -> 0, /key1 -> 1; @ 1: /key0 -> 1000, /key1 -> 1; @ 2: /key0 -> 1000, 
/key1 -> 1. 

In addition, by comparing scenario2 and scenario3, we can find that restarting 
the cluster will affect the results. By comparing scenario1 and scenario5, we 
can find that step5, the operation of reading the content of the znode, also 
affects the results.


> Uncommitted requests  have been executed
> ----------------------------------------
>
>                 Key: ZOOKEEPER-4681
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4681
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.5.8
>            Reporter: krystal he
>            Priority: Critical
>         Attachments: zookeeper-no-message-loss.patch, 
> zookeeper-scenario1.patch, zookeeper-scenario10.patch, 
> zookeeper-scenario11.patch, zookeeper-scenario12.patch, 
> zookeeper-scenario2.patch, zookeeper-scenario3.patch, 
> zookeeper-scenario4.patch, zookeeper-scenario5.patch, 
> zookeeper-scenario6.patch, zookeeper-scenario7.patch, 
> zookeeper-scenario8.patch, zookeeper-scenario9.patch
>
>
> Using a [tool|https://github.com/kry4tall/CC-ZOO358] that I modifyed from 
> [Filip Niksic's zootester|https://github.com/fniksic/zootester] for testing 
> ZooKeeper, I discovered the following scenario which causes uncommitted 
> requests to be executed.
> Zab protocol has three rounds: PROPOSE, ACK, and COMMIT. By adding relevant 
> code to the zookeeper source code,my tool can drop the PROPOSAL, ACK and 
> COMMIT messages and collect the values of some variables of each server 
> instance at the end of each round. Except affecting message reception, my 
> code will not affect other actions of Zookeeper.
>  
> Setup:
> ubuntu 22.04.2, jdk13.0.2, maven 3.9.0, ant 1.10.13.
> Replace directory called "zookeeper-server" in Zookeeper 3.5.8 with the 
> "zookeeper-server" in [my github repo|https://github.com/kry4tall/CC-ZOO358]. 
> Ant the modified Zookeeper 3.5.8 to get zookeeper-3.5.8.jar. Replace 
> zookeeper-3.5.8.jar downloaded by maven.
> Create a directory called "states" and a file called 
> "[scenarios|https://github.com/kry4tall/CC-ZOO358/blob/krystal/zoo-tester/test/scenarios]";.
>  Write the path to test.properties in zoo-tester's resource directory.
> Use "-s scenario-X"(X = 1,2,3,4,5,6) as the startup parameter to run the main 
> method of ZooTester.
>  
> Base scenario:
> Initially, start an ensemble with 3 servers called A, B, and C, and 
> initialize 2 znodes called /key0 and /key1, and set them to 0 and 1 
> respectively.
>  # Request to set /key0 to 1000 on 3 servers.
>  # *(Optional) Isolate the proposal messages which leader send to 2 
> followers.*
>  # *(Optional) Isolate the ack messages which 2 followers send to leader.*
>  # (Optional) Stop all servers and then restart them.
>  # (Optional) Read /key0 and /key1 in all servers respectively.
>  # Request to set /key1 to 1001 on 3 servers.
>  # (Optional) Stop all servers and then restart them.
>  # Read /key0 and /key1 in all servers respectively.
> Mark the execution step list [1,2,5,6,8] as {*}scenario1{*}, [1,2,4,5,6,8] as 
> {*}scenario2{*}, [1,2,5,6,7,8] as {*}scenario3{*}, [1,2,4,5,6,7,8] as 
> {*}scenario4{*}, [1,2,6,8] as *scenario5* and [1,2,6,7,8] as {*}scenario6{*}, 
> [1,3,5,6,8] as {*}scenario7{*}, [1,3,4,5,6,8] as {*}scenario8{*}, 
> [1,3,5,6,7,8] as {*}scenario9{*}, [1,3,4,5,6,7,8] as {*}scenario10{*}, 
> [1,3,6,8] as *scenario11* and [1,3,6,7,8] as {*}scenario12{*}.
> The output of these 12 scenarios is placed in the attachment. As a 
> comparison, I have also attached the results of scenario [1,5,6,8] where 
> {*}no message loss action was performed{*}. We can see that the results have 
> no problems.
> The typical case of a bug caused by dropping proposal message is scenario6. 
> In the optional steps, scenario6 selects step2 and step7. By performing these 
> two operations, we finally obtained the following result which violates data 
> consistency: @ 0: /key0 -> 0, /key1 -> 1001; @ 1: /key0 -> 0, /key1 -> 1001; 
> @ 2: /key0 -> 1000, /key1 -> 1001. 
> The typical case of a bug caused by dropping ack message is scenario7. In the 
> optional steps, scenario7 selects step3 and step5. In this scenario, we 
> obtained the following result which violates data consistency after step5: @ 
> 0: /key0 -> 0, /key1 -> 1; @ 1: /key0 -> 1000, /key1 -> 1; @ 2: /key0 -> 
> 1000, /key1 -> 1. 
> In addition, by comparing scenario2 and scenario3, we can find that 
> restarting the cluster will affect the results. By comparing scenario1 and 
> scenario5, we can find that step5, the operation of reading the content of 
> the znode, also affects the results.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to