[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

[https://github.com/apache/incubator-ratis/runs/927310008?check_suite_focus=true]

[https://github.com/apache/incubator-ratis/runs/926606136?check_suite_focus=true]

 

!image-2020-07-31-12-33-35-755.png!

 

!image-2020-07-31-12-34-08-384.png!

 

!image-2020-07-31-12-40-11-183.png!

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

!image-2020-07-31-12-33-35-755.png!

 

!image-2020-07-31-12-34-08-384.png!

 

!image-2020-07-31-12-40-11-183.png!

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png, 
> image-2020-07-31-12-34-08-384.png, image-2020-07-31-12-40-11-183.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> [https://github.com/apache/incubator-ratis/runs/927310008?check_suite_focus=true]
> [https://github.com/apache/incubator-ratis/runs/926606136?check_suite_focus=true]
>  
> !image-2020-07-31-12-33-35-755.png!
>  
> !image-2020-07-31-12-34-08-384.png!
>  
> !image-2020-07-31-12-40-11-183.png!
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-981) Step-down stale leader in case of split-brain

2020-07-30 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain resolved RATIS-981.
---
Fix Version/s: 1.1.0
   Resolution: Fixed

> Step-down stale leader in case of split-brain
> -
>
> Key: RATIS-981
> URL: https://issues.apache.org/jira/browse/RATIS-981
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Nanda kumar
>Assignee: Glen Geng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> We should make sure that the stale leader steps down to the candidate state 
> before the next leader election.
> Proposal:
> In the heartbeat thread in the Leader node, we should check if the last 
> response time of the follower is less than the leader election timeout. If 
> the majority of the follower’s last response time is less than the leader 
> election timeout, the current leader is still the active leader. Majority of 
> the followers are heartbeating to the current leader, so there can’t be a new 
> leader.
> If the majority of follower’s last response time is greater than the leader 
> election timeout, the current leader should step down and become a candidate.
> With this check, we can be sure that the current leader will step down and 
> become a candidate before the new leader election starts in case of a network 
> partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Attachment: image-2020-07-31-12-40-11-183.png

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png, 
> image-2020-07-31-12-34-08-384.png, image-2020-07-31-12-40-11-183.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> !image-2020-07-31-12-33-35-755.png!
>  
> !image-2020-07-31-12-34-08-384.png!
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

!image-2020-07-31-12-33-35-755.png!

 

!image-2020-07-31-12-34-08-384.png!

 

!image-2020-07-31-12-40-11-183.png!

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

!image-2020-07-31-12-33-35-755.png!

 

!image-2020-07-31-12-34-08-384.png!

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png, 
> image-2020-07-31-12-34-08-384.png, image-2020-07-31-12-40-11-183.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> !image-2020-07-31-12-33-35-755.png!
>  
> !image-2020-07-31-12-34-08-384.png!
>  
> !image-2020-07-31-12-40-11-183.png!
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Attachment: image-2020-07-31-12-34-08-384.png

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png, 
> image-2020-07-31-12-34-08-384.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> !image-2020-07-31-12-33-35-755.png!
>  
> !image-2020-07-31-12-34-08-384.png!
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

!image-2020-07-31-12-33-35-755.png!

 

!image-2020-07-31-12-34-08-384.png!

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png, 
> image-2020-07-31-12-34-08-384.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> !image-2020-07-31-12-33-35-755.png!
>  
> !image-2020-07-31-12-34-08-384.png!
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Attachment: image-2020-07-31-12-33-35-755.png

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
> Attachments: image-2020-07-31-12-33-35-755.png
>
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides changing election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides changing election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

 

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
>  
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable, especially resources is limited.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable, especially resources is limited.
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], since larger election timeout will make leader 
become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable.
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cased affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cases affected by LeaderState::checkLeadership() ?

 

  was:
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
leader become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable.
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], because larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cases affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned RATIS-624:
---

Assignee: Rui Wang

> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Assignee: Rui Wang
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
 

After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

Current walk around is to enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], since larger election timeout will make leader 
become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 

  was:
After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

The walk around is enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], since larger election timeout will make leader 
become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
>  
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable.
> Current walk around is to enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], since larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cased affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Arpit Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168377#comment-17168377
 ] 

Arpit Agarwal commented on RATIS-624:
-

[~amaliujia] I have added you as a Ratis contributor. You will be able to 
assign issues to yourself now. Welcome aboard!

> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Assignee: Rui Wang
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

Such case need do node restart operation or membership change operation, which 
will make leader vulnerable.

The walk around is enlarge election timeout a little bit, e.g., from 
[150ms,300ms] to [300ms, 600ms], since larger election timeout will make leader 
become more stable.

 

TODO:

1) do we have better way besides change election timeout ?

2) are there other test cased affected by LeaderState::checkLeadership() ?

 

  was:
After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable.
> The walk around is enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], since larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cased affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Affects Version/s: 1.1.0

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Affects Versions: 1.1.0
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
> Such case need do node restart operation or membership change operation, 
> which will make leader vulnerable.
> The walk around is enlarge election timeout a little bit, e.g., from 
> [150ms,300ms] to [300ms, 600ms], since larger election timeout will make 
> leader become more stable.
>  
> TODO:
> 1) do we have better way besides change election timeout ?
> 2) are there other test cased affected by LeaderState::checkLeadership() ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
After merge  LeaderState::checkLeadership(), some test case become hard to pass 
under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.

 

  was:
After merge 

 


> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> After merge  LeaderState::checkLeadership(), some test case become hard to 
> pass under GitHub CI, such as GroupManagementBaseTest and TestMultiRaftGroup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Description: 
After merge 

 

  was:
We should make sure that the stale leader steps down to the candidate state 
before the next leader election.

Proposal:
In the heartbeat thread in the Leader node, we should check if the last 
response time of the follower is less than the leader election timeout. If the 
majority of the follower’s last response time is less than the leader election 
timeout, the current leader is still the active leader. Majority of the 
followers are heartbeating to the current leader, so there can’t be a new 
leader.

If the majority of follower’s last response time is greater than the leader 
election timeout, the current leader should step down and become a candidate.

With this check, we can be sure that the current leader will step down and 
become a candidate before the new leader election starts in case of a network 
partition.



> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> After merge 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)
Glen Geng created RATIS-1014:


 Summary: checkLeadership() may make some test case become flaky 
under GitHub CI.
 Key: RATIS-1014
 URL: https://issues.apache.org/jira/browse/RATIS-1014
 Project: Ratis
  Issue Type: Improvement
Reporter: Glen Geng
Assignee: Glen Geng


We should make sure that the stale leader steps down to the candidate state 
before the next leader election.

Proposal:
In the heartbeat thread in the Leader node, we should check if the last 
response time of the follower is less than the leader election timeout. If the 
majority of the follower’s last response time is less than the leader election 
timeout, the current leader is still the active leader. Majority of the 
followers are heartbeating to the current leader, so there can’t be a new 
leader.

If the majority of follower’s last response time is greater than the leader 
election timeout, the current leader should step down and become a candidate.

With this check, we can be sure that the current leader will step down and 
become a candidate before the new leader election starts in case of a network 
partition.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Labels:   (was: pull-request-available)

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> We should make sure that the stale leader steps down to the candidate state 
> before the next leader election.
> Proposal:
> In the heartbeat thread in the Leader node, we should check if the last 
> response time of the follower is less than the leader election timeout. If 
> the majority of the follower’s last response time is less than the leader 
> election timeout, the current leader is still the active leader. Majority of 
> the followers are heartbeating to the current leader, so there can’t be a new 
> leader.
> If the majority of follower’s last response time is greater than the leader 
> election timeout, the current leader should step down and become a candidate.
> With this check, we can be sure that the current leader will step down and 
> become a candidate before the new leader election starts in case of a network 
> partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1014) checkLeadership() may make some test case become flaky under GitHub CI.

2020-07-30 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated RATIS-1014:
-
Issue Type: Test  (was: Improvement)

> checkLeadership() may make some test case become flaky under GitHub CI.
> ---
>
> Key: RATIS-1014
> URL: https://issues.apache.org/jira/browse/RATIS-1014
> Project: Ratis
>  Issue Type: Test
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Major
>
> We should make sure that the stale leader steps down to the candidate state 
> before the next leader election.
> Proposal:
> In the heartbeat thread in the Leader node, we should check if the last 
> response time of the follower is less than the leader election timeout. If 
> the majority of the follower’s last response time is less than the leader 
> election timeout, the current leader is still the active leader. Majority of 
> the followers are heartbeating to the current leader, so there can’t be a new 
> leader.
> If the majority of follower’s last response time is greater than the leader 
> election timeout, the current leader should step down and become a candidate.
> With this check, we can be sure that the current leader will step down and 
> become a candidate before the new leader election starts in case of a network 
> partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-1013) Fix failed UT caused by RATIS-757

2020-07-30 Thread runzhiwang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168340#comment-17168340
 ] 

runzhiwang commented on RATIS-1013:
---

[~ljain]  [~dineshchitlangia] Hi, It looks like 
https://github.com/apache/incubator-ratis/pull/103 caused CI unstable, and it 
failed CI 6 times, which is  abnormally, Could you have a look ?

> Fix failed UT caused by RATIS-757
> -
>
> Key: RATIS-1013
> URL: https://issues.apache.org/jira/browse/RATIS-1013
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1013) Fix failed UT caused by RATIS-757

2020-07-30 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-1013:
--
Parent: RATIS-863
Issue Type: Sub-task  (was: Bug)

> Fix failed UT caused by RATIS-757
> -
>
> Key: RATIS-1013
> URL: https://issues.apache.org/jira/browse/RATIS-1013
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-1013) Fix failed UT caused by RATIS-757

2020-07-30 Thread runzhiwang (Jira)
runzhiwang created RATIS-1013:
-

 Summary: Fix failed UT caused by RATIS-757
 Key: RATIS-1013
 URL: https://issues.apache.org/jira/browse/RATIS-1013
 Project: Ratis
  Issue Type: Bug
Reporter: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168296#comment-17168296
 ] 

Hanisha Koneru commented on RATIS-624:
--

The requirement comes from Ozone. In Ozone, if one of the OM Ratis server is 
lagging behind and needs to install a snapshot to catch up, it would have to 
stop it's Ratis server so that there are no transactions applied in the 
meanwhile. If Ratis provides an option to pause/ unpause its state, OM Ratis 
server would not have to be stopped in this process. 

> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168119#comment-17168119
 ] 

Rui Wang commented on RATIS-624:


[~hanishakoneru] thanks!

Can you share a bit context on this JIRA? Is there some papers/systems doing 
this? Or is there a requirement from hadoop-ozone to ask ratis supports this?



> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1011) Define internal streaming APIs

2020-07-30 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated RATIS-1011:
--
Description: 
Similar to ratis rpc, ratis streaming should define a set of internal APIs in 
order to support pluggable implementations.

The APIs must support asynchronous event driven.

  was:Similar to ratis rpc, ratis streaming should define a set of internal 
APIs in order to support pluggable implementations.


> Define internal streaming APIs
> --
>
> Key: RATIS-1011
> URL: https://issues.apache.org/jira/browse/RATIS-1011
> Project: Ratis
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Tsz-wo Sze
>Assignee: Ansh Khanna
>Priority: Major
>
> Similar to ratis rpc, ratis streaming should define a set of internal APIs in 
> order to support pluggable implementations.
> The APIs must support asynchronous event driven.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-1012) Implement ratis streaming using netty

2020-07-30 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created RATIS-1012:
-

 Summary: Implement ratis streaming using netty
 Key: RATIS-1012
 URL: https://issues.apache.org/jira/browse/RATIS-1012
 Project: Ratis
  Issue Type: Sub-task
  Components: Streaming
Reporter: Tsz-wo Sze
Assignee: Ansh Khanna


Since we are getting good results from RATIS-1009, we will continue to work on 
the first ratis streaming implementation using netty with zero buffer copying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-1011) Define internal streaming APIs

2020-07-30 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created RATIS-1011:
-

 Summary: Define internal streaming APIs
 Key: RATIS-1011
 URL: https://issues.apache.org/jira/browse/RATIS-1011
 Project: Ratis
  Issue Type: Sub-task
  Components: Streaming
Reporter: Tsz-wo Sze
Assignee: Ansh Khanna


Similar to ratis rpc, ratis streaming should define a set of internal APIs in 
order to support pluggable implementations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-1009) A simple benchmark achieving zero-copy semantics using Netty.

2020-07-30 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze resolved RATIS-1009.
---
Fix Version/s: 1.1.0
   Resolution: Fixed

I have merged the pull request.  Thanks, Ansh!

> A simple benchmark achieving zero-copy semantics using Netty.
> -
>
> Key: RATIS-1009
> URL: https://issues.apache.org/jira/browse/RATIS-1009
> Project: Ratis
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Ansh Khanna
>Assignee: Ansh Khanna
>Priority: Major
> Fix For: 1.1.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Patch: [https://github.com/apache/incubator-ratis/pull/155]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-1009) A simple benchmark achieving zero-copy semantics using Netty.

2020-07-30 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated RATIS-1009:
--
Component/s: Streaming

> A simple benchmark achieving zero-copy semantics using Netty.
> -
>
> Key: RATIS-1009
> URL: https://issues.apache.org/jira/browse/RATIS-1009
> Project: Ratis
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Ansh Khanna
>Assignee: Ansh Khanna
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Patch: [https://github.com/apache/incubator-ratis/pull/155]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168018#comment-17168018
 ] 

Hanisha Koneru commented on RATIS-624:
--

[~amaliujia], please go ahead.

> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-965) Add a metric for raftServer impl groups for a raft server

2020-07-30 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain reassigned RATIS-965:
-

Assignee: Cyrus Jackson  (was: Ansh Khanna)

> Add a metric for raftServer impl groups for a raft server
> -
>
> Key: RATIS-965
> URL: https://issues.apache.org/jira/browse/RATIS-965
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Cyrus Jackson
>Priority: Major
> Fix For: 1.1.0
>
> Attachments: RATIS-965.001.patch
>
>
> Currently, a single raft server instance can contain multiple raftServerImpl 
> belonging to different raft groups. The idea here is to track the number of 
> RaftGroups a raft server is part of.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-965) Add a metric for raftServer impl groups for a raft server

2020-07-30 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain resolved RATIS-965.
---
Fix Version/s: 1.1.0
   Resolution: Fixed

> Add a metric for raftServer impl groups for a raft server
> -
>
> Key: RATIS-965
> URL: https://issues.apache.org/jira/browse/RATIS-965
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Ansh Khanna
>Priority: Major
> Fix For: 1.1.0
>
> Attachments: RATIS-965.001.patch
>
>
> Currently, a single raft server instance can contain multiple raftServerImpl 
> belonging to different raft groups. The idea here is to track the number of 
> RaftGroups a raft server is part of.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-624) RaftServer should support pause/ unpause in its LifeCycle state

2020-07-30 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167653#comment-17167653
 ] 

Rui Wang commented on RATIS-624:


[~hanishakoneru]

Can I work on this JIRA?



> RaftServer should support pause/ unpause in its LifeCycle state
> ---
>
> Key: RATIS-624
> URL: https://issues.apache.org/jira/browse/RATIS-624
> Project: Ratis
>  Issue Type: Task
>Reporter: Hanisha Koneru
>Priority: Major
>  Labels: ozone
> Fix For: 1.1.0
>
>
> This Jira aims to add support to RaftServer to support pause and unpause to 
> its state. When paused, the RaftServer should not accept any incoming append 
> log entries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)