[jira] [Commented] (RATIS-651) Add metrics related to leaderElection and HeartBeat

2019-08-27 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917409#comment-16917409
 ] 

Shashikant Banerjee commented on RATIS-651:
---

Thanks [~avijayan] for updating the patch. The patch looks good to me. I am +1 
on this change. Will commit this shortly.

> Add metrics related to leaderElection and HeartBeat
> ---
>
> Key: RATIS-651
> URL: https://issues.apache.org/jira/browse/RATIS-651
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Aravindan Vijayan
>Priority: Critical
> Attachments: RATIS-651-000.patch, RATIS-651-001.patch, 
> RATIS-651-002.patch, RATIS-651-003.patch
>
>
> Following metrics would be helpful to determine the leader election events 
> and timeouts:
>  
> |numLeaderElections|Number of leader elections since the creation of ratis 
> pipeline|
> |numLeaderElectionTimeouts|Number of leader election timeouts or failures|
> |LeaderElectionCompletionLatency|Time required to complete a leader election|
> |MaxNoLeaderInterval|Max time where there has been no elected leader in the 
> raft ring|
> |heartBeatMissCount|No of times heartBeat response is missed from a server |



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (RATIS-669) Allow Ratis gRPCTlsConfig to take Java Key/Cert Object in addition to File

2019-08-27 Thread Xiaoyu Yao (Jira)
Xiaoyu Yao created RATIS-669:


 Summary: Allow Ratis gRPCTlsConfig to take Java Key/Cert Object in 
addition to File
 Key: RATIS-669
 URL: https://issues.apache.org/jira/browse/RATIS-669
 Project: Ratis
  Issue Type: Improvement
Affects Versions: 0.3.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


This is needed for TLS client that does not have its own local persistence of 
cert file.

CA cert will be decoded from block token for client external to ozone cluser 
(non SCM/OM/DN).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-543) Ratis GRPC client produces excessive logging while writing data.

2019-08-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917292#comment-16917292
 ] 

Hadoop QA commented on RATIS-543:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m  1s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.examples.filestore.TestFileStoreWithGrpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/ratis:date2019-08-27 |
| JIRA Issue | RATIS-543 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12978715/r485_20190827.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 91b564800372 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / 021165f |
| maven | version: Apache Maven 3.6.0 
(97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T18:41:47Z) |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/946/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-RATIS-Build/946/testReport/ |
| Max. process+thread count | 2200 (vs. ulimit of 5000) |
| modules | C: ratis-grpc U: ratis-grpc |
| Console output | 
https://builds.apache.org/job/PreCommit-RATIS-Build/946/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Ratis GRPC client produces excessive logging while writing data.
> 
>
> Key: RATIS-543
> URL: https://issues.apache.org/jira/browse/RATIS-543
> Project: Ratis
>  Issue Type: Bug
>  Components: gRPC
>Reporter: Aravindan Vijayan
>Assignee: Tsz Wo Nicholas Sze
>Priority: Blocker
>  Labels: ozone
> Attachments: r485_20190827.patch
>
>
> {code}
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> 

[jira] [Commented] (RATIS-651) Add metrics related to leaderElection and HeartBeat

2019-08-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917274#comment-16917274
 ] 

Hadoop QA commented on RATIS-651:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 1s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 48s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
ratis.server.simulation.TestRaftStateMachineExceptionWithSimulatedRpc |
|   | ratis.examples.filestore.TestFileStoreWithNetty |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/ratis:date2019-08-27 |
| JIRA Issue | RATIS-651 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12978708/RATIS-651-003.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 53af9d7639d0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / 021165f |
| maven | version: Apache Maven 3.6.0 
(97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T18:41:47Z) |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/945/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-RATIS-Build/945/testReport/ |
| Max. process+thread count | 2982 (vs. ulimit of 5000) |
| modules | C: ratis-server ratis-test U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-RATIS-Build/945/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add metrics related to leaderElection and HeartBeat
> ---
>
> Key: RATIS-651
> URL: https://issues.apache.org/jira/browse/RATIS-651
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
> 

[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

2019-08-27 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917139#comment-16917139
 ] 

Rajeshbabu Chintaguntla commented on RATIS-556:
---

[~an...@apache.org]
bq.can we please do this small change: instead of throwing an exception and 
catching outside, can we just WARN here itself and continue processing other 
logs(as to avoid im.mediate retry of the same log and in case if it continues 
to fail, then other logs will not ever be tried for close).
Done in v3 patch.

> Detect node failures and close the log to prevent additional writes
> ---
>
> Key: RATIS-556
> URL: https://issues.apache.org/jira/browse/RATIS-556
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch, RATIS-556_v3.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (RATIS-556) Detect node failures and close the log to prevent additional writes

2019-08-27 Thread Rajeshbabu Chintaguntla (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla updated RATIS-556:
--
Attachment: RATIS-556_v3.patch

> Detect node failures and close the log to prevent additional writes
> ---
>
> Key: RATIS-556
> URL: https://issues.apache.org/jira/browse/RATIS-556
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch, RATIS-556_v3.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

2019-08-27 Thread Ankit Singhal (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917105#comment-16917105
 ] 

Ankit Singhal commented on RATIS-556:
-

Thanks [~rajeshbabu] , v2 looks good to me.

can we please do this small change: instead of throwing an exception and 
catching outside, can we just WARN here itself and continue processing other 
logs(as to avoid immediate retry of the same log and in case if it continues to 
fail, then other logs will not ever be tried for close).

{code}
+try {
+RaftClientReply reply = client.send(
+() -> 
LogServiceProtoUtil.toChangeStateRequestProto(logName, LogStream.State.CLOSED)
+.toByteString());
+LogServiceProtos.ChangeStateReplyProto 
message =
+
LogServiceProtos.ChangeStateReplyProto.parseFrom(reply.getMessage().getContent());
+} catch (IOException e) {
+throw new RuntimeException(e);
+}
{code}


> Detect node failures and close the log to prevent additional writes
> ---
>
> Key: RATIS-556
> URL: https://issues.apache.org/jira/browse/RATIS-556
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

2019-08-27 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917083#comment-16917083
 ] 

Rajeshbabu Chintaguntla commented on RATIS-556:
---

[~elserj] [~an...@apache.org] uploaded the patch adding an inverted index to 
map peer to logs and closing the logs when the peer is down.

> Detect node failures and close the log to prevent additional writes
> ---
>
> Key: RATIS-556
> URL: https://issues.apache.org/jira/browse/RATIS-556
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (RATIS-556) Detect node failures and close the log to prevent additional writes

2019-08-27 Thread Rajeshbabu Chintaguntla (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla updated RATIS-556:
--
Attachment: RATIS-556_v2.patch

> Detect node failures and close the log to prevent additional writes
> ---
>
> Key: RATIS-556
> URL: https://issues.apache.org/jira/browse/RATIS-556
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917069#comment-16917069
 ] 

Tsz Wo Nicholas Sze commented on RATIS-661:
---

>  Since the impl is removed earlier, RaftServer#getGroupIds would not give the 
>corresponding groupId ...

When the group is being removed, it is correct to have RaftServer#getGroupIds 
not returning that id.  Ozone datanode could use notifyGroupRemove() to check 
when the server impl is shutdown.

If the group is not removed from the map in the beginning, new calls including 
client requests and another groupRemoveAsync(..) call can happen.  It will have 
race condition.

 

> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch, RATIS-661.004.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917067#comment-16917067
 ] 

Hadoop QA commented on RATIS-661:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 27s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.netty.TestLogAppenderWithNetty |
|   | ratis.server.simulation.TestRaftStateMachineExceptionWithSimulatedRpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/ratis:date2019-08-27 |
| JIRA Issue | RATIS-661 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12978677/RATIS-661.004.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 2a046e678ea8 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / 021165f |
| maven | version: Apache Maven 3.6.0 
(97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T18:41:47Z) |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/944/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-RATIS-Build/944/testReport/ |
| Max. process+thread count | 4208 (vs. ulimit of 5000) |
| modules | C: ratis-server ratis-test U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-RATIS-Build/944/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: 

[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917059#comment-16917059
 ] 

Lokesh Jain commented on RATIS-661:
---

[~szetszwo] Thanks for reviewing the patch!

| Why changing remove(..) to get(..) below? 

Since the impl is removed earlier, RaftServer#getGroupIds would not give the 
corresponding groupId even though the group has not yet been removed. 
RaftServer#getGroupIds is used in ozone datanode to know if pipeline exists or 
not. This can lead to race condition as pipeline may still be active even 
though it is reported as non-existent.

| Just make the call there as below.

I was thinking of keeping it this way so that we notify after all the 
transactions have been applied.
{code:java}
impl.shutdown(deleteDirectory);
impl.getStateMachine().notifyGroupRemove();{code}

> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch, RATIS-661.004.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Moved] (RATIS-668) Fix NOTICE file

2019-08-27 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal moved HDDS-2046 to RATIS-668:
---

  Key: RATIS-668  (was: HDDS-2046)
 Target Version/s: 0.4.0  (was: 0.4.1)
Affects Version/s: (was: 0.4.1)
   0.4.0
 Workflow: no-reopen-closed, patch-avail  (was: patch-available, 
re-open possible)
  Project: Ratis  (was: Hadoop Distributed Data Store)

> Fix NOTICE file
> ---
>
> Key: RATIS-668
> URL: https://issues.apache.org/jira/browse/RATIS-668
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Blocker
>
> NOTICE file needs to be updated based on Justin's comments here:
>  
> [https://mail-archives.apache.org/mod_mbox/incubator-general/201908.mbox/%3C8EA21F57-A972-4CBE-AC2F-D3830FE6BDB4%40classsoftware.com%3E]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916965#comment-16916965
 ] 

Tsz Wo Nicholas Sze commented on RATIS-661:
---

[~ljain], thanks for working on this.
-  Why changing remove(..) to get(..) below?  It could have a race condition 
when there are multiple groupRemoveAsync(..) calls.
{code}
 }
-final CompletableFuture f = impls.remove(groupId);
+final CompletableFuture f = impls.get(groupId);
 if (f == null) {
{code}
- Let's call the new method notifyGroupRemove() in StateMachine.
- Let's do not change shutdown(..) since the groupRemoval parameter is always 
false except for groupRemoveAsync(..). Just make the call there as below.
{code}
@@ -403,6 +403,7 @@ public class RaftServerProxy implements RaftServer {
 }
 return f.thenApply(impl -> {
   final Collection commitInfos = impl.getCommitInfos();
+  impl.getStateMachine().notifyGroupRemove();
   impl.shutdown(deleteDirectory);
   return new RaftClientReply(request, commitInfos);
 });
 {code}


> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch, RATIS-661.004.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (RATIS-543) Ratis GRPC client produces excessive logging while writing data.

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze reassigned RATIS-543:
-

Assignee: Tsz Wo Nicholas Sze

> Ratis GRPC client produces excessive logging while writing data.
> 
>
> Key: RATIS-543
> URL: https://issues.apache.org/jira/browse/RATIS-543
> Project: Ratis
>  Issue Type: Bug
>  Components: gRPC
>Reporter: Aravindan Vijayan
>Assignee: Tsz Wo Nicholas Sze
>Priority: Blocker
>  Labels: ozone
> Attachments: r485_20190827.patch
>
>
> {code}
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1352, SUCCESS, logIndex=15195,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1355, SUCCESS, logIndex=15196,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1357, SUCCESS, logIndex=15197,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6: receive 
> RaftClientReply:client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6@group-AE803AF42C5D,
>  cid=1370, SUCCESS, logIndex=0, com
> mits[5a076d87-abf9-4ade-ae37-adab741d99a6:c16423, 
> 6e21905d-9796-4248-834e-ed97ea6763ef:c16422, 
> 34e8d6e5-456f-4e2a-99a5-4f21fd9c4a7e:c16423]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836: receive 
> RaftClientReply:client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836@group-4E41299EA191,
>  cid=1376, SUCCESS, logIndex=0, com
> mits[a5729949-67f1-496e-a0d3-1bfc0e139836:c4764, 
> 111d4c23-756f-4c8a-a48d-aa2a327a5179:c4764, 
> 287eccfb-8461-419a-8732-529d042380b3:c4764]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb: receive 
> RaftClientReply:client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb@group-D1BB7F32F754,
>  cid=1382, FAILED org.apache.ratis.
> protocol.NotLeaderException: Server 0bb45975-b0d2-499e-85cc-22ea22c57ecb is 
> not the leader (f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:10.17.200.18:9858). 
> Request must be sent to leader., logIndex=0, 
> commits[0bb45975-b0d2-499e-85cc-22ea22c57ecb:c15358, 6c7a
> 780f-5474-49da-b880-3eaf69d9d83d:c15358, 
> f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:c15358]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1359, SUCCESS, logIndex=15208, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1362, SUCCESS, logIndex=15209, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1363, SUCCESS, logIndex=15210, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:32 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1371, SUCCESS, logIndex=15211, 
> 

[jira] [Updated] (RATIS-543) Ratis GRPC client produces excessive logging while writing data.

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated RATIS-543:
--
Component/s: gRPC

r543_20190827.patch: change the log to trace.

> Ratis GRPC client produces excessive logging while writing data.
> 
>
> Key: RATIS-543
> URL: https://issues.apache.org/jira/browse/RATIS-543
> Project: Ratis
>  Issue Type: Bug
>  Components: gRPC
>Reporter: Aravindan Vijayan
>Priority: Blocker
>  Labels: ozone
> Attachments: r485_20190827.patch
>
>
> {code}
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1352, SUCCESS, logIndex=15195,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1355, SUCCESS, logIndex=15196,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1357, SUCCESS, logIndex=15197,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6: receive 
> RaftClientReply:client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6@group-AE803AF42C5D,
>  cid=1370, SUCCESS, logIndex=0, com
> mits[5a076d87-abf9-4ade-ae37-adab741d99a6:c16423, 
> 6e21905d-9796-4248-834e-ed97ea6763ef:c16422, 
> 34e8d6e5-456f-4e2a-99a5-4f21fd9c4a7e:c16423]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836: receive 
> RaftClientReply:client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836@group-4E41299EA191,
>  cid=1376, SUCCESS, logIndex=0, com
> mits[a5729949-67f1-496e-a0d3-1bfc0e139836:c4764, 
> 111d4c23-756f-4c8a-a48d-aa2a327a5179:c4764, 
> 287eccfb-8461-419a-8732-529d042380b3:c4764]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb: receive 
> RaftClientReply:client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb@group-D1BB7F32F754,
>  cid=1382, FAILED org.apache.ratis.
> protocol.NotLeaderException: Server 0bb45975-b0d2-499e-85cc-22ea22c57ecb is 
> not the leader (f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:10.17.200.18:9858). 
> Request must be sent to leader., logIndex=0, 
> commits[0bb45975-b0d2-499e-85cc-22ea22c57ecb:c15358, 6c7a
> 780f-5474-49da-b880-3eaf69d9d83d:c15358, 
> f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:c15358]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1359, SUCCESS, logIndex=15208, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1362, SUCCESS, logIndex=15209, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1363, SUCCESS, logIndex=15210, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:32 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1371, SUCCESS, logIndex=15211, 
> 

[jira] [Updated] (RATIS-543) Ratis GRPC client produces excessive logging while writing data.

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated RATIS-543:
--
Attachment: r485_20190827.patch

> Ratis GRPC client produces excessive logging while writing data.
> 
>
> Key: RATIS-543
> URL: https://issues.apache.org/jira/browse/RATIS-543
> Project: Ratis
>  Issue Type: Bug
>Reporter: Aravindan Vijayan
>Priority: Blocker
>  Labels: ozone
> Attachments: r485_20190827.patch
>
>
> {code}
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1352, SUCCESS, logIndex=15195,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1355, SUCCESS, logIndex=15196,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1357, SUCCESS, logIndex=15197,
>  commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15201, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15189, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15186]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6: receive 
> RaftClientReply:client-C46A037579AA->5a076d87-abf9-4ade-ae37-adab741d99a6@group-AE803AF42C5D,
>  cid=1370, SUCCESS, logIndex=0, com
> mits[5a076d87-abf9-4ade-ae37-adab741d99a6:c16423, 
> 6e21905d-9796-4248-834e-ed97ea6763ef:c16422, 
> 34e8d6e5-456f-4e2a-99a5-4f21fd9c4a7e:c16423]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836: receive 
> RaftClientReply:client-EBF618C3F968->a5729949-67f1-496e-a0d3-1bfc0e139836@group-4E41299EA191,
>  cid=1376, SUCCESS, logIndex=0, com
> mits[a5729949-67f1-496e-a0d3-1bfc0e139836:c4764, 
> 111d4c23-756f-4c8a-a48d-aa2a327a5179:c4764, 
> 287eccfb-8461-419a-8732-529d042380b3:c4764]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb: receive 
> RaftClientReply:client-4D5E3CDC8889->0bb45975-b0d2-499e-85cc-22ea22c57ecb@group-D1BB7F32F754,
>  cid=1382, FAILED org.apache.ratis.
> protocol.NotLeaderException: Server 0bb45975-b0d2-499e-85cc-22ea22c57ecb is 
> not the leader (f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:10.17.200.18:9858). 
> Request must be sent to leader., logIndex=0, 
> commits[0bb45975-b0d2-499e-85cc-22ea22c57ecb:c15358, 6c7a
> 780f-5474-49da-b880-3eaf69d9d83d:c15358, 
> f1a756c3-6b42-4ece-8093-dbcdac5f8d5b:c15358]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1359, SUCCESS, logIndex=15208, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1362, SUCCESS, logIndex=15209, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:31 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1363, SUCCESS, logIndex=15210, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15210, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> aaf673a3-95ac-43aa-8614-b1a324142430:c15189]
> 19/05/03 10:23:32 INFO client.GrpcClientProtocolClient: 
> client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da: receive 
> RaftClientReply:client-FD23551CACEE->51711703-9f9d-4c79-bfb1-38726f0059da@group-1EADCA052664,
>  cid=1371, SUCCESS, logIndex=15211, 
> commits[51711703-9f9d-4c79-bfb1-38726f0059da:c15211, 
> 0beac0f1-af74-43ac-ba73-0a92ecb9f0ae:c15201, 
> 

[jira] [Commented] (RATIS-485) Load Generator OOMs if Ratis Unavailable

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916920#comment-16916920
 ] 

Tsz Wo Nicholas Sze commented on RATIS-485:
---

Is the test creating a lot of RaftClient(s)?  Each client has a 
TimeoutScheduler which may cause the OOM.   Let's make the scheduler static to 
see if it could fix the OOM: r485_20190827.patch

> Load Generator OOMs if Ratis Unavailable
> 
>
> Key: RATIS-485
> URL: https://issues.apache.org/jira/browse/RATIS-485
> Project: Ratis
>  Issue Type: Bug
>  Components: examples
>Reporter: Clay B.
>Priority: Trivial
> Attachments: loadgen.log, r485_20190827.patch
>
>
> Running the load generator without a Ratis cluster (e.g. spurious node IPs) 
> results in an OOM.
> If one has a single Ratis server it tries seemingly indefinitely:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ 
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 
> --numFiles 100 --peers n0:127.0.0.1:1{code}
> If one has two Ratis servers it OOMs:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ 
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 
> --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
> [...]
> 1/787867107@5e5792a0 with java.util.concurrent.CompletionException: 
> java.io.IOException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new 
> leader: null. Failed 
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
>  with java.io.IOException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader 
> from n1 to n0
> 2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with 
> policy RetryForeverNoSleep for 
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send* 
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed 
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
>  with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: 
> unable to create new native thread
> Exception in thread "main" java.util.concurrent.CompletionException: 
> java.lang.OutOfMemoryError: unable to create new native thread
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
>     at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>     at 
> java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
>     at 
> java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
>     at 
> org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
>     at 
> org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
>     at 
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
>     at 
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
>     at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
>     at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  

[jira] [Updated] (RATIS-485) Load Generator OOMs if Ratis Unavailable

2019-08-27 Thread Tsz Wo Nicholas Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated RATIS-485:
--
Attachment: r485_20190827.patch

> Load Generator OOMs if Ratis Unavailable
> 
>
> Key: RATIS-485
> URL: https://issues.apache.org/jira/browse/RATIS-485
> Project: Ratis
>  Issue Type: Bug
>  Components: examples
>Reporter: Clay B.
>Priority: Trivial
> Attachments: loadgen.log, r485_20190827.patch
>
>
> Running the load generator without a Ratis cluster (e.g. spurious node IPs) 
> results in an OOM.
> If one has a single Ratis server it tries seemingly indefinitely:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ 
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 
> --numFiles 100 --peers n0:127.0.0.1:1{code}
> If one has two Ratis servers it OOMs:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ 
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 
> --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
> [...]
> 1/787867107@5e5792a0 with java.util.concurrent.CompletionException: 
> java.io.IOException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new 
> leader: null. Failed 
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
>  with java.io.IOException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader 
> from n1 to n0
> 2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with 
> policy RetryForeverNoSleep for 
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send* 
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed 
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 
> RW, 
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
>  with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: 
> unable to create new native thread
> Exception in thread "main" java.util.concurrent.CompletionException: 
> java.lang.OutOfMemoryError: unable to create new native thread
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
>     at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>     at 
> java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
>     at 
> java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
>     at 
> org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
>     at 
> org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
>     at 
> org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
>     at 
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
>     at 
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
>     at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
>     at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
>     at java.lang.Thread.start0(Native Method)
>     at 

[jira] [Updated] (RATIS-651) Add metrics related to leaderElection and HeartBeat

2019-08-27 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated RATIS-651:

Attachment: RATIS-651-003.patch

> Add metrics related to leaderElection and HeartBeat
> ---
>
> Key: RATIS-651
> URL: https://issues.apache.org/jira/browse/RATIS-651
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Aravindan Vijayan
>Priority: Critical
> Attachments: RATIS-651-000.patch, RATIS-651-001.patch, 
> RATIS-651-002.patch, RATIS-651-003.patch
>
>
> Following metrics would be helpful to determine the leader election events 
> and timeouts:
>  
> |numLeaderElections|Number of leader elections since the creation of ratis 
> pipeline|
> |numLeaderElectionTimeouts|Number of leader election timeouts or failures|
> |LeaderElectionCompletionLatency|Time required to complete a leader election|
> |MaxNoLeaderInterval|Max time where there has been no elected leader in the 
> raft ring|
> |heartBeatMissCount|No of times heartBeat response is missed from a server |



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-651) Add metrics related to leaderElection and HeartBeat

2019-08-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916900#comment-16916900
 ] 

Hadoop QA commented on RATIS-651:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 51s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.netty.TestRaftSnapshotWithNetty |
|   | ratis.netty.TestRaftStateMachineExceptionWithNetty |
|   | ratis.grpc.TestRaftSnapshotWithGrpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/ratis:date2019-08-27 |
| JIRA Issue | RATIS-651 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12978641/RATIS-651-002.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux 620c4f44fc38 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / 021165f |
| maven | version: Apache Maven 3.6.0 
(97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T18:41:47Z) |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/943/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-RATIS-Build/943/testReport/ |
| Max. process+thread count | 2030 (vs. ulimit of 5000) |
| modules | C: ratis-server ratis-test U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-RATIS-Build/943/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add metrics related to leaderElection and HeartBeat
> ---
>
> Key: RATIS-651
> URL: https://issues.apache.org/jira/browse/RATIS-651
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Shashikant 

[jira] [Commented] (RATIS-651) Add metrics related to leaderElection and HeartBeat

2019-08-27 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916886#comment-16916886
 ] 

Shashikant Banerjee commented on RATIS-651:
---

Thanks [~avijayan] for working on this. The patch looks overall good to me .  
Can we just initialize and aggregate the heartBeatMetrics in LeaderState 
instead of LogAppender class?

> Add metrics related to leaderElection and HeartBeat
> ---
>
> Key: RATIS-651
> URL: https://issues.apache.org/jira/browse/RATIS-651
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Aravindan Vijayan
>Priority: Critical
> Attachments: RATIS-651-000.patch, RATIS-651-001.patch, 
> RATIS-651-002.patch
>
>
> Following metrics would be helpful to determine the leader election events 
> and timeouts:
>  
> |numLeaderElections|Number of leader elections since the creation of ratis 
> pipeline|
> |numLeaderElectionTimeouts|Number of leader election timeouts or failures|
> |LeaderElectionCompletionLatency|Time required to complete a leader election|
> |MaxNoLeaderInterval|Max time where there has been no elected leader in the 
> raft ring|
> |heartBeatMissCount|No of times heartBeat response is missed from a server |



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated RATIS-661:
--
Attachment: RATIS-661.004.patch

> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch, RATIS-661.004.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916572#comment-16916572
 ] 

Lokesh Jain commented on RATIS-661:
---

[~msingh] Thanks for reviewing the patch! v4 patch addresses your comments.

> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch, RATIS-661.004.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (RATIS-661) Add call in state machine to handle group removal

2019-08-27 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916540#comment-16916540
 ] 

Mukul Kumar Singh commented on RATIS-661:
-

Thanks for working on this [~ljain]. The patch generally looks good to me.

Can we add this handleGroupRemove call in RaftServerImpl in shutdown after all 
the transactions have been applied and before deleting the directory.

> Add call in state machine to handle group removal
> -
>
> Key: RATIS-661
> URL: https://issues.apache.org/jira/browse/RATIS-661
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-661.001.patch, RATIS-661.002.patch, 
> RATIS-661.003.patch
>
>
> Currently during RaftServerProxy#groupRemoveAsync there is no way for 
> stateMachine to know that the RaftGroup will be removed. This Jira aims to 
> add a call in the stateMachine to handle group removal.
> It also changes the logic of groupRemoval api to remove the RaftServerImpl 
> from the RaftServerProxy#impls map after the shutdown is complete. This is 
> required to synchronize the removal with the corresponding api of 
> RaftServer#getGroupIds. RaftServer#getGroupIds uses the RaftServerProxy#impls 
> map to get the groupIds.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)