[jira] [Resolved] (RATIS-1006) Exclude netty.3.10 from Ratis dependencies

2020-07-17 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-1006.
--
Resolution: Fixed

Thanks for the review [~ljain] I have merged this.

> Exclude netty.3.10 from Ratis dependencies
> --
>
> Key: RATIS-1006
> URL: https://issues.apache.org/jira/browse/RATIS-1006
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Exclude netty 3.10 from Ratis dependencies



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-1006) Exclude netty.3.10 from Ratis dependencies

2020-07-17 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-1006:


 Summary: Exclude netty.3.10 from Ratis dependencies
 Key: RATIS-1006
 URL: https://issues.apache.org/jira/browse/RATIS-1006
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Exclude netty 3.10 from Ratis dependencies



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-981) Step-down stale leader in case of split-brain

2020-07-06 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned RATIS-981:
---

Assignee: Glen Geng  (was: Nanda kumar)

> Step-down stale leader in case of split-brain
> -
>
> Key: RATIS-981
> URL: https://issues.apache.org/jira/browse/RATIS-981
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Nanda kumar
>Assignee: Glen Geng
>Priority: Major
>
> We should make sure that the stale leader steps down to the candidate state 
> before the next leader election.
> Proposal:
> In the heartbeat thread in the Leader node, we should check if the last 
> response time of the follower is less than the leader election timeout. If 
> the majority of the follower’s last response time is less than the leader 
> election timeout, the current leader is still the active leader. Majority of 
> the followers are heartbeating to the current leader, so there can’t be a new 
> leader.
> If the majority of follower’s last response time is greater than the leader 
> election timeout, the current leader should step down and become a candidate.
> With this check, we can be sure that the current leader will step down and 
> become a candidate before the new leader election starts in case of a network 
> partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-990) Update flatbuffers version

2020-06-29 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148297#comment-17148297
 ] 

Mukul Kumar Singh commented on RATIS-990:
-

we can use the createString function which provides the same functionality as 
createByteVector.

{code}
+int dataContentOffset = builder.createString(
+p.getMessage().getContent().asReadOnlyByteBuffer());
{code}

> Update flatbuffers version
> --
>
> Key: RATIS-990
> URL: https://issues.apache.org/jira/browse/RATIS-990
> Project: Ratis
>  Issue Type: Improvement
>  Components: thirdparty
>Reporter: Tsz-wo Sze
>Assignee: Ansh Khanna
>Priority: Major
>
> https://github.com/apache/incubator-ratis-thirdparty/blob/38b1c0c4201ec0856aed3230fd16ba26cb929e57/pom.xml#L76-L77
> {code}
> 
> 1.11.0
> {code}
> Currently, the flatbuffers version is 1.11.0 in ratis-thirdparty.  However, 
> 1.11.0 seems not supporting methods using ByteBuffer so that it does not 
> support zero buffer copying.  We should update it 1.12.0 or above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-963) Update gRPC to 1.29.0

2020-06-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-963.
-
Resolution: Fixed

Thanks for the reviews [~arp] and [~dineshchitlangia] I have merged this.

> Update gRPC to 1.29.0
> -
>
> Key: RATIS-963
> URL: https://issues.apache.org/jira/browse/RATIS-963
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update gRPC to 1.29.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-985) Reduce notify() in GrpcLogAppender::onNextImpl

2020-06-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned RATIS-985:
---

Assignee: Tsz-wo Sze

> Reduce notify() in GrpcLogAppender::onNextImpl
> --
>
> Key: RATIS-985
> URL: https://issues.apache.org/jira/browse/RATIS-985
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Tsz-wo Sze
>Priority: Major
> Attachments: Screenshot 2020-06-23 at 12.33.41 PM.png
>
>
> !Screenshot 2020-06-23 at 12.33.41 PM.png|width=979,height=211!
> Can be notified on conditional basis, as opposed to notifying for all calls; 
> provides an option to reduce sync contention.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-971) use flatbuffers in Server to Server communication to avoid buffer copies

2020-06-09 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-971:

Attachment: server.flat

> use flatbuffers in Server to Server communication to avoid buffer copies
> 
>
> Key: RATIS-971
> URL: https://issues.apache.org/jira/browse/RATIS-971
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: server.flat
>
>
> Currently, Ratis uses protobufs to serialize/de-serialize data during 
> server/server communication. This causes multiple buffer copies. This can be 
> avoided by using flatbuffers. 
> The idea here is to use flatbuffers for server to server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-970) use flatbuffers in Client to Datanode communication to avoid buffer copies

2020-06-09 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-970:

Attachment: client.flat

> use flatbuffers in Client to Datanode communication to avoid buffer copies
> --
>
> Key: RATIS-970
> URL: https://issues.apache.org/jira/browse/RATIS-970
> Project: Ratis
>  Issue Type: Bug
>  Components: performance, protocol
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: client.flat
>
>
> Currently, Ratis uses protobufs to serialize/de-serialize data during client 
> send and also when it is received on the server. This causes multiple buffer 
> copies. This can be avoided by using flatbuffers. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-971) use flatbuffers in Server to Server communication to avoid buffer copies

2020-06-09 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned RATIS-971:
---

Assignee: Mukul Kumar Singh

> use flatbuffers in Server to Server communication to avoid buffer copies
> 
>
> Key: RATIS-971
> URL: https://issues.apache.org/jira/browse/RATIS-971
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> Currently, Ratis uses protobufs to serialize/de-serialize data during 
> server/server communication. This causes multiple buffer copies. This can be 
> avoided by using flatbuffers. 
> The idea here is to use flatbuffers for server to server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-971) use flatbuffers in Server to Server communication to avoid buffer copies

2020-06-09 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-971:
---

 Summary: use flatbuffers in Server to Server communication to 
avoid buffer copies
 Key: RATIS-971
 URL: https://issues.apache.org/jira/browse/RATIS-971
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh


Currently, Ratis uses protobufs to serialize/de-serialize data during 
server/server communication. This causes multiple buffer copies. This can be 
avoided by using flatbuffers. 
The idea here is to use flatbuffers for server to server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-970) use flatbuffers in Client to Datanode communication to avoid buffer copies

2020-06-09 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-970:
---

 Summary: use flatbuffers in Client to Datanode communication to 
avoid buffer copies
 Key: RATIS-970
 URL: https://issues.apache.org/jira/browse/RATIS-970
 Project: Ratis
  Issue Type: Bug
  Components: performance, protocol
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Currently, Ratis uses protobufs to serialize/de-serialize data during client 
send and also when it is received on the server. This causes multiple buffer 
copies. This can be avoided by using flatbuffers. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-969) Add flatbuffers to Ratis thirdparty.

2020-06-08 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-969:
---

 Summary: Add flatbuffers to Ratis thirdparty.
 Key: RATIS-969
 URL: https://issues.apache.org/jira/browse/RATIS-969
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Add flatbuffers to Ratis thirdparty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-963) Update gRPC to 1.29.0

2020-05-31 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-963:
---

 Summary: Update gRPC to 1.29.0
 Key: RATIS-963
 URL: https://issues.apache.org/jira/browse/RATIS-963
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Update gRPC to 1.29.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-962) Update thirdparty gRPC to 1.29.0

2020-05-31 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-962:

Summary: Update thirdparty gRPC to 1.29.0  (was: Update gRPC to 1.29.0)

> Update thirdparty gRPC to 1.29.0
> 
>
> Key: RATIS-962
> URL: https://issues.apache.org/jira/browse/RATIS-962
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> Update gRPC to 1.29.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-962) Update gRPC to 1.29.0

2020-05-31 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-962:
---

 Summary: Update gRPC to 1.29.0
 Key: RATIS-962
 URL: https://issues.apache.org/jira/browse/RATIS-962
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Update gRPC to 1.29.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-961) Add various profiles to MiniOzoneChaosCluster to run different modes

2020-05-31 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-961:
---

 Summary: Add various profiles to MiniOzoneChaosCluster to run 
different modes
 Key: RATIS-961
 URL: https://issues.apache.org/jira/browse/RATIS-961
 Project: Ratis
  Issue Type: Bug
  Components: test
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Add various profiles to MiniOzoneChaosCluster to run different modes. This will 
help in running different modes easily from MiniOzoneChaosCluster shell script




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-936) Fix ratis-hadoop with non shaded protobuf dependency

2020-05-11 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-936:
---

 Summary: Fix ratis-hadoop with non shaded protobuf dependency
 Key: RATIS-936
 URL: https://issues.apache.org/jira/browse/RATIS-936
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


With RATIS-932, dependency on shaded Ratis thirdparty has been removed. This 
has caused the ratis-hadoop protocol to break.

This jira purposes to fix this by using the protobuf 2.5.0 for the rpc 
communication and then converting to Protobuf 3.5.0 at rpc endpoint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Moved] (RATIS-934) Make ExceptionDependentRetry$Builder setters public

2020-05-11 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh moved HDDS-3565 to RATIS-934:
---

 Key: RATIS-934  (was: HDDS-3565)
Target Version/s: 0.6.0  (was: 0.6.0)
Workflow: no-reopen-closed, patch-avail  (was: patch-available, 
re-open possible)
 Project: Ratis  (was: Hadoop Distributed Data Store)

> Make ExceptionDependentRetry$Builder setters public
> ---
>
> Key: RATIS-934
> URL: https://issues.apache.org/jira/browse/RATIS-934
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-932) Avoid usage of ratis-thirdparty-hadoop in ratis-hadoop module

2020-05-10 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-932.
-
Fix Version/s: 0.6.0
 Assignee: Mukul Kumar Singh
   Resolution: Fixed

Thanks for the review [~ljain] and [~szetszwo]. I have merged this.

> Avoid usage of ratis-thirdparty-hadoop in ratis-hadoop module
> -
>
> Key: RATIS-932
> URL: https://issues.apache.org/jira/browse/RATIS-932
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Ratis both takes a direct as well as a transitive dependency via Ratis 
> thirdparty.
> This removes the dependency on Ratis third-party hadoop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-933) Remove Ratis-thirdparty hadoop module from Ratis Thirdparty

2020-05-07 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-933:
---

 Summary: Remove Ratis-thirdparty hadoop module from Ratis 
Thirdparty
 Key: RATIS-933
 URL: https://issues.apache.org/jira/browse/RATIS-933
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


This jira removes the ratis thirdparty module in Ratis third-party. This will 
simplify the dependency management in Ratis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-932) Avoid usage of ratis-thirdparty-hadoop in ratis-hadoop module

2020-05-07 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-932:
---

 Summary: Avoid usage of ratis-thirdparty-hadoop in ratis-hadoop 
module
 Key: RATIS-932
 URL: https://issues.apache.org/jira/browse/RATIS-932
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh


Ratis both takes a direct as well as a transitive dependency via Ratis 
thirdparty.
This removes the dependency on Ratis third-party hadoop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-922) Avoid Shading Hadoop in Ratis Thirdparty.

2020-05-05 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-922:
---

 Summary: Avoid Shading Hadoop in Ratis Thirdparty.
 Key: RATIS-922
 URL: https://issues.apache.org/jira/browse/RATIS-922
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Ratis takes a dependency on Hadoop both directly and transitively via Ratis 
third party. 
This becomes difficult to manage because of CVEs reported in different versions 
of Hadoop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-855) Release Ratis 0.4.0 Thirdparty

2020-04-28 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-855.
-
Fix Version/s: 0.6.0
   Resolution: Fixed

Ratis artifacts have been released and the binary can be found at 
https://dist.apache.org/repos/dist/release/incubator/ratis/thirdparty/. 
Resolving this as Fixed.

> Release Ratis 0.4.0 Thirdparty
> --
>
> Key: RATIS-855
> URL: https://issues.apache.org/jira/browse/RATIS-855
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>
> with RATIS-852 and RATIS-847 resolved, release 0.4.0 third party



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-910) Update Ratis-thirdparty to 0.4.0

2020-04-28 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-910.
-
Fix Version/s: 0.6.0
   Resolution: Fixed

The PR has been merged. Thanks [~ljain] for review and merge.

> Update Ratis-thirdparty to 0.4.0
> 
>
> Key: RATIS-910
> URL: https://issues.apache.org/jira/browse/RATIS-910
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This jira is to track update of Ratis third-party after 0.4.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-910) Update Ratis-thirdparty to 0.4.0

2020-04-28 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-910:
---

 Summary: Update Ratis-thirdparty to 0.4.0
 Key: RATIS-910
 URL: https://issues.apache.org/jira/browse/RATIS-910
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


This jira is to track update of Ratis third-party after 0.4.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-880) Update github description and disable merge options apart from Squash and merge

2020-04-24 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-880:
---

 Summary: Update github description and disable merge options apart 
from Squash and merge
 Key: RATIS-880
 URL: https://issues.apache.org/jira/browse/RATIS-880
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Update github description and disable merge options apart from Squash and merge



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-875) Bump the copyright year in the NOTICE of thirdparty

2020-04-22 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-875.
-
Resolution: Fixed

I have merged this to Ratis thirdparty master.

> Bump the copyright year in the NOTICE of thirdparty
> ---
>
> Key: RATIS-875
> URL: https://issues.apache.org/jira/browse/RATIS-875
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Trivial
>
> Reported by [~arp]  during a 0.4.0 rc vote.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-860) Organize log4j dependency in pom.xml

2020-04-21 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088462#comment-17088462
 ] 

Mukul Kumar Singh commented on RATIS-860:
-

Thanks for working on the patch [~shashikant]. +1, the changes look good to me.

> Organize log4j dependency in pom.xml
> 
>
> Key: RATIS-860
> URL: https://issues.apache.org/jira/browse/RATIS-860
> Project: Ratis
>  Issue Type: Bug
>  Components: build
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: RATIS-860.000.patch
>
>
> Currently, dependency of log4j in ozone is added as following:
> {code:java}
> 
>   log4j
>   log4j
>   1.2.17
> {code}
> Idea here is to add log4j.version as a property in pom.xml and reuse the same 
> while defining the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-850) Allow log purge up to snapshot index

2020-04-21 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088433#comment-17088433
 ] 

Mukul Kumar Singh commented on RATIS-850:
-

Thanks for updating the patch. +1, the patch looks good to me.

> Allow log purge up to snapshot index
> 
>
> Key: RATIS-850
> URL: https://issues.apache.org/jira/browse/RATIS-850
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-850.001.patch, RATIS-850.002.patch
>
>
> Ratis logs are purged only up to the least commit index on all the peers. But 
> if one peer is down, it stop log purging on all the peers. If the Ratis 
> server takes snapshots, then we can purge logs up to the snapshot index even 
> if some peer has not committed up to that index. When the peer rejoins the 
> ring, instead of ratis logs, it can get the snapshot to catch up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-737) Release Ratis 0.3.0 Thirdparty

2020-04-17 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-737.
-
Fix Version/s: 0.3.0
   Resolution: Fixed

This is already released, resolving this.

> Release Ratis 0.3.0 Thirdparty
> --
>
> Key: RATIS-737
> URL: https://issues.apache.org/jira/browse/RATIS-737
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-855) Release Ratis 0.4.0 Thirdparty

2020-04-17 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-855:
---

 Summary: Release Ratis 0.4.0 Thirdparty
 Key: RATIS-855
 URL: https://issues.apache.org/jira/browse/RATIS-855
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


with RATIS-852 and RATIS-847 resolved, release 0.4.0 third party



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-847) Upgrade netty to 4.1.48.Final

2020-04-16 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-847:

Attachment: RATIS-847.003.patch

> Upgrade netty to 4.1.48.Final
> -
>
> Key: RATIS-847
> URL: https://issues.apache.org/jira/browse/RATIS-847
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Lokesh Jain
>Priority: Major
> Attachments: RATIS-847.001.patch, RATIS-847.002.patch, 
> RATIS-847.003.patch
>
>
> We should upgrade netty version to 4.1.48.Final. We are currently on 
> 4.1.46.Final



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-852) GrpcSslTest fails with CertificateExpiredException

2020-04-16 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-852:

Attachment: RATIS-852.001.patch

> GrpcSslTest fails with CertificateExpiredException
> --
>
> Key: RATIS-852
> URL: https://issues.apache.org/jira/browse/RATIS-852
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-852.001.patch
>
>
> GrpcSslTest fails with CertificateExpiredException
> {code}
> [INFO] Running org.apache.ratis.thirdparty.demo.GrpcSslTest
> 2020-04-16 11:40:30,624 [Thread-0] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/server.pem
> 2020-04-16 11:40:30,624 [main] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.pem
> 2020-04-16 11:40:30,627 [Thread-0] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/server.crt
> 2020-04-16 11:40:30,629 [main] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/ca.crt
> 2020-04-16 11:40:30,629 [Thread-0] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.crt
> 2020-04-16 11:40:30,630 [main] INFO  demo.GrpcSslTest 
> (GrpcSslTest.java:getResource(37)) - Getting Resource: 
> /Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.crt
> 2020-04-16 11:40:31,224 [Thread-0] INFO  demo.GrpcServer 
> (GrpcSslServer.java:start(69)) - GrpcSslServer started, listening on 50005
> 2020-04-16 11:40:31,454 [main] WARN  demo.GrpcSslClient 
> (GrpcSslClient.java:greet(86)) - RPC failed: {0}
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> Channel Pipeline: [SslHandler#0, ProtocolNegotiators$ClientTlsHandler#0, 
> WriteBufferingAndExceptionHandler#0, DefaultChannelPipeline$TailContext#0]
>   at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:235)
>   at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:216)
>   at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:141)
>   at 
> org.apache.ratis.thirdparty.demo.GreeterGrpc$GreeterBlockingStub.hello(GreeterGrpc.java:156)
>   at 
> org.apache.ratis.thirdparty.demo.GrpcSslClient.greet(GrpcSslClient.java:82)
>   at 
> org.apache.ratis.thirdparty.demo.GrpcSslTest.testSslClientServer(GrpcSslTest.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Created] (RATIS-852) GrpcSslTest fails with CertificateExpiredException

2020-04-16 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-852:
---

 Summary: GrpcSslTest fails with CertificateExpiredException
 Key: RATIS-852
 URL: https://issues.apache.org/jira/browse/RATIS-852
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


GrpcSslTest fails with CertificateExpiredException

{code}
[INFO] Running org.apache.ratis.thirdparty.demo.GrpcSslTest
2020-04-16 11:40:30,624 [Thread-0] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/server.pem

2020-04-16 11:40:30,624 [main] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.pem

2020-04-16 11:40:30,627 [Thread-0] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/server.crt

2020-04-16 11:40:30,629 [main] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/ca.crt

2020-04-16 11:40:30,629 [Thread-0] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.crt

2020-04-16 11:40:30,630 [main] INFO  demo.GrpcSslTest 
(GrpcSslTest.java:getResource(37)) - Getting Resource: 
/Users/mukul/code/apache/ratis/thirdparty/test/target/test-classes/ssl/client.crt

2020-04-16 11:40:31,224 [Thread-0] INFO  demo.GrpcServer 
(GrpcSslServer.java:start(69)) - GrpcSslServer started, listening on 50005
2020-04-16 11:40:31,454 [main] WARN  demo.GrpcSslClient 
(GrpcSslClient.java:greet(86)) - RPC failed: {0}
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
Channel Pipeline: [SslHandler#0, ProtocolNegotiators$ClientTlsHandler#0, 
WriteBufferingAndExceptionHandler#0, DefaultChannelPipeline$TailContext#0]
at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:235)
at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:216)
at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:141)
at 
org.apache.ratis.thirdparty.demo.GreeterGrpc$GreeterBlockingStub.hello(GreeterGrpc.java:156)
at 
org.apache.ratis.thirdparty.demo.GrpcSslClient.greet(GrpcSslClient.java:82)
at 
org.apache.ratis.thirdparty.demo.GrpcSslTest.testSslClientServer(GrpcSslTest.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:383)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:344)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:417)
Caused by: 

[jira] [Updated] (RATIS-847) Upgrade netty to 4.1.48.Final

2020-04-15 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-847:

Attachment: RATIS-847.002.patch

> Upgrade netty to 4.1.48.Final
> -
>
> Key: RATIS-847
> URL: https://issues.apache.org/jira/browse/RATIS-847
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Lokesh Jain
>Priority: Major
> Attachments: RATIS-847.001.patch, RATIS-847.002.patch
>
>
> We should upgrade netty version to 4.1.48.Final. We are currently on 
> 4.1.46.Final



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-847) Upgrade netty to 4.1.48.Final

2020-04-15 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083940#comment-17083940
 ] 

Mukul Kumar Singh commented on RATIS-847:
-

[~ljain] can you please review.

> Upgrade netty to 4.1.48.Final
> -
>
> Key: RATIS-847
> URL: https://issues.apache.org/jira/browse/RATIS-847
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Lokesh Jain
>Priority: Major
> Attachments: RATIS-847.001.patch
>
>
> We should upgrade netty version to 4.1.48.Final. We are currently on 
> 4.1.46.Final



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-847) Upgrade netty to 4.1.48.Final

2020-04-15 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-847:

Attachment: RATIS-847.001.patch

> Upgrade netty to 4.1.48.Final
> -
>
> Key: RATIS-847
> URL: https://issues.apache.org/jira/browse/RATIS-847
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Lokesh Jain
>Priority: Major
> Attachments: RATIS-847.001.patch
>
>
> We should upgrade netty version to 4.1.48.Final. We are currently on 
> 4.1.46.Final



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-828) Logs cluttered by AlreadyExistsException

2020-04-14 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083778#comment-17083778
 ] 

Mukul Kumar Singh commented on RATIS-828:
-

Thanks for working on this [~swagle]. +1 the patch looks good to me.

> Logs cluttered by AlreadyExistsException
> 
>
> Key: RATIS-828
> URL: https://issues.apache.org/jira/browse/RATIS-828
> Project: Ratis
>  Issue Type: Wish
>  Components: server
>Reporter: Attila Doroszlai
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: RATIS-828.01.patch
>
>
> Follow-up for HDDS-3148:
> Ozone startup logs are cluttered by printing stack trace of 
> AlreadyExistsException related to group addition.  Example:
> {code}
> 2020-03-09 13:53:01,563 [grpc-default-executor-0] WARN  impl.RaftServerProxy 
> (RaftServerProxy.java:lambda$groupAddAsync$11(390)) - 
> 7a07f161-9144-44b2-8baa-73f0e9299675: Failed groupAdd* 
> GroupManagementRequest:client-27FB1A91809E->7a07f161-9144-44b2-8baa-73f0e9299675@group-E151028E3AC0,
>  cid=2, seq=0, RW, null, 
> Add:group-E151028E3AC0:[18f4e257-bf09-482e-b1bb-a2408a093ff7:172.17.0.2:43845,
>  7a07f161-9144-44b2-8baa-73f0e9299675:172.17.0.2:41551, 
> 8a66c80e-ab55-4975-92a9-8aaf06ab418a:172.17.0.2:36921]
> java.util.concurrent.CompletionException: 
> org.apache.ratis.protocol.AlreadyExistsException: 
> 7a07f161-9144-44b2-8baa-73f0e9299675: Failed to add 
> group-E151028E3AC0:[18f4e257-bf09-482e-b1bb-a2408a093ff7:172.17.0.2:43845, 
> 7a07f161-9144-44b2-8baa-73f0e9299675:172.17.0.2:41551, 
> 8a66c80e-ab55-4975-92a9-8aaf06ab418a:172.17.0.2:36921] since the group 
> already exists in the map.
>   at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>   at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
>   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
>   at 
> java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:631)
>   at 
> java.util.concurrent.CompletableFuture.thenApplyAsync(CompletableFuture.java:2006)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.groupAddAsync(RaftServerProxy.java:379)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:363)
>   at 
> org.apache.ratis.grpc.server.GrpcAdminProtocolService.lambda$groupManagement$0(GrpcAdminProtocolService.java:42)
>   at org.apache.ratis.grpc.GrpcUtil.asyncCall(GrpcUtil.java:160)
>   at 
> org.apache.ratis.grpc.server.GrpcAdminProtocolService.groupManagement(GrpcAdminProtocolService.java:42)
>   at 
> org.apache.ratis.proto.grpc.AdminProtocolServiceGrpc$MethodHandlers.invoke(AdminProtocolServiceGrpc.java:358)
>   at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:814)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.ratis.protocol.AlreadyExistsException: 
> 7a07f161-9144-44b2-8baa-73f0e9299675: Failed to add 
> group-E151028E3AC0:[18f4e257-bf09-482e-b1bb-a2408a093ff7:172.17.0.2:43845, 
> 7a07f161-9144-44b2-8baa-73f0e9299675:172.17.0.2:41551, 
> 8a66c80e-ab55-4975-92a9-8aaf06ab418a:172.17.0.2:36921] since the group 
> already exists in the map.
>   at 
> org.apache.ratis.server.impl.RaftServerProxy$ImplMap.addNew(RaftServerProxy.java:83)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.groupAddAsync(RaftServerProxy.java:378)
>   ... 13 more
> {code}
> Since these are "normal", I think stack trace should be suppressed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-850) Allow log purge up to snapshot index

2020-04-14 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083774#comment-17083774
 ] 

Mukul Kumar Singh commented on RATIS-850:
-

[~hanishakoneru], This was done on Ozone's datanode to avoid cases where 
aggressive log purge will cause the log to be purged and this will trigger 
snapshot installation rather very frequently.

> Allow log purge up to snapshot index
> 
>
> Key: RATIS-850
> URL: https://issues.apache.org/jira/browse/RATIS-850
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-850.001.patch
>
>
> Ratis logs are purged only up to the least commit index on all the peers. But 
> if one peer is down, it stop log purging on all the peers. If the Ratis 
> server takes snapshots, then we can purge logs up to the snapshot index even 
> if some peer has not committed up to that index. When the peer rejoins the 
> ring, instead of ratis logs, it can get the snapshot to catch up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-837) Add read only based load generator to MiniOzoneChaosCluster

2020-04-01 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-837:
---

 Summary: Add read only based load generator to 
MiniOzoneChaosCluster
 Key: RATIS-837
 URL: https://issues.apache.org/jira/browse/RATIS-837
 Project: Ratis
  Issue Type: Bug
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


This jira proposes to add a read-only workload to MiniOzoneChaosCluster. The 
set of files will be initially written and then read multiple times while nodes 
are shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-836) It's a waste to send an empty request to check leader state

2020-04-01 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072527#comment-17072527
 ] 

Mukul Kumar Singh commented on RATIS-836:
-

cc [~ljain]

> It's a waste to send an empty request to check leader state
> ---
>
> Key: RATIS-836
> URL: https://issues.apache.org/jira/browse/RATIS-836
> Project: Ratis
>  Issue Type: Improvement
>Reporter: runzhiwang
>Priority: Major
>
> *What's the problem ?*
> Before send [normal 
> request|https://github.com/apache/incubator-ratis/blob/master/ratis-client/src/main/java/org/apache/ratis/client/impl/OrderedAsync.java#L243],
>  ratis client will send an [empty 
> request|https://github.com/apache/incubator-ratis/blob/master/ratis-client/src/main/java/org/apache/ratis/client/impl/OrderedAsync.java#L235]
>  to server to check the leader state,  which cost about 5 millseconds, it's a 
> waste. 
> *How to improve ?*
> I think it can be improved by send normal request directly to server, without 
> sending the empty request. If the server was not leader, response client with 
> the NotLeaderException and client retry the request. [~msingh] What do you 
> think ? If you agree with it, I will submit an PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-832) Add Metrics for retry cache count as well as size in bytes

2020-03-27 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068443#comment-17068443
 ] 

Mukul Kumar Singh commented on RATIS-832:
-

cc [~sammichen]

> Add Metrics for retry cache count as well as size in bytes
> --
>
> Key: RATIS-832
> URL: https://issues.apache.org/jira/browse/RATIS-832
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 0.6.0
>Reporter: Shashikant Banerjee
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-11 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057020#comment-17057020
 ] 

Mukul Kumar Singh commented on RATIS-816:
-

Thanks for updating the patch [~elek]. +1 the updated patch looks good to me.

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-11 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057008#comment-17057008
 ] 

Mukul Kumar Singh commented on RATIS-816:
-

Thanks for working on this [~elek]. Can we use parametrized logging.

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-814) High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861

2020-03-02 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049181#comment-17049181
 ] 

Mukul Kumar Singh commented on RATIS-814:
-

Thanks for working on this [~ljain]. +1 I have committed this to master.

> High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861
> -
>
> Key: RATIS-814
> URL: https://issues.apache.org/jira/browse/RATIS-814
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: RATIS-814.001.patch, flamegraph.svg
>
>
> TimeoutScheduler creates an instance of ScheduledThreadPoolExecutor with 0 
> core pool threads. There is a bug in JDK 
> [https://bugs.openjdk.java.net/browse/JDK-8129861] which causes high CPU 
> usage if ScheduledThreadPoolExecutor is instantiated with 0 core pool 
> threads. The bug was fixed in Java 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-814) High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861

2020-03-02 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-814:

Fix Version/s: 0.6.0

> High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861
> -
>
> Key: RATIS-814
> URL: https://issues.apache.org/jira/browse/RATIS-814
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: RATIS-814.001.patch, flamegraph.svg
>
>
> TimeoutScheduler creates an instance of ScheduledThreadPoolExecutor with 0 
> core pool threads. There is a bug in JDK 
> [https://bugs.openjdk.java.net/browse/JDK-8129861] which causes high CPU 
> usage if ScheduledThreadPoolExecutor is instantiated with 0 core pool 
> threads. The bug was fixed in Java 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-814) High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861

2020-03-02 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049178#comment-17049178
 ] 

Mukul Kumar Singh commented on RATIS-814:
-

Thanks for working on this [~ljain]. +1 the patch looks good to me.

> High CPU usage by TimeoutScheduler due to JDK bug JDK-8129861
> -
>
> Key: RATIS-814
> URL: https://issues.apache.org/jira/browse/RATIS-814
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-814.001.patch, flamegraph.svg
>
>
> TimeoutScheduler creates an instance of ScheduledThreadPoolExecutor with 0 
> core pool threads. There is a bug in JDK 
> [https://bugs.openjdk.java.net/browse/JDK-8129861] which causes high CPU 
> usage if ScheduledThreadPoolExecutor is instantiated with 0 core pool 
> threads. The bug was fixed in Java 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-821) High processor load for ScheduledThreadPoolExecutor with 0 core threads

2020-03-02 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-821.
-
Resolution: Duplicate

> High processor load for ScheduledThreadPoolExecutor with 0 core threads
> ---
>
> Key: RATIS-821
> URL: https://issues.apache.org/jira/browse/RATIS-821
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: runzhiwang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This jira is related to the bug of JDK8 
> https://bugs.openjdk.java.net/browse/JDK-8129861. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-821) High processor load for ScheduledThreadPoolExecutor with 0 core threads

2020-03-02 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049176#comment-17049176
 ] 

Mukul Kumar Singh commented on RATIS-821:
-

[~yjxxtd], RATIS-814 tracks the same issue. I will resolve this as dup of 
RATIS-814.

> High processor load for ScheduledThreadPoolExecutor with 0 core threads
> ---
>
> Key: RATIS-821
> URL: https://issues.apache.org/jira/browse/RATIS-821
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: runzhiwang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This jira is related to the bug of JDK8 
> https://bugs.openjdk.java.net/browse/JDK-8129861. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-30 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026796#comment-17026796
 ] 

Mukul Kumar Singh commented on RATIS-755:
-

Thanks for the review [~ljain]. patch v6 fixes the issue.

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch, RATIS-755.002.patch, 
> RATIS-755.003.patch, RATIS-755.004.patch, RATIS-755.005.patch, 
> RATIS-755.006.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-30 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-755:

Attachment: RATIS-755.006.patch

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch, RATIS-755.002.patch, 
> RATIS-755.003.patch, RATIS-755.004.patch, RATIS-755.005.patch, 
> RATIS-755.006.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-805) Add grpc metrics to ratis

2020-01-30 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-805:
---

 Summary: Add grpc metrics to ratis
 Key: RATIS-805
 URL: https://issues.apache.org/jira/browse/RATIS-805
 Project: Ratis
  Issue Type: Bug
  Components: server
Reporter: Mukul Kumar Singh
Assignee: Aravindan Vijayan


https://github.com/grpc-ecosystem/java-grpc-prometheus talks about some of the 
grpc metrics. It will be good to explore if we can add these metrics into ratis 
as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-29 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-755:

Attachment: RATIS-755.005.patch

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch, RATIS-755.002.patch, 
> RATIS-755.003.patch, RATIS-755.004.patch, RATIS-755.005.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-29 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-755:

Attachment: RATIS-755.004.patch

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch, RATIS-755.002.patch, 
> RATIS-755.003.patch, RATIS-755.004.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-29 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-755:

Attachment: RATIS-755.003.patch

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch, RATIS-755.002.patch, 
> RATIS-755.003.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-801) Ratis snapshot should consider stateMachine#appliedIndex for triggering snapshot

2020-01-28 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024932#comment-17024932
 ] 

Mukul Kumar Singh commented on RATIS-801:
-

[~bharat], I feel it will be great addition to have StateMachine having the 
ability to trigger snapshots. Can we add a shouldTakeSnapshot() methods to 
StateMachine . ?

> Ratis snapshot should consider stateMachine#appliedIndex for triggering 
> snapshot
> 
>
> Key: RATIS-801
> URL: https://issues.apache.org/jira/browse/RATIS-801
> Project: Ratis
>  Issue Type: Improvement
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.5.0
>
>
> Currently, while triggering snapshot, snapshotUpdater#appliedIndex is taken 
> into account to decide whether it has exceeded the snapshot threshold from 
> the last snapshotIndex. This may lead to creating more snapshots than usual 
> as stateMachineUpdater#appliedIndex is updated as soon as the 
> applyTransaction call happens. Ideally, Ratis snapshot should nbe triggered 
> taking stateMachine's applied index into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-21 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020453#comment-17020453
 ] 

Mukul Kumar Singh commented on RATIS-755:
-

Thanks for the review [~ljain],if the StateMachineFunction is not provided 
then, there wont be any SM string part.
basically in the above output there wont be any "State Machine: " lines.

{code}
+String smString = "";
+if (function != null) {
+  smString = "\n\t State Machine: " + function.apply(smLog);
+}
+return callIdString + smString;
{code}

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-797) Ratis segment file corruption after server restart

2020-01-19 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018901#comment-17018901
 ] 

Mukul Kumar Singh commented on RATIS-797:
-

Hi [~Sammi],

Can you please save the datanode logs as well the datanode ratis directory as 
well ?

> Ratis segment file corruption after server restart
> --
>
> Key: RATIS-797
> URL: https://issues.apache.org/jira/browse/RATIS-797
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.5.0
>
>
> While testing ozone, it was observed that ratis segment show corruptions 
> after a server restart
> {code:java}
> 2020-01-08 02:06:46,576 INFO 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Successfully read 1 
> entries from segment file 
> /metadata/hadoop-ozone/datanode/ratis/data/5e26b460-ca4e-4791-bf70-1fd535056988/current/log_inprogress_0
> 2020-01-08 02:06:46,576 WARN 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Segment file is 
> corrupted: expected to have 0 entries but only 1 entries read successfully
> 2020-01-08 02:06:46,580 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 2d422fc8-f7c2-4e41-a59b-abbf76330dfe@group-1FD535056988-SegmentedRaftLogWorker:
>  flushIndex: setUnconditionally 0 -> 0
> 2020-01-08 02:06:46,618 INFO org.eclipse.jetty.util.log: Logging initialized 
> @1978ms
> 2020-01-08 02:06:46,738 INFO org.apache.ratis.server.RaftServerConfigKeys: 
> raft.server.snaps
> 2020-01-16 07:51:12,268 WARN 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Segment file is 
> corrupted: expected to have -3668 entries but only 3500 entries read 
> successfully
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-797) Ratis segment file corruption after server restart

2020-01-16 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017451#comment-17017451
 ] 

Mukul Kumar Singh commented on RATIS-797:
-

[~szetszwo], can you please have a look at this jira.

> Ratis segment file corruption after server restart
> --
>
> Key: RATIS-797
> URL: https://issues.apache.org/jira/browse/RATIS-797
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.5.0
>
>
> While testing ozone, it was observed that ratis segment show corruptions 
> after a server restart
> {code:java}
> 2020-01-08 02:06:46,576 INFO 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Successfully read 1 
> entries from segment file 
> /metadata/hadoop-ozone/datanode/ratis/data/5e26b460-ca4e-4791-bf70-1fd535056988/current/log_inprogress_0
> 2020-01-08 02:06:46,576 WARN 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Segment file is 
> corrupted: expected to have 0 entries but only 1 entries read successfully
> 2020-01-08 02:06:46,580 INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 2d422fc8-f7c2-4e41-a59b-abbf76330dfe@group-1FD535056988-SegmentedRaftLogWorker:
>  flushIndex: setUnconditionally 0 -> 0
> 2020-01-08 02:06:46,618 INFO org.eclipse.jetty.util.log: Logging initialized 
> @1978ms
> 2020-01-08 02:06:46,738 INFO org.apache.ratis.server.RaftServerConfigKeys: 
> raft.server.snaps
> 2020-01-16 07:51:12,268 WARN 
> org.apache.ratis.server.raftlog.segmented.LogSegment: Segment file is 
> corrupted: expected to have -3668 entries but only 3500 entries read 
> successfully
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-14 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015000#comment-17015000
 ] 

Mukul Kumar Singh commented on RATIS-755:
-

The log dump with this utility is like
{code}
20178:ratis2 mukul$ java -jar 
ratis-tools/target/ratis-tools-0.5.0-SNAPSHOT-shaded.jar 
/tmp/arithmetic/64656d6f-5261-6674-4772-6f7570313233/current/log_inprogress_0
file path is 
/tmp/arithmetic/64656d6f-5261-6674-4772-6f7570313233/current/log_inprogress_0
(t:1, i:0), CONFIGURATIONENTRY
(t:1, i:1), STATEMACHINELOGENTRY, client-6143FC438F2B, cid=0
 State Machine: a = 3
(t:1, i:2), METADATAENTRY(c1)
(t:1, i:3), STATEMACHINELOGENTRY, client-FBF7EA69AB2A, cid=0
 State Machine: b = 4
(t:1, i:4), METADATAENTRY(c3)
{code}

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-14 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-755:

Attachment: RATIS-755.001.patch

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
> Attachments: RATIS-755.001.patch
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-755) Add a log dump command line utility inside ratis

2020-01-14 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned RATIS-755:
---

Assignee: Mukul Kumar Singh  (was: Prashant Pogde)

> Add a log dump command line utility inside ratis
> 
>
> Key: RATIS-755
> URL: https://issues.apache.org/jira/browse/RATIS-755
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Fix For: 0.5.0
>
>
> This tool proposes to add a utility to dump the following information to the 
> a) log index
> b) log term
> c) log entry type
> d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-791) Refactor MiniOzoneLoadGenerator to add more load generators to chaos testing

2020-01-12 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-791:
---

 Summary: Refactor MiniOzoneLoadGenerator to add more load 
generators to chaos testing
 Key: RATIS-791
 URL: https://issues.apache.org/jira/browse/RATIS-791
 Project: Ratis
  Issue Type: Bug
  Components: test
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


This jira proposes to add new load generators to chaos testing to test new 
modes of failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-790) GrpcLogAppenders on Ratis leader block on each other

2020-01-10 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned RATIS-790:
---

Assignee: Tsz-wo Sze  (was: Shashikant Banerjee)

> GrpcLogAppenders on Ratis leader block on each other
> 
>
> Key: RATIS-790
> URL: https://issues.apache.org/jira/browse/RATIS-790
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Major
> Fix For: 0.5.0
>
>
> {code:java}
> "org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$369/802019944@2f87467e"
>  #152 daemon prio=5 os_prio=0 tid=0x7f6648271000 nid=0x7aeb waiting on 
> condition [0x7f65d7803000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0003d82f4208> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>         at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>         at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.offer(ScheduledThreadPoolExecutor.java:1010)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(ScheduledThreadPoolExecutor.java:1037)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(ScheduledThreadPoolExecutor.java:809)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:328)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
>         at 
> org.apache.ratis.util.TimeoutScheduler$Scheduler.schedule(TimeoutScheduler.java:83)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:155)
>         - locked <0x0003d7ecd4e8> (a 
> org.apache.ratis.util.TimeoutScheduler)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:138)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:191)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.sendRequest(GrpcLogAppender.java:203)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:194)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:121)
>         at 
> org.apache.ratis.server.impl.LogAppender$AppenderDaemon.run(LogAppender.java:77)
>         at 
> org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$369/802019944.run(Unknown
>  Source)
>         at java.lang.Thread.run(Thread.java:748)
> "org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$369/802019944@19778529"
>  #150 daemon prio=5 os_prio=0 tid=0x7f664826f800 nid=0x7aea waiting for 
> monitor entry [0x7f65d7904000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:151)
>         - waiting to lock <0x0003d7ecd4e8> (a 
> org.apache.ratis.util.TimeoutScheduler)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:138)
>         at 
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:191)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.sendRequest(GrpcLogAppender.java:203)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:194)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:121)
>         at 
> org.apache.ratis.server.impl.LogAppender$AppenderDaemon.run(LogAppender.java:77)
>         at 
> org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$369/802019944.run(Unknown
>  Source)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> The GrpcLogAppenders, one for each follower on a ratis leader seem to block 
> to each other because both of them share same instance of TimeoutScheduler 
>  
> cc [~msingh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-785) Statemachine updater fails with assertion

2020-01-06 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-785:

Reporter: Sammi Chen  (was: Shashikant Banerjee)

> Statemachine updater fails with assertion
> -
>
> Key: RATIS-785
> URL: https://issues.apache.org/jira/browse/RATIS-785
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Sammi Chen
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
>
> {code:java}
> java.lang.IllegalStateException: retry cache entry should be pending: 
> client-7E602ACF0902:70:done
>         at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
>         at 
> org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
>         at java.lang.Thread.run(Thread.java:748)
> 2019-12-20 11:27:24,343 ERROR 
> org.apache.ratis.server.impl.StateMachineUpdater: 
> ed90869c-317e-4303-8922-9fa83a3983cb@group-9D552F016938-StateMachineUpdater: 
> the StateMachineUpdater hits Throwable
> java.lang.IllegalStateException: retry cache entry should be pending: 
> client-7E602ACF0902:70:done
>         at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
>         at 
> org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> The issue seems to be caused by precondition, where in the the reply future 
> in retry cache is marked complete already where it expects to be in pending 
> state.
>  
> One possible case, would be like , if the entry gets evicted from cache, we 
> end up creating two different requests (two log entries) for same set of 
> client and call id which is the key to retryCache. If the server now restarts 
> and starts reapplying the transaction, the earlier index might add it to the 
> retryCache but when the apply for the other log index happens, it might 
> already see the future marked complete as for both of them retry cache key 
> would be same.
> FYI, the issue happens only after a restart.
> cc [~msingh], [~ljain] [~szetszwo]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-783) RaftServerProxy#groupAddAsync removes the group on AlreadyExistsException

2019-12-18 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999671#comment-16999671
 ] 

Mukul Kumar Singh commented on RATIS-783:
-

+1, the patch looks good to me.

> RaftServerProxy#groupAddAsync removes the group on AlreadyExistsException
> -
>
> Key: RATIS-783
> URL: https://issues.apache.org/jira/browse/RATIS-783
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: RATIS-783.000.patch
>
>
> {{RaftServerProxy#groupAddAsync}} removes the existing group on 
> {{AlreadyExistsException}}.
> We can log WARN message, but we should not remove the existing group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-779) StateMachine#truncateStateMachineData should be called during instantiation of TruncateLog

2019-12-18 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999231#comment-16999231
 ] 

Mukul Kumar Singh commented on RATIS-779:
-

Thanks for working on this [~ljain]. +1 the patch looks good to me.

> StateMachine#truncateStateMachineData should be called during instantiation 
> of TruncateLog
> --
>
> Key: RATIS-779
> URL: https://issues.apache.org/jira/browse/RATIS-779
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-779.001.patch, RATIS-779.002.patch
>
>
> Currently StateMachine#truncateStateMachineData should be called in the 
> constructor of TruncateLog. This makes sure that the function is called with 
> RaftLog write lock held. StateMachine#writeStateMachineData is also called 
> with RaftLog write lock held. It is important for these calls to be 
> synchronized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-779) StateMachine#truncateStateMachineData should be called during instantiation of TruncateLog

2019-12-17 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998419#comment-16998419
 ] 

Mukul Kumar Singh commented on RATIS-779:
-

Thanks for working on this [~ljain], Let's add a comment in StateMachine.java 
around the lock inside which this call happens. Also, an advisory that this 
call along with writeStateMachineData should be as lightweight as possible.

> StateMachine#truncateStateMachineData should be called during instantiation 
> of TruncateLog
> --
>
> Key: RATIS-779
> URL: https://issues.apache.org/jira/browse/RATIS-779
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-779.001.patch
>
>
> Currently StateMachine#truncateStateMachineData should be called in the 
> constructor of TruncateLog. This makes sure that the function is called with 
> RaftLog write lock held. StateMachine#writeStateMachineData is also called 
> with RaftLog write lock held. It is important for these calls to be 
> synchronized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-782) Upgrade maven version to 3.6.3

2019-12-17 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998056#comment-16998056
 ] 

Mukul Kumar Singh commented on RATIS-782:
-

Thanks for working on this [~ljain]. +1 the patch looks good to me.

> Upgrade maven version to 3.6.3
> --
>
> Key: RATIS-782
> URL: https://issues.apache.org/jira/browse/RATIS-782
> Project: Ratis
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-782.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-770) Add a precondition that Ratis writeBufferSize is lesser than SegmentSize in Ratis

2019-12-05 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-770:
---

 Summary: Add a precondition that Ratis writeBufferSize is lesser 
than SegmentSize in Ratis
 Key: RATIS-770
 URL: https://issues.apache.org/jira/browse/RATIS-770
 Project: Ratis
  Issue Type: Bug
  Components: server
Reporter: Mukul Kumar Singh
Assignee: Prashant Pogde


Add a precondition that Ratis writeBufferSize is lesser than SegmentSize in 
Ratis



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-769) Ratis RaftGroupId should also include number of nodes in the log output

2019-12-05 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-769:
---

 Summary: Ratis RaftGroupId should also include number of nodes in 
the log output
 Key: RATIS-769
 URL: https://issues.apache.org/jira/browse/RATIS-769
 Project: Ratis
  Issue Type: Bug
  Components: server
Reporter: Mukul Kumar Singh
Assignee: Prashant Pogde


for log lines like 

{code}
2019-12-05 12:06:33,805 INFO impl.RaftServerImpl: 
ff9ad02f-a8b1-4641-8ddb-fec62ddd3a63@group-6CDAAB81725E: changes role from 
CANDIDATE to FOLLOWER at term 1 for DISCOVERED_A_NEW_TERM
{code}

In order to make out if a raft group has 1 or 3 nodes, it will be great to have 
num nodes as another parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-767) DirectByteBuffers leaked by BufferedWriteChannel in SegmentRaftLog

2019-12-04 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-767:

Attachment: RATIS-767.002.patch

> DirectByteBuffers leaked by BufferedWriteChannel in SegmentRaftLog
> --
>
> Key: RATIS-767
> URL: https://issues.apache.org/jira/browse/RATIS-767
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Rajesh Balamohan
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-767.001.patch, RATIS-767.002.patch, Screenshot 
> 2019-10-25 at 12.20.05 PM.png
>
>
> As noticed by Rajesh, Ratis is leaking DirectByteBuffers in 
> BufferedWriteChannel. 
> As has been shared in multiple articles on the internet, it is best to 
> allocate pool of direct byte buffers and use them from the pool. Please refer 
> : https://www.javamex.com/tutorials/io/nio_buffer_direct.shtml
> This jira introduces a BufferPool to avoid memory leaks
>  !Screenshot 2019-10-25 at 12.20.05 PM.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-764) BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with ExcludeList informatipn

2019-11-29 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-764:
---

 Summary: BlockOutputStreamEntryPool.java:allocateNewBlock spams 
the log with ExcludeList informatipn
 Key: RATIS-764
 URL: https://issues.apache.org/jira/browse/RATIS-764
 Project: Ratis
  Issue Type: Bug
  Components: client
Reporter: Mukul Kumar Singh


BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with ExcludeList 
information with the following log lines

{code}
2019-11-29 20:22:51,590 [pool-244-thread-9] INFO  io.BlockOutputStreamEntryPool 
(BlockOutputStreamEntryPool.java:allocateNewBlock(257)) - Allocating block with 
ExcludeList {datanodes = [], containerIds = [], pipelineIds = []}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-755) Add a log dump command line utility inside ratis

2019-11-19 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-755:
---

 Summary: Add a log dump command line utility inside ratis
 Key: RATIS-755
 URL: https://issues.apache.org/jira/browse/RATIS-755
 Project: Ratis
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: 0.5.0


This tool proposes to add a utility to dump the following information to the 

a) log index
b) log term
c) log entry type
d) state machine data if present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-752:

Component/s: (was: server)
 thirdparty

> Update Ratis thirdparty to 0.3.0
> 
>
> Key: RATIS-752
> URL: https://issues.apache.org/jira/browse/RATIS-752
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-752.001.patch
>
>
> This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
> protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972245#comment-16972245
 ] 

Mukul Kumar Singh commented on RATIS-752:
-

This is a followup of RATIS-407.

> Update Ratis thirdparty to 0.3.0
> 
>
> Key: RATIS-752
> URL: https://issues.apache.org/jira/browse/RATIS-752
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-752.001.patch
>
>
> This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
> protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-752:

Attachment: RATIS-752.001.patch

> Update Ratis thirdparty to 0.3.0
> 
>
> Key: RATIS-752
> URL: https://issues.apache.org/jira/browse/RATIS-752
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-752.001.patch
>
>
> This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
> protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-752:
---

 Summary: Update Ratis thirdparty to 0.3.0
 Key: RATIS-752
 URL: https://issues.apache.org/jira/browse/RATIS-752
 Project: Ratis
  Issue Type: Bug
  Components: server
Reporter: Mukul Kumar Singh


This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-710) GC pauses in leader should not penalize appendRequest response processing

2019-10-31 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-710.
-
Resolution: Not A Problem

Resolving this as not a issue for now as it would be better to identify the 
cause of the memory/GC pauses before fixing this issue.

> GC pauses in leader should not penalize appendRequest response processing
> -
>
> Key: RATIS-710
> URL: https://issues.apache.org/jira/browse/RATIS-710
> Project: Ratis
>  Issue Type: Bug
>  Components: raft-group
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Critical
> Fix For: 0.5.0
>
>
> In ozone perf testing, it was observed that once leader goes through gc pause 
> cycle and wakes up, it just times out all append requests , but the follower 
> seems to be processing the append requests fine. It goes in a loop and ends 
> up failing the watch requests on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-739) Leader should notify state machine about log append completion events

2019-10-31 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-739:

Attachment: RATIS-739.001.patch

> Leader should notify state machine about log append completion events
> -
>
> Key: RATIS-739
> URL: https://issues.apache.org/jira/browse/RATIS-739
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-739.001.patch
>
>
> Leader should notify state machine about log append completion events. This 
> can be used in Ozone to control the behaviour of container state machine 
> cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-739) Leader should notify state machine about log append completion events

2019-10-31 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-739:
---

 Summary: Leader should notify state machine about log append 
completion events
 Key: RATIS-739
 URL: https://issues.apache.org/jira/browse/RATIS-739
 Project: Ratis
  Issue Type: Bug
  Components: server
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Leader should notify state machine about log append completion events. This can 
be used in Ozone to control the behaviour of container state machine cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-710) GC pauses in leader should not penalize appendRequest response processing

2019-10-31 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963891#comment-16963891
 ] 

Mukul Kumar Singh commented on RATIS-710:
-

Yes, I agree that after RATIS-457 this might ideally not be needed.

> GC pauses in leader should not penalize appendRequest response processing
> -
>
> Key: RATIS-710
> URL: https://issues.apache.org/jira/browse/RATIS-710
> Project: Ratis
>  Issue Type: Bug
>  Components: raft-group
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Critical
> Fix For: 0.5.0
>
>
> In ozone perf testing, it was observed that once leader goes through gc pause 
> cycle and wakes up, it just times out all append requests , but the follower 
> seems to be processing the append requests fine. It goes in a loop and ends 
> up failing the watch requests on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-737) Release Ratis 0.3.0 Thirdparty

2019-10-30 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-737:
---

 Summary: Release Ratis 0.3.0 Thirdparty
 Key: RATIS-737
 URL: https://issues.apache.org/jira/browse/RATIS-737
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-346) Provide an option for raft server to not join Raft ring on restart

2019-10-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-346.
-
Resolution: Invalid

> Provide an option for raft server to not join Raft ring on restart
> --
>
> Key: RATIS-346
> URL: https://issues.apache.org/jira/browse/RATIS-346
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> On restart on a Ratis server, it may be desirable to not allow the node to 
> rejoin the raft ring. This jira proposes to add a config parameter to let the 
> server do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-707) Test failures caused by minTimeout set to zero

2019-10-23 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957762#comment-16957762
 ] 

Mukul Kumar Singh commented on RATIS-707:
-

[~swagle] [~szetszwo] should we reopen RATIS-698 ? To start the discussion on 
how we can solve the initial leader election issue ?

> Test failures caused by minTimeout set to zero
> --
>
> Key: RATIS-707
> URL: https://issues.apache.org/jira/browse/RATIS-707
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: RATIS-707.01.patch, RATIS-707.01.patch
>
>
> TestRaftAsyncWithGrpc#testBasicAppendEntriesAsync and other tests fail if the 
> initial minTimeout is 0 then the server can trigger a leader election much 
> more frequently because the heartbeat interval is still at minTimeoutMs/2
> {code}
> 2019-10-11 00:45:47,813 INFO  impl.FollowerState 
> (FollowerState.java:run(108)) - s0@group-C51B0F2AC202-FollowerState: change 
> to CANDIDATE, lastRpcTime:21ms, electionTimeout:17ms
> 2019-10-11 00:45:47,870 INFO  impl.FollowerState 
> (FollowerState.java:run(108)) - s0@group-C51B0F2AC202-FollowerState: change 
> to CANDIDATE, lastRpcTime:35ms, electionTimeout:31ms
> 2019-10-11 00:45:47,933 INFO  impl.FollowerState 
> (FollowerState.java:run(108)) - s0@group-C51B0F2AC202-FollowerState: change 
> to CANDIDATE, lastRpcTime:51ms, electionTimeout:51ms
> 2019-10-11 00:45:47,969 INFO  impl.FollowerState 
> (FollowerState.java:run(108)) - s0@group-C51B0F2AC202-FollowerState: change 
> to CANDIDATE, lastRpcTime:22ms, electionTimeout:21ms
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-603) Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on errors

2019-10-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-603:

Attachment: RATIS-603.008.patch

> Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on 
> errors
> ---
>
> Key: RATIS-603
> URL: https://issues.apache.org/jira/browse/RATIS-603
> Project: Ratis
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-603.001.patch, RATIS-603.002.patch, 
> RATIS-603.003.patch, RATIS-603.004.patch, RATIS-603.005.patch, 
> RATIS-603.006.patch, RATIS-603.007.patch, RATIS-603.008.patch
>
>
> This jira proposes to add a SmLogEntryProto to toString converter so that 
> logEntry information can be printed on errors/exceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-603) Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on errors

2019-10-23 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957756#comment-16957756
 ] 

Mukul Kumar Singh commented on RATIS-603:
-

Thanks for the review [~szetszwo]. Patch v8 fixes the review comments.

> Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on 
> errors
> ---
>
> Key: RATIS-603
> URL: https://issues.apache.org/jira/browse/RATIS-603
> Project: Ratis
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-603.001.patch, RATIS-603.002.patch, 
> RATIS-603.003.patch, RATIS-603.004.patch, RATIS-603.005.patch, 
> RATIS-603.006.patch, RATIS-603.007.patch, RATIS-603.008.patch
>
>
> This jira proposes to add a SmLogEntryProto to toString converter so that 
> logEntry information can be printed on errors/exceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-603) Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on errors

2019-10-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-603:

Attachment: RATIS-603.007.patch

> Add a logStringSupplier for RaftServerImpl to optionally print SmLogEntry on 
> errors
> ---
>
> Key: RATIS-603
> URL: https://issues.apache.org/jira/browse/RATIS-603
> Project: Ratis
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-603.001.patch, RATIS-603.002.patch, 
> RATIS-603.003.patch, RATIS-603.004.patch, RATIS-603.005.patch, 
> RATIS-603.006.patch, RATIS-603.007.patch
>
>
> This jira proposes to add a SmLogEntryProto to toString converter so that 
> logEntry information can be printed on errors/exceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-704) Invoke sendAsync as soon as OrderedAsync is created

2019-10-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-704:

Labels: ozone  (was: )

> Invoke sendAsync as soon as OrderedAsync is created
> ---
>
> Key: RATIS-704
> URL: https://issues.apache.org/jira/browse/RATIS-704
> Project: Ratis
>  Issue Type: Improvement
>  Components: client
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: ozone
> Attachments: r704_20191009.patch, r704_20191011.patch, 
> r704_20191018.patch, r704_20191022.patch
>
>
> In OrderedAsync, the messages are sent asynchronously except for the first 
> message.  The first message is used to establish the connection.  
> OrderedAsync will wait for the first message to complete before sending the 
> following messages.
> Note that, when sending only two messages, the performance of sending the 
> messages asynchronously is degenerated to sending them sequentially 
> [~msingh] has discovered a case that can be optimized: an application may 
> send two or more messages and the first message may take a long time to 
> process.  In this case, we may send a dummy lightweighted message establish 
> the connection, and then send real messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-724) Leader election seem to timeout in Ratis

2019-10-23 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-724.
-
Resolution: Not A Problem

Adding hostname entry to /etc/hosts resolves the issue. Closing this.

> Leader election seem to timeout in Ratis
> 
>
> Key: RATIS-724
> URL: https://issues.apache.org/jira/browse/RATIS-724
> Project: Ratis
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.5.0
>
>
> {code:java}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEWjava.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more2019-10-21 17:40:18,702 
> [4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1] 
> INFO  impl.LeaderElection (LogUtils.java:infoOrTrace(149)) - 
> 4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1 got 
> exception when requesting votes: {}java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> 

[jira] [Comment Edited] (RATIS-407) Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty

2019-10-22 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957520#comment-16957520
 ] 

Mukul Kumar Singh edited comment on RATIS-407 at 10/23/19 3:49 AM:
---

Hey [~elserj] and [~szetszwo], I already tested this with Ratis-thrdparty unit 
test, Ratis unit test and have also used this version in Ozone for verification.


was (Author: msingh):
Hey Josh, I already tested this with Ratis-thrdparty unit test, Ratis unit test 
and have also used this version in Ozone for verification.

> Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty
> 
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-407) Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty

2019-10-22 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957520#comment-16957520
 ] 

Mukul Kumar Singh commented on RATIS-407:
-

Hey Josh, I already tested this with Ratis-thrdparty unit test, Ratis unit test 
and have also used this version in Ozone for verification.

> Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty
> 
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-727) Garbage collection due to same request retries on a follower

2019-10-22 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957094#comment-16957094
 ] 

Mukul Kumar Singh commented on RATIS-727:
-

[~szetszwo], the NLE retries are without timeout only for cases when the leader 
is known, So if a NLE points to a peer being a leader, the next request is now 
sent to the node which is the leader.

> Garbage collection due to same request retries on a follower
> 
>
> Key: RATIS-727
> URL: https://issues.apache.org/jira/browse/RATIS-727
> Project: Ratis
>  Issue Type: Bug
>  Components: client
>Reporter: Lokesh Jain
>Priority: Major
>
> In a heap dump it could be seen that a client request retries on the same 
> follower multiple times and every time the request is rejected with a 
> NotLeaderException. In case of Ozone it is a WriteChunk request which leads 
> to garbage collection of 16MB for every request. In the heap dump a client 
> request retries multiple times leading to garbage collection of ~100MB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-724) Leader election seem to timeout in Ratis

2019-10-21 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-724:

Priority: Blocker  (was: Major)

> Leader election seem to timeout in Ratis
> 
>
> Key: RATIS-724
> URL: https://issues.apache.org/jira/browse/RATIS-724
> Project: Ratis
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.5.0
>
>
> {code:java}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEWjava.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more2019-10-21 17:40:18,702 
> [4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1] 
> INFO  impl.LeaderElection (LogUtils.java:infoOrTrace(149)) - 
> 4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1 got 
> exception when requesting votes: {}java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more2019-10-21 17:40:18,705 
> 

[jira] [Commented] (RATIS-724) Leader election seem to timeout in Ratis

2019-10-21 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956082#comment-16956082
 ] 

Mukul Kumar Singh commented on RATIS-724:
-

cc [~szetszwo]

> Leader election seem to timeout in Ratis
> 
>
> Key: RATIS-724
> URL: https://issues.apache.org/jira/browse/RATIS-724
> Project: Ratis
>  Issue Type: Bug
>Reporter: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
>
> {code:java}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEWjava.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> e9477db3-627c-447b-9726-7a0202331e44@group-284D2F681BFD is not in [RUNNING]: 
> current state is NEW at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more2019-10-21 17:40:18,702 
> [4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1] 
> INFO  impl.LeaderElection (LogUtils.java:infoOrTrace(149)) - 
> 4ed74939-427d-455d-852a-df3499c9dbb2@group-284D2F681BFD-LeaderElection1 got 
> exception when requesting votes: {}java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122) at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192) at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:259)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:197)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:133) 
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2998885738ns at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:99)
>  at 
> org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:204) at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$1(LeaderElection.java:234)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more2019-10-21 

[jira] [Updated] (RATIS-407) Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty

2019-10-21 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-407:

Summary: Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis 
thirdparty  (was: Update grpc version to 1.16.0)

> Update grpc version to 1.24.0 and protobuf to 3.10.0 in Ratis thirdparty
> 
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-407) Update grpc version to 1.16.0

2019-10-20 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955750#comment-16955750
 ] 

Mukul Kumar Singh commented on RATIS-407:
-

[~elserj] and [~szetszwo] please have a look.

> Update grpc version to 1.16.0
> -
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-407) Update grpc version to 1.16.0

2019-10-20 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955749#comment-16955749
 ] 

Mukul Kumar Singh commented on RATIS-407:
-

I have tested this patch with Ozone and Ratis test cases.

> Update grpc version to 1.16.0
> -
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-407) Update grpc version to 1.16.0

2019-10-19 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated RATIS-407:

Attachment: RATIS-407.002.patch

> Update grpc version to 1.16.0
> -
>
> Key: RATIS-407
> URL: https://issues.apache.org/jira/browse/RATIS-407
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-407.001.patch, RATIS-407.002.patch
>
>
> Ratis currently uses Grpc version 1.14.0. This can be updated to 1.16.0.
> Along with this protobuf version can be updated to 3.5.1 and Netty to 
> 4.1.30.Final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-716) RetryCache$CacheEntry$replyFuture holds onto RaftClientRequest until eviction

2019-10-15 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952467#comment-16952467
 ] 

Mukul Kumar Singh commented on RATIS-716:
-

Amazing observation [~ljain], +1, The patch looks good to me.

> RetryCache$CacheEntry$replyFuture holds onto RaftClientRequest until eviction
> -
>
> Key: RATIS-716
> URL: https://issues.apache.org/jira/browse/RATIS-716
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: RATIS-716.001.patch
>
>
> RetryCache$CacheEntry$replyFuture holds onto RaftClientRequest until 
> eviction. Multiple cache entries can hold a lot of memory thus leading to GC 
> pauses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   >