[jira] [Created] (RATIS-697) Provide helper scripts for code quality checks

2019-09-30 Thread Marton Elek (Jira)
Marton Elek created RATIS-697:
-

 Summary: Provide helper scripts for code quality checks
 Key: RATIS-697
 URL: https://issues.apache.org/jira/browse/RATIS-697
 Project: Ratis
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


In Ozone we started to use simple shell scripts to check the quality of the 
code. (dev-support/check/checkstyle.sh). They help us to execute local maven 
commands quickly and collect all of the results.

They also help us to use github-actions or other highly parallel CI in the 
future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-750) Ratis server fails with "java.lang.ClassNotFoundException: com.codahale.metrics.jvm.GarbageCollectorMetricSet"

2019-11-11 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971684#comment-16971684
 ] 

Marton Elek commented on RATIS-750:
---

Optional used intentionally as metrics may or may not be used by the apps. It's 
the responsibility of the user of ratis to put a specific version to the 
classpath. I think a better fix is to add it as a real compile dependency only 
where it is required (eg. add it to the logService and to the example server).

> Ratis server fails with "java.lang.ClassNotFoundException: 
> com.codahale.metrics.jvm.GarbageCollectorMetricSet"
> --
>
> Key: RATIS-750
> URL: https://issues.apache.org/jira/browse/RATIS-750
> Project: Ratis
>  Issue Type: Bug
>  Components: build
>Reporter: Clay B.
>Assignee: Clay B.
>Priority: Major
> Attachments: 
> 0001-RATIS-750.-Ratis-server-fails-with-java.lang.ClassNo.patch
>
>
> In testing the current master, starting the Ratis server via 
> {{./ratis-examples/src/main/bin/server.sh filestore server --storage $storage 
> --id $id --peers $peers 2>&1 | \}} I end up with the following failure to 
> start:
> {code:java}
> Found 
> /home/vagrant/incubator-ratis/ratis-examples/target/ratis-examples-0.5.0-SNAPSHOT.jar
> 2019-11-11 03:27:52 INFO  MetricRegistries:64 - Loaded MetricRegistries class 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl
> 2019-11-11 03:27:52 WARN  MetricRegistriesImpl:61 - First MetricRegistry has 
> been created without registering reporters. You may need to call 
> MetricRegistries.global().addReportRegistration(...) before.
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> com/codahale/metrics/jvm/GarbageCollectorMetricSet
> at 
> org.apache.ratis.metrics.JVMMetrics.addJvmMetrics(JVMMetrics.java:42)
> at 
> org.apache.ratis.metrics.JVMMetrics.initJvmMetrics(JVMMetrics.java:32)
> at org.apache.ratis.examples.filestore.cli.Server.run(Server.java:60)
> at org.apache.ratis.examples.common.Runner.main(Runner.java:58)
> Caused by: java.lang.ClassNotFoundException: 
> com.codahale.metrics.jvm.GarbageCollectorMetricSet
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 4 more
> === Command terminated normally (Mon Nov 11 03:27:52 2019) === {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek reassigned RATIS-752:
-

Assignee: Mukul Kumar Singh

> Update Ratis thirdparty to 0.3.0
> 
>
> Key: RATIS-752
> URL: https://issues.apache.org/jira/browse/RATIS-752
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: RATIS-752.001.patch
>
>
> This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
> protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-752) Update Ratis thirdparty to 0.3.0

2019-11-12 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-752:
--
Fix Version/s: 0.5.0

> Update Ratis thirdparty to 0.3.0
> 
>
> Key: RATIS-752
> URL: https://issues.apache.org/jira/browse/RATIS-752
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: RATIS-752.001.patch
>
>
> This jira updates the ratis thirdparty version to 0.3.0 and also updates the 
> protobuf.version to 3.10.0 and grpc.version to 1.24.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-750) Ratis server fails with "java.lang.ClassNotFoundException: com.codahale.metrics.jvm.GarbageCollectorMetricSet"

2019-11-12 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972388#comment-16972388
 ] 

Marton Elek commented on RATIS-750:
---

Thanks the update [~clayb]Yes, I think it gives us more flexibility as ratis 
can be used without metrics-jvm. It's optional.

Tested and worked well. Committed to the master...

> Ratis server fails with "java.lang.ClassNotFoundException: 
> com.codahale.metrics.jvm.GarbageCollectorMetricSet"
> --
>
> Key: RATIS-750
> URL: https://issues.apache.org/jira/browse/RATIS-750
> Project: Ratis
>  Issue Type: Bug
>  Components: build
>Reporter: Clay B.
>Assignee: Clay B.
>Priority: Major
> Attachments: 
> 0001-RATIS-750.-Ratis-server-fails-with-java.lang.ClassNo.patch, 
> 0002-RATIS-750.-Ratis-server-fails-with-java.lang.ClassNo.patch
>
>
> In testing the current master, starting the Ratis server via 
> {{./ratis-examples/src/main/bin/server.sh filestore server --storage $storage 
> --id $id --peers $peers 2>&1 | \}} I end up with the following failure to 
> start:
> {code:java}
> Found 
> /home/vagrant/incubator-ratis/ratis-examples/target/ratis-examples-0.5.0-SNAPSHOT.jar
> 2019-11-11 03:27:52 INFO  MetricRegistries:64 - Loaded MetricRegistries class 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl
> 2019-11-11 03:27:52 WARN  MetricRegistriesImpl:61 - First MetricRegistry has 
> been created without registering reporters. You may need to call 
> MetricRegistries.global().addReportRegistration(...) before.
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> com/codahale/metrics/jvm/GarbageCollectorMetricSet
> at 
> org.apache.ratis.metrics.JVMMetrics.addJvmMetrics(JVMMetrics.java:42)
> at 
> org.apache.ratis.metrics.JVMMetrics.initJvmMetrics(JVMMetrics.java:32)
> at org.apache.ratis.examples.filestore.cli.Server.run(Server.java:60)
> at org.apache.ratis.examples.common.Runner.main(Runner.java:58)
> Caused by: java.lang.ClassNotFoundException: 
> com.codahale.metrics.jvm.GarbageCollectorMetricSet
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 4 more
> === Command terminated normally (Mon Nov 11 03:27:52 2019) === {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-702) Make metrics reporting implementation pluggable

2019-10-07 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek reassigned RATIS-702:
-

Assignee: Marton Elek

> Make metrics reporting implementation pluggable
> ---
>
> Key: RATIS-702
> URL: https://issues.apache.org/jira/browse/RATIS-702
> Project: Ratis
>  Issue Type: Wish
>  Components: metrics
>Reporter: Henrik Hegardt
>Assignee: Marton Elek
>Priority: Major
>
> It would be really nice if the metrics functionality also was pluggable so 
> one could choose how to report metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-702) Make metrics reporting implementation pluggable

2019-10-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945977#comment-16945977
 ] 

Marton Elek commented on RATIS-702:
---

Thank you very much [~hheg] to report this issue. I totally agree. The 
reporters should be registered by the user of the ratis library.

> Make metrics reporting implementation pluggable
> ---
>
> Key: RATIS-702
> URL: https://issues.apache.org/jira/browse/RATIS-702
> Project: Ratis
>  Issue Type: Wish
>  Components: metrics
>Reporter: Henrik Hegardt
>Assignee: Marton Elek
>Priority: Major
>
> It would be really nice if the metrics functionality also was pluggable so 
> one could choose how to report metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-702) Make metrics reporting implementation pluggable

2019-10-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-702:
--
Attachment: RATIS-702.001.patch

> Make metrics reporting implementation pluggable
> ---
>
> Key: RATIS-702
> URL: https://issues.apache.org/jira/browse/RATIS-702
> Project: Ratis
>  Issue Type: Wish
>  Components: metrics
>Reporter: Henrik Hegardt
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-702.001.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It would be really nice if the metrics functionality also was pluggable so 
> one could choose how to report metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-702) Make metrics reporting implementation pluggable

2019-10-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948429#comment-16948429
 ] 

Marton Elek commented on RATIS-702:
---

Oh, sorry if I misunderstood the goal of this jira.

Making ratis metrics totally vendor independent is a bigger task as we have 
dropwizard interfaces in our interfaces. Supporting both 3 and 4 dropwizard 
seems to be easier. As far as I see after this patch only the JVMMetrics 
depends on dropwizard metrics 3 and all the others interfaces are compatible.

I made the jvm and ganglia related dependencies optional. I didn't test it 
(yet) but I think if you use the metrics library and you bump the verison of 
dropwizard dependencies, it should work with 4.

> Make metrics reporting implementation pluggable
> ---
>
> Key: RATIS-702
> URL: https://issues.apache.org/jira/browse/RATIS-702
> Project: Ratis
>  Issue Type: Wish
>  Components: metrics
>Reporter: Henrik Hegardt
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-702.001.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It would be really nice if the metrics functionality also was pluggable so 
> one could choose how to report metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-702) Make metrics reporting implementation pluggable

2019-10-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-702:
--
Attachment: RATIS-702.002.patch

> Make metrics reporting implementation pluggable
> ---
>
> Key: RATIS-702
> URL: https://issues.apache.org/jira/browse/RATIS-702
> Project: Ratis
>  Issue Type: Wish
>  Components: metrics
>Reporter: Henrik Hegardt
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-702.001.patch, RATIS-702.002.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It would be really nice if the metrics functionality also was pluggable so 
> one could choose how to report metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-697) Provide helper scripts for code quality checks

2019-10-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948461#comment-16948461
 ] 

Marton Elek commented on RATIS-697:
---

I just enabled github actions as a test right inside this  pull request.

> Provide helper scripts for code quality checks
> --
>
> Key: RATIS-697
> URL: https://issues.apache.org/jira/browse/RATIS-697
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In Ozone we started to use simple shell scripts to check the quality of the 
> code. (dev-support/check/checkstyle.sh). They help us to execute local maven 
> commands quickly and collect all of the results.
> They also help us to use github-actions or other highly parallel CI in the 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-706) Dead lock in GrpcClientRpc

2019-10-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-706:
--
Attachment: jstack.txt

> Dead lock in GrpcClientRpc
> --
>
> Key: RATIS-706
> URL: https://issues.apache.org/jira/browse/RATIS-706
> Project: Ratis
>  Issue Type: Bug
>  Components: gRPC
>Reporter: Marton Elek
>Priority: Major
> Attachments: jstack.txt
>
>
> I started an Ozone cluster on Kubernetes and started a freon test (ozone 
> freon ockg -n1)
> After a while I found that the one freon instance is not creating keys any 
> more. I checked the om RPC endpoint with ozone insight and no RPC messages 
> has been arrived.
> Based on the jstack output we have a deadlock between 
> PeerProxyMap.handleException and GrpcClientRpc.sendRequestAsync.
> I am not sure (yet) what is the exact problem, but based on the stack traces 
> It seems to be Ratis related.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-706) Dead lock in GrpcClientRpc

2019-10-10 Thread Marton Elek (Jira)
Marton Elek created RATIS-706:
-

 Summary: Dead lock in GrpcClientRpc
 Key: RATIS-706
 URL: https://issues.apache.org/jira/browse/RATIS-706
 Project: Ratis
  Issue Type: Bug
  Components: gRPC
Reporter: Marton Elek
 Attachments: jstack.txt

I started an Ozone cluster on Kubernetes and started a freon test (ozone freon 
ockg -n1)

After a while I found that the one freon instance is not creating keys any 
more. I checked the om RPC endpoint with ozone insight and no RPC messages has 
been arrived.

Based on the jstack output we have a deadlock between 
PeerProxyMap.handleException and GrpcClientRpc.sendRequestAsync.

I am not sure (yet) what is the exact problem, but based on the stack traces It 
seems to be Ratis related.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-706) Dead lock in GrpcClientRpc

2019-10-14 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951107#comment-16951107
 ] 

Marton Elek commented on RATIS-706:
---

Thanks the quick fix [~szetszwo]. I deployed it and worked well, couldn't see 
the dead lock any more. 

 

I am not sure If I can judge the code but will try if nobody else will review 
it...

> Dead lock in GrpcClientRpc
> --
>
> Key: RATIS-706
> URL: https://issues.apache.org/jira/browse/RATIS-706
> Project: Ratis
>  Issue Type: Bug
>  Components: gRPC
>Reporter: Marton Elek
>Assignee: Tsz-wo Sze
>Priority: Major
> Attachments: jstack.txt, r706_20191011.patch, r706_20191011b.patch
>
>
> I started an Ozone cluster on Kubernetes and started a freon test (ozone 
> freon ockg -n1)
> After a while I found that the one freon instance is not creating keys any 
> more. I checked the om RPC endpoint with ozone insight and no RPC messages 
> has been arrived.
> Based on the jstack output we have a deadlock between 
> PeerProxyMap.handleException and GrpcClientRpc.sendRequestAsync.
> I am not sure (yet) what is the exact problem, but based on the stack traces 
> It seems to be Ratis related.
> {code}
> Found one Java-level deadlock:
> =
> "pool-2-thread-6":
>   waiting to lock monitor 0x7f80356c8800 (object 0x00033eb70a00, a 
> java.lang.Object),
>   which is held by 
> "java.util.concurrent.ThreadPoolExecutor$Worker@77329f41[State = -1, empty 
> queue]"
> "java.util.concurrent.ThreadPoolExecutor$Worker@77329f41[State = -1, empty 
> queue]":
>   waiting to lock monitor 0x01170980 (object 0x00033eb99b10, a 
> org.apache.ratis.util.SlidingWindow$Client),
>   which is held by 
> "java.util.concurrent.ThreadPoolExecutor$Worker@df368f8[State = -1, empty 
> queue]"
> "java.util.concurrent.ThreadPoolExecutor$Worker@df368f8[State = -1, empty 
> queue]":
>   waiting to lock monitor 0x7f80356c8800 (object 0x00033eb70a00, a 
> java.lang.Object),
>   which is held by 
> "java.util.concurrent.ThreadPoolExecutor$Worker@77329f41[State = -1, empty 
> queue]"
> Java stack information for the threads listed above:
> ===
> "pool-2-thread-6":
>   at org.apache.ratis.util.PeerProxyMap.getProxy(PeerProxyMap.java:103)
>   - waiting to lock <0x00033eb70a00> (a java.lang.Object)
>   at 
> org.apache.ratis.grpc.client.GrpcClientRpc.sendRequestAsyncUnordered(GrpcClientRpc.java:78)
>   at 
> org.apache.ratis.client.impl.UnorderedAsync.sendRequestWithRetry(UnorderedAsync.java:75)
>   at 
> org.apache.ratis.client.impl.UnorderedAsync.send(UnorderedAsync.java:59)
>   at 
> org.apache.ratis.client.impl.RaftClientImpl.sendWatchAsync(RaftClientImpl.java:139)
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:282)
>   at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:198)
>   at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnLastIndex(CommitWatcher.java:161)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:346)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:482)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:496)
>   at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
>   at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:435)
>   at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:473)
>   at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
>   - locked <0x0003f2ba4240> (a 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream)
>   at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.createKey(RandomKeyGenerator.java:710)
>   at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator.access$1100(RandomKeyGenerator.java:88)
>   at 
> org.apache.hadoop.ozone.freon.RandomKeyGenerator$ObjectCreator.run(RandomKeyGenerator.java:615)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515)
>   at 
> java.util.concurrent.FutureTask.run(java.base@11.0.3/FutureTask.java:264)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628)
>   at java.lang.Thread.run(java.base@11.0.3/Thread.java:834)
> "java.util.concurrent.ThreadPoolExecutor$Worker@77329f41[State = -1, empty 
> queue]":
>   at 
> 

[jira] [Created] (RATIS-820) Use https for the maven repositories

2020-02-28 Thread Marton Elek (Jira)
Marton Elek created RATIS-820:
-

 Summary: Use https for the maven repositories
 Key: RATIS-820
 URL: https://issues.apache.org/jira/browse/RATIS-820
 Project: Ratis
  Issue Type: Improvement
Reporter: Marton Elek



As reported here: https://github.com/apache/incubator-ratis/pull/53



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-820) Use https for the maven repositories

2020-02-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved RATIS-820.
---
Fix Version/s: 0.6.0
   Resolution: Fixed

> Use https for the maven repositories
> 
>
> Key: RATIS-820
> URL: https://issues.apache.org/jira/browse/RATIS-820
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Priority: Major
> Fix For: 0.6.0
>
>
> As reported here: https://github.com/apache/incubator-ratis/pull/53



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-01-29 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025848#comment-17025848
 ] 

Marton Elek commented on RATIS-804:
---

The problematic segment is this (in CacheInvalidationPolicy.java):
{code:java}
if (result.isEmpty()) {
  for (int i = safeIndex; i >= j; i--) {
LogSegment s = segments.get(i);
if (s.getStartIndex() > lastAppliedIndex && s.hasCache()) {
  result.add(s);
  break;
}
  }
} {code}
This is the last segment in the algorithm. The evictImpl:
 # First checks which segments are not flushed. They should be kept
 # (In case of follower) Which segments are already applied
 # (In case of follower and the no segments to remove until this point): 
*Remove the segments between the lastAppliedIndex and the localFlushIndex* with 
the hope that it can be loaded any time. It can, but only with locks.

> Race condition between cache evict and load in LogSegment
> -
>
> Key: RATIS-804
> URL: https://issues.apache.org/jira/browse/RATIS-804
> Project: Ratis
>  Issue Type: Bug
>Reporter: Marton Elek
>Priority: Critical
>
> I am doing some kind of stress testing with Ozone. I start one Datanode in 
> FOLLOWER mode and the load generator (Freon) behaves like a LEADER.
> I am sending huge number of AppendLogEntries to the FOLLOWER without 
> inhibitions.
> As a result I got NPE:
> {code:java}
> 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
> 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
> the StateMachineUp
> dater hits Throwable
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
> at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
> ... 4 more {code}
> It seems to be a race condition between LogSegment.evictCache() and 
> LogSegment.loadCache().
>  # StateMachineUpdater tries to update the StateMachine with the next log 
> entry
>  # It can't be found in the cache, therefore the LogSegment.loadCache() is 
> called
>  # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
>  # After loading, it returns with the loaded entry
> If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
> log segment is already flushed, therefore can be evicted) an NPE will be 
> thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-02-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032334#comment-17032334
 ] 

Marton Elek commented on RATIS-804:
---

{quote}[~elek], would you mind testing the patch?
{quote}
Sure, thanks the patch. I just started to create a new build to deploy and test.

> Race condition between cache evict and load in LogSegment
> -
>
> Key: RATIS-804
> URL: https://issues.apache.org/jira/browse/RATIS-804
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Marton Elek
>Assignee: Tsz-wo Sze
>Priority: Critical
> Attachments: r804_20200205.patch
>
>
> I am doing some kind of stress testing with Ozone. I start one Datanode in 
> FOLLOWER mode and the load generator (Freon) behaves like a LEADER.
> I am sending huge number of AppendLogEntries to the FOLLOWER without 
> inhibitions.
> As a result I got NPE:
> {code:java}
> 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
> 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
> the StateMachineUp
> dater hits Throwable
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
> at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
> ... 4 more {code}
> It seems to be a race condition between LogSegment.evictCache() and 
> LogSegment.loadCache().
>  # StateMachineUpdater tries to update the StateMachine with the next log 
> entry
>  # It can't be found in the cache, therefore the LogSegment.loadCache() is 
> called
>  # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
>  # After loading, it returns with the loaded entry
> If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
> log segment is already flushed, therefore can be evicted) an NPE will be 
> thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-02-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032519#comment-17032519
 ] 

Marton Elek commented on RATIS-804:
---

+1

 

I tested and couldn't reproduce the Exception any more.

 

(In fact it's very hard to reproduce with a properly configured client. I used 
a specific client which doesn't close the GRPC requests. With the fixed client, 
it's very hard to see the Exception during real tests...)

> Race condition between cache evict and load in LogSegment
> -
>
> Key: RATIS-804
> URL: https://issues.apache.org/jira/browse/RATIS-804
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Marton Elek
>Assignee: Tsz-wo Sze
>Priority: Critical
> Attachments: r804_20200205.patch
>
>
> I am doing some kind of stress testing with Ozone. I start one Datanode in 
> FOLLOWER mode and the load generator (Freon) behaves like a LEADER.
> I am sending huge number of AppendLogEntries to the FOLLOWER without 
> inhibitions.
> As a result I got NPE:
> {code:java}
> 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
> 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
> the StateMachineUp
> dater hits Throwable
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
> at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
> ... 4 more {code}
> It seems to be a race condition between LogSegment.evictCache() and 
> LogSegment.loadCache().
>  # StateMachineUpdater tries to update the StateMachine with the next log 
> entry
>  # It can't be found in the cache, therefore the LogSegment.loadCache() is 
> called
>  # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
>  # After loading, it returns with the loaded entry
> If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
> log segment is already flushed, therefore can be evicted) an NPE will be 
> thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-01-28 Thread Marton Elek (Jira)
Marton Elek created RATIS-804:
-

 Summary: Race condition between cache evict and load in LogSegment
 Key: RATIS-804
 URL: https://issues.apache.org/jira/browse/RATIS-804
 Project: Ratis
  Issue Type: Bug
Reporter: Marton Elek


I am doing some kind of stress testing with Ozone. I start one Datanode in 
FOLLOWER mode and the load generator (Freon) behaves like a LEADER.

I am sending huge number of AppendLogEntries to the FOLLOWER without 
inhibitions.

As a result I got NPE:
{code:java}
2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
the StateMachineUp
dater hits Throwable
org.apache.ratis.server.raftlog.RaftLogIOException: 
java.lang.NullPointerException
at 
org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
at 
org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
at 
org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at 
org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
at 
org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
... 4 more {code}
It seems to be a race condition between LogSegment.evictCache() and 
LogSegment.loadCache().
 # StateMachineUpdater tries to update the StateMachine with the next log entry
 # It can't be found in the cache, therefore the LogSegment.loadCache() is 
called
 # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
 # After loading, it returns with the loaded entry

If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
log segment is already flushed, therefore can be evicted) an NPE will be thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-02-17 Thread Marton Elek (Jira)
Marton Elek created RATIS-816:
-

 Summary: Use peerId in error log / exception of 
GrpcServerProtocolClient
 Key: RATIS-816
 URL: https://issues.apache.org/jira/browse/RATIS-816
 Project: Ratis
  Issue Type: Improvement
Reporter: Marton Elek


GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
requests.

I propose to persist raftPeerId in the constructor and use it in the error / 
exception message.

This is not just getting more meaningful message (it's a nice to have) but in 
HDDS-3023 I am modifying the byte code to mock the leader->follower 
communication. It's way more easier to do if the required raftPeerId is 
available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-02-17 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-816:
--
Attachment: RATIS-816.001.patch

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-02-17 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek reassigned RATIS-816:
-

Assignee: Marton Elek

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-12 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-816:
--
Fix Version/s: 0.6.0

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch, 
> RATIS-816.003.patch, RATIS-816.004.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-12 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057810#comment-17057810
 ] 

Marton Elek commented on RATIS-816:
---

Thanks the review [~msingh] and [~arp]. Checkstyle problem is fixed, I am 
merging it to the master.

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch, 
> RATIS-816.003.patch, RATIS-816.004.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-12 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-816:
--
Attachment: RATIS-816.004.patch

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch, 
> RATIS-816.003.patch, RATIS-816.004.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-11 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-816:
--
Attachment: RATIS-816.002.patch

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-11 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057015#comment-17057015
 ] 

Marton Elek commented on RATIS-816:
---

Sure, uploaded in the 2nd patch.

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-816) Use peerId in error log / exception of GrpcServerProtocolClient

2020-03-11 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-816:
--
Attachment: RATIS-816.003.patch

> Use peerId in error log / exception of GrpcServerProtocolClient
> ---
>
> Key: RATIS-816
> URL: https://issues.apache.org/jira/browse/RATIS-816
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
> Attachments: RATIS-816.001.patch, RATIS-816.002.patch, 
> RATIS-816.003.patch
>
>
> GrpcServerProtocolClient is used to send out requestVote and appendLogEntry 
> requests.
> I propose to persist raftPeerId in the constructor and use it in the error / 
> exception message.
> This is not just getting more meaningful message (it's a nice to have) but in 
> HDDS-3023 I am modifying the byte code to mock the leader->follower 
> communication. It's way more easier to do if the required raftPeerId is 
> available in the class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-827) Ratis example is leaking to the ratis-tools classpath

2020-03-13 Thread Marton Elek (Jira)
Marton Elek created RATIS-827:
-

 Summary: Ratis example is leaking to the ratis-tools classpath
 Key: RATIS-827
 URL: https://issues.apache.org/jira/browse/RATIS-827
 Project: Ratis
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


ratis-tools depends on ratis-example project which means that all the projects 
using ratis-tools can get unexpected dependencies from the example project:

For example I see the following ozone.

{code}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/hadoop/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/hadoop/share/ozone/lib/ratis-examples-0.6.0-a320ae0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{code}

I propose to move the example dependent tools implementation to the example 
project and make the example project depends on the tools instead of the 
opposite direction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-827) Ratis example is leaking to the ratis-tools classpath

2020-03-13 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-827:
--
Attachment: RATIS-827.001.patch

> Ratis example is leaking to the ratis-tools classpath
> -
>
> Key: RATIS-827
> URL: https://issues.apache.org/jira/browse/RATIS-827
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Critical
> Attachments: RATIS-827.001.patch
>
>
> ratis-tools depends on ratis-example project which means that all the 
> projects using ratis-tools can get unexpected dependencies from the example 
> project:
> For example I see the following ozone.
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/ratis-examples-0.6.0-a320ae0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}
> I propose to move the example dependent tools implementation to the example 
> project and make the example project depends on the tools instead of the 
> opposite direction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-827) Ratis example is leaking to the ratis-tools classpath

2020-03-13 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-827:
--
Attachment: RATIS-827.002.patch

> Ratis example is leaking to the ratis-tools classpath
> -
>
> Key: RATIS-827
> URL: https://issues.apache.org/jira/browse/RATIS-827
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Critical
> Attachments: RATIS-827.001.patch, RATIS-827.002.patch
>
>
> ratis-tools depends on ratis-example project which means that all the 
> projects using ratis-tools can get unexpected dependencies from the example 
> project:
> For example I see the following ozone.
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/ratis-examples-0.6.0-a320ae0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}
> I propose to move the example dependent tools implementation to the example 
> project and make the example project depends on the tools instead of the 
> opposite direction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-827) Ratis example is leaking to the ratis-tools classpath

2020-03-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-827:
--
Attachment: RATIS-827.003.patch

> Ratis example is leaking to the ratis-tools classpath
> -
>
> Key: RATIS-827
> URL: https://issues.apache.org/jira/browse/RATIS-827
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Critical
> Attachments: RATIS-827.001.patch, RATIS-827.002.patch, 
> RATIS-827.003.patch
>
>
> ratis-tools depends on ratis-example project which means that all the 
> projects using ratis-tools can get unexpected dependencies from the example 
> project:
> For example I see the following ozone.
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop/share/ozone/lib/ratis-examples-0.6.0-a320ae0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}
> I propose to move the example dependent tools implementation to the example 
> project and make the example project depends on the tools instead of the 
> opposite direction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094548#comment-17094548
 ] 

Marton Elek commented on RATIS-840:
---

[~szetszwo] can you please help to review it? Ozone test results are very noisy 
because this issue.

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority commit got from the FollowerInfo of the old 
> GrpcLogAppender never changes. So even though follower has append 

[jira] [Created] (RATIS-875) Bump the copyright year in the NOTICE of thirdparty

2020-04-22 Thread Marton Elek (Jira)
Marton Elek created RATIS-875:
-

 Summary: Bump the copyright year in the NOTICE of thirdparty
 Key: RATIS-875
 URL: https://issues.apache.org/jira/browse/RATIS-875
 Project: Ratis
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


Reported by [~arp]  during a 0.4.0 rc vote.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-880) Update github description and disable merge options apart from Squash and merge

2020-04-25 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved RATIS-880.
---
Fix Version/s: 0.6.0
   Resolution: Fixed

> Update github description and disable merge options apart from Squash and 
> merge
> ---
>
> Key: RATIS-880
> URL: https://issues.apache.org/jira/browse/RATIS-880
> Project: Ratis
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update github description and disable merge options apart from Squash and 
> merge



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-840) Memory leak of LogAppender

2020-04-25 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-840:
--
Priority: Blocker  (was: Critical)

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority commit got from the FollowerInfo of the old 
> GrpcLogAppender never changes. So even though follower has append entry 
> successfully, the leader can not update commit. So the new created 
> GrpcLogAppender can never work normally.
> 6. The reason of unit test of 

[jira] [Commented] (RATIS-840) Memory leak of LogAppender

2020-04-29 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095242#comment-17095242
 ] 

Marton Elek commented on RATIS-840:
---

bq. Marton Elek Please wait for me, I have to make sure the patch does not 
generate new failed ut. Because there are about 30 failed ut in ratis even 
though without my patch currently, it's need some time to do it.

I am huge +1 with this approach, but do you suggest to fix all the unit tests 
before RATIS-840? As I wrote Ozone is bleeding. What is the plan?



> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because 

[jira] [Updated] (RATIS-948) Update Sonar statistics only from the apache repo, not from the forks

2020-05-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated RATIS-948:
--
Description: 
RATIS-940 enabled the Sonar check for all the commits but it doesn't work for 
forked repositories (unless somebody set an own SONAR_TOKEN).

The fix is the same as HDDS-2627, we can restrict the execution to the apache 
repostiroy.

Thanks to [~ljain] who reported this issue:

Example failure:

https://github.com/lokeshj1703/incubator-ratis/runs/716253801

Note: PRs are not affected as they work well (no sonar check there) only the 
builds of forked repos. 

  was:
RATIS-940 enabled the Sonar check for all the commits but it doesn't work for 
forked repositories (unless somebody set an own SONAR_TOKEN).

The fix is the same as HDDS-2627, we can restrict the execution to the apache 
repostiroy.


> Update Sonar statistics only from the apache repo, not from the forks
> -
>
> Key: RATIS-948
> URL: https://issues.apache.org/jira/browse/RATIS-948
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> RATIS-940 enabled the Sonar check for all the commits but it doesn't work for 
> forked repositories (unless somebody set an own SONAR_TOKEN).
> The fix is the same as HDDS-2627, we can restrict the execution to the apache 
> repostiroy.
> Thanks to [~ljain] who reported this issue:
> Example failure:
> https://github.com/lokeshj1703/incubator-ratis/runs/716253801
> Note: PRs are not affected as they work well (no sonar check there) only the 
> builds of forked repos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Moved] (RATIS-948) Update Sonar statistics only from the apache repo, not from the forks

2020-05-28 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek moved HDDS-3675 to RATIS-948:
-

 Key: RATIS-948  (was: HDDS-3675)
Target Version/s: 0.6.0  (was: 0.6.0)
Workflow: no-reopen-closed, patch-avail  (was: patch-available, 
re-open possible)
 Project: Ratis  (was: Hadoop Distributed Data Store)

> Update Sonar statistics only from the apache repo, not from the forks
> -
>
> Key: RATIS-948
> URL: https://issues.apache.org/jira/browse/RATIS-948
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> RATIS-940 enabled the Sonar check for all the commits but it doesn't work for 
> forked repositories (unless somebody set an own SONAR_TOKEN).
> The fix is the same as HDDS-2627, we can restrict the execution to the apache 
> repostiroy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-948) Update Sonar statistics only from the apache repo, not from the forks

2020-06-02 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved RATIS-948.
---
Resolution: Fixed

> Update Sonar statistics only from the apache repo, not from the forks
> -
>
> Key: RATIS-948
> URL: https://issues.apache.org/jira/browse/RATIS-948
> Project: Ratis
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> RATIS-940 enabled the Sonar check for all the commits but it doesn't work for 
> forked repositories (unless somebody set an own SONAR_TOKEN).
> The fix is the same as HDDS-2627, we can restrict the execution to the apache 
> repostiroy.
> Thanks to [~ljain] who reported this issue:
> Example failure:
> https://github.com/lokeshj1703/incubator-ratis/runs/716253801
> Note: PRs are not affected as they work well (no sonar check there) only the 
> builds of forked repos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)