[
https://issues.apache.org/jira/browse/RATIS-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Elser updated RATIS-485:
-----------------------------
Attachment: RATIS-485.003.patch
> TimeoutScheduler is leaked by gRPC client implementation
> --------------------------------------------------------
>
> Key: RATIS-485
> URL: https://issues.apache.org/jira/browse/RATIS-485
> Project: Ratis
> Issue Type: Bug
> Components: examples
> Reporter: Clay B.
> Assignee: Tsz Wo Nicholas Sze
> Priority: Major
> Fix For: 0.5.0
>
> Attachments: RATIS-485.003.patch, loadgen.log, r485_20190827.patch,
> r485_20190828.patch
>
>
> Running the load generator without a Ratis cluster (e.g. spurious node IPs)
> results in an OOM.
> If one has a single Ratis server it tries seemingly indefinitely:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576
> --numFiles 100 --peers n0:127.0.0.1:1{code}
> If one has two Ratis servers it OOMs:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$
> ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576
> --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
> [...]
> 1/787867107@5e5792a0 with java.util.concurrent.CompletionException:
> java.io.IOException:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new
> leader: null. Failed
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0
> RW,
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> with java.io.IOException:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> 2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader
> from n1 to n0
> 2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with
> policy RetryForeverNoSleep for
> RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0
> RW,
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send*
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0
> RW,
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed
> RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0
> RW,
> org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError:
> unable to create new native thread
> Exception in thread "main" java.util.concurrent.CompletionException:
> java.lang.OutOfMemoryError: unable to create new native thread
> at
> org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
> at
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> at
> java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
> at
> java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
> at
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
> at
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
> at
> org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
> at
> org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
> at
> org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
> at
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
> at
> org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
> at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
> at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at
> java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
> at
> org.apache.ratis.util.TimeoutScheduler.schedule(TimeoutScheduler.java:117)
> at
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:104)
> at
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:82)
> at
> org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:134)
> at
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.onNext(GrpcClientProtocolClient.java:234)
> at
> org.apache.ratis.grpc.client.GrpcClientRpc.sendRequestAsync(GrpcClientRpc.java:71)
> at
> org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:324)
> ... 15 more
> {code}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)