[ 
https://issues.apache.org/jira/browse/HDDS-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng updated HDDS-2186:
---------------------------
    Comment: was deleted

(was: https://github.com/apache/hadoop-ozone/pull/28)

> Fix tests using MiniOzoneCluster for its memory related exceptions
> ------------------------------------------------------------------
>
>                 Key: HDDS-2186
>                 URL: https://issues.apache.org/jira/browse/HDDS-2186
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: 0.5.0
>            Reporter: Li Cheng
>            Assignee: Li Cheng
>            Priority: Major
>              Labels: flaky-test
>             Fix For: HDDS-1564
>
>
> After multi-raft usage, MiniOzoneCluster seems to be fishy and reports a 
> bunch of 'out of memory' exceptions in ratis. Attached sample stacks.
>  
> 2019-09-26 15:12:22,824 
> [2e1e11ca-833a-4fbc-b948-3d93fc8e7288@group-218F3868CEA9-SegmentedRaftLogWorker]
>  ERROR segmented.SegmentedRaftLogWorker 
> (SegmentedRaftLogWorker.java:run(323)) - 
> 2e1e11ca-833a-4fbc-b948-3d93fc8e7288@group-218F3868CEA9-SegmentedRaftLogWorker
>  hit exception2019-09-26 15:12:22,824 
> [2e1e11ca-833a-4fbc-b948-3d93fc8e7288@group-218F3868CEA9-SegmentedRaftLogWorker]
>  ERROR segmented.SegmentedRaftLogWorker 
> (SegmentedRaftLogWorker.java:run(323)) - 
> 2e1e11ca-833a-4fbc-b948-3d93fc8e7288@group-218F3868CEA9-SegmentedRaftLogWorker
>  hit exceptionjava.lang.OutOfMemoryError: Direct buffer memory at 
> java.nio.Bits.reserveMemory(Bits.java:694) at 
> java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at 
> java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at 
> org.apache.ratis.server.raftlog.segmented.BufferedWriteChannel.<init>(BufferedWriteChannel.java:41)
>  at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogOutputStream.<init>(SegmentedRaftLogOutputStream.java:72)
>  at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$StartLogSegment.execute(SegmentedRaftLogWorker.java:566)
>  at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:289)
>  at java.lang.Thread.run(Thread.java:748)
>  
> which leads to:
> 2019-09-26 15:12:23,029 [RATISCREATEPIPELINE1] ERROR 
> pipeline.RatisPipelineProvider 
> (RatisPipelineProvider.java:lambda$null$2(181)) - Failed invoke Ratis rpc 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider$$Lambda$297/1222454951@55d1e990
>  for c1f4d375-683b-42fe-983b-428a63aa88032019-09-26 15:12:23,029 
> [RATISCREATEPIPELINE1] ERROR pipeline.RatisPipelineProvider 
> (RatisPipelineProvider.java:lambda$null$2(181)) - Failed invoke Ratis rpc 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider$$Lambda$297/1222454951@55d1e990
>  for 
> c1f4d375-683b-42fe-983b-428a63aa8803org.apache.ratis.protocol.TimeoutIOException:
>  deadline exceeded after 2999881264ns at 
> org.apache.ratis.grpc.GrpcUtil.tryUnwrapException(GrpcUtil.java:82) at 
> org.apache.ratis.grpc.GrpcUtil.unwrapException(GrpcUtil.java:75) at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient.blockingCall(GrpcClientProtocolClient.java:178)
>  at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient.groupAdd(GrpcClientProtocolClient.java:147)
>  at 
> org.apache.ratis.grpc.client.GrpcClientRpc.sendRequest(GrpcClientRpc.java:94) 
> at 
> org.apache.ratis.client.impl.RaftClientImpl.sendRequest(RaftClientImpl.java:278)
>  at 
> org.apache.ratis.client.impl.RaftClientImpl.groupAdd(RaftClientImpl.java:205) 
> at 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.lambda$initializePipeline$1(RatisPipelineProvider.java:142)
>  at 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.lambda$null$2(RatisPipelineProvider.java:177)
>  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) 
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at 
> java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291) at 
> java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at 
> java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401) at 
> java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734) at 
> java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) 
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
>  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583) 
> at 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.lambda$callRatisRpc$3(RatisPipelineProvider.java:171)
>  at 
> java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
>  at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
> at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)Caused
>  by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: deadline exceeded after 2999881264ns at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.AdminProtocolServiceGrpc$AdminProtocolServiceBlockingStub.groupManagement(AdminProtocolServiceGrpc.java:274)
>  at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient.lambda$groupAdd$3(GrpcClientProtocolClient.java:149)
>  at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient.blockingCall(GrpcClientProtocolClient.java:176)
>  ... 25 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to