[
https://issues.apache.org/jira/browse/BEAM-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía updated BEAM-9651:
-------------------------------
Status: Open (was: Triage Needed)
> StreamingDataflowWorker stuck waiting for
> org.apache.beam.runners.dataflow.worker.windmill.DirectStreamObserver.onNext
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: BEAM-9651
> URL: https://issues.apache.org/jira/browse/BEAM-9651
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Sam Whittle
> Assignee: Sam Whittle
> Priority: Major
>
> Operation ongoing in step <redacted> for at least 28h10m00s without
> outputting or completing in state windmill-read at
> sun.misc.Unsafe.park(Native Method) at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at
> java.util.concurrent.Phaser$QNode.block(Phaser.java:1140) at
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at
> java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067) at
> java.util.concurrent.Phaser.awaitAdvanceInterruptibly(Phaser.java:758) at
> org.apache.beam.runners.dataflow.worker.windmill.DirectStreamObserver.onNext(DirectStreamObserver.java:49)
> at
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$AbstractWindmillStream.send(GrpcWindmillServer.java:615)
> at
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$GrpcGetDataStream.onNewStream(GrpcWindmillServer.java:946)
> at
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$AbstractWindmillStream.startStream(GrpcWindmillServer.java:628)
> at
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$GrpcGetDataStream.<init>(GrpcWindmillServer.java:941)
> at
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.getDataStream(GrpcWindmillServer.java:506)
> at
> org.apache.beam.runners.dataflow.worker.MetricTrackingWindmillServerStub$$Lambda$129/665137804.get(Unknown
> Source) at
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool$StreamData.<init>(WindmillServerStub.java:159)
> at
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool$StreamData.<init>(WindmillServerStub.java:158)
> at
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool.getStream(WindmillServerStub.java:191)
> at
> org.apache.beam.runners.dataflow.worker.MetricTrackingWindmillServerStub.getStateData(MetricTrackingWindmillServerStub.java:199)
> at
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:433)
> at
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:328)
> at
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillValue.read(WindmillStateInternals.java:389)
> at
> <redacted>
> Because the stream is started in a StreamPool synchronized block, all other
> threads interacting with StreamPool to get or release streams end up blocking.
> It is unclear if the stream never became usable and thus blocked forever or
> if there is a race with the use of the Phaser that causes the stuckness.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)