Jaydeep Vaghasiya created BEAM-9096:
---------------------------------------
Summary: Processing stuck in step Write Valid Event To
GCS/WriteFiles/WriteShardedBundlesToTempFiles/GroupIntoShards/ReadStream
Key: BEAM-9096
URL: https://issues.apache.org/jira/browse/BEAM-9096
Project: Beam
Issue Type: Bug
Components: beam-community, io-java-gcp, runner-dataflow
Affects Versions: 2.16.0
Environment: Dataflow
Reporter: Jaydeep Vaghasiya
Assignee: Aizhamal Nurmamat kyzy
Using JAVA SDK of apache-beam (2.16) in dataflow I am trying to write streaming
data from Kafka to GCS using dataflow, I am facing an issue with writing data
to bucket in my pipeline. Looking at the error I can see that the processing
gets stuck for a while in step WriteShardedBundlesToTempFiles and eventually it
gets terminated, which results in data loss.
Also, this issue is not persistent. Restarting the pipeline resolves it some
time and it may also fail again.
below is a stack trace
```
Processing stuck in step Write Valid Event To
GCS/WriteFiles/WriteShardedBundlesToTempFiles/GroupIntoShards/ReadStream for at
least 05m01s without outputting or completing in state process at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) at
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) at
org.apache.beam.sdk.coders.AvroCoder.encode(AvroCoder.java:306) at
org.apache.beam.sdk.coders.Coder.getEncodedElementByteSize(Coder.java:297) at
org.apache.beam.sdk.coders.Coder.registerByteSizeObserver(Coder.java:291) at
org.apache.beam.sdk.coders.IterableLikeCoder.registerByteSizeObserver(IterableLikeCoder.java:203)
at
org.apache.beam.sdk.coders.IterableLikeCoder.registerByteSizeObserver(IterableLikeCoder.java:60)
at
org.apache.beam.sdk.coders.KvCoder.registerByteSizeObserver(KvCoder.java:128)
at org.apache.beam.sdk.coders.KvCoder.registerByteSizeObserver(KvCoder.java:36)
at
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:613)
at
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:529)
at
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$ElementByteSizeObservableCoder.registerByteSizeObserver(IntrinsicMapTaskExecutorFactory.java:400)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputObjectAndByteCounter.update(OutputObjectAndByteCounter.java:125)
at
org.apache.beam.runners.dataflow.worker.DataflowOutputCounter.update(DataflowOutputCounter.java:64)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:43)
at
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:182)
at
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:102)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.lambda$onTrigger$1(ReduceFnRunner.java:1057)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner$$Lambda$176/1157856648.output(Unknown
Source) at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnContextFactory$OnTriggerContextImpl.output(ReduceFnContextFactory.java:438)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SystemReduceFn.onTrigger(SystemReduceFn.java:125)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.onTrigger(ReduceFnRunner.java:1060)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.emit(ReduceFnRunner.java:930)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.onTimers(ReduceFnRunner.java:790)
at
org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:95)
at
org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42)
at
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
at
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
at
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
at
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:134)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
at
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
at
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1316)
at
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
at
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1049)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)```
--
This message was sent by Atlassian Jira
(v8.3.4#803005)