[ 
https://issues.apache.org/jira/browse/BEAM-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michal Walenia updated BEAM-8980:
---------------------------------
    Description: 
When running a GBK Load test using Java harness image and JobServer image 
generated from master, the load test fails with a cryptic exception:
{code:java}
Exception in thread "main" java.lang.RuntimeException: Invalid job state: 
FAILED.
11:45:31        at 
org.apache.beam.sdk.loadtests.JobFailure.handleFailure(JobFailure.java:55)
11:45:31        at org.apache.beam.sdk.loadtests.LoadTest.run(LoadTest.java:106)
11:45:31        at 
org.apache.beam.sdk.loadtests.CombineLoadTest.run(CombineLoadTest.java:66)
11:45:31        at 
org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
{code}
 

After some investigation, I found a stacktrace of the error:
{code:java}
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: 
call already 
cancelledorg.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: 
CANCELLED: call already cancelled at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
 at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
 at 
org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.flush(BeamFnDataSizeBasedBufferingOutboundObserver.java:90)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.accept(BeamFnDataSizeBasedBufferingOutboundObserver.java:102)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.processElements(FlinkExecutableStageFunction.java:278)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:201)
 at 
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
 at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504) at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369) at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at 
java.lang.Thread.run(Thread.java:748) Suppressed: 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: 
call already cancelled at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
 at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
 at 
org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.close(BeamFnDataSizeBasedBufferingOutboundObserver.java:84)
 at 
org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:298)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:202)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:202)
 ... 6 more Suppressed: java.lang.IllegalStateException: Processing bundle 
failed, TODO: [BEAM-3962] abort bundle. at 
org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:320)
 ... 8 more
{code}
It seems that the core issue is an IllegalStateException thrown from 
SdkHarnessClient.java:320, related to BEAM-3962.

 

  was:
When running a GBK Load test using Java harness image and JobServer image 
generated from master, the load test fails with a cryptic exception:
{{}}
{code:java}
Exception in thread "main" java.lang.RuntimeException: Invalid job state: 
FAILED.
11:45:31        at 
org.apache.beam.sdk.loadtests.JobFailure.handleFailure(JobFailure.java:55)
11:45:31        at org.apache.beam.sdk.loadtests.LoadTest.run(LoadTest.java:106)
11:45:31        at 
org.apache.beam.sdk.loadtests.CombineLoadTest.run(CombineLoadTest.java:66)
11:45:31        at 
org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
{code}
 

After some investigation, I found a stacktrace of the error:
{code:java}
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: 
call already 
cancelledorg.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: 
CANCELLED: call already cancelled at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
 at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
 at 
org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.flush(BeamFnDataSizeBasedBufferingOutboundObserver.java:90)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.accept(BeamFnDataSizeBasedBufferingOutboundObserver.java:102)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.processElements(FlinkExecutableStageFunction.java:278)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:201)
 at 
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
 at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504) at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369) at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at 
java.lang.Thread.run(Thread.java:748) Suppressed: 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: 
call already cancelled at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
 at 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
 at 
org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
 at 
org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.close(BeamFnDataSizeBasedBufferingOutboundObserver.java:84)
 at 
org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:298)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:202)
 at 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:202)
 ... 6 more Suppressed: java.lang.IllegalStateException: Processing bundle 
failed, TODO: [BEAM-3962] abort bundle. at 
org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:320)
 ... 8 more
{code}
It seems that the core issue is an IllegalStateException thrown from 
SdkHarnessClient.java:320, related to BEAM-3962.

 


> Running GroupByKeyLoadTest on Portable Flink fails
> --------------------------------------------------
>
>                 Key: BEAM-8980
>                 URL: https://issues.apache.org/jira/browse/BEAM-8980
>             Project: Beam
>          Issue Type: Bug
>          Components: testing
>            Reporter: Michal Walenia
>            Priority: Major
>
> When running a GBK Load test using Java harness image and JobServer image 
> generated from master, the load test fails with a cryptic exception:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Invalid job state: 
> FAILED.
> 11:45:31      at 
> org.apache.beam.sdk.loadtests.JobFailure.handleFailure(JobFailure.java:55)
> 11:45:31      at org.apache.beam.sdk.loadtests.LoadTest.run(LoadTest.java:106)
> 11:45:31      at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.run(CombineLoadTest.java:66)
> 11:45:31      at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
> {code}
>  
> After some investigation, I found a stacktrace of the error:
> {code:java}
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: 
> CANCELLED: call already 
> cancelledorg.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: 
> CANCELLED: call already cancelled at 
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
>  at 
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
>  at 
> org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
>  at 
> org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.flush(BeamFnDataSizeBasedBufferingOutboundObserver.java:90)
>  at 
> org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.accept(BeamFnDataSizeBasedBufferingOutboundObserver.java:102)
>  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.processElements(FlinkExecutableStageFunction.java:278)
>  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:201)
>  at 
> org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
>  at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504) at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369) at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at 
> java.lang.Thread.run(Thread.java:748) Suppressed: 
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: 
> CANCELLED: call already cancelled at 
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
>  at 
> org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
>  at 
> org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98)
>  at 
> org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.close(BeamFnDataSizeBasedBufferingOutboundObserver.java:84)
>  at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:298)
>  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:202)
>  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:202)
>  ... 6 more Suppressed: java.lang.IllegalStateException: Processing bundle 
> failed, TODO: [BEAM-3962] abort bundle. at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:320)
>  ... 8 more
> {code}
> It seems that the core issue is an IllegalStateException thrown from 
> SdkHarnessClient.java:320, related to BEAM-3962.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to