[ 
https://issues.apache.org/jira/browse/BEAM-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869911#comment-16869911
 ] 

Daniel Oliveira edited comment on BEAM-7619 at 6/21/19 10:01 PM:
-----------------------------------------------------------------

Looks like there's an error occurring in the SDK Harness:


{noformat}
12:08:13 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
Error received from SDK harness for instruction -129: Traceback (most recent 
call last):
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 157, in _execute
12:08:13     response = task()
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 190, in <lambda>
12:08:13     self._execute(lambda: worker.do_instruction(work), work)
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 342, in do_instruction
12:08:13     request.instruction_id)
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 368, in process_bundle
12:08:13     bundle_processor.process_bundle(instruction_id))
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 593, in process_bundle
12:08:13     data.ptransform_id].process_encoded(data.data)
12:08:13 KeyError: u'\n\x04-107\x12\x04-105'
{noformat}

Seems like in [bundle_processor.py line 
593|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/bundle_processor.py#L593]
 we use a ptransform_id that isn't present in the dict for whatever reason. 
I'll try to look into any changes that were made back when the test started 
failing, or try to find people familiar with this code.



was (Author: danoliveira):
Looks like there's an error occurring in the SDK Harness:


{noformat}
12:08:13 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
Error received from SDK harness for instruction -129: Traceback (most recent 
call last):
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 157, in _execute
12:08:13     response = task()
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 190, in <lambda>
12:08:13     self._execute(lambda: worker.do_instruction(work), work)
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 342, in do_instruction
12:08:13     request.instruction_id)
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 368, in process_bundle
12:08:13     bundle_processor.process_bundle(instruction_id))
12:08:13   File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 593, in process_bundle
12:08:13     data.ptransform_id].process_encoded(data.data)
12:08:13 KeyError: u'\n\x04-107\x12\x04-105'
{noformat}

Seems like in [bundle_processor.py line 
593|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/bundle_processor.py#L593]
 we use a `ptransform_id` that isn't present in the dict for whatever reason. 
I'll try to look into any changes that were made back when the test started 
failing, or try to find people familiar with this code.


> [beam_PostCommit_Py_ValCont] [test_metrics_fnapi_it] 
> java.lang.IllegalStateException: Already closed.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-7619
>                 URL: https://issues.apache.org/jira/browse/BEAM-7619
>             Project: Beam
>          Issue Type: Bug
>          Components: test-failures
>            Reporter: Mikhail Gryzykhin
>            Assignee: Daniel Oliveira
>            Priority: Major
>              Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|[https://builds.apache.org/job/beam_PostCommit_Py_ValCont/3625/consoleFull]]
>  
> Initial investigation:
> 12:08:13 ERROR: test_wordcount_fnapi_it 
> (apache_beam.examples.wordcount_it_test.WordCountIT)
> 12:08:13 
> ----------------------------------------------------------------------
> 12:08:13 Traceback (most recent call last):
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount_it_test.py",
>  line 52, in test_wordcount_fnapi_it
> 12:08:13 self._run_wordcount_it(wordcount.run, experiment='beam_fn_api')
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount_it_test.py",
>  line 84, in _run_wordcount_it
> 12:08:13 run_wordcount(test_pipeline.get_full_options_as_args(**extra_opts))
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount.py",
>  line 114, in run
> 12:08:13 result = p.run()
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 12:08:13 self._options).run(False)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 12:08:13 return self.runner.run_pipeline(self, self._options)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline
> 12:08:13 self.result.wait_until_finish(duration=wait_duration)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish
> 12:08:13 (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> 12:08:13 DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, 
> Error:
> 12:08:13 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> Error received from SDK harness for instruction -129: Traceback (most recent 
> call last):
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> 12:08:13 response = task()
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in <lambda>
> 12:08:13 self._execute(lambda: worker.do_instruction(work), work)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> 12:08:13 request.instruction_id)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> 12:08:13 bundle_processor.process_bundle(instruction_id))
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 593, in process_bundle
> 12:08:13 data.ptransform_id].process_encoded(data.data)
> 12:08:13 KeyError: u'\n\x04-107\x12\x04-105'
> 12:08:13 
> 12:08:13 at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 12:08:13 at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 12:08:13 at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:285)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:412)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:381)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123)
> 12:08:13 Suppressed: java.lang.IllegalStateException: Already closed.
> 12:08:13 at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:215)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91)
> 12:08:13 ... 6 more
> 12:08:13 Caused by: java.lang.RuntimeException: Error received from SDK 
> harness for instruction -129: Traceback (most recent call last):
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> 12:08:13 response = task()
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in <lambda>
> 12:08:13 self._execute(lambda: worker.do_instruction(work), work)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> 12:08:13 request.instruction_id)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> 12:08:13 bundle_processor.process_bundle(instruction_id))
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 593, in process_bundle
> 12:08:13 data.ptransform_id].process_encoded(data.data)
> 12:08:13 KeyError: u'\n\x04-107\x12\x04-105'
> 12:08:13 
> 12:08:13 at 
> org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:157)
> 12:08:13 at 
> org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:140)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> 12:08:13 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 12:08:13 at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 12:08:13 at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 12:08:13 at java.lang.Thread.run(Thread.java:745)
> ----
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to