[
https://issues.apache.org/jira/browse/BEAM-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pablo Estrada resolved BEAM-4649.
---------------------------------
Resolution: Fixed
> Failures in beam_PostCommit_Py_ValCont due to exception in
> read_log_control_messages
> ------------------------------------------------------------------------------------
>
> Key: BEAM-4649
> URL: https://issues.apache.org/jira/browse/BEAM-4649
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-harness
> Affects Versions: 2.6.0
> Reporter: Alan Myrvold
> Assignee: Alan Myrvold
> Priority: Major
> Fix For: 2.6.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Example
> https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-06-26_11_04_04-14383422363721841333?project=apache-beam-testing
>
> All of the failures have the same exception in the docker logs:
> I 2018/06/26 18:05:12 Executing: python -m
> apache_beam.runners.worker.sdk_worker_main
> I Exception in thread read_log_control_messages:
> I Traceback (most recent call last):
> I File "/usr/local/lib/python2.7/threading.py", line 801, in
> __bootstrap_inner
> I self.run()
> I File "/usr/local/lib/python2.7/threading.py", line 754, in run
> I self.__target(*self.__args, **self.__kwargs)
> I File
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
> line 61, in <lambda>
> I target=lambda: self._read_log_control_messages(log_control_messages),
> I File
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
> line 107, in _read_log_control_messages
> I for _ in log_control_iterator:
> I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line
> 344, in next
> I return self._next()
> I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line
> 335, in _next
> I raise self
> I _Rendezvous: <_Rendezvous of RPC that terminated with
> (StatusCode.UNAVAILABLE, Connect Failed)>
> I
> I Traceback (most recent call last):
> I File "/usr/local/lib/python2.7/runpy.py", line 174, in
> _run_module_as_main
> I "__main__", fname, loader, pkg_name)
> I File "/usr/local/lib/python2.7/runpy.py", line 72, in _run_code
> I exec code in run_globals
> I File
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
> line 195, in <module>
> I main(sys.argv)
> I File
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
> line 134, in main
> I worker_count=_get_worker_count(sdk_pipeline_options)).run()
> I File
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
> line 104, in run
> I for work_request in control_stub.Control(get_responses()):
> I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line
> 344, in next
> I return self._next()
> I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line
> 324, in _next
> I raise self
> I grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with
> (StatusCode.UNAVAILABLE, Connect Failed)>
> Looks like a race condition?
> Harness log with startup message is appearing *after* docker log with
> connection exception.
> Harness log:
> 2018-06-26 10:40:59.328 PDT Launched Beam Fn Logging service url:
> "localhost:12370"
> Docker log:
> 2018-06-26 10:40:53.361 PDT Exception in thread read_log_control_messages:
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)