Alan Myrvold created BEAM-4649:
----------------------------------
Summary: Failures in beam_PostCommit_Py_ValCont due to exception
in read_log_control_messages
Key: BEAM-4649
URL: https://issues.apache.org/jira/browse/BEAM-4649
Project: Beam
Issue Type: Bug
Components: sdk-py-harness
Affects Versions: 2.6.0
Reporter: Alan Myrvold
Assignee: Alan Myrvold
Fix For: 2.6.0
Example
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-06-26_11_04_04-14383422363721841333?project=apache-beam-testing
All of the failures have the same exception in the docker logs:
I 2018/06/26 18:05:12 Executing: python -m
apache_beam.runners.worker.sdk_worker_main
I Exception in thread read_log_control_messages:
I Traceback (most recent call last):
I File "/usr/local/lib/python2.7/threading.py", line 801, in
__bootstrap_inner
I self.run()
I File "/usr/local/lib/python2.7/threading.py", line 754, in run
I self.__target(*self.__args, **self.__kwargs)
I File
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
line 61, in <lambda>
I target=lambda: self._read_log_control_messages(log_control_messages),
I File
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
line 107, in _read_log_control_messages
I for _ in log_control_iterator:
I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344,
in next
I return self._next()
I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 335,
in _next
I raise self
I _Rendezvous: <_Rendezvous of RPC that terminated with
(StatusCode.UNAVAILABLE, Connect Failed)>
I
I Traceback (most recent call last):
I File "/usr/local/lib/python2.7/runpy.py", line 174, in _run_module_as_main
I "__main__", fname, loader, pkg_name)
I File "/usr/local/lib/python2.7/runpy.py", line 72, in _run_code
I exec code in run_globals
I File
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 195, in <module>
I main(sys.argv)
I File
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 134, in main
I worker_count=_get_worker_count(sdk_pipeline_options)).run()
I File
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 104, in run
I for work_request in control_stub.Control(get_responses()):
I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344,
in next
I return self._next()
I File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 324,
in _next
I raise self
I grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with
(StatusCode.UNAVAILABLE, Connect Failed)>
Looks like a race condition?
Harness log with startup message is appearing *after* docker log with
connection exception.
Harness log:
2018-06-26 10:40:59.328 PDT Launched Beam Fn Logging service url:
"localhost:12370"
Docker log:
2018-06-26 10:40:53.361 PDT Exception in thread read_log_control_messages:
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)