Alan Myrvold created BEAM-4649:
----------------------------------

             Summary: Failures in beam_PostCommit_Py_ValCont due to exception 
in read_log_control_messages
                 Key: BEAM-4649
                 URL: https://issues.apache.org/jira/browse/BEAM-4649
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-harness
    Affects Versions: 2.6.0
            Reporter: Alan Myrvold
            Assignee: Alan Myrvold
             Fix For: 2.6.0


Example

https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-06-26_11_04_04-14383422363721841333?project=apache-beam-testing

 

All of the failures have the same exception in the docker logs:
I  2018/06/26 18:05:12 Executing: python -m 
apache_beam.runners.worker.sdk_worker_main
I  Exception in thread read_log_control_messages:
I  Traceback (most recent call last):
I    File "/usr/local/lib/python2.7/threading.py", line 801, in 
__bootstrap_inner
I      self.run()
I    File "/usr/local/lib/python2.7/threading.py", line 754, in run
I      self.__target(*self.__args, **self.__kwargs)
I    File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
 line 61, in <lambda>
I      target=lambda: self._read_log_control_messages(log_control_messages),
I    File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py",
 line 107, in _read_log_control_messages
I      for _ in log_control_iterator:
I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344, 
in next
I      return self._next()
I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 335, 
in _next
I      raise self
I  _Rendezvous: <_Rendezvous of RPC that terminated with 
(StatusCode.UNAVAILABLE, Connect Failed)>
I  
I  Traceback (most recent call last):
I    File "/usr/local/lib/python2.7/runpy.py", line 174, in _run_module_as_main
I      "__main__", fname, loader, pkg_name)
I    File "/usr/local/lib/python2.7/runpy.py", line 72, in _run_code
I      exec code in run_globals
I    File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
 line 195, in <module>
I      main(sys.argv)
I    File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
 line 134, in main
I      worker_count=_get_worker_count(sdk_pipeline_options)).run()
I    File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 104, in run
I      for work_request in control_stub.Control(get_responses()):
I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344, 
in next
I      return self._next()
I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 324, 
in _next
I      raise self
I  grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with 
(StatusCode.UNAVAILABLE, Connect Failed)>

Looks like a race condition?
Harness log with startup message is appearing *after* docker log with 
connection exception.

Harness log:
2018-06-26 10:40:59.328 PDT Launched Beam Fn Logging service url: 
"localhost:12370"
Docker log:
2018-06-26 10:40:53.361 PDT Exception in thread read_log_control_messages:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to