TrevorEdwards commented on a change in pull request #3744: [AIRFLOW-2893] fix stuck dataflow job due to name mismatch URL: https://github.com/apache/incubator-airflow/pull/3744#discussion_r210130649
########## File path: airflow/contrib/hooks/gcp_dataflow_hook.py ########## @@ -124,36 +127,48 @@ def __init__(self, cmd): def _line(self, fd): if fd == self._proc.stderr.fileno(): - lines = self._proc.stderr.readlines() - for line in lines: - self.log.warning(line[:-1]) - if lines: - return lines[-1] + line = ''.join(self._proc.stderr.readlines()) + self.log.warning(line[:-1]) + return line if fd == self._proc.stdout.fileno(): - line = self._proc.stdout.readline() + line = ''.join(self._proc.stdout.readlines()) + self.log.info(line[:-1]) return line @staticmethod def _extract_job(line): - if line is not None: - if line.startswith("Submitted job: "): - return line[15:-1] + # Job id info: https://goo.gl/SE29y9. + job_id_pattern = re.compile( + b'.*console.cloud.google.com/dataflow.*/jobs/([a-z|0-9|A-Z|\-|\_]+).*') Review comment: The linked line includes location- I think location is in the hierarchy from looking at it as well. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services