TrevorEdwards commented on a change in pull request #3744: [AIRFLOW-2893] fix 
stuck dataflow job due to name mismatch
URL: https://github.com/apache/incubator-airflow/pull/3744#discussion_r210130649
 
 

 ##########
 File path: airflow/contrib/hooks/gcp_dataflow_hook.py
 ##########
 @@ -124,36 +127,48 @@ def __init__(self, cmd):
 
     def _line(self, fd):
         if fd == self._proc.stderr.fileno():
-            lines = self._proc.stderr.readlines()
-            for line in lines:
-                self.log.warning(line[:-1])
-            if lines:
-                return lines[-1]
+            line = ''.join(self._proc.stderr.readlines())
+            self.log.warning(line[:-1])
+            return line
         if fd == self._proc.stdout.fileno():
-            line = self._proc.stdout.readline()
+            line = ''.join(self._proc.stdout.readlines())
+            self.log.info(line[:-1])
             return line
 
     @staticmethod
     def _extract_job(line):
-        if line is not None:
-            if line.startswith("Submitted job: "):
-                return line[15:-1]
+        # Job id info: https://goo.gl/SE29y9.
+        job_id_pattern = re.compile(
+            
b'.*console.cloud.google.com/dataflow.*/jobs/([a-z|0-9|A-Z|\-|\_]+).*')
 
 Review comment:
   The linked line includes location- I think location is in the hierarchy from 
looking at it as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to