dimberman commented on a change in pull request #11325:
URL: https://github.com/apache/airflow/pull/11325#discussion_r501182226
##########
File path: airflow/kubernetes/pod_launcher.py
##########
@@ -124,9 +125,21 @@ def monitor_pod(self, pod: V1Pod, get_logs: bool) ->
Tuple[State, Optional[str]]
:return: Tuple[State, Optional[str]]
"""
if get_logs:
- logs = self.read_pod_logs(pod)
- for line in logs:
- self.log.info(line)
+ read_logs_since_sec = None
+ while True:
+ logs = self.read_pod_logs(pod,
since_seconds=read_logs_since_sec)
+ for line in logs:
+ self.log.info(line)
+ curr_time = dt.now()
+ time.sleep(1)
+
+ if not self.base_container_is_running(pod):
+ break
+
+ self.log.warning('Pod %s log read interrupted',
pod.metadata.name)
+ delta = dt.now() - curr_time
+ # Prefer logs duplication rather than loss
+ read_logs_since_sec = math.ceil(delta.total_seconds())
Review comment:
Is there any chance of log lines being missed in the period between the
logs being read and this time? Is there any way we can get the timestamp from
the read_pod_log request?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]