vandonr-amz commented on code in PR #29522:
URL: https://github.com/apache/airflow/pull/29522#discussion_r1136097037
##########
airflow/providers/amazon/aws/hooks/batch_client.py:
##########
@@ -419,8 +419,46 @@ def get_job_awslogs_info(self, job_id: str) -> dict[str,
str] | None:
:param job_id: AWS Batch Job ID
"""
- job_container_desc =
self.get_job_description(job_id=job_id).get("container", {})
- log_configuration = job_container_desc.get("logConfiguration", {})
+ job_desc = self.get_job_description(job_id=job_id)
Review Comment:
ok I see your point, but should there really be one log link per job ?
I'm looking at it, and it seems that in the case of a multinode job, there
is multiple `log_configuration` (one per node), but from that log config we get
- the log group
- the region
I'd imagine that multinode batch jobs would not be multi-region ? So that'd
would be a constant across all nodes.
And also, I suppose in an overwhelming majority of the cases, the log group
would be the same for all nodes (it would be very weird if it wasn't).
Then we get the stream name from the attempts, but this does not depend on
the number of nodes. I imagine in most cases there would be one attempt. If
there are more, we make the choice of returning the stream name for the last
attempt, which makes sense.
The job runs on many nodes, but the logs all end up in the same log stream.
What we can do is iterate on the log configs to make sure they are all
sending logs
- to aws
- in the same region
- in the same group
and log a warning if it's not the case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]