hussein-awala commented on code in PR #31482:
URL: https://github.com/apache/airflow/pull/31482#discussion_r1203753961
##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -54,9 +54,15 @@ def get_log_uri(
"Requires either the output of a describe_cluster call or both an
EMR Client and a job_flow_id."
)
+ log_uri = None
+ parsed_log_uri = None
if cluster:
- log_uri = S3Hook.parse_s3_url(cluster["Cluster"]["LogUri"])
+ parsed_log_uri = S3Hook.parse_s3_url(cluster["Cluster"]["LogUri"])
else:
response = emr_client.describe_cluster(ClusterId=job_flow_id)
- log_uri = S3Hook.parse_s3_url(response["Cluster"]["LogUri"])
- return "/".join(log_uri)
+ if "LogUri" in response["Cluster"]:
+ parsed_log_uri = S3Hook.parse_s3_url(response["Cluster"]["LogUri"])
+
+ if parsed_log_uri:
+ log_uri = "/".join(parsed_log_uri)
+ return log_uri
Review Comment:
When cluster is provided, we get it by the same method but we provide it to
avoid querying the API twice, so we can have the same exception.
We can simplify this method by using this code:
```suggestion
cluster_info = (cluster or
emr_client.describe_cluster(ClusterId=job_flow_id))["Cluster"]
if "LogUri" not in cluster_info:
return
log_uri = S3Hook.parse_s3_url(cluster_info["LogUri"])
return "/".join(log_uri)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]