josh-fell commented on code in PR #28837:
URL: https://github.com/apache/airflow/pull/28837#discussion_r1066521820
##########
airflow/providers/amazon/aws/operators/sagemaker.py:
##########
@@ -1049,3 +1050,54 @@ def execute(self, context: Context) -> dict | None:
self.check_interval,
)
return best
+
+
+class SageMakerCreateExperimentOperator(SageMakerBaseOperator):
+ """
+ Creates a SageMaker experiment, to be then associated to jobs etc.
+
+ .. seealso::
+ For more information on how to use this operator, take a look at the
guide:
+ :ref:`howto/operator:SageMakerCreateExperimentOperator`
+
+ :param name: name of the experiment, must be unique within the AWS account
+ :param description: description of the experiment, optional
+ :param tags: tags to attach to the experiment, optional
+
+ :returns: the ARN of the experiment created, though experiments are
referred to by name
+ """
+
+ template_fields: Sequence[str] = (
+ "name",
+ "description",
+ "tags",
+ )
+
+ def __init__(
+ self,
+ *,
+ name: str,
+ description: str | None = None,
+ tags: dict | None = None,
+ aws_conn_id: str = DEFAULT_CONN_ID,
+ **kwargs,
+ ):
+ super().__init__(config={}, aws_conn_id=aws_conn_id, **kwargs)
+ self.name = name
+ self.description = description
+ if tags:
+ self.tags_set = [{"Key": kvp[0], "Value": kvp[1]} for kvp in
tags.items()]
+ else:
+ self.tags_set = []
Review Comment:
There is a possibility of unexpected behavior here with `tags` being a
template field. Template field values are rendered _right before_ `execute()`
is called but after an operator is initialized. So a user could specify
```python
SageMakerCreateExperimentOperator(..., tag={"my_key": "{{ var.value.my_tag
}}")
```
when calling the operator, but `tag_set` would be literally `[{"Key":
"my_key", "Value": "{{ var.value.my_tag }}"}]` instead of whatever `{{
var.value.my_tag }}` would eventually render to be because the templated parts
of `tag` haven't been rendered yet.
IMO the easiest thing to do is move this code block under `execute()` and
not worry about this edge case.
##########
airflow/providers/amazon/aws/operators/sagemaker.py:
##########
@@ -1049,3 +1050,54 @@ def execute(self, context: Context) -> dict | None:
self.check_interval,
)
return best
+
+
+class SageMakerCreateExperimentOperator(SageMakerBaseOperator):
+ """
+ Creates a SageMaker experiment, to be then associated to jobs etc.
+
+ .. seealso::
+ For more information on how to use this operator, take a look at the
guide:
+ :ref:`howto/operator:SageMakerCreateExperimentOperator`
+
+ :param name: name of the experiment, must be unique within the AWS account
+ :param description: description of the experiment, optional
+ :param tags: tags to attach to the experiment, optional
+
+ :returns: the ARN of the experiment created, though experiments are
referred to by name
+ """
+
+ template_fields: Sequence[str] = (
+ "name",
+ "description",
+ "tags",
+ )
+
+ def __init__(
+ self,
+ *,
+ name: str,
+ description: str | None = None,
+ tags: dict | None = None,
+ aws_conn_id: str = DEFAULT_CONN_ID,
Review Comment:
I think it would be useful if this param was included in the docstring as
well. In the Python API doc for the operator, users explicitly then know this
param is available should they not want to use the default connection ID.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]