francesco-camussoni-ueno opened a new issue, #43552:
URL: https://github.com/apache/airflow/issues/43552
### Apache Airflow Provider(s)
amazon
### Versions of Apache Airflow Providers
8.2.0
### Apache Airflow version
2.6.3
### Operating System
mw1.small
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### What happened
I have this task related to a tuning job on a dag:
`tuning_dict = {"task_id": "tuning", "config":
{"HyperParameterTuningJobConfig": {"ParameterRanges":
{"CategoricalParameterRanges": [{"Name": "max_features", "Values": ["sqrt",
"log2"]}, {"Name": "criterion", "Values": ["gini", "entropy", "log_loss"]}],
"ContinuousParameterRanges": [{"Name": "ccp_alpha", "MinValue": "0.0",
"MaxValue": "0.02"}], "IntegerParameterRanges": [{"Name": "min_samples_leaf",
"MinValue": "2", "MaxValue": "15"}, {"Name": "n_estimators", "MinValue": "50",
"MaxValue": "500"}]}, "HyperParameterTuningJobObjective": {"Name":
"validation:accuracy", "Type": "Maximize"}, "Strategy": "Bayesian",
"RandomSeed": 123}, ...`
The key ContinuousParameterRanges contains some hyperparameters for mi
tunning job that are casted as a string. This is a must based on the
TuningOperator:
https://github.com/apache/airflow/blob/providers-amazon/3.4.0/airflow/providers/amazon/aws/example_dags/example_sagemaker.py
(line 202).
But I'm seeing that they are converted to float in the case of
ContinuousParameterRanges or to int in the case of IntegerParameterRanges
because of this bunch of code:
https://github.com/apache/airflow/blob/providers-amazon/8.20.0/airflow/providers/amazon/aws/operators/sagemaker.py
(line 99 or function parse_config_integers/parse_integers)
So when I execute the dag I get this kind of erros:
```
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.ContinuousParameterRanges[0].MinValue,
value: 0.0, type: <class 'float'>, valid types: <class 'str'>
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.ContinuousParameterRanges[0].MaxValue,
value: 0.02, type: <class 'float'>, valid types: <class 'str'>
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.IntegerParameterRanges[0].MinValue,
value: 2, type: <class 'int'>, valid types: <class 'str'>
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.IntegerParameterRanges[0].MaxValue,
value: 15, type: <class 'int'>, valid types: <class 'str'>
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.IntegerParameterRanges[1].MinValue,
value: 50, type: <class 'int'>, valid types: <class 'str'>
Invalid type for parameter
HyperParameterTuningJobConfig.ParameterRanges.IntegerParameterRanges[1].MaxValue,
value: 500, type: <class 'int'>, valid types: <class 'str'>
```
Any help?
### What you think should happen instead
I think that those parameters don't have to be converted as float or string
### How to reproduce
Generate a dag with this task:
'tuning_dict = {"task_id": "tuning", "config":
{"HyperParameterTuningJobConfig": {"ParameterRanges":
{"CategoricalParameterRanges": [{"Name": "max_features", "Values": ["sqrt",
"log2"]}, {"Name": "criterion", "Values": ["gini", "entropy", "log_loss"]}],
"ContinuousParameterRanges": [{"Name": "ccp_alpha", "MinValue": "0.0",
"MaxValue": "0.02"}], "IntegerParameterRanges": [{"Name": "min_samples_leaf",
"MinValue": "2", "MaxValue": "15"}, {"Name": "n_estimators", "MinValue": "50",
"MaxValue": "500"}]}, "HyperParameterTuningJobObjective": {"Name":
"validation:accuracy", "Type": "Maximize"}, "Strategy": "Bayesian",
"RandomSeed": 123}, "ResourceLimits": {"MaxNumberOfTrainingJobs": 10,
"MaxParallelTrainingJobs": 4, "MaxRuntimeInSeconds": 7200}, "Tags": [{"Key":
"USER", "Value": "[email protected]"}, {"Key": "TRIBU", "Value":
"Central Data"}, {"Key": "SQUAD", "Value": "Personalization and Relevance"},
{"Key": "ONLINE_OR_BATCH", "Value": "batch"}, {"Key": "PREDICTION_TYPE", "Va
lue": "clasificacion binaria"}, {"Key": "VERSION_DESCRIPTION", "Value":
"Version inicial"}, {"Key": "DESCRIPTION", "Value": "Desarrollo de deployment
de pipeline de entrenamiento"}], "HyperParameterTuningJobName":
"mlpipeline-training-tuning", "TrainingJobDefinition":
{"AlgorithmSpecification": {"TrainingImage": "<Training_image>",
"TrainingInputMode": "File", "MetricDefinitions": [{"Name":
"validation:accuracy", "Regex": "validation-accuracy=(.*?);"}, {"Name":
"validation:recall", "Regex": "validation-recall=(.*?);"}, {"Name":
"validation:precision", "Regex": "validation-precision=(.*?);"}]},
"InputDataConfig": [{"ChannelName": "ingestion", "DataSource": {"S3DataSource":
{"S3DataType": "S3Prefix", "S3Uri": "<BUCKET>", "S3DataDistributionType":
"FullyReplicated"}}}], "OutputDataConfig": {"S3OutputPath":
"s3://pr-ueno-prod-sagemaker/ml-projects/mlpipeline/training_pipeline/tuning/output"},
"ResourceConfig": {"InstanceType": "ml.m5.large", "InstanceCount": 1,
"VolumeSizeInGB": 10}, "S
toppingCondition": {"MaxRuntimeInSeconds": 7200}, "RoleArn": "<ROLE>",
"StaticHyperParameters": {}}}}`
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]