kisssam opened a new issue, #39567:
URL: https://github.com/apache/airflow/issues/39567
### Apache Airflow Provider(s)
google
### Versions of Apache Airflow Providers
apache-airflow-providers-google==10.17.0
### Apache Airflow version
airflow-2.7.3
### Operating System
Running on Google Cloud Composer
### Deployment
Google Cloud Composer
### Deployment details
apache-airflow-providers-google==10.17.0
### What happened
When a task using BigQueryInsertJobOperator has exactly 64 characters in
its `task_id`, the task fails with the following error:
```
[2024-05-10TXX:XX:XX.XXX+0000] {standard_task_runner.py:104} ERROR - Failed
to execute job XXXXXXXX for task
task_id_with_exactly_64_characters_00000000000000000000000000000 (400 POST
https://bigquery.googleapis.com/bigquery/v2/projects/<PROJECT_ID>/jobs?prettyPrint=false:
Label value "task_id_with_exactly_64_characters_00000000000000000000000000000"
has invalid characters.
```
when the provider package `apache-airflow-providers-google` is of version
10.17.0.
### What you think should happen instead
If the task_id does not follow the conditions for BQ label values, i.e.,
Values can be empty, and have a maximum length of 63 characters and can contain
only lowercase letters, numeric characters, underscores, and dashes - then the
BigQuery job should still get created successfully without the default labels
not being added , instead of failing as currently observed in case of task_id
with 64 characters.
### How to reproduce
* Create a Airflow environment with apache-airflow-providers-google==10.17.0
.
* Create a task with the task_id as
"task_id_with_exactly_64_characters_00000000000000000000000000000" using the
BigQueryInsertJobOperator to create any BQ query job.
* Observe that the job fails with the error `Label value
"task_id_with_exactly_64_characters_00000000000000000000000000000" has invalid
characters.`
### Anything else
This is occurring as a result of the validation introduced in #37736.
#37736 automatically sets the `airflow-dag` and `airflow-task` as job labels
for the BigQuery job created as long as these identifiers follow the regex
pattern `LABEL_REGEX = re.compile(r"^[a-z][\w-]{0,63}$")` - which means that
the task_id name regex matches a pattern starting with a lowercase letter and
has a maximum length of 64 characters and contain only alphanumeric characters,
underscores, or hyphens.
Otherwise, the BigQueryInsertJobOperator will create a job without adding
any default labels (for example, in the case of task_id greater than 64
characters).
However, as per the [BigQuery documentation for
labels](https://cloud.google.com/bigquery/docs/labels-intro#requirements),
Values can be empty, and have a maximum length of 63 characters and can contain
only lowercase letters, numeric characters, underscores, and dashes.
Hence, the current validation regex `LABEL_REGEX` does not satisfy the
conditions for BigQuery label values.
For the edge case of a task_id with 64 characters, this passes the
validation in `LABEL_REGEX` but because BigQuery label values only support upto
63 characters, the BigQuery job creation fails.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]