mpeteuil opened a new pull request, #23536: URL: https://github.com/apache/airflow/pull/23536
#### Motivation In certain databases there is a need to set the collation for ID fields like `dag_id` or `task_id` to something different than the database default. This is because in MySQL with utf8mb4 the index size becomes too big for the MySQL limits. In past pull requests this was handled [#7570](https://github.com/apache/airflow/pull/7570), [#17729](https://github.com/apache/airflow/pull/17729), but the `root_dag_id` field on the dag model was missed. Since this field is used to join with the `dag_id` in various other models ([and self-referentially](https://github.com/apache/airflow/blob/451c7cbc42a83a180c4362693508ed33dd1d1dab/airflow/models/dag.py#L2766)), it also needs to have the same collation as other ID fields. This can be seen by running `airflow db reset` before and after applying this change while also specifying `sql_engine_collation_for_ids` in the configuration. Without this change affected database setups could see issues like the following: ```py sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1267, "Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='") ``` Other related pull requests: [#19408](https://github.com/apache/airflow/pull/19408) #### Backwards Compatibility Since the downgrade is just to drop the column, there shouldn't be any problems. #### Tests? Since there weren't many tests covering these cases when the changes were made in the previously referenced pull requests, it didn't seem needed in this specific case. Let me know if that's not the case though we can talk about what those might look like. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
