mpeteuil opened a new pull request, #23536:
URL: https://github.com/apache/airflow/pull/23536

   #### Motivation
   In certain databases there is a need to set the collation for ID fields like 
`dag_id` or `task_id` to something different than the database default. This is 
because in MySQL with utf8mb4 the index size becomes too big for the MySQL 
limits. In past pull requests this was handled 
[#7570](https://github.com/apache/airflow/pull/7570), 
[#17729](https://github.com/apache/airflow/pull/17729), but the `root_dag_id` 
field on the dag model was missed. Since this field is used to join with the 
`dag_id` in various other models ([and 
self-referentially](https://github.com/apache/airflow/blob/451c7cbc42a83a180c4362693508ed33dd1d1dab/airflow/models/dag.py#L2766)),
 it also needs to have the same collation as other ID fields.
   
   This can be seen by running `airflow db reset` before and after applying 
this change while also specifying `sql_engine_collation_for_ids` in the 
configuration.
   
   Without this change affected database setups could see issues like the 
following:
   ```py
   sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) 
(1267, "Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and 
(utf8_general_ci,IMPLICIT) for operation '='")
   ```
   
   
   Other related pull requests: 
[#19408](https://github.com/apache/airflow/pull/19408)
   
   #### Backwards Compatibility
   
   Since the downgrade is just to drop the column, there shouldn't be any 
problems.
   
   #### Tests?
   Since there weren't many tests covering these cases when the changes were 
made in the previously referenced pull requests, it didn't seem needed in this 
specific case. Let me know if that's not the case though we can talk about what 
those might look like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to