GitHub user jx2lee edited a discussion: What is best way to change isolation 
level in MySqlHook (`get_pandas_df`)

Hey, airflow community!
I have question of best way to use MySqlHook!

- We've EL pipeline using Airflow Custom Operator, often used `MySqlHook`.
- 
hook.[get_pandas_df](https://github.com/apache/airflow/blob/88d3c357c513f419737c7abe85d12bcabc810ea1/providers/src/airflow/providers/common/sql/hooks/sql.py#L294)
 is important in our pipeline! (selected by chunk size, concating more and 
more!)
- db(source) isolation level is repeatable read, but we need to change 
isolation level when extract source(to `READ COMITTED`!!).
- but due to [this 
line](https://github.com/apache/airflow/blob/88d3c357c513f419737c7abe85d12bcabc810ea1/providers/src/airflow/providers/common/sql/hooks/sql.py#L315),
 do not change isolation level. so we changed get_pandas_df to 
`pandas.read_sql` with connection adjusting changed isolation level

I want to keep using get_pandas_df, and changed isolation level when use it! is 
there best, safe way to change isolation level in MySqlHook?


---
e.g my case
```python
        from contextlib import closing

        check_query = 'select @@transaction_isolation'
        with closing(self.rdb_hook.get_conn()) as conn:
            with conn.cursor() as cursor:
                cursor.execute("set session aurora_read_replica_read_committed 
= ON;")
                cursor.execute("set session transaction isolation level read 
committed;")
                conn.commit()

                query = f"""
                    SELECT {','.join(self.columns)}
                    FROM {self.schema}.{self.table}"""

                query += f"""
                    WHERE {self.primary_key}
                """
```

GitHub link: https://github.com/apache/airflow/discussions/43929

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to