GitHub user hy144328 edited a discussion: Roadmap to SQLAlchemy 2

I would like to propose a plan to migrate from SQLAlchemy 1.4 to 2.0.
>From what I see in discussions, issues and PRs, we have not quite hit the 
>homerun yet.
I already have some PRs in preparation, and I am interested in feedback.

## Goals

*   Make migration to SQLAlchemy 2 as boring as possible.
*   Break down migration into smaller steps to take the pressure off LGTM 
reviews.

## Motivation

Personal motivation:
My company uses Superset with Dremio.
The SQLAlchemy plug-in for Dremio requires SQLAlchemy 2. 
https://github.com/narendrans/sqlalchemy_dremio/blob/c576c65318f58e9342bbdb3dd0d6d691af9299ba/setup.py#L12
Superset is still stuck at SQLAlchemy 1. 
https://github.com/apache/superset/blob/1230b9091b0ebba0e63a7853970330b549747bd8/pyproject.toml#L102
Currently, we use a fork that forcefully downgrades the SQLAlchemy plug-in for 
Dremio.
However, this is not sustainable in the long term.

General motivation:
While SQLAlchemy has no official EOL dates, it is virtually there. 
https://github.com/sqlalchemy/sqlalchemy/discussions/13009
SQLAlchemy 2.0 has been out for multiple years now.
SQLAlchemy 2.1 is around the corner.
This will make SQLAlchemy obsolete.

## Observations

SQLAlchemy 1.4 to 2.0 is a breaking change.
This includes both the direct dependency on SQLAlchemy and the indirect 
dependencies on other packages, e.g. Flask and Flask-SQLAlchemy.
There is a [single-shot PR](https://github.com/apache/superset/pull/35117) by 
@dpgaspar that attempts to bump everything in a single PR.
I am not sure whether the PR is ready.
The last commit is from September 2025.
The PR touches 92 files and over a thousand lines of code, which makes it a 
daunting task to review.
Therefore, I would like to suggest smaller steps to some pressure off before 
the final push of bumping the dependencies.

Firstly, SQLAlchemy 1.4 (current version) is already designed to be a 
transition version. 
https://docs.sqlalchemy.org/en/20/changelog/migration_20.html#the-1-4-2-0-migration-path
This means that it enables in many cases both version 1 and 2 styles.
So we are able to update the code without actually bumping the SQLAlchemy 
version.

Secondly, SQLAlchemy is able to emit warnings.
This way, we are able to use the existing unit tests to check on the progress 
of the migration.

Thirdly, Python `warnings` and PyTest integrate well with each other.
So we are able to turn warning into errors, and mitigate regressions during 
collaboration.

## Battleplan

1.  Enable SQLAlchemy 1.4 deprecation warning in unit testing set-up.

    - [ ] Turn on SQLAlchemy deprecation warnings. #40274 

2.  After running all unit tests, I get the following warning cases:

    ```
    $ SQLALCHEMY_WARN_20=1 TZ=UTC python3 -m pytest tests/unit_tests/ |& grep 
"ovedIn20Warning: " foo.txt | sed 's/.*ovedIn20Warning: //' | sort | uniq
    Passing a string to Connection.execute() is deprecated and will be removed 
in version 2.0.  Use the text() construct, or the Connection.exec_driver_sql() 
method to invoke a driver-level SQL string. (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "Query" object is being merged into a Session along the backref cascade 
path for relationship "Database.queries"; in SQLAlchemy 2.0, this reverse 
cascade will not take place.  Set cascade_backrefs to False in either the 
relationship() or backref() function for the 2.0 behavior; or to set globally 
for the whole Session, set the future=True flag (Background on this error at: 
https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "SavedQuery" object is being merged into a Session along the backref 
cascade path for relationship "Database.saved_queries"; in SQLAlchemy 2.0, this 
reverse cascade will not take place.  Set cascade_backrefs to False in either 
the relationship() or backref() function for the 2.0 behavior; or to set 
globally for the whole Session, set the future=True flag (Background on this 
error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "SqlaTable" object is being merged into a Session along the backref cascade 
path for relationship "Database.tables"; in SQLAlchemy 2.0, this reverse 
cascade will not take place.  Set cascade_backrefs to False in either the 
relationship() or backref() function for the 2.0 behavior; or to set globally 
for the whole Session, set the future=True flag (Background on this error at: 
https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "SqlMetric" object is being merged into a Session along the backref cascade 
path for relationship "SqlaTable.metrics"; in SQLAlchemy 2.0, this reverse 
cascade will not take place.  Set cascade_backrefs to False in either the 
relationship() or backref() function for the 2.0 behavior; or to set globally 
for the whole Session, set the future=True flag (Background on this error at: 
https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "TableColumn" object is being merged into a Session along the backref 
cascade path for relationship "SqlaTable.columns"; in SQLAlchemy 2.0, this 
reverse cascade will not take place.  Set cascade_backrefs to False in either 
the relationship() or backref() function for the 2.0 behavior; or to set 
globally for the whole Session, set the future=True flag (Background on this 
error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    "TaggedObject" object is being merged into a Session along the backref 
cascade path for relationship "Tag.objects"; in SQLAlchemy 2.0, this reverse 
cascade will not take place.  Set cascade_backrefs to False in either the 
relationship() or backref() function for the 2.0 behavior; or to set globally 
for the whole Session, set the future=True flag (Background on this error at: 
https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    The ``as_declarative()`` function is now available as 
sqlalchemy.orm.as_declarative() (deprecated since: 1.4) (Background on 
SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    The autoload parameter is deprecated and will be removed in version 2.0.  
Please use the autoload_with parameter, passing an engine or connection. 
(Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    The connection.execute() method in SQLAlchemy 2.0 will accept parameters as 
a single dictionary or a single sequence of dictionaries only. Parameters 
passed as keyword arguments, tuples or positionally oriented dictionaries 
and/or tuples will no longer be accepted. (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    The current statement is being autocommitted using implicit autocommit, 
which will be removed in SQLAlchemy 2.0. Use the .begin() method of Engine or 
Connection in order to use an explicit transaction for DML and DDL statements. 
(Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    The `database` package is deprecated and will be removed in v2.0 of 
sqlalchemy. Use the `dialects` package instead. (Background on SQLAlchemy 2.0 
at: https://sqlalche.me/e/b8d9)
    The ``declarative_base()`` function is now available as 
sqlalchemy.orm.declarative_base(). (deprecated since: 1.4) (Background on 
SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    The Engine.execute() method is considered legacy as of the 1.x series of 
SQLAlchemy and will be removed in 2.0. All statement execution in SQLAlchemy 
2.0 is performed by the Connection.execute() method of Connection, or in the 
ORM by the Session.execute() method of Session. (Background on SQLAlchemy 2.0 
at: https://sqlalche.me/e/b8d9)
    The legacy calling style of select() is deprecated and will be removed in 
SQLAlchemy 2.0.  Please use the new calling style described at select(). 
(Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    The "whens" argument to case(), when referring to a sequence of items, is 
now passed as a series of positional elements, rather than as a list.  
(Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    "User" object is being merged into a Session along the backref cascade path 
for relationship "Role.user"; in SQLAlchemy 2.0, this reverse cascade will not 
take place.  Set cascade_backrefs to False in either the relationship() or 
backref() function for the 2.0 behavior; or to set globally for the whole 
Session, set the future=True flag (Background on this error at: 
https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: 
https://sqlalche.me/e/b8d9)
    ```

    This means that we are able to work on each warning case more or less 
independently:

    - [ ] Passing a string to Connection.execute() is deprecated
    - [ ] "Query" object is being merged into a Session
    - [ ] "SavedQuery" object is being merged into a Session
    - [ ] "SqlaTable" object is being merged into a Session
    - [ ] "SqlMetric" object is being merged into a Session
    - [ ] "TableColumn" object is being merged into a Session
    - [ ] "TaggedObject" object is being merged into a Session
    - [ ] The `as_declarative()` function is now available
    - [ ] The autoload parameter is deprecated
    - [ ] The connection.execute() method
    - [ ] The current statement is being autocommitted using implicit autocommit
    - [ ] The `database` package is deprecated
    - [ ] The `declarative_base()` function is now available
    - [ ] The Engine.execute() method is considered legacy
    - [ ] The legacy calling style of select() is deprecated
    - [ ] The "whens" argument to case
    - [ ] "User" object is being merged into a Session

3.  Actually bump SQLAlchemy and indirect dependencies.
    By then, #35117 will hopefully have to do some less heavy lifting. :-)

    - [ ] Bump SQLAlchemy and indirect dependencies. #35117 

4.  Clean up the unit test set-up again.

    - [ ] Streamline unneeded SQLAlchemy deprecation warnings.

## Conclusion

As mentioned, I already have some PRs in mind, that I will add to the 
battleplan.
I do not see any downsides to the refactoring approach of incrementally 
mitigating the deprecation warnings before the actual bump (except boredom).
Curious to hear your thoughts.

GitHub link: https://github.com/apache/superset/discussions/40273

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to