We're working with a codebase that uses SQLAlchemy 1.0.12 to talk to MySQL 5.5. Periodically, we have issues with empty IN clauses. Specifically, we have queries that look like `MyTable.my_column.in_(my_sequence)`. When `my_sequence` is empty, SQLAlchemy compiles this to `my_table.my_column <> my_table.my_column`. On MySQL 5.5, this clause will be optimized out to false, as long `my_column` was declared NOT NULL. However, if `my_column` can take on NULL values, it cannot be optimized out and it will cause a table scan.
SQLAlchemy's rationale for compiling in this way is described here: http://article.gmane.org/gmane.comp.python.sqlalchemy.user/6749 specifically, if empty-IN were always compiled to false and empty-NOT-IN were always compiled to true, `~MyTable.my_column.in_(empty_sequence)` would return rows where `my_column IS NULL`, which would be counterintuitive. However, we're willing to accept this tradeoff in order to achieve the goals of (1) protecting our servers from unintentional table scans (2) not having to explicitly guard against empty-IN across our codebase, as we would if we made SQLAlchemy's warning into an exception. My question is: what's the best way to achieve this? I've considered: 1. Using the TypeEngine.Comparator API to change the way `in_` is compiled (this was suggested on freenode). http://docs.sqlalchemy.org/en/rel_1_0/core/custom_types.html#redefining-and-creating-new-operators This is the cleanest approach in terms of working with the established APIs. The difficulty I'm seeing is that it seems necessary to subclass all of the standard SQLAlchemy types (e.g., Integer, String, Float), then ensure that the subclasses are used in all model definitions. This seems error-prone, i.e., if we miss a type, we'll get table scans anyway. 2. Swapping out the default implementation of the `in_` operator by modifying `sqlalchemy.sql.default_comparator.operator_lookup`. I had trouble getting this to work, which is probably just as well: it's a monkeypatch and it requires working with non-public APIs. 3. Adding an in-house patch that compiles empty IN to 0. Something like this: https://gist.github.com/slingamn/8fc40ed59287267a89ad102ece75881e This seems relatively clean, but we'd like to stay on the stock upstream package if possible. So I'm interested in advice about these approaches, or suggestions for new ones. Has anyone else dealt with this issue recently? Thanks very much for your time. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
