On 7/28/15 7:41 AM, David Allouche wrote:
On 28 Jul 2015, at 11:31, Jacob Magnusson <[email protected]> wrote:
I'm trying to avoid "implicit" joins (i.e. more than one entry in the FROM
clause) in my code and I'm wondering if you guys have a good idea on how to best achieve
this. I want to raise an exception if a query changes to have more than one entry in the
FROM clause. An alternative would be to have it happen only before the query is executed.
I'm guessing you would have a good idea on where to start Michael?
Just chiming in to say I think this would indeed be useful.
In my software, I have a few places where queries are built all over the place: column
selection here, filters there, and I did at least once have a bug caused by such an
"implicit" join.
Such bugs tend to be very dangerous, because if the executed query has a
DISTINCT clause, they can be invisible, and if the test dataset is small enough
they can go undetected. When they land in production with real dataset they
cause combinatorial explosions.
The ORM often has bugs where extra FROM objects are added by accident,
but usually these are issues that are broken on my end to fix in the
ORM. The general concept of "implicit joins" is quite useful, hence
they are in the tutorial. A lazy-load on a "secondary" relationship
uses them as well.
However, the SQL expression system is of course fully open to
inspection, so while I can't advise on when you'd want this check and
when you wouldn't (seems like the hard part), I'd at least say a way to
intercept SQL is to use a before_execute() event. But what is
difficult is that it's not easy to tell if a SELECT really has an
implicit join in it or not. Here's a simple way:
from sqlalchemy.sql import Select
@event.listens_for(e, "before_execute")
def catch_joins(conn, clauseelement, multiparams, params):
if isinstance(clauseelement, Select) and \
len(clauseelement.froms) > 1:
raise Exception("select has too many froms: %s" %
clauseelement)
That should work for simple cases. It won't work as is if the SELECT
with the implicit froms is buried inside, you'd need to traverse for
that. That becomes difficult because a SELECT might have multiple
froms, but they're correlated. You'd need to check that somehow as well
(or not use correlated SELECTs). That would be:
from sqlalchemy.sql import visitors, Selectable
@event.listens_for(e, "before_execute")
def catch_joins(conn, clauseelement, multiparams, params):
if isinstance(clauseelement, Selectable):
def visit_select(select):
if len(select.froms) > 1:
raise Exception("select has too many froms: %s" % select)
visitors.traverse(clauseelement, {}, {"select": visit_select})
as far as how to limit when this takes place, I'd probably put clues
into conn.execution_options(). You can check these like:
@event.listens_for(e, "before_execute")
def catch_joins(conn, clauseelement, multiparams, params):
if conn._execution_options.get('no_implicit_joins', False):
if isinstance(clauseelement, Selectable):
def visit_select(select):
if len(select.froms) > 1:
raise Exception("select has too many froms: %s" %
select)
visitors.traverse(clauseelement, {}, {"select": visit_select})
and set them on a Query:
s.query(A, B).execution_options(no_implicit_joins=True).all()
Regards.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.