Re: [sqlalchemy] Is there any way to get around the "redundancy" of coupled .join and contains_eager chains?

seaders Mon, 01 Oct 2018 10:47:45 -0700

*Actually*, is there a much, much simpler option of something like "no 
aliasing" for a query?


I.e.

  EventMarket.query
  .options(
    joinedload(EventMarket.event, with_aliases=False)
  )
  .filter(Event.date == '2018-10-01')

Or

  EventMarket.query
  .options(
    without_aliases,
    joinedload(EventMarket.event)
  )
  .filter(Event.date == '2018-10-01')

Which would produce

  SELECT ...
  FROM event_market
  LEFT OUTER JOIN event AS event ON event.id <http://event_1.id> = 
event_market_1.event_id
  WHERE event.us_date <http://event.id> = '2018-10-01'

Instead of

  SELECT ...
  FROM event_market, event
  LEFT OUTER JOIN event AS event_1 ON event_1.id = event_market_1.event_id
  WHERE event.us_date <http://event.id> = '2018-10-01'

Which is my main bone of contention when you don't do the multiple 
references, 

  EventMarket.query
  .options(
    joinedload(EventMarket.event)
  )
  .filter(Event.date == '2018-10-01')

And that joins the Event table as event_1, but also includes it in the FROM 
list, as event,


I consider that a pretty standard type of query which produces a *very* 
incorrect result (all the worse when you've 1000s of event_market rows, and 
the thing you're working on was quick, but is now really, really slow - 
because you only meant to get them for 1 day).  Now, I consider the result 
of that query so different from what you think would happen (you JOIN 
event, and filter those results in WHERE), that I would *at least* include 
a warning that this is probably not what you wanted to happen.  That's the 
thing with my feeling about the redundancy factor.  It'd be one thing if 
the Python code above *didn't* work, and produced an error in SQLAlchemy or 
with the dB engine, but my problem is the fact that it works, and produces 
what it does.

But anyway, you're the maintainer and driver of the API, so I'll say no 
more about that.

With `with_aliases=False` or `without_aliases`, that'd handle all, and even 
with pathing, it *could* handle that, too.  I've been caught previously 
with a very similar situation like this specifically because I didn't 
realise the join of a pathed relationship with a secondary join and table 
was put in brackets in the query.  It was basically the same thing, I 
wanted to be using the relationship as the join in the query, but be able 
to filter based on the tables in that relationship.

Basically,

class Source(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String(50), unique=True)

class Sport(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String(50), unique=True)

class Competition(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String(50), unique=True)

    sport_id = Column(Integer, ForeignKey(Sport.id), primary_key=True)
    sport = relationship(Sport)

class SourceCompetition(Base):
    comp_id = Column(Integer, ForeignKey(Competition.id), primary_key=True)
    comp = relationship(Competition)

    source_id = Column(Integer, ForeignKey(Source.id), primary_key=True)
    source = relationship(Source)    

    source_comp_id = Column(String(50), unique=True)

class SourceSport(Base):
    sport_id = Column(Integer, ForeignKey(Sport.id), primary_key=True)
    sport = relationship(Sport)

    source_id = Column(Integer, ForeignKey(Source.id), primary_key=True)
    source = relationship(Source)

    source_sport_id = Column(String(50), unique=True)

    source_competitions = relationship(
        SourceCompetition,
        secondary=lambda: table_for(Competition),
        primaryjoin=lambda: and_(
            SourceSport.sport_id == Competition.sport_id,
            SourceSport.source_id == SourceCompetition.source_id,
        ),
        secondaryjoin=lambda: and_(
            Competition.id == SourceCompetition.comp_id,
        ),
        uselist=True,
    )


For the above, I kinda consider it a near identical issue, I've 
"front-loaded" the logic of how SourceSport and SourceCompetition are 
connected in the relationship, so then I could run a simple query like,

  SourceSport.query
  .options(
    joinedload(SourceSport.source_competitions)
   .joinedload(SourceCompetition.competition)
  )
  .filter(Source.id == 1,
           Sport.name == 'American Football')

I get problems *just with* source_competitions, and how it self-references 
tables,

SELECT ...
FROM sport, source_sport
  LEFT OUTER JOIN (competition AS competition_1 INNER JOIN source_competition 
AS source_competition_1
                   ON competition_1.id = source_competition_1.comp_id)
  ON source_sport.sport_id = competition_1.sport_id AND source_sport.source_id 
= source_competition.source_id

WHERE team_sports_sport.name = %s


Leads to,

_mysql_exceptions.OperationalError: (1054, "Unknown column 
'source_competition.source_id' in 'on clause'")


Never mind using those same tables for extra filtering outside!

Like the request above, if there were an option for without_aliasing, 

SELECT ...
FROM sport, source_sport
  LEFT OUTER JOIN (competition INNER JOIN source_competition
                   ON competition.id = source_competition.comp_id)
  ON source_sport.sport_id = competition.sport_id AND source_sport.source_id = 
source_competition.source_id

WHERE team_sports_sport.name = %s


It works.

How about that, then?  An explicit option for no aliasing?  That way, if 
someone (me) trips over and falls flat on their face by referencing the 
same table in two places incorrectly, that's explicitly our problem, 
because that's the option we've stipulated in the join.


On Monday, October 1, 2018 at 4:36:04 PM UTC+1, Mike Bayer wrote:
>
>
> Well again, there's two public API points that do what you want, you just 
> don't like calling both, so I don't think a fork is needed, there are a lot 
> of ways to extend Query:
>
> 1. you can subclass it.  Such as, override join() and add a flag 
> "eagerload=True", then apply the contains_eager() option.
>
> 2. you can use the with_transformation hook I illustrated earlier:  
> https://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=with_transformation#sqlalchemy.orm.query.Query.with_transformation
>
> 3. you can make your own MapperOption (bug, it seems to not be linked in 
> the docs right now, it should be, here it is: 
> https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/interfaces.py#L572).
>    
> The only glitch there is that you have to modify the Query in place to do 
> the query.join() part which might not be that simple.
>
> 4. you can just use your own function (not too different from 
> with_transformation)
>
>
> right off, no API that I like is occurring to me as yet.    I dont like 
> hardcoding an AND and assuming the existing join condtion should stay, as 
> the extra_join_condition proposes.  That's a very specific and arbitrary 
> use case, does not support being able to change the ON clause entirely, 
> does not support being able to load the objects from a different entity 
> such as a SELECT statement, does not support being able to change the ORDER 
> BY, etc.    The Query can already generate exact joins/conditions/orderings 
> etc. using its existing API which is here focused on the join() method.  
>
> the query.join(..., eagerload=True) would be the best candidate but it 
> would only make sense with a limited set of arguments that can be passed to 
> joinI().  That is, this makes sense:
>
> query.join(Foo.bar, eagerload=True)
>
> because the relationship is apparent, "Foo.bar".
>
> these do not, because no relationship name is apparent:
>
> query.join(some_table)
>
> query.join(SomeClass, SomeClass.foo == OtherClass.bar)
>
>
> so then the most unambiguous API is :
>
> query.join(<whatever>, eagerload=Foo.bar)
>
> But even then, what if the eagerload here is A.bs -> B.foo -> Foo.bar, and 
> explicitly *not* A.cs -> C.foo -> Foo.bar ?  there's no pathing here and it 
> is again ambiguous (same problem with just a boolean flag also).   The 
> issue of "pathing" is one that users aren't generally aware of because it 
> doesn't come into play for simple queries, but once a mapping is more 
> complicated then they suddenly are aware of it, and it is the main reason 
> why the APIs here have to be very deliberate in their nature.    Part of 
> SQLAlchemy's nature is that everything has to be as generalizable as 
> possible.   We try really hard not to add one-offs that can't be part of a 
> larger composable vocabulary.
>
> It's unfortunately a bad API pattern to take a very explicit and clear 
> API, that is, query.join().options(contains_eager(...)) and attempt to 
> "shorten" it with another API that does the exact same thing (e.g. is 
> redundant) but also introduces complexity and ambiguity (e.g. doesn't even 
> work as well, introduces bugs, etc).   
>
> So I have no ideas here as of yet, though I agree if the existing pattern 
> is a little verbose.
>
> -- 
>> SQLAlchemy - 
>> The Python SQL Toolkit and Object Relational Mapper
>>  
>> http://www.sqlalchemy.org/
>>  
>> To post example code, please provide an MCVE: Minimal, Complete, and 
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> description.
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Is there any way to get around the "redundancy" of coupled .join and contains_eager chains?

Reply via email to