Hello,

I have a very simple data model with a one to many relationship - a thread 
could have many identifiers that link to it:

class Thread(Base):
        __tablename__ = 'threads'

        id = Column(Integer, primary_key=True)
        link = Column(String(length=512, collation=UTF8), nullable=False)
        identifiers = relationship('ThreadIdentifier', backref='thread')

        def __repr__(self):
                return u"<Thread: link {0} id {1}>".format(self.link, 
self.id)


class ThreadIdentifier(Base):
        __tablename__ = 'threads_identifiers'

        id = Column(Integer, primary_key=True) # lame, (identifier, 
thread_id could be PK)
        identifier = Column(String(length=255), index=True)
        thread_id = Column(Integer, ForeignKey('threads.id'), 
nullable=False, primary_key=True)

        def __repr__(self):
                return u"<ThreadIdentifier: {0} (tid 
{1})>".format(self.identifier, self.thread_id)

If I try to query for threads with a certain identifier, the following SQL 
gets generated:

> s.query(Thread).filter(Thread.identifiers.any(ThreadIdentifier.identifier 
== 'foo')).one()
015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine SELECT threads.id 
AS threads_id, threads.link AS threads_link
FROM threads
WHERE EXISTS (SELECT 1
FROM threads_identifiers
WHERE threads.id = threads_identifiers.thread_id AND 
threads_identifiers.identifier = %s)
2015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine ('foo',)

The query returns the correct results but is incredibly slow because 
MySQL's query engine apparently doesn't optimize this EXISTS clause. It 
does a full table scan of the threads table and a subquery for each row.

If I use an inner join instead (this was just raw SQL I typed in):

select threads.id, threads.link FROM threads INNER JOIN threads_identifiers 
ON threads_identifiers.thread_id = threads.id WHERE 
threads_identifiers.identifier = 'foo';

Then MySQL is smart enough to use indices and the query returns incredibly 
quickly. 

Is there a good way to leverage the .any() filter in MySQL? I can sort of 
work around this by doing something like: 

s.query(ThreadIdentifier).filter(ThreadIdentifier.identifier == 
'foo').one().thread

but the code is much trickier to read.

thanks
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to