> -----Original Message-----
> From: sqlalchemy@googlegroups.com [mailto:sqlalchemy@googlegroups.com]
> On Behalf Of Knack
> Sent: 14 June 2011 18:43
> To: sqlalchemy
> Subject: [sqlalchemy] General questions of a newbee
> 
> Hi guys,
> 
> I've done some programming, but I'm new to RDBMS and ORMs. I've read
> some documentation, but before diving in deeper and doing some
> tutorials, I'm trying to understand what can be done with SQLAlchemy
> and get a coarse understanding of how it works.
> 
> Imagine some tables which are all related (like 'created by') to a
> user by a foreign key. If I query all tables by a certain user, I
> assume SQLAlchemy loads and creates all objects which have references
> in the column 'created by' to the certain user. Like a 'manual' eager
> loading. If I use the objects properties to follow the relations,
> does
> SQLA need to perform any more DB accesses? Or are the referenced
> objects directly referenced (maybe with properties that stores the
> direct reference after resolving after the first call)?
> 
> How about backrefs? Would every call to those require a new SQL query
> under the hood? Or are those 'stored' in the ORM after the first call?
> I guess this would impact how to model parent-children relations. On
> the one hand it seems like an easy life to me if the parents don't
> need references to the children in the database, as children could be
> added without modifing the parents. One the other hand, how's the
> performance impact if you need to get the children by backref calls?
> 

SQLAlchemy gives you a lot of control over when related objects are
accessed - the full details are at
http://www.sqlalchemy.org/docs/orm/loading.html.

When you configure a relationship between 2 classes, the default load
behaviour is known as "lazy loading". This means that the related object
will only be loaded when you first access the property on the parent.
Once an object is loaded, it is stored in the SQLAlchemy session object.
Subsequent requests for that same object (ie. same type and primary key)
will get the object from the session rather than going to the database.

That's not a very clear explanation - perhaps an example would help.
Imagine you were modelling a blog, and you had Post items and User
items. Posts have a 'created_by_id' foreign key to the User table, and a
'created_by' relationship which gives you the actual User object. Now
imagine that you have 3 posts in the database, created by 2 different
users.

Here's what happens when you load all the posts and then access their
'created_by' property, in the default configuration.

   posts = session.query(Post).all()

...runs something like 'SELECT * from post'

   print posts[0].created_by

...SA looks at the created_by_id on posts[0], then checks to see if it
already has a User with that id in the session. It doesn't, so it
retrieves it from the database ("SELECT * from user where id = :id"),
stores it in the session, and returns it to you.

   print posts[1].created_by

...SA checks posts[1].created_by_id again. It is the same as
posts[0].created_by_id. SA already has that user in the session, so it
returns the same user without going to the database.

  print posts[2].created_by

...this post was created by a different user, which isn't already in the
session, so SA goes to the database again.

The posts themselves have now been stored in the session, so if you
wrote the following:

  post = session.query(Post).get(1)

...SA would see that post 1 already exists in the session and not go
back to the database. Note that this only works for the 'get' method -
if you try to do any other kind of query, SA will still run the query.
However, when it's reading the rows back, it will try to match those
rows up with objects already in the session. If it finds a match, the
instance from the session will be returned. This ensures that (for a
given session) you will only ever have one instance representing a row
in the database.

If you were working with a large number of posts and users, it would be
very inefficient to (potentially) run a new query for each post just to
get the user that created it. SQLAlchemy allows you to request a
different loading strategy:

   posts = (session.query(Post)
            .options(joinedload('created_by'))
            .all())

...issues something like:

   SELECT *
   FROM post
   LEFT JOIN user ON post.created_by_id = user.id

ie. the users will be loaded in the same query as the posts. After this,
SA will not need to go back to the database when you access the
'created_by' property, even the first time.

Backrefs are not really any different from forward references, and the
same conditions apply. I think there may be a slight caveat though. If
you wrote:

   posts = session.query(Post).all()
   user = posts[0].created_by
   print user.posts

...I don't think SA has any way of knowing that all the posts from the
DB have been already been loaded into the session. It will run something
like 'SELECT * from post where created_by_id = :user_id', before
matching each row up with the Post instances that already exist in the
session.

I hope that helps,

Simon

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to