or not available via `get`?

Mike Bayer Sun, 13 May 2018 19:27:51 -0700

On Sun, May 13, 2018 at 3:21 PM, Jonathan Vanasco <[email protected]> wrote:
> A better way of conveying my problem might be: `get()` returns any object
> which is in the identity map; I am having issues after the identity map is
> now populated with some objects which are 'partials' that were created via
> 'load_only' parameters.  The benefits of using `get` to minimize a database
> query are outnumbered by dozens of selects to the database.
>
> I'll try to re-explain everything with a more details below.
>
> On Sunday, May 13, 2018 at 11:41:56 AM UTC-4, Mike Bayer wrote:
>>
>>
>> Phase one is a query with lots of joinedload, and *also*, a
>> load_only() (Is that right?   your previous email shows a joinedload +
>> load_only, this email OTOH says "phase 1, lots of eagerloading and
>> joinedloading", is that the one with the "load_only" ?)
>>
>
> Phase 1 can have a load_only, usually not.  The problem with the load_only
> is in the stuff that now happens before phase 1.
>
>> Then, you say, the get() in phase 2 (which we assume is the get() you
>> refer towards in your previous email) hits objects from this
>> joinedload + loadonly 99% of the time, so 1% of the time they are from
>> a cache, which you haven't told me how objects get into that or what
>> state they are in.
>
>
> That 1% are direct queries against the Database.  The cache I'm talking
> about is SqlAlchemy's identity map.
>
> Phase-1:
> - SqlAlchemy queries PostgreSQL for the primary keys of objects under
> several criteria.
> - The app aggregates all the primary keys by object type, then selects the
> objects by the primary key, making extensive use of the joined/eager
> loading.
> - The generated SqlAlchemy objects are inspected and analyzed for foreign
> keys that are of interest, which are then loaded via the aggregated gets.
> - Batching selects like this enormously improved PostgreSQL performance by
> influencing how it loads/unloads and caches data. The queries only involve
> indexes, and then tables only get 1 (sometimes 2) operations with the
> primary index.
>
> Phase-2:
> - The objects are interacted with. Their attributes and relationships are
> accessed.
> - Sometimes this code needs to use a specific object that was
> not-necessarily loaded in Phase-1. SqlAlchemy's `get(primary_key)` is used
> to leverage the identity map, which would either return an existing object
> by primary key, or hit the database.  This almost always results in a "cache
> hit" from the identity map, so no sql is emitted. Occasionally the object
> was not loaded, so sql must be emitted to select it.
>
> The current problem was created when a new "Phase-0" was introduced.
>
> In Phase-0, SqlAlchemy queries a number of objects with `load_only`
> specifying a small subset of their columns.  If these objects are queried in
> Phase-2 via `get()`, the identity map usually returns the previously
> selected object with a small subset of the the columns; this causes sql to
> be emitted and data to load every time a previously untouched attributed or
> relationship is accessed for the bulk of the objects' columns and
> relationships.  This results in dozens of sql selects.
>
> What I'd like to accomplish: not return any objects which were loaded in
> Phase-0 via the `get` queries in Phase-2.


This is not necessarily what you want, if your phase-1 query happens
to locate an identity that's already present from phase-0, you just
want to populate it.  This is easy, call populate_existing() on your
phase-1 query.   Then any objects it locates will be fully loaded with
whatever your query has.


> I'd use a second session to handle the Phase-0 objects (which are readonly),
> except that means two database connections.

that's not true, you can use the same database connection on a second
Session.   Just say s2 = Session(bind=s1.connection()).

> Loading the 'full' objects isn't a good option either - some of these
> objects span a half dozen database tables and are mapped back into a single
> object via association_proxy.

I'm not sure which step/phase this refers towards. But there are two
solutions for you up above.


>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] is it possible to make load_only local to an object only and/or avoid the identity map and/or not available via `get`?

Reply via email to