or not available via `get`?

Jonathan Vanasco Sun, 13 May 2018 12:22:16 -0700

A better way of conveying my problem might be: `get()` returns any object 
which is in the identity map; I am having issues after the identity map is 
now populated with some objects which are 'partials' that were created via 
'load_only' parameters.  The benefits of using `get` to minimize a database 
query are outnumbered by dozens of selects to the database.

I'll try to re-explain everything with a more details below.

On Sunday, May 13, 2018 at 11:41:56 AM UTC-4, Mike Bayer wrote:
>
>
> Phase one is a query with lots of joinedload, and *also*, a 
> load_only() (Is that right?   your previous email shows a joinedload + 
> load_only, this email OTOH says "phase 1, lots of eagerloading and 
> joinedloading", is that the one with the "load_only" ?) 
>
>
Phase 1 can have a load_only, usually not.  The problem with the load_only 
is in the stuff that now happens before phase 1.

Then, you say, the get() in phase 2 (which we assume is the get() you 
> refer towards in your previous email) hits objects from this 
> joinedload + loadonly 99% of the time, so 1% of the time they are from 
> a cache, which you haven't told me how objects get into that or what 
> state they are in. 
>

That 1% are direct queries against the Database.  The cache I'm talking 
about is SqlAlchemy's identity map.

Phase-1:
- SqlAlchemy queries PostgreSQL for the primary keys of objects under 
several criteria.
- The app aggregates all the primary keys by object type, then selects the 
objects by the primary key, making extensive use of the joined/eager 
loading.
- The generated SqlAlchemy objects are inspected and analyzed for foreign 
keys that are of interest, which are then loaded via the aggregated gets.
- Batching selects like this enormously improved PostgreSQL performance by 
influencing how it loads/unloads and caches data. The queries only involve 
indexes, and then tables only get 1 (sometimes 2) operations with the 
primary index.

Phase-2:
- The objects are interacted with. Their attributes and relationships are 
accessed.
- Sometimes this code needs to use a specific object that was 
not-necessarily loaded in Phase-1. SqlAlchemy's `get(primary_key)` is used 
to leverage the identity map, which would either return an existing object 
by primary key, or hit the database.  This almost always results in a 
"cache hit" from the identity map, so no sql is emitted. Occasionally the 
object was not loaded, so sql must be emitted to select it.

The current problem was created when a new "Phase-0" was introduced.

In Phase-0, SqlAlchemy queries a number of objects with `load_only` 
specifying a small subset of their columns.  If these objects are queried 
in Phase-2 via `get()`, the identity map usually returns the previously 
selected object with a small subset of the the columns; this causes sql to 
be emitted and data to load every time a previously untouched attributed or 
relationship is accessed for the bulk of the objects' columns and 
relationships.  This results in dozens of sql selects.

What I'd like to accomplish: not return any objects which were loaded in 
Phase-0 via the `get` queries in Phase-2.
I'd use a second session to handle the Phase-0 objects (which are 
readonly), except that means two database connections.
Loading the 'full' objects isn't a good option either - some of these 
objects span a half dozen database tables and are mapped back into a single 
object via association_proxy.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] is it possible to make load_only local to an object only and/or avoid the identity map and/or not available via `get`?

Reply via email to