On Sun, May 13, 2018 at 3:21 PM, Jonathan Vanasco <[email protected]> wrote: > A better way of conveying my problem might be: `get()` returns any object > which is in the identity map; I am having issues after the identity map is > now populated with some objects which are 'partials' that were created via > 'load_only' parameters. The benefits of using `get` to minimize a database > query are outnumbered by dozens of selects to the database. > > I'll try to re-explain everything with a more details below. > > On Sunday, May 13, 2018 at 11:41:56 AM UTC-4, Mike Bayer wrote: >> >> >> Phase one is a query with lots of joinedload, and *also*, a >> load_only() (Is that right? your previous email shows a joinedload + >> load_only, this email OTOH says "phase 1, lots of eagerloading and >> joinedloading", is that the one with the "load_only" ?) >> > > Phase 1 can have a load_only, usually not. The problem with the load_only > is in the stuff that now happens before phase 1. > >> Then, you say, the get() in phase 2 (which we assume is the get() you >> refer towards in your previous email) hits objects from this >> joinedload + loadonly 99% of the time, so 1% of the time they are from >> a cache, which you haven't told me how objects get into that or what >> state they are in. > > > That 1% are direct queries against the Database. The cache I'm talking > about is SqlAlchemy's identity map. > > Phase-1: > - SqlAlchemy queries PostgreSQL for the primary keys of objects under > several criteria. > - The app aggregates all the primary keys by object type, then selects the > objects by the primary key, making extensive use of the joined/eager > loading. > - The generated SqlAlchemy objects are inspected and analyzed for foreign > keys that are of interest, which are then loaded via the aggregated gets. > - Batching selects like this enormously improved PostgreSQL performance by > influencing how it loads/unloads and caches data. The queries only involve > indexes, and then tables only get 1 (sometimes 2) operations with the > primary index. > > Phase-2: > - The objects are interacted with. Their attributes and relationships are > accessed. > - Sometimes this code needs to use a specific object that was > not-necessarily loaded in Phase-1. SqlAlchemy's `get(primary_key)` is used > to leverage the identity map, which would either return an existing object > by primary key, or hit the database. This almost always results in a "cache > hit" from the identity map, so no sql is emitted. Occasionally the object > was not loaded, so sql must be emitted to select it. > > The current problem was created when a new "Phase-0" was introduced. > > In Phase-0, SqlAlchemy queries a number of objects with `load_only` > specifying a small subset of their columns. If these objects are queried in > Phase-2 via `get()`, the identity map usually returns the previously > selected object with a small subset of the the columns; this causes sql to > be emitted and data to load every time a previously untouched attributed or > relationship is accessed for the bulk of the objects' columns and > relationships. This results in dozens of sql selects. > > What I'd like to accomplish: not return any objects which were loaded in > Phase-0 via the `get` queries in Phase-2.
This is not necessarily what you want, if your phase-1 query happens to locate an identity that's already present from phase-0, you just want to populate it. This is easy, call populate_existing() on your phase-1 query. Then any objects it locates will be fully loaded with whatever your query has. > I'd use a second session to handle the Phase-0 objects (which are readonly), > except that means two database connections. that's not true, you can use the same database connection on a second Session. Just say s2 = Session(bind=s1.connection()). > Loading the 'full' objects isn't a good option either - some of these > objects span a half dozen database tables and are mapped back into a single > object via association_proxy. I'm not sure which step/phase this refers towards. But there are two solutions for you up above. > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
