anyway, with some profiling of loading 1000 rows with six deferred cols, the function count with the defer was over 50K and without the defer around 37K; the patch is now committed and with the defer it's at 32K, so a bit less than that of loading the data, as it should be (more would be better sure, but this is a decent improvement). The change to get that much wasn't that big a deal so this is in 0.8.
On Jul 11, 2013, at 2:04 PM, Michael Bayer <[email protected]> wrote: > please try out this patch: > > http://www.sqlalchemy.org/trac/attachment/ticket/2778/2778.patch > > > which refactors this particular system to not require the production of a new > object per instance, which is the slowest part of this, and also inlines the > work of assembling the callable. This should give you 50% or more method > call improvement. if this is enough, this might be OK for 0.8. > > > > > > On Jul 11, 2013, at 1:40 PM, Michael Bayer <[email protected]> wrote: > >> well what kind of data are we talking about? defer()'s use case was for >> binary large objects and such, fields that are many K/Megs in size. if >> you're deferring a bunch of ints, then yes it's not optimized very well for >> that. >> >> Half of the overhead could be easily fixed here, creating those >> LoadDeferredColumns objects could be offloaded to a later point. The >> other half, setting up that callable, I'd have to spend some time reviewing >> the use cases here. The difference between an attribute that is "deferred" >> vs. one that is "expired" is that if you access some other expired >> attribute, the "deferred" attribute will still not load - because the use >> case is, you really don't want this BLOB column to load unless you touch it >> specifically. So to get that instruction into the state, "don't load these >> keys even on an unexpire", uses some kind of method call on every state. >> InstanceState._set_callable could be inlined more here to do less work, >> instructions up to the loader process just to populate a key in a dictionary >> maybe, though these reorganizations can destabilize the code. it's not >> something I'd be comfortable doing in 0.8, the ticket I created >> (http://www.sqlalchemy.org/trac/ticket/2778) has any potential work here for >> 0.9. >> >> The other way to go here is to provide a query option that explicitly >> delivers the attribute as "expired" as opposed to "deferred", looking at how >> that works right now I can give you the recipe below, but it still involves >> a function call per column so that the InstanceState knows the attribute is >> "expired". >> >> >> from sqlalchemy.orm.strategies import DeferredOption, DeferredColumnLoader >> >> >> class DontLoadColumnOption(DeferredOption): >> def get_strategy_class(self): >> return NoColumnLoader >> >> >> class NoColumnLoader(DeferredColumnLoader): >> def create_row_processor(self, context, path, mapper, row, adapter): >> if not self.is_class_level: >> def set_deferred_for_local_state(state, dict_, row): >> state.callables[self.key] = state >> return set_deferred_for_local_state, None, None >> else: >> return super(NoColumnLoader, self).create_row_processor( >> context, path, mapper, row, adapter) >> >> if __name__ == '__main__': >> from sqlalchemy import * >> from sqlalchemy.orm import * >> from sqlalchemy.ext.declarative import declarative_base >> >> Base = declarative_base() >> >> class A(Base): >> __tablename__ = 'a' >> >> id = Column(Integer, primary_key=True) >> >> x = Column(Integer) >> y = Column(Integer) >> z = Column(Integer) >> q = Column(Integer) >> >> e = create_engine("sqlite://", echo=True) >> Base.metadata.create_all(e) >> s = Session(e) >> s.add_all([ >> A(x="x%d" % i, y="y%d" % i, z="z%d" % i, q="q%d" % i) >> for i in xrange(1000) >> ]) >> s.commit() >> s.close() >> >> loaded = s.query(A).options(DontLoadColumnOption("y"), >> DontLoadColumnOption("z")).order_by(A.id).all() >> >> for a in loaded: >> assert 'y' not in a.__dict__ >> assert 'z' not in a.__dict__ >> assert 'x' in a.__dict__ >> assert 'q' in a.__dict__ >> >> assert a.z == "z%d" % (a.id - 1), a.z >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Jul 11, 2013, at 10:23 AM, "Gombas, Gabor" >> <[email protected]> wrote: >> >>> >>> I did need the objects, not just the raw data, otherwise I’d had to >>> duplicate a bunch of existing code which expected full-blown objects to >>> operate on. Modifying the mapper is not really an option unless the >>> majority of the users have the same requirements, otherwise I end up having >>> to add a huge amount of undefer() calls everywhere else (and the query >>> storm caused by missing an undefer() would be much more painful). Maybe the >>> documentation of defer() could mention that the use of the option can in >>> fact reduce performance, because it’s not intuitive that loading unneeded >>> data is cheaper than not loading it. >>> >>> From: [email protected] [mailto:[email protected]] On >>> Behalf Of Michael Bayer >>> Sent: 11 July 2013 15:34 >>> To: [email protected] >>> Subject: Re: [sqlalchemy] The cost of defer() >>> >>> the path would be to figure out if the logic of a per-query defer option >>> can somehow be linked to the attribute when it hits its normal refresh >>> logic - if those attribute were set up as "deferred" at the mapper config >>> level (where deferred is usually used), you wouldn't see this overhead >>> since the deferred loader would be the default callable. But if you only >>> need a few attributes why not just go the other way and query for those >>> attributes directly? that would save you way more overhead than even if >>> we removed all overhead from defer(), since the most expensive thing is all >>> the mapper identity map logic that takes place when full entities are >>> loaded. >>> >>> >>> On Jul 11, 2013, at 4:02 AM, "Gombas, Gabor" >>> <[email protected]> wrote: >>> >>> >>> >>> Hi, >>> >>> I wrote a query joining a couple of tables, returning over a hundred >>> thousand rows. Since I only needed to access a couple of the attributes of >>> the returned objects for this specific use case, I thought to use a dozen >>> or so Query.options(defer(…)) calls to avoid loading the unneeded columns. >>> But to my surprise, the query became much slower. Profiling attributed >>> almost all the extra time to set_deferred_for_local_state() and >>> LoadDeferredColumns.__init__(): >>> >>> 827855 15.668 0.000 27.068 0.000 >>> …/SQLAlchemy-0.8.1-py2.6-linux-x86_64.egg/sqlalchemy/orm/strategies.py:184(set_deferred_for_local_state) >>> 827855 10.524 0.000 10.524 0.000 >>> …/SQLAlchemy-0.8.1-py2.6-linux-x86_64.egg/sqlalchemy/orm/strategies.py:259(__init__) >>> >>> If defer() is so expensive, then it is not very useful. Would it be >>> possible to make it cheaper? >>> >>> Gabor >>> >>> >>> >>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the >>> opinions or views contained herein are not intended to be, and do not >>> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall >>> Street Reform and Consumer Protection Act. If you have received this >>> communication in error, please destroy all electronic and paper copies and >>> notify the sender immediately. Mistransmission is not intended to waive >>> confidentiality or privilege. Morgan Stanley reserves the right, to the >>> extent permitted under applicable law, to monitor electronic >>> communications. This message is subject to terms available at the following >>> link: http://www.morganstanley.com/disclaimers If you cannot access these >>> links, please notify us by reply message and we will send the contents to >>> you. By messaging with Morgan Stanley you consent to the foregoing. >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/sqlalchemy. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/sqlalchemy. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >>> >>> >>> >>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the >>> opinions or views contained herein are not intended to be, and do not >>> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall >>> Street Reform and Consumer Protection Act. If you have received this >>> communication in error, please destroy all electronic and paper copies and >>> notify the sender immediately. Mistransmission is not intended to waive >>> confidentiality or privilege. Morgan Stanley reserves the right, to the >>> extent permitted under applicable law, to monitor electronic >>> communications. This message is subject to terms available at the following >>> link:http://www.morganstanley.com/disclaimers If you cannot access these >>> links, please notify us by reply message and we will send the contents to >>> you. By messaging with Morgan Stanley you consent to the foregoing. >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/sqlalchemy. >>> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "sqlalchemy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/sqlalchemy. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> > > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/groups/opt_out. > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
