Re: [sqlalchemy] Model class creation and configure_mapper performance

Mike Bayer Thu, 08 Mar 2018 13:54:39 -0800

On Thu, Mar 8, 2018 at 2:55 PM,  <[email protected]> wrote:
> Hi,
>
> Our team has been intensively using SQLAlchemy for five years now. Over the
> years, we've accumulated a large number of models (roughly 350).
>
> Over time, our app start-up has also slowed quite a bit. This has some
> negative effects since it's tied to several dev processes like starting up a
> shell environment, locally reloading our web server, and running our tests.
>
> I have been doing some profiling to try to identify why, and it turns out a
> large portion of this time (~4.5s) is taken up setting up the model classes
> and configuring the mappers:
>
> 2.1s is spent setting up the initial mappings:
> sqlalchemy/ext/declarative/base.py:106(setup_mapping)
> 2.4s is spent configuring the mapping:
> sqlalchemy/orm/mapper.py:2945(configure_mappers)
>
> While each individual call does not take much time, it ends up taking a long
> time cumulatively for our hundreds of models.
>
> Does anybody have advice on changes we can make to our app or models to
> improve the performance here? Attached some SQLAlchemy perf files. Happy to
> dive more into these if they might be tackleable on my end!
>


unfortunately not in the short term.   I have put enormous efforts
over the past ten years into improving the performance of SQLAlchemy,
within virtually all areas of its *runtime* behavior - that is, the
speed of the connection pool, the speed of executing statements, the
speed of fetching results, the speed of mapped objects being loaded
and flushed, and within the generation of SQL statements within core
and ORM.  Speed and performance concerns are intrinsic to virtually
every coding decision I approve to go into SQLAlchemy.

A key strategy by which these performance improvements are organized
is to build something of a "just in time configurator" - that is, a
result set that calls down into datatypes as it iterates through rows
and columns is reorganized to quickly create a simplified collection
of callables when the result is first acquired, which then provide
minimal overhead for the same operation during row fetching.  Or, a
mapper that configures structures like "with_polymorphic_mappers",
"insert_cols_evaluating_none", "pk_keys_to_table" that are designed to
provide extremely fast datastructures tailored to extremely specific
steps in the query process or session's flush process , where the
expense of creating these structures is pushed a single "up front"
calculation, in many cases within the mapper init or
configure_mappers() steps you refer towards.     Most of the
performance-critical aspects of SQLAlchemy have a corresponding "up
front figure out what we'll be doing" phase, and if this phase itself
is not expected to occur more than once, it therefore has the least
amount of performance burden.     The problem is, within those
processes themselves, there's really not a way I can easily continue
to push the time spent up to some other phase, short of compiling
things into some kind of serialized cached structure that you load
from a file.    There are likely a lot of performance wins within
these areas if I had the resources to look for them but they aren't
likely to be larger than the 10-20% speedup variety.

That said, these results still look quite slow and I wonder if these
are against low-CPU instances, 300 classes shouldn't be a 9 second
operation.    I see there are over 5000 schema object attachment
events (e.g. like Column to Table) which implies you also have really
big table structures.   Anywhere you can not refer to columns or
tables you don't actually use in your application would cut way down
on these times.    There is also the notion of not having your whole
application import all modules at once, if you can swing it.   But
overall you're running a really large application server here within
an interpreted scripting language, startup overhead has to be expected
to some degree and is typical within large enterprise applications.



> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Model class creation and configure_mapper performance

Reply via email to