[sqlalchemy] SQLAlchemy 0.5.4 Released

Michael Bayer Sun, 17 May 2009 18:20:20 -0700

Hello list -

SQLAlchemy 0.5.4 is released, and this release is *highly* recommended  
for all users.    For an indication of how high, lets just say, higher  
than 0.5.3, 0.5.2, and 0.5.1combined.


Not to worry, there are no security holes or memory leaks in previous  
versions.   But we have neutralized some major, major speed bumps in  
the flush() process, as well as made significant improvements to  
memory usage.     Due to the removal of these bugs, large dataset  
scenarios that were more or less impossible to work with for any  
version of SQLAlchemy now run at top speed with the same rate  
performance regardless of how large the Session grows.    Other  
performance issues that were proportional to the number of  
interconnected mapped classes, memory/speed issues related to the  
number of relation()s set up on mappers during a load, or just  
wasteful overhead during the flush have been mitigated.   The  
improvements are of a magnitude such that some applications that had  
abaondoned the ORM due to latency related to large sets of objects may  
be able to come back to it and regain all its advantages.

The key to all these improvements is, that i finally have a job using  
SQLAlchemy full time where I've gotten the opportunity to use the ORM  
with somewhat large amounts of data.   None of these issues were very  
deep and just required that I spend some more time with the profiler  
and bigger datasets.    My own use case here is a 6500 row spreadsheet  
of interconnected objects, representing about 25K rows - the process  
of ingesting that data, which requires that all of the objects need to  
stay present in the session, has gone from 33 minutes to 8.     The  
key is that the number of method calls to flush X number of objects is  
now the same for a session regardless of how many other non-dirty  
items are present.   Similarly, a mapping setup that has 30 mappers  
configured will not be slowed down by unnecessary traversal of all the  
mapper relations.

The Session itself, which has for some time now has been "weak  
referencing" with regards to its contents, has been repaired such that  
the weak referencing behavior is now fully operational.  Previously,  
objects which were related via mutual backrefs would not get cleared  
from the session when all external references were lost until you  
expunged them.     That is no longer necessary - the Session now has  
no strong references whatsoever to its contents, as long as no changes  
are pending on those objects.   Pending changes as always are strongly  
referenced until flushed.   So now you can iterate through as many  
tens of thousands of objects as you like (keeping in mind an  
individual Query still loads each individual result fully in unless  
yield_per is enabled) and there's no need to expunge the session in  
between chunks.

The loading of objects has also been sped up and reduced in memory  
overhead by killing a wasteful structure of callables that was  
generated on a per-relation()/per-object basis whenever  
query.options() was used.

In other news I've backported a convenient extension from the 0.6  
series which allows you to create custom SQL expression elements with  
compiler functions.   This is the "compiler" extension and is  
described in the documentation.

Download SQLAlchemy 0.5.4 (right now !!  get rid of whatever buggy old  
version you're using) at:

http://www.sqlalchemy.org/download.html


0.5.4
=====

- orm
     - Significant performance enhancements regarding Sessions/flush()
       in conjunction with large mapper graphs, large numbers of
       objects:

       - Removed all* O(N) scanning behavior from the flush() process,
         i.e. operations that were scanning the full session,
         including an extremely expensive one that was erroneously
         assuming primary key values were changing when this
         was not the case.

         * one edge case remains which may invoke a full scan,
           if an existing primary key attribute is modified
           to a new value.

       - The Session's "weak referencing" behavior is now *full* -
         no strong references whatsoever are made to a mapped object
         or related items/collections in its __dict__.  Backrefs and
         other cycles in objects no longer affect the Session's ability
         to lose all references to unmodified objects.  Objects with
         pending changes still are maintained strongly until flush.
         [ticket:1398]

         The implementation also improves performance by moving
         the "resurrection" process of garbage collected items
         to only be relevant for mappings that map "mutable"
         attributes (i.e. PickleType, composite attrs).  This removes
         overhead from the gc process and simplifies internal
         behavior.

         If a "mutable" attribute change is the sole change on an object
         which is then dereferenced, the mapper will not have access to
         other attribute state when the UPDATE is issued.  This may  
present
         itself differently to some MapperExtensions.

         The change also affects the internal attribute API, but not
         the AttributeExtension interface nor any of the publically
         documented attribute functions.

       - The unit of work no longer genererates a graph of "dependency"
         processors for the full graph of mappers during flush(),  
instead
         creating such processors only for those mappers which represent
         objects with pending changes.  This saves a tremendous number
         of method calls in the context of a large interconnected
         graph of mappers.

       - Cached a wasteful "table sort" operation that previously
         occured multiple times per flush, also removing significant
         method call count from flush().

       - Other redundant behaviors have been simplified in
         mapper._save_obj().

     - Modified query_cls on DynamicAttributeImpl to accept a full
       mixin version of the AppenderQuery, which allows subclassing
       the AppenderMixin.

     - The "polymorphic discriminator" column may be part of a
       primary key, and it will be populated with the correct
       discriminator value.  [ticket:1300]

     - Fixed the evaluator not being able to evaluate IS NULL clauses.

     - Fixed the "set collection" function on "dynamic" relations to
       initiate events correctly.  Previously a collection could only
       be assigned to a pending parent instance, otherwise modified
       events would not be fired correctly.  Set collection is now
       compatible with merge(), fixes [ticket:1352].

     - Allowed pickling of PropertyOption objects constructed with
       instrumented descriptors; previously, pickle errors would occur
       when pickling an object which was loaded with a descriptor-based
       option, such as query.options(eagerload(MyClass.foo)).

     - Lazy loader will not use get() if the "lazy load" SQL clause
       matches the clause used by get(), but contains some parameters
       hardcoded.  Previously the lazy strategy would fail with the
       get().  Ideally get() would be used with the hardcoded
       parameters but this would require further development.
       [ticket:1357]

     - MapperOptions and other state associated with query.options()
       is no longer bundled within callables associated with each
       lazy/deferred-loading attribute during a load.
       The options are now associated with the instance's
       state object just once when it's populated.  This removes
       the need in most cases for per-instance/attribute loader
       objects, improving load speed and memory overhead for
       individual instances. [ticket:1391]

     - Fixed another location where autoflush was interfering
       with session.merge().  autoflush is disabled completely
       for the duration of merge() now. [ticket:1360]

     - Fixed bug which prevented "mutable primary key" dependency
       logic from functioning properly on a one-to-one
       relation().  [ticket:1406]

     - Fixed bug in relation(), introduced in 0.5.3,
       whereby a self referential relation
       from a base class to a joined-table subclass would
       not configure correctly.

     - Fixed obscure mapper compilation issue when inheriting
       mappers are used which would result in un-initialized
       attributes.

     - Fixed documentation for session weak_identity_map -
       the default value is True, indicating a weak
       referencing map in use.

     - Fixed a unit of work issue whereby the foreign
       key attribute on an item contained within a collection
       owned by an object being deleted would not be set to
       None if the relation() was self-referential. [ticket:1376]

     - Fixed Query.update() and Query.delete() failures with eagerloaded
       relations. [ticket:1378]

     - It is now an error to specify both columns of a binary  
primaryjoin
       condition in the foreign_keys or remote_side collection.  Whereas
       previously it was just nonsensical, but would succeed in a
       non-deterministic way.

- schema
     - Added a quote_schema() method to the IdentifierPreparer class
       so that dialects can override how schemas get handled. This
       enables the MSSQL dialect to treat schemas as multipart
       identifiers, such as 'database.owner'. [ticket: 594, 1341]

- sql
     - Back-ported the "compiler" extension from SQLA 0.6.  This
       is a standardized interface which allows the creation of custom
       ClauseElement subclasses and compilers.  In particular it's
       handy as an alternative to text() when you'd like to
       build a construct that has database-specific compilations.
       See the extension docs for details.

     - Exception messages are truncated when the list of bound
       parameters is larger than 10, preventing enormous
       multi-page exceptions from filling up screens and logfiles
       for large executemany() statements. [ticket:1413]

     - ``sqlalchemy.extract()`` is now dialect sensitive and can
       extract components of timestamps idiomatically across the
       supported databases, including SQLite.

     - Fixed __repr__() and other _get_colspec() methods on
       ForeignKey constructed from __clause_element__() style
       construct (i.e. declarative columns).  [ticket:1353]

- mysql
     - Reflecting a FOREIGN KEY construct will take into account
       a dotted schema.tablename combination, if the foreign key
       references a table in a remote schema. [ticket:1405]

- mssql
     - Modified how savepoint logic works to prevent it from
       stepping on non-savepoint oriented routines. Savepoint
       support is still very experimental.

     - Added in reserved words for MSSQL that covers version 2008
       and all prior versions. [ticket:1310]

     - Corrected problem with information schema not working with a
       binary collation based database. Cleaned up information schema
       since it is only used by mssql now. [ticket:1343]

- sqlite
     - Corrected the SLBoolean type so that it properly treats only 1
       as True. [ticket:1402]

     - Corrected the float type so that it correctly maps to a
       SLFloat type when being reflected. [ticket:1273]

- extensions

     - Fixed adding of deferred or other column properties to a
       declarative class. [ticket:1379]

     - Added "compiler" extension from 0.6

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] SQLAlchemy 0.5.4 Released

Reply via email to