On Fri 22 May 2009 05:44:24 AM EDT, Oleg Broytmann wrote:
> On Fri, May 22, 2009 at 01:11:51AM -0400, Stef Telford wrote:
>> SQLObject development should -really- focus on is speed.
>
>    There are many areas to improve SQLObject - code, tests, documentation,
> development process, communication skills of the developers. Let's add
> speed to the list.
>    But well, it is open source, developers scratch their itches, and speed
> is not our biggest itch.
>    SQLObject certainly isn't optimized for mass insertion, this is a well
> known fact. SQLObject does a lot of thing behind the scene. For mass
> insertion it is better to use SQLBuilder lower-level API (underdocumented,
> yes).
>    You are welcome to start a project to improve SQLObject. Or at least you
> can help by explaining how SA optimizes mass insertion. On an INSERT
> SQLObject does:
>    -- creates an instance;
>    -- sends INSERT query;
>    -- sends SELECT query to get the row back - this is necessary to get
>       back autogenerated fields - autoincremented IDs, timestamps, system
>       variables and such; this is what slows down SQLObject so much;
>    -- caches the row in its internal cache; this is what makes SQLObject to
>       eat memory; SQLObject periodically clears the cache thus spending
>       even more time.
> What SA does and does not do?

The original comparison isn't fair.  There are two differences between
the SQLAlchemy (SA) approach and the SQLObject (SO) approach.

1.  The SA code does one single insert statement and the SQLObject code
    does a separate query for each object.

2.  The SO code instantiates a class after inserting each row.  The SA
    code doesn't instantiate anything.

It is possible to do get SO to act like SA, but like Oleg says, it is
underdocumented.

Here's how I do a big insert with SO:

    def bulky_insert(cls, insert_these):
        """
        Does a single insert, cramming everything from insert_these,
        which must be a list of dictionaries, and each dictionary must
        be like:

            d = dict(shift_template_id=sr.shift_templateID,
                shift_time_id=new_st.id, 
                shift_type_id=sr.shift_typeID,
                location_id=sr.locationID,
                build_id=bld.id,
                required_staff=sr.required_staff)
        """

        cls._connection.query(cls.sqlrepr(
            Insert(cls.sqlmeta.table, insert_these)))

The cls is a subclass of the typical SQLObject class.  I build a list of
dictionaries, where each dictionary has a key named with the column name
and the value is (duhh...) the value for that column name.

This does a single big insert and doesn't return anything.  No classes
are instantiated.  This is *really* fast.

I find myself using SQLBuilder tools *a lot*.  Maybe I should read the
code and see if I can help at least write some documentation for others,
because I've had to figure out most of this stuff through trial and
error.

Matt


------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
sqlobject-discuss mailing list
sqlobject-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss

Reply via email to