Michael Bayer wrote:
> On Dec 22, 2009, at 5:20 PM, Igor Katson wrote:
>
>>> 0.5 supports this using the "postgres_returning" argument to update() and
>>> insert(), but the ORM doesn't make use of it. in 0.6, the feature has been
>>> generalized using the returning() method on insert() and update(), and the
>>> ORM will use it to fetch new primary key identifiers.
>>>
>>> It does not take effect within the ORM for inline SQL other than primary
>>> key identifiers, but you can use an insert() with returning() manually to
>>> achieve that effect.
>>>
>> Why not, Michael? Is it hard to implement this? This seems to be pretty
>> useful and performance-improving, reducing the amount of selects needed
>> when coding unaware of object expiry.
>
> because a. I dont have the time to implement conditional returning()
> awareness within the ORM and b. I don't think it will have as dramatic a
> performance impact as you expect. When I added default returning() support
> for insert() statements by default, the number of function call counts in our
> performance tests went up, due to the additional complexity. I have made the
> automatic-returning feature optional for all dialects since its not really
> clear that performance is actually enhanced. So its a race between Python
> method performance, which is abysmal, and overhead of SQL round trips.
> Future enhancements like turning ResultProxy into C (which some work has been
> done on) could then be game changers again.
>
> Other reasons performance might not increase much at all are:
>
> 1. a typical ORM application may create/modify a bunch of objects, then call
> commit(). By default everything is expired unconditionally in that case -
> RETURNING for non-primary key columns is wasted overhead.
> 2. the "set an attribute to a SQL expression" use case is very rare. in most
> cases the additional complexity of deciding about returning() at the ORM
> level just adds complexity for no reason.
> 3. The use case of SQL-level defaults/onupdates on columns is there, but
> again the usual use here is for timestamps and such that usually aren't
> fetched in the same transaction as the one in which they are flushed -
> fetching them back just so they can be thrown away is again wasted overhead.
> 4. its not clear overall if returning() is faster for all DBAPIs - in
> particular the mechanics of using RETURNING with oracle and MSSQL are pretty
> complex.
>
>
>> P.S. And in SQLAlchemy 0.5, what's the most orm-like way to do solve
>> this? In my code, it looks like this now, cause I didn't find an easier
>> (less lines + less sql-like code) way to do it.
>> columns = ','.join([str(col) for col in obj_table.c])
>> query = '''UPDATE %(table)s
>> SET col = col + %(value)s
>> WHERE id = %(id)s
>> RETURNING %(columns)s''' % {
>> 'id': id,
>> 'columns': columns,
>> 'table': obj_table,
>> 'value': value
>> }
>> obj = meta.Session.query(Obj).from_statement(query).first()
>
> As far as the rendering above, you should use the postgres_returning keyword
> in conjunction with an update() construct, instead of piecing together a
> string. It should be in the API documentation. The usage of query(Obj) to
> call it and get an object back from RETURNING is novel and I hadn't thought
> of that. You need to ensure that the rows you're matching are not already
> present in the session, or have been expired, else the new state will not be
> reflected in the objects.
>
> You also should take a measurement of performance enhancement with the above
> technique, vs. using query.update() or Session.execute(update()) and then
> re-selecting the rows back, or just allowing the natural expiration to result
> in individual SELECTs as needed.
>
> If you truly have some very performance critical section where thousands of
> rows are being inserted/updated, the ORM is going to give you fairly mediocre
> performance with or without RETURNING, as it is designed to do lots of the
> thinking and effort for you in a very generic (read: unoptimized) way and
> hence is an expensive route to go no matter what. You'd be better off
> dropping into the usage of insert()/update()/delete() constructs and not
> attempting to reinstate the new rows back into ORM objects, since that is all
> an expensive process designed to make the production of source code easier.
> In this case your source code is being made more difficult to produce anyway
> so the ORM might not be worth it.
>
Thanks, Michael, it's clear now. A was not thinking that Python's
overhead might be more than RDBMS's one. After profiling my application
with repoze.profile and apache ab, I see that SQL only takes around 10%
of the request serving time (being extremely fast and serving 1000
requests in 1 second!), everything else is Python.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en.