Hi Ben,

I'm confused about the utility of this function.. Since it's possible to use external table with update statements, you certainly can do the whole update using a single request.
I prepared a little example that reproduces yours in pure SQL.

-- let's create a similar dataset:
create temporary table test (id integer, age integer, name text);
insert into test values (1, 10, 'Rose'), (42, 45, 'Bobby'), (23, 38, 'Sarrah');

-- Here are the values I want to update, in this example, null won't update the test table.
create temporary table source (id integer, age integer, name text);
insert into source values (42, null, 'Bob'), (23, 30, 'Sarah');

-- The "Bazinga" part:
update test t
set name = coalesce(s.name, t.name), -- a small workaround to keep the original value if the source is null age = coalesce(s.age, t.age) -- feel free to use any function you want, that works exactly the same way as usual.
  from source s
  where s.id = t.id;

-- Here it is:
select * from test order by id;

1;10;Rose
23;30;Sarah
42;50;Bob

-- Bazinga --

This method is the fastest I know, even on largest datasets where you might waste a lot of time and energy to execute millions and millions of requests, here you just update your whole dataset with one request and take profit of all the database optimisations (indexes etc.). With a modern dbms, you can even check the execution plan and get usefull tips on needed optimisations before you execute the nightmare. The only limitation I know is that you can't update more than 1 table at a time, but transactions give the ability to delay constraints checks at their end, so, 1 update per table seems legit.

I think not every DBMS can handle this type of update, but it's part of SQL standards. At least, the famous ones (postgres, mssql, oracle, mysql etc.) have these functionnalities, with more or less syntax variations.
I hope this will help.

Regards,

Christophe NINUCCI

Le 27/03/2013 21:41, Ben Hoyt a écrit :
Hi folks,

In our own utility file, I've added a function called multiple_update() that executes multiple separate update statements in one query. Note that this is not the same as a single db.update() that updates multiple rows -- with this function, the columns and vars for each row can be different.

In my testing (Windows 64-bit, db is PostgreSQL running on another machine) I've found this to be about 3-4x as fast as individual db.update()s wrapped in a transaction, so a worthwhile speed increase. The speed increase factor seems to be about constant regardless of the number of rows (initially I was hoping it go up with N, but it didn't in my tests.)

Any thoughts on the API signature, and whether this would be a good addition to web.DB? Any comments on this approach in general? Is the way I'm doing the cast to unicode and then '\n'.join(updates) the correct approach?

Couple of notes:

* If this were part of web.py, obviously it'd be a method on DB instead of a stand-alone function. * dbconn.db_connection is our "current database object". Again, this would go away if it were a DB method.

Code below:

-----
def multiple_update(table, where, values_vars, database=None, _test=False):
    """Execute multiple separate update statements in one query.

    >>> database = web.DB(None, {})
    >>> row1 = ({'name': 'Bob'}, {'id': 42})
    >>> row2 = ({'name': 'Sarah', 'age': 30}, {'id': 23})
>>> print multiple_update('foo', 'id = $id', [row1, row2], _test=True, database=database)
    UPDATE foo SET name = 'Bob' WHERE id = 42;
    UPDATE foo SET age = 30, name = 'Sarah' WHERE id = 23;
>>> print multiple_update('foo', 'id = $id', [], _test=True, database=database)
    None
    """
    if not values_vars:
        return
    if database is None:
        database = dbconn.db_connection

    updates = []
    for values, vars in values_vars:
        update = u'UPDATE {table} SET {values} WHERE {where};'.format(
                table=table,
                values=web.db.sqlwhere(values, ', '),
                where=database._where(where, vars))
        updates.append(unicode(update))

    query = '\n'.join(updates)
    if _test:
        return query
    database.query(query)
-----

Thanks,
Ben.
--
You received this message because you are subscribed to the Google Groups "web.py" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups 
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to