Re: [sqlalchemy] keeping long filter statements easily readable?

2012-05-28 Thread Simon King
On Sun, May 27, 2012 at 4:37 PM, Julien Lacroix ne...@aradriel.de wrote:
 I've stumbled across someof my old query and got certain problems to read my
 bunch of where statements.
 The editors word wrap function makes things just worse. whats your advance
 to keep long filter statements readable,
 one where statement per line would be great.

 e.g.:
 obj = dbObj.query(table).filter(table.foo1 == False AND include more filter
 statements here).order_by(table.foo2).all()


I tend to write my SA queries like this:

obj = (session.query(dbObj)
   .filter(dbObj.foo1 == False)
   .filter(sa.or_(dbObj.foo2 == 'abc',
  dbObj.foo3 == 'def'))
   .order_by(dbObj.foo2)
   .all())

Remember multiple filter clauses are automatically AND-ed together.

Simon

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Session management with mod_wsgi and webapp2

2012-05-28 Thread Simon King
On Mon, May 28, 2012 at 12:58 AM, Bradley Mclain
bradley.james.mcl...@gmail.com wrote:
 On Saturday, May 26, 2012 9:31:55 AM UTC+10, Michael Bayer wrote:


 On May 24, 2012, at 11:28 PM, Bradley Mclain wrote:

 Hi,

 Currently working on a project that use SQL alchemy with mod_wsgi,
  webapp2 and prestans (python REST framework). The backend is Oracle XE
 using cx_Oracle.

 What is the appropriate way to manage sessions in a multithreading app
 such as this? Currently I have been following the guide found here to
 implement a contextual/thread-local session.


 scoped_session produces a session factory/proxy object which maintains a
 single Session per thread, and as the guide indicates you can allow this
 factory to automatically create a Session on first use, then at request end
 time you do an unconditional remove() so that the scoped_session closes the
 Session it may or may not be handling, and disposes of it.



 I have then made a session that is automatically cleaned up by the request
 handler. The code for this looks something like:

 def dispatch(self):
     self.db_session = db.Session()
     super(BaseHandler, self).dispatch()
     self.db_session.close()
     db.Session.remove()


 db.Session.remove() should be enough here, it calls close() for you.


 However this doesn't seem to clean up all database connections as viewed
 in netstat



 The Session uses an Engine as a source of connections.   The Engine is
 introduced at http://docs.sqlalchemy.org/en/rel_0_7/core/engines.html .
  Intrinsic to its behavior is that it uses a small connection pool which by
 default holds onto five connections out of the total accumulated.  Full
 information on how this pooling is configured or disabled is
 at http://docs.sqlalchemy.org/en/rel_0_7/core/pooling.html.

 and I get a connections exhausted error from Oracle XE after a certain
 time of using the application. The same thing also happens when I use MySQL
 as well.


 one of the many options here is the pool_recycle option which places a
 time limit on the age of connections.



 If I add a call to db.engine.dispose() after the handler has returned this
 appears to fix the problem but the SQL Alchemy documentation indicates that
 this is only for usage with dropped connections or in test suites.


 I'm glad the documentation made clear that dispose() is not generally
 appropriate, as the Engine is a factory/registry for connections, not a
 connection itself.


 Thanks for confirming my apprehensions about using engine.dispose().

 So I gather it is just a matter of me tweaking the pool_recycle and
 pool_size options to something more appropriate for my application?

 Can you offer any guidance as to some appropriate values for a threaded REST
 application? Would pool_size be able to be quite small like 2 and
 pool_recycle something like 2 mins?



How many connections are alive when you get the connections
exhausted error? Is it more than pool_size + max_overflow (default
15)? Also, are you using multiple processes with mod_wsgi? If so,
remember that each process will have its own pool, so if you had 2
processes, you could have up to 30 connections open.

Simon

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?

2012-05-28 Thread Jeff
The unique constraint sounds like a workable solution! I'll implement
that with a try/except and report back if that was effective. Thanks!

On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote:
 On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote:
  Thanks,

  I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as
  far as I can tell that locks rows that have been selected. That is not
  helpful in this use case, in which the issue is rows not existing, and
  then later existing. Am I misunderstanding?

  On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote:
  On May 27, 2012, at 1:07 AM, Jeff wrote:

   I have multiple processes accessing  a table. All of these processes
   want to read a set of rows from the table, and if the rows are not
   present they will make a calculation and insert the rows themselves.
   The issue comes where process  A does a query to see if the target set
   of rows is present in the table, and they're not, and then another
   starts calculating. While it's calculating, process B inserts the
   rows. Then process A inserts the rows, and now we have two copies of
   these sets of rows. Bad.

  You should look at SELECT FOR UPDATE.

 http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo...

  Cheers,
  M

 Could you put unique constraints on the table so that the second
 process will get an error when it tries to insert the duplicate rows?
 It won't prevent you from performing the calculations twice, but at
 least you won't get the duplicates.

 Another option would be to write some sort of pending marker into
 the table, so that subsequent processes know that the result is
 already being calculated.

 Simon

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Multiple inserts with .append()'d associations

2012-05-28 Thread Jeff
There are Analyses, which have Events. There's an
Analysis_Event_Association table. Events are written before the
Analyses. Later, I have a for loop calculating multiple Analyses. For
each Analysis, we identify the Events it includes and append them,
using sqlalchemy's relationship plumbing. This just writes the
appropriate row to the association table. It works great:

all_events =
list(session.query(Event).filter().order_by(...).all())

for a in analyses_parameters:
analysis = Analysis()
events_in_this_analysis = all_events[...]
analysis.events = events_in_this_analysis
session.append(analysis)

session.commit()
session.close()
session.bind.dispose()

The issue is that I'm in a situation where there are many, many
processes writing to the same database, and the database only allows
us a limited number of connections. So I'm trying to have connections
open only when they're actually being used, hence the
session.bind.dispose().

Writing all the analyses is a single insert, and thus is efficient
with regards to having a connection open. HOWEVER, it appears that
each of the Events associations we appended are being inserted
individually, which is bad.  The connection sits there, then is used,
then sits there, then is used, etc. This contributes to the maximum
connections open, which is a limited resource. I'd like all the
appended Events association to be inserted in one go. Is there a way
to do this? Am I correct about what's going on?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?

2012-05-28 Thread Michael Bayer
An option to add along to the unique constraint, if you expect to get 
collisions often, is to use a SAVEPOINT so that a process can roll back 
partially if this particular INSERT fails, then use the row.  The Session 
offers SAVEPOINT via begin_nested():

session.begin_nested()
try:
 session.add(thing_that_may_exist_already)
 session.commit()  # flushes, and commits only the savepoint
except exc.IntegrityError:
session.rollback()
thing_that_may_exist_already = 
session.query(Thing).filter_by(criteiron).one()

the difference between using locks to prevent concurrent dupes versus using 
constraints and expecting dupes to fail is known as pessimistic versus 
optimistic locking.


On May 28, 2012, at 10:38 AM, Jeff wrote:

 The unique constraint sounds like a workable solution! I'll implement
 that with a try/except and report back if that was effective. Thanks!
 
 On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote:
 On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote:
 Thanks,
 
 I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as
 far as I can tell that locks rows that have been selected. That is not
 helpful in this use case, in which the issue is rows not existing, and
 then later existing. Am I misunderstanding?
 
 On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote:
 On May 27, 2012, at 1:07 AM, Jeff wrote:
 
 I have multiple processes accessing  a table. All of these processes
 want to read a set of rows from the table, and if the rows are not
 present they will make a calculation and insert the rows themselves.
 The issue comes where process  A does a query to see if the target set
 of rows is present in the table, and they're not, and then another
 starts calculating. While it's calculating, process B inserts the
 rows. Then process A inserts the rows, and now we have two copies of
 these sets of rows. Bad.
 
 You should look at SELECT FOR UPDATE.
 
 http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo...
 
 Cheers,
 M
 
 Could you put unique constraints on the table so that the second
 process will get an error when it tries to insert the duplicate rows?
 It won't prevent you from performing the calculations twice, but at
 least you won't get the duplicates.
 
 Another option would be to write some sort of pending marker into
 the table, so that subsequent processes know that the result is
 already being calculated.
 
 Simon
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Multiple inserts with .append()'d associations

2012-05-28 Thread Michael Bayer

On May 28, 2012, at 10:52 AM, Jeff wrote:

 There are Analyses, which have Events. There's an
 Analysis_Event_Association table. Events are written before the
 Analyses. Later, I have a for loop calculating multiple Analyses. For
 each Analysis, we identify the Events it includes and append them,
 using sqlalchemy's relationship plumbing. This just writes the
 appropriate row to the association table. It works great:
 
 all_events =
 list(session.query(Event).filter().order_by(...).all())
 
 for a in analyses_parameters:
analysis = Analysis()
events_in_this_analysis = all_events[...]
analysis.events = events_in_this_analysis
session.append(analysis)
 
 session.commit()
 session.close()
 session.bind.dispose()

 
 The issue is that I'm in a situation where there are many, many
 processes writing to the same database, and the database only allows
 us a limited number of connections. So I'm trying to have connections
 open only when they're actually being used, hence the
 session.bind.dispose().

engine.dispose() is an inefficient operation, as the Engine is a factory for 
connections, not a connection itself.   If you'd like there to be zero actual 
database connections open when the application is idle, you can disable pooling 
using NullPool. 



 
 Writing all the analyses is a single insert, and thus is efficient
 with regards to having a connection open. HOWEVER, it appears that
 each of the Events associations we appended are being inserted
 individually, which is bad.  The connection sits there, then is used,
 then sits there, then is used, etc. This contributes to the maximum
 connections open, which is a limited resource.

Just to make sure we're on the same page, the Session procures a single DBAPI 
connection, and uses it for the lifespan of that transaction.   So the above 
code is using only one connection until COMMIT is called.   Unless there is a 
significant period of latency in the above loop, it would be way less efficient 
and also defeat transactional guarantees to use a new connection for each 
INSERT.   If your app is using MyISAM tables then the transactional guarantees 
are moot in any case, though.   But breaking up a tight loop to repeatedly 
connect over and over again for each record being inserted, in the effort to 
use fewer concurrent connections, is very unlikely to help the situation as 
you're adding lots more work and time spent to the operation. 

 I'd like all the
 appended Events association to be inserted in one go. Is there a way
 to do this? Am I correct about what's going on?

SQLAlchemy 0.7 will also perform the INSERT of all the above records as a 
single prepared statement if you assign a primary key value to the objects 
ahead of time; the ORM no longer needs to fetch the newly inserted primary key 
in this case, so you'd see a single statement sent to the DBAPI, and the DBAPI 
then executes the statement in a more efficient manner for each set of 
parameters (typically using prepared statement mechanics).   It still 
ultimately emits individual INSERT statements over the wire but just at a 
lower, more performant level on the Python side.

Or, you can get the same effect by inserting the records using a non-ORM 
executemany, where in this case you wouldn't need to pre-assign primary key 
values:

Session.execute(Event.__table__.insert(), params=[{a:evt.a, b:evt.b, ...} for 
evt in events}]

the disadvantage to the above is that the Session doesn't know about these 
newly inserted Event objects.  You'd need to make sure you don't add equivalent 
records to the Session itself or they will conflict, and if you want to work 
with these Event objects in the Session immediately subsequent to the above 
operation you'd need to load them in (as that would be the only way to get at 
their newly generated primary key values).


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.