[sqlalchemy] Re: Slightly confusing error when session set up incorrectly

Ben Sizer Wed, 29 Jun 2011 09:25:01 -0700

On Jun 29, 6:46 am, Michael Bayer <[email protected]> wrote:
> On Jun 28, 2011, at 9:34 PM, Ben Sizer wrote:
>
> > What does the "None None" signify? Would it be possible to change this
> > exception to be a bit more descriptive and a little less cryptic? eg.
> > Include the URI that failed?
>
> That error is raised by SQLite and we just propagate it out.


Ok, that's a bit annoying, but it does sound like something that is
not your problem to solve. Maybe if I get time I'll contact the
pysqlite maintainers as I think it would be useful to have a pseudo-
statement in there at least.

> > Additionally, I notice that sqlalchemy doesn't attempt to make an
> > actual connection to the database until you perform the first query.
>
> That is true, all connections/transactions are lazy initializing.    Sorry 
> this was a surprise, though I don't think this is much of an issue once you 
> get used to dealing with lazy initializing objects.
>
> The formal pattern at play with SQLAlchemy's connection pool, Engine and 
> Session behavior is called the Proxy Pattern, a decent description is at 
> http://sourcemaking.com/design_patterns/proxy.

Yeah, that's all fine. I don't have a problem with this choice, just
that it's not immediately obvious when following the docs.

> the lazy initializing behavior of the Session is documented:
>
> http://www.sqlalchemy.org/docs/orm/session.html#unitofwork-transaction

Not exactly. It says it maintains a connection for each SQL statement,
which is not logically equivalent to saying there is no connection
unless there has been an SQL statement. There is a similar implication
in the 'What does the Session do?' paragraph also.

> But that doesn't have anything to do with the Engine, which is its own thing. 
>   As far as the Engine, the docs currently use the term "connect" in 
> conjunction with create_engine() which is for simplicities' sake, but is 
> technically inaccurate, perhaps come up with some term other than "connect", 
> "configure a connection source" perhaps.   We can add an explicit sentence to 
> the top ofhttp://www.sqlalchemy.org/docs/core/connections.html.

I must admit I don't fully understand engines, connections, sessions,
etc. I will study the docs further! However when writing manual DB
code I would usually have a step where I initialise the driver (which
I had been thinking of as the create_engine stage), a step where I can
issue statements (which I think of as the session stage), and a point
temporally between those where I explicitly connect to the physical
storage, which I had thought was automatically taking place when
creating the engine but I now realise only happens on demand. It's not
a problem as such, just a surprise.

> > I expect I am not unusual
> > in wanting an error in session configuration to fail as soon as
> > possible,
>
> Nobody has ever raised this issue before to my recollection.

The problem with show-stopper bugs is that people often just give up
and switch to something else rather than tell you about it. ;)

(Of course in this case, the bug is mine, but developers do like to
blame their tools...)

> I'm not sure lots of users are phased whether the stack trace starts at the 
> Session.configure() line or if it starts later as soon as their first unit 
> test tries to hit the database - they get the same error, see that the 
> connection URL is bad, and fix it.

But this is exactly the problem: I had one unit test for setting up
the database, and one for actually using the database. The first one
passed - the second one failed. So I immediately think, "my usage of
the database is wrong", not "I set up the database connection
wrongly", because all the code for setting up the connection was
covered the test that passed, too. That is what the unit tests were
there for, after all - to separate out where a problem appears to be
coming from, by splitting your code coverage across tests.

> I suppose inefficiency is the only issue but its so unnecessarily 
> inefficient, most people would consider it to be wrong behavior.    A Session 
> may be configured such that depending on what's requested of it, it can 
> connect to any number of different engines - connecting to 
> several/dozens/hundreds of engines unconditionally upon construction is not 
> an option.

Yeah, that's fine. I just think it would be good to have had something
about this documented right in at the top of the Session docs so that
anyone starting out or writing unit tests for this sort of thing knows
that they're not actually testing the db connection just because they
made an engine and Session without error. Anywhere you pass in invalid
data, you hope that the system catches that as soon as possible so
that the problem's narrowed down, ideally to the exact statement that
is incorrect. In this case, it's not efficient to do so, therefore it
would be great for the docs to explicitly highlight that fact.

> Feel free to suggest what verbiage you're looking for, it's not hitting me 
> strongly what the confusion is - the application failed to connect, you get 
> an error describing the problem, the stack trace shows it happened upon 
> connect() inside the pool.     An application should always have unit tests 
> which will reveal basic issues like this immediately.

As noted above, the nature of the lazy evaluation is such that you end
up not testing the functionality that you think you're testing, which
combined with the rather vague exception (not your fault, I know),
makes this a bit confusing to debug. I'm really not sure of the
precise wording I'd pick: there's a lot of stuff in "What does the
Session do?" and other parts of the Session docs which -hint- at there
being no connection until a query is issued, but which don't
explicitly spell that out.


> > (In fact, this part
> > of the docs is a bit confusing in general - there seem to be a fair
> > few permutations of how to approach it - pass arguments to
> > sessionmaker? or to Session.configure? or to Session's constructor? -
> > and it's not clear why they all have to exist.)
>
> They exist because the intended usage is that sessionmaker() be placed as a 
> global variable in the module space of an application - it is then often the 
> case that the actual Engine does not exist at this point, so cannot be passed 
> to the sessionmaker().   A later configuration step can then call configure() 
> to add new Engine configurations.    This usage is driven by the needs of web 
> frameworks and such.

Ok, so:
    Session = sessionmaker(bind=some_engine) is for when you know
which engine you need at import time;
    Session.configure(bind=engine) is when you have decided on the
engine and want all subsequent sessions to use that engine, without
needing to keep a reference to that engine;
    session = Session(bind=engine) is when you want to be able to
specify the engine on a per-session basis.

Is that right? It seems to me like the middle one isn't strictly
necessary as you could just keep a reference to the engine yourself
rather than injecting it into the Session, but I guess that's just a
personal preference.

I can see why you'd call sessionmaker() at file scope to ensure it's
done at import time, but then why isn't this part just rolled into the
sqlalchemy package directly? Can't it just expose Session to me and
let me call .configure on it? Maybe that's exactly what it means when
the docs say, "Session is a regular Python class which can be directly
instantiated.", which is great, but that means there would be 4 ways
of approaching this, which seems a little much. It would be great if
the Session docs gave a short sentence on the rationale behind each of
these use cases so that a new user can know the implications of the
one he or she picks. At the moment it's sort of assumed that you will
know why you want one or the other, without the differences being
clear.

> The Session constructor itself is not mentioned there - when you see it say 
> Session(), that's an invocation of the sessionmaker() function.

The docs say, "the sessionmaker() call creates a class for us, which
we assign to the name Session". So when you call "session =
Session(bind=conn)", you're invoking the Session constructor/__init__
method, surely? (I appreciate that the Session object here might not
be the same as sqlalchemy.orm.session.Session, whatever that is, but
from the user's perspective, there is a class called Session which
we're instantiating.)

Thanks for your help so far.

--
Ben Sizer

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

[sqlalchemy] Re: Slightly confusing error when session set up incorrectly

Reply via email to