Re: [ZODB-Dev] RelStorage now in Subversion

2008-10-18 Thread Jim Fulton

On Jan 31, 2008, at 9:14 AM, Jim Fulton wrote:


 On Jan 31, 2008, at 3:08 AM, Shane Hathaway wrote:

 Jim Fulton wrote:
 On Jan 27, 2008, at 3:11 AM, Shane Hathaway wrote:
 I hope the patch, or a modified version of the patch, will be  
 accepted for inclusion in ZODB 3.9.  A monkey patch version is  
 possible, but I'm trying to avoid that.
 I'm tentatively planning on including it in 3.9.  I need to review  
 it carefully. I also plan to make other changes in the  
 invalidation mechanisms, so I'd like to take this into account at  
 the same time.

 I'm really glad to hear that.  I tried to make the patch as simple  
 as I could.  Here is an overview of the changes the patch makes:

 I can read patches. :) I've already reviewed the patch and  
 understand it.

Well, I understand what it does ...

For me to make progress on this, I'm going to need interfaces and  
tests.  I don't think I can write these, in part, because there isn't  
any explicit contract that the current patch implements/leverages.

I'd like to see an interface that describes the extra/alternate  
interface that relstorage exposes to ZODB.  I'd also like to see some  
tests that verify that ZODB is exercising the interface correctly.  It  
would probably be best to start with the interface.  I'll likely want  
to iterate on that first before we start working on tests.

Jim

--
Jim Fulton
Zope Corporation


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-05 Thread Dieter Maurer
Hello Shane,

Shane Hathaway wrote at 2008-2-3 23:57 -0700:
 ...
Looking into this more, I believe I found the semantic we need in the 
PostgreSQL reference for the LOCK statement [1].  It says this about 
obtaining a share lock in read committed mode: once you obtain the 
lock, there are no uncommitted writes outstanding.  My understanding of 
that statement and the rest of the paragraph suggests the following 
guarantee: in read committed mode, once a reader obtains a share lock on 
a table, it sees the effect of all previous transactions on that table.

I have been too pessimitic with respect to Postgres.

While Postgres uses the freedom of the ASNI isolation level definitions
(they say that some things must not happen but do not prescribe that
other things must necessarily happen), Postgres has a precise
specification for the read committed mode -- it says: in read
committed mode, each query sees the state as it has been when
the query started. This implies that it sees all transactions
that have been committed before the query started. This is sufficient
for your conflict resolution to be correct -- as you hold the commit
lock during conflict resolution such that no new transaction can happen
during the query in question.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-03 Thread Dieter Maurer
Meanwhile I have carefully studied your implementation.

There is only a single point I am not certain about:

  As I understand isolation levels, they garantee that some bad
  things will not happen but not that all not bad thing will happen.

  For read committed this means: it garantees that I will
  only see committed transactions but not necessarily that I will see
  the effect of a transaction as soon as it is committed.

  Your conflict resolution requires that it sees a transaction as
  soon as it is commited.

  The supported relational databases may have this property -- but
  I expect we do not have a written garantee that this will definitely
  be the case.

I plan to make a test which tries to provoke a conflict resolution
failure -- and gives me confidance that the read committed of
Postgres really has the required property.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-03 Thread Shane Hathaway

Dieter Maurer wrote:

  Your conflict resolution requires that it sees a transaction as
  soon as it is commited.

  The supported relational databases may have this property -- but
  I expect we do not have a written garantee that this will definitely
  be the case.

I plan to make a test which tries to provoke a conflict resolution
failure -- and gives me confidance that the read committed of
Postgres really has the required property.


You have a point.  I would be interested in that test as well.

Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-03 Thread Shane Hathaway

Dieter Maurer wrote:

  For read committed this means: it garantees that I will
  only see committed transactions but not necessarily that I will see
  the effect of a transaction as soon as it is committed.

  Your conflict resolution requires that it sees a transaction as
  soon as it is commited.


Looking into this more, I believe I found the semantic we need in the 
PostgreSQL reference for the LOCK statement [1].  It says this about 
obtaining a share lock in read committed mode: once you obtain the 
lock, there are no uncommitted writes outstanding.  My understanding of 
that statement and the rest of the paragraph suggests the following 
guarantee: in read committed mode, once a reader obtains a share lock on 
a table, it sees the effect of all previous transactions on that table.


In RelStorage, all conflict detection and resolution already happens 
under the protection of an exclusive lock on the commit_lock table. 
However, the table we're using for conflict detection is current_object, 
not commit_lock, so we are not yet fulfilling the conditions of the 
above-mentioned guarantee.  We could be relying on undocumented 
behavior.  It's quite conceivable that Postgres might aggressively 
release locks at transaction commit, then allow the data updates to flow 
lazily to other sessions until a share lock is acquired.


To correct this, it appears we only need to add LOCK current_object IN 
SHARE MODE before the conflict detection and resolution code.  Do you 
agree that will plug the hole?


Your diligence is much appreciated.

Shane

[1] http://www.postgresql.org/docs/8.1/interactive/sql-lock.html

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-02 Thread Shane Hathaway

Dieter Maurer wrote:

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.


Good point.  After a commit, we *must* poll.


This implies, the read connection must start a new transaction
at least after a ConflictError has occured. Otherwise, the
ConflictError cannot go away.


Also a good point.  All these details will come into play if I attempt 
to poll less often.



What I fear is described by the following szenario:

   You start a transaction on your load connection L.
   L will see the world as it has been at the start of this transaction.

   Another transaction M modifies object o.

   L reads o, o is modified and committed.
   As L has used o's state before M's modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.


Yes, RelStorage uses standard ZODB conflict detection: all object 
changes must be derived from the most current state of the object in the 
database.  If any object has been changed by later transactions, 
conflict resolution is attempted, and if that fails, the transaction fails.



I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits (2PC) but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  [EMAIL PROTECTED] about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, RelStorage
  seems only safe for single storage use.


Actually, RelStorage inherited two phase commit support from PGStorage. 
 The 2PC API is accessible through psycopg2 if you simply issue the 
transaction control statements yourself.


Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-01 Thread Dieter Maurer
Hallo Shane,

Shane Hathaway wrote at 2008-1-31 13:45 -0700:
 ...
No, RelStorage doesn't work like that either.  RelStorage opens a second
database connection when it needs to store data.  The store connection
will commit at the right time, regardless of the polling strategy.  The
load connection is already left open between connections; I'm only
talking about allowing the load connection to keep an idle transaction.
 I see nothing wrong with that, other than being a little surprising.

That looks very troubesome.

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.

  and you read older and older data
 which must increase serializability problems

I'm not sure what you're concerned about here.  If a storage instance
hasn't polled in a while, it should poll before loading anything.

Even if it has polled not too far in the past, it should
repoll when the storage is joined to a Zope request processing
(in Connection._setDB):
If it does not, then it may start work with an already outdated
state -- which can have adverse effects when the request bases modifications
on this outdated state.
If everything works fine, than a ConflictError results later
during the commit.

This implies, the read connection must start a new transaction
at least after a ConflictError has occured. Otherwise, the
ConflictError cannot go away.

 
 (Postgres might
 not garantee serializability even when the so called isolation
 level is chosen; in this case, you may not see the problems
 directly but nevertheless they are there).

If that is true then RelStorage on PostgreSQL is already a failed
proposition.  If PostgreSQL ever breaks consistency by exposing later
updates to a load connection, even in the serializable isolation mode,
ZODB will lose consistency.  However, I think that fear is unfounded.
If PostgreSQL were a less stable database then I would be more concerned.

I do not expect that Postgres will expose later updates to the load
connection.

What I fear is described by the following szenario:

   You start a transaction on your load connection L.
   L will see the world as it has been at the start of this transaction.

   Another transaction M modifies object o.

   L reads o, o is modified and committed.
   As L has used o's state before M's modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.

If something causes a commit failure, then the probability of such
failures increases with the outdatedness of L's reads.

 ...
RelStorage only uses the serializable isolation level for loading, not
for storing.  A big commit lock prevents database-level conflicts while
storing.  RelStorage performs ZODB-level conflict resolution, but only
while the commit lock is held, so I don't yet see any opportunity for
consistency to be broken.  (Now I imagine you'll complain the commit
lock prevents scaling, but it uses the same design as ZEO, and that
seems to scale fine.)

Side note:

  We currently face problems with ZEO's commit lock: we have 24 clients
  that produce about 10 transactions per seconds. We observe
  occational commit contentions in the duration of a few minutes.

  We already have found several things that contribute to this problem --
  slow operations on clients while the commit lock is held on ZEO:
  Python garbage collections, invalidation processing, stupid
  application code.
  But there are still some mysteries and we do not yet have
  a good solution.

 
I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits (2PC) but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  [EMAIL PROTECTED] about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, RelStorage
  seems only safe for single storage use.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Shane Hathaway

Jim Fulton wrote:


On Jan 27, 2008, at 3:11 AM, Shane Hathaway wrote:
I hope the patch, or a modified version of the patch, will be accepted 
for inclusion in ZODB 3.9.  A monkey patch version is possible, but 
I'm trying to avoid that.



I'm tentatively planning on including it in 3.9.  I need to review it 
carefully. I also plan to make other changes in the invalidation 
mechanisms, so I'd like to take this into account at the same time.


I'm really glad to hear that.  I tried to make the patch as simple as I 
could.  Here is an overview of the changes the patch makes:


* Connection.__init__() looks for a method of the storage called 
bind_connection.  If it exists, the Connection calls bind_connection() 
and gets back a Connection-specific storage object, which the Connection 
should use in place of the original storage.  This way, each storage 
instance can hold Connection-specific state.


* If the storage has an attribute called propagate_invalidations set to 
a false value, Connection.invalidate() no longer accepts any 
invalidation messages.  The storage.poll_invalidations() method becomes 
the sole source of invalidation information.


* Connection.close() calls storage.connection_closing() if the method 
exists.  This allows the storage to release resources while the 
connection is not in use.


* Connection._flush_invalidations() calls storage.poll_invalidations() 
if the method exists.  poll_invalidations() returns either a sequence of 
OIDs the Connection needs to invalidate, or the value None, which means 
the entire cache needs to be invalidated (which can happen if a 
connection has not been used in a long time).


* The DB class also calls connection_closing() after initialization, to 
free storage resources used during initialization.


I admit that polling for invalidations probably limits scalability, but 
I have not yet found a better way to match ZODB with relational 
databases.  Polling in both PostgreSQL and Oracle appears to cause no 
delays right now, but if the polling becomes a problem, within 
RelStorage I can probably find ways to reduce the impact of polling, 
such as limiting the polling frequency.


Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Jim Fulton


On Jan 31, 2008, at 3:08 AM, Shane Hathaway wrote:


Jim Fulton wrote:

On Jan 27, 2008, at 3:11 AM, Shane Hathaway wrote:
I hope the patch, or a modified version of the patch, will be  
accepted for inclusion in ZODB 3.9.  A monkey patch version is  
possible, but I'm trying to avoid that.
I'm tentatively planning on including it in 3.9.  I need to review  
it carefully. I also plan to make other changes in the invalidation  
mechanisms, so I'd like to take this into account at the same time.


I'm really glad to hear that.  I tried to make the patch as simple  
as I could.  Here is an overview of the changes the patch makes:


I can read patches. :) I've already reviewed the patch and understand  
it. I want to think a bit harder about the consistency/timeliness  
impacts of polling rather than notification.  I may also implement  
this differently. For example, I *may* define multiple storage  
interfaces and do some sort of adapter dance to assemble things  
differently depending storage capabilities.  shrug/


Jim

--
Jim Fulton
Zope Corporation


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer
Shane Hathaway wrote at 2008-1-31 01:08 -0700:
 ...
I admit that polling for invalidations probably limits scalability, but 
I have not yet found a better way to match ZODB with relational 
databases.  Polling in both PostgreSQL and Oracle appears to cause no 
delays right now, but if the polling becomes a problem, within 
RelStorage I can probably find ways to reduce the impact of polling, 
such as limiting the polling frequency.

I am surprised that you think to be able to play with the polling
frequency.

  Postgres will deliver objects as they have been when the
  transaction started.
  Therefore, when you start a postgres transaction
  you must invalidate any object in your cache that
  has been modified between load time and the begin of this
  transaction. Otherwise, your cache can deliver stale state
  not fitting with the objects loaded directly from Postgres.

  I read this as you do not have much room for manouver.
  You must ask Postgres about invalidations when the transaction
  starts.

  Of course, you can in addition ask Postgres periodically
  in order to have a smaller and (hopefully) faster result
  when the transaction starts.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Shane Hathaway
Dieter Maurer wrote:
 I am surprised that you think to be able to play with the polling
 frequency.
 
   Postgres will deliver objects as they have been when the
   transaction started.
   Therefore, when you start a postgres transaction
   you must invalidate any object in your cache that
   has been modified between load time and the begin of this
   transaction. Otherwise, your cache can deliver stale state
   not fitting with the objects loaded directly from Postgres.
 
   I read this as you do not have much room for manouver.
   You must ask Postgres about invalidations when the transaction
   starts.

Yes, quite right!

However, we don't necessarily have to roll back the Postgres transaction
on every ZODB.Connection close, as we're doing now.  If we leave the
Postgres transaction open even after the ZODB.Connection closes, then
when the ZODB.Connection reopens, we have the option of not polling,
since at that point ZODB's view of the database remains unchanged from
the last time the Connection was open.

It's not usually good practice to leave sessions idle in a transaction,
but this case seems like a good exception since it should significantly
reduce the database traffic.

Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer
Shane Hathaway wrote at 2008-1-31 11:55 -0700:
 ...
Yes, quite right!

However, we don't necessarily have to roll back the Postgres transaction
on every ZODB.Connection close, as we're doing now.

That sounds very nasty!

In Zope, I definitely *WANT* to either commit or roll back the
transaction when the request finishes. I definitely do not
want to let the following completely unrelated request
decide about the fate of my modifications.

 If we leave the
Postgres transaction open even after the ZODB.Connection closes, then
when the ZODB.Connection reopens, we have the option of not polling,
since at that point ZODB's view of the database remains unchanged from
the last time the Connection was open.

Yes, but you leave the fate of your previous activities to
the future -- and you read older and older data
which must increase serializability problems (Postgres might
not garantee serializability even when the so called isolation
level is chosen; in this case, you may not see the problems
directly but nevertheless they are there).

It's not usually good practice to leave sessions idle in a transaction,
but this case seems like a good exception since it should significantly
reduce the database traffic.

I agree that it can reduce traffic but I am almost convinced that
the price will be high (in either cannot serialize concurrent updates
or not directly noticable serializability violations).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-27 Thread Shane Hathaway

Andreas Jung wrote:

Thanks a lot for the great work!

One question: what is the ZODB patch actually doing? Is it for 
optimization purposes? Is it specfic to RelStorage or does it provide 
some general ZODB improvement(s)?


The patch is intended to be a general ZODB improvement that enables ZODB 
to interact in a natural way with data stores that already provide MVCC 
semantics.  It does this by allowing the storage to bind a separate 
storage instance to each ZODB.Connection, rather than attaching all 
ZODB.Connections to a single storage.  The patch also allows the storage 
to provide object invalidations through polling.


I hope the patch, or a modified version of the patch, will be accepted 
for inclusion in ZODB 3.9.  A monkey patch version is possible, but I'm 
trying to avoid that.


Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-26 Thread Andreas Jung



--On 27. Januar 2008 00:26:04 -0700 Shane Hathaway [EMAIL PROTECTED] 
wrote:



Hi all,

RelStorage now exists in the Zope subversion repository here:

   http://svn.zope.org/relstorage/trunk/

I have also created a wiki page:

   http://wiki.zope.org/ZODB/RelStorage




Thanks a lot for the great work!

One question: what is the ZODB patch actually doing? Is it for optimization 
purposes? Is it specfic to RelStorage or does it provide some general ZODB 
improvement(s)?


Andreas

pgpgabgdkzJr6.pgp
Description: PGP signature
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev