Re: [ZODB-Dev] Another interesting ZODB cache inconsistency

2006-01-15 Thread Dieter Maurer
Jim Fulton wrote at 2006-1-13 17:54 -0500:
 ...
 This means that interrupting ZEO while it is sending invalidation messages
 can cause inconsitent states in the ZODB caches of its clients.

What do you mean by inconsistent states?  Do you mean inconsistent
between clients?

The inconsistent ZEO clients were in fact Zopes.

   Viewing the same page (via the same Zope) repeatedly sometimes
   showed the old and sometimes the new state.

   I concluded that some ZODB caches must still contain the old
   while others already contain the new state.

   The inconsistency was thus already within a single ZEO client.


   I agree that this contradicts my previous assumption
   that partial reception of invalidation messages were to blame
   because all DB's in a single ZEO client receive the invalidation
   messages at the same time.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Another interesting ZODB cache inconsistency

2006-01-13 Thread Jeremy Hylton
Is the problem with consistency of results served across the ZEO
clients or by consistency of the database itself?  It seems like it
must be the former.

In the case of an intolerable ZEO failure, I would expect to lose
execution time consistency among peers but preserve consistency of
committed state.  ZEO can't really provide consistency across the
clients anyway, since one client could be executing before a
particular transaction commits and another after it commits.  If two
web clients talk to the two different ZEO clients, they'll see
different results.  A big transaction exacerbates the problem, because
its takes longer to do everything (including the underlying commit on
the storage).

A few thoughts about the effects:

- Each client should process all of the invalidations from a
transaction or none.  If a client loses contact with the server while
invalidations are being sent, it should not process any of them. 
Maybe there's a bug in the code here?  I haven't looked at the code
lately.

- If a client is disconnected, regardless of the state it was in with
respect to this one transaction, it should revalidate its cache and
invalidate and stale data that it held as a result of the disconnect.

Jeremy

On 1/13/06, Dieter Maurer [EMAIL PROTECTED] wrote:
 We recently observed another ZODB cache inconsistency:

   The commit of a huge transaction caused our ZEO server to be late
   in responding to the HA monitoring probe. The HA monitor responded
   with a SIGTERM targeted to the ZEO server. ZEO restarted.

   The ZEO client performing the huge transaction reported an
   error in the second phase of its commit state.

   The ZODB states of other ZEO clients were inconsitent:
   some of them had received invalidation messages and saw
   the objects modified by the huge transaction with their new
   values. Others had not yet received the invalidation messages
   and treated the objects as still unchanged.


 This means that interrupting ZEO while it is sending invalidation messages
 can cause inconsitent states in the ZODB caches of its clients.

 I do not know what can be done about it...


 --
 Dieter
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev