Re: [ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-16 Thread Dieter Maurer
Chris Bainbridge wrote at 2006-11-15 18:14 +:
> ...
>Another interesting thing; if I add time.sleep(1) to the end of the
>while loop, then the problem goes away. Possibly there is some kind of
>cache race condition, where the ZEO server sends invalidations
>immediately after the client has commited?

The effect of invalidations is synchronized.
Invalidations become effective only at transaction boundaries
and when a connection is opened.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-15 Thread Chris Bainbridge

Ok, I have the asyncore loop in, I've added explicit transaction begin
and aborts, and cleaned up the test case a bit:

import thread
import asyncore
import random
from ZEO.ClientStorage import ClientStorage
from ZODB import DB
from persistent.list import PersistentList
from ZODB.POSException import ConflictError
import transaction

storage = ClientStorage(('bw64node01', 12345))
db = DB(storage)
conn = db.open()
root = conn.root()
conn.sync()
thread.start_new_thread(asyncore.loop,())

if 'test' not in root:
   try:
   transaction.begin()
   root['test'] = PersistentList([0,1])
   transaction.commit()
   except ConflictError:
   transaction.abort()

g = root['test']
y = PersistentList()
while 1:
   try:
   transaction.begin()
   g[g.index(random.choice(g))] = y
#g[g.index(random.choice(g))] = PersistentList()
   transaction.commit()
   except ConflictError:
   transaction.abort()

Now, when 4 or so instances are run in parallel, this will fail with
POSKeyError corruption of the ZODB database. However, if you uncomment
the commented out line, it's fine. Maybe I'm  missing something - why
can't I create a PersistentList outside of the transaction, and then
add multiple entries inside root['test'] pointing to it?

Another interesting thing; if I add time.sleep(1) to the end of the
while loop, then the problem goes away. Possibly there is some kind of
cache race condition, where the ZEO server sends invalidations
immediately after the client has commited?
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-15 Thread Chris Withers

Chris Bainbridge wrote:

Hi Alan,

  - You cant just catch ConflictError and pass


I do conn.sync() at the top of the loop which is supposed to abort the
connection and re-sync the objects with the zeo server.


Urm, sounds like you're looking for transaction.abort().

Also, be aware of the weirdness that can occur if you run ZEO clients 
without an asyncore loop. These can lead you to need to call .sync()...



  - I think you can catch a ReadConflictError and *retry* that is ok.


Eep, in this day and age you shouldn't be seeing any of these ;-)

  - But a ConflictError needs to be *retried* manually in your client 
code.


Yup, abort the transaction and try again...


afaik, this may be better coding style, but isn't actually required,
since doesn't each commit implicitly begin a new transaction?


Urm, the abort and possibly the .sync are absolutely necessary to get 
all the objects back into a sane, consistent state...


Chris

--
Simplistix - Content Management, Zope & Python Consulting
   - http://www.simplistix.co.uk

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-14 Thread Chris Bainbridge

Hi Alan,

Thanks for the advice. I'm using multiple processes, one on each host
in a cluster. The extra thread is only used to run the asyncore loop,
which allows zodb to receive asynchronous notifications. I've been
playing around with your suggestions, and found that if I don't run
the extra asyncore thread, and put replace conn.sync() with explicit
calls to transaction.begin and end, then the test case will run
without errors. However, if any process receives a SIGTERM signal,
then the bug will occur and the database becomes corrupt.
Unfortunately this doesn't solve the problem, since in my real app
removing the asyncore loop just makes the bug take longer to show up.
I've found a work around though, if instead of modifying the main list
I do list[i].__setstate__(y.__getstate()) so that the code modifies
the objects rather than the PersistentList, then the bug doesn't
occur.


  - You cant just catch ConflictError and pass


I do conn.sync() at the top of the loop which is supposed to abort the
connection and re-sync the objects with the zeo server.


  - I think you can catch a ReadConflictError and *retry* that is ok.

  - But a ConflictError needs to be *retried* manually in your client code.

If you catch a ConflictError you need to abort the transaction.
You should be explicit about *beginning* transactions after ending previous
transaction.


afaik, this may be better coding style, but isn't actually required,
since doesn't each commit implicitly begin a new transaction?
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-08 Thread Chris Bainbridge

Hi,

This issue results in a corrupted database. Can anyone confirm that
they can reproduce this with the test case I provided, so that I can
eliminate any potential problems with my setup as being the cause?

Thanks,
Chris
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: POSKeyError in zodb-3.6.0

2006-11-06 Thread Chris Bainbridge

The bug I'm getting on the client side is two or more clients
simultaneously reporting:

Traceback (most recent call last):
 File "/home/chrb/test_bad.py", line 45, in ?
   i = g.index(random.choice(g))
 File "/usr/lib/python2.4/UserList.py", line 78, in index
   def index(self, item, *args): return self.data.index(item, *args)
 File "/usr/lib/python2.4/UserList.py", line 17, in __eq__
   def __eq__(self, other): return self.data == self.__cast(other)
 File "/usr/lib/python2.4/site-packages/ZODB/Connection.py", line
732, in setstate
   self._setstate(obj)
 File "/usr/lib/python2.4/site-packages/ZODB/Connection.py", line
768, in _setstate
   p, serial = self._storage.load(obj._p_oid, self._version)
 File "/usr/lib/python2.4/site-packages/ZEO/ClientStorage.py", line
746, in load
   return self.loadEx(oid, version)[:2]
 File "/usr/lib/python2.4/site-packages/ZEO/ClientStorage.py", line
769, in loadEx
   data, tid, ver = self._server.loadEx(oid, version)
 File "/usr/lib/python2.4/site-packages/ZEO/ServerStub.py", line 192, in loadEx
   return self.rpc.call("loadEx", oid, version)
 File "/usr/lib/python2.4/site-packages/ZEO/zrpc/connection.py", line
536, in call
   raise inst # error raised by server
ZODB.POSException.POSKeyError: 0x01f5

This seems to be triggered by the call to PersistentList.index. If I
change the random select line to:

 i = random.randint(0, len(g)-1)

then I no longer see this error.

Presumably this just means that the access pattern for index() is
sufficient to trigger this bug, rather than index itself being the
problem.
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: POSKeyError

2005-04-15 Thread Bob Horvath
Dieter Maurer wrote:
> Tim Peters wrote at 2005-3-30 08:39 -0500:
> 
>>...
>>[Dieter Maurer]
>>
>>>The last packing bug was some time in the past. With current Zope
>>>versions, there is no known packing bug.
>>
>>It's not that packing introduces new problems, it's that packing isn't an
>>error recovery procedure:  if the POSKeyErrors persist after the pack, then
>>packing will have destroyed some amount of original evidence forever,
>>potentially making it that much harder for someone to figure out how to
>>repair the POSKeyErrors.
> 
> 
> Thus, the person who fears his storage gets too huge makes
> a backup copy (for analysis) and then packs the production storage.
> 
> I do not advocate packing as an error recovery procedure.
> It is just that isolated POSKeyErrors need not to prevent
> packing.
> 


OK, so I lost a little fear of packing, did a backup, and packed my
database.  Low and behold, the POSKeyErrors are gone.

I do still have a dozen "refers to invalid object:" coming out of
fsrefs.py.  Are these errors waiting to happen?  Or nothing to be
concerned about?

I was tempted to play around with the killthem script
(http://mindlace.net/src/zodb/killthem.py ), but that seems to rely on
zopectl which is a Zope 2.7 thing, right?  I am still stuck on 2.6.2
until I can get rid of these damn errors.

I do feel like my database is somewhat healthier.  Hope it isn't a false
sense of security.
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] Re: POSKeyError

2005-03-30 Thread Dieter Maurer
Tim Peters wrote at 2005-3-30 08:39 -0500:
>...
>[Dieter Maurer]
>> The last packing bug was some time in the past. With current Zope
>> versions, there is no known packing bug.
>
>It's not that packing introduces new problems, it's that packing isn't an
>error recovery procedure:  if the POSKeyErrors persist after the pack, then
>packing will have destroyed some amount of original evidence forever,
>potentially making it that much harder for someone to figure out how to
>repair the POSKeyErrors.

Thus, the person who fears his storage gets too huge makes
a backup copy (for analysis) and then packs the production storage.

I do not advocate packing as an error recovery procedure.
It is just that isolated POSKeyErrors need not to prevent
packing.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zodb-dev