Re: [Zope-dev] Conflict errors on startup, revisited
Tim Peters wrote: I'm not really happy with that, because I have no theory for how ZEO client cache files _get_ out of synch. The most common one for me is connecting a ZEO client to one storage server (say a dev one) and then changing the config to connect to another storage server (say a staging one) without making sure the cache files are gone. That seems ot do it most time, and I'm sure I've got POSKeyErrors and all manner of other weirdness too. My workaround is never to use persistent client caches on clients that I'm likely to connect to different servers. This seems a shame, I wish there was some kind of safety check that ZEO could do to go wait, these caches are insane, they don't even apply to the server you're connecting to cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
Paul Winkler wrote at 2005-2-24 15:16 -0500: ... Twiddling around with some old suggestions from Dieter Maurer (see http://aspn.activestate.com/ASPN/Mail/Message/Zope-CMF/1978842 ), I added a _p_jar.sync() where he suggested - right before the final except: in initialize_product(). No joy. Note that the database is initialized in a separate connection -- in case it is empty. This may cause ConflictErrors in the primary connection. -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
Paul Winkler wrote at 2005-2-24 14:29 -0500: This is a topic that comes up occasionally, but since the symptom seems to be rare and hard to provoke, it never gets resolved. Well it's biting me again and I'm determined to fix it, but would like some input from the list. I know that ZopeProfiler can cause such ConflictErrors and (what is more) why it does this. ZopeProfiler (like some other products) wants to install itself in Control_Panel. Because, this requires a database connection and because App.ProductContext.ProductContest tries hard to hide its application object, ZopeProfiler opens a new one, modifies Control_Panel and commits the transaction and then closes its connection. Such modifications via a second ZODB connection can cause unexpected (and puzzling) ConflictError on startup. The next ZopeProfiler version will access ProductContext.__app instead (using the mangled name). -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
ZODB.POSException.ConflictError: database conflict error ( serial this txn started with 0x035b25f36751f988 2005-02-10 18:59:24.215675, serial currently committed 0x035b743c6d186822 2005-02-24 17:00:25.569220) [Dieter Maurer] Where is the oid in this ConflictError report? It is vital information to analyse which object causes the problem... I don't know. I reformatted Paul's traceback to line up the timestamps for easier reading, but there was no oid in the original either. There wouldn't be, either, if no oid was passed to the ConflictError constructor. ConflictError.__str__ only includes info for the stuff it was told about: def __str__(self): extras = [] if self.oid: extras.append(oid %s % oid_repr(self.oid)) if self.class_name: extras.append(class %s % self.class_name) if self.serials: current, old = self.serials extras.append(serial this txn started with %s % readable_tid_repr(old)) extras.append(serial currently committed %s % readable_tid_repr(current)) if extras: return %s (%s) % (self.message, , .join(extras)) else: return self.message In Paul's case, the ConflictError occurred on the ZEO server, but the traceback came from the ZEO client (just raising the Exception object the ZEO server passed to it). There's no way then to guess what the traceback might have looked like on the ZEO server side (b!). Perhaps looking at the ZEO server log would have helped. ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 02:29:50PM -0500, Paul Winkler wrote: This is a topic that comes up occasionally, but since the symptom seems to be rare and hard to provoke, it never gets resolved. Well it's biting me again and I'm determined to fix it, ... And since my ConflictError turned out to be merely caused by a borked ZEO client cache, I've given up on the hunt ... again :-\ For all I know, ZEO cache may have been the cause of my previous run-ins with ConflictError on startup too. -PW -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Conflict errors on startup, revisited
This is a topic that comes up occasionally, but since the symptom seems to be rare and hard to provoke, it never gets resolved. Well it's biting me again and I'm determined to fix it, but would like some input from the list. Related collector issues: http://www.zope.org/Collectors/Zope/300 http://www.zope.org/Collectors/Zope/1201 The issue: On rare occasions I have seen ConflictErrors raised during product installation. Today, I can finally provoke them at will with a pretty minimal setup. The setup: Zope 2.7.3, using ZEO, with DirectoryStorage on the ZEO server. Hardware: dual-processor PIII 800, some kinda scsi disk, ext3, some old redhat flavor of linux. To provoke it I need merely do ./bin/zopectl run foo.py ... where foo.py is empty. Products/ now contains only one dummy product, which I've named Conflictor, which contains only an empty __init__.py. I get a ConflictError during registration of Conflictor. Interestingly I can only provoke this ConflictError in this particular INSTANCE_HOME. I have several other instances and am not seeing the symptom there. I started trivially hacking OFS/Application.py to add some debugging output (a diff against 2.7.3 is attached), and here's what I get: [EMAIL PROTECTED] PaulMiniInstanceHome]$ ls Products/ [EMAIL PROTECTED] PaulMiniInstanceHome]$ mkdir Products/Conflicter [EMAIL PROTECTED] PaulMiniInstanceHome]$ touch Products/Conflicter/__in it__.py [EMAIL PROTECTED] PaulMiniInstanceHome]$ touch foo.py [EMAIL PROTECTED] PaulMiniInstanceHome]$ ./bin/zopectl run foo.py installing product PluginIndexes...OK! installing product Conflicter... BZZT Traceback (most recent call last): File string, line 1, in ? File /usr/local/lib/python2.3/site-packages/PIL/__init__.py, line 51, in app File /usr/local/lib/python2.3/site-packages/PIL/__init__.py, line 47, in startup File /zope/Zope-2.7.3-SoftwareHome/lib/python/Zope/App/startup.py, line 96, in startup File /zope/ZopeSoftwareHome/lib/python/OFS/Application.py, line 279, in init ialize initializer.initialize() File /zope/ZopeSoftwareHome/lib/python/OFS/Application.py, line 306, in init ialize self.install_products() File /zope/ZopeSoftwareHome/lib/python/OFS/Application.py, line 561, in inst all_products return install_products(app) File /zope/ZopeSoftwareHome/lib/python/OFS/Application.py, line 597, in inst all_products folder_permissions, raise_exc=debug_mode) File /zope/ZopeSoftwareHome/lib/python/OFS/Application.py, line 803, in inst all_product get_transaction().commit() File /zope/Zope-2.7.3-SoftwareHome/lib/python/ZODB/Transaction.py, line 252, in commit File /zope/Zope-2.7.3-SoftwareHome/lib/python/ZODB/Connection.py, line 738, in tpc_vote File /zope/Zope-2.7.3-SoftwareHome/lib/python/ZEO/ClientStorage.py, line 841 , in tpc_vote File /zope/Zope-2.7.3-SoftwareHome/lib/python/ZEO/ClientStorage.py, line 825 , in _check_serials ZODB.POSException.ConflictError: database conflict error (serial this txn starte d with 0x035b25f36751f988 2005-02-10 18:59:24.215675, serial currently committed 0x035b743c6d186822 2005-02-24 17:00:25.569220) I guess what I'm asking, before I hack more on OFS/Application.py, is this: * any ideas re. what on earth could be causing this ConflictError? the product doesn't DO anything fer chrissakes. (Maybe registration is finishing too fast??) * Is the PIL stuff in the traceback a red herring or what? -- Paul Winkler http://www.slinkp.com *** Application.py.ORIG Thu Feb 24 14:27:23 2005 --- Application.py Thu Feb 24 14:32:24 2005 *** *** 34,39 --- 34,40 from Acquisition import aq_base from App.Product import doInstall from App.config import getConfiguration + from ZODB.POSException import ConflictError class Application(Globals.ApplicationDefaultPermissions, ZDOM.Root, Folder.Folder, *** *** 587,594 if done.has_key(product_name): continue done[product_name]=1 ! install_product(app, product_dir, product_name, meta_types, ! folder_permissions, raise_exc=debug_mode) Products.meta_types=Products.meta_types+tuple(meta_types) Globals.default__class_init__(Folder.Folder) --- 588,606 if done.has_key(product_name): continue done[product_name]=1 ! # collector issue 300: handle ConflictErrors during installation ! sys.stderr.write('installing product %s' % product_name) ! try: ! import time; time.sleep(0.1) ! app._p_jar.sync() # DM's suggestion ! install_product(app, product_dir, product_name, meta_types, ! folder_permissions, raise_exc=debug_mode) ! sys.stderr.write('...OK!\n') ! except ConflictError: ! # somehow we should retry it. XXX look at zpublisher ! sys.stderr.write('... BZZT\n') ! raise !
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 02:29:50PM -0500, Paul Winkler wrote: * Is the PIL stuff in the traceback a red herring or what? Note that PIL is not referred to by anything in my Products directory. (Heck, *nothing* is referred to by anything in Products.) That's what's so baffling about it appearing in the traceback. -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 02:52:45PM -0500, Chris McDonough wrote: Two words: *hurt system*. Please, let me kill it. Interesting. How on earth did you determine that? You think it has to do with registering help for PluginIndexes? I'm gonna go test that theory - i'll hack registerHelp() to no-op. -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, 2005-02-24 at 14:57, Paul Winkler wrote: On Thu, Feb 24, 2005 at 02:52:45PM -0500, Chris McDonough wrote: Two words: *hurt system*. Please, let me kill it. Interesting. How on earth did you determine that? Womanly intuition. You think it has to do with registering help for PluginIndexes? I'm gonna go test that theory - i'll hack registerHelp() to no-op. That's a good verifier. - C ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 02:58:01PM -0500, Chris McDonough wrote: On Thu, 2005-02-24 at 14:57, Paul Winkler wrote: On Thu, Feb 24, 2005 at 02:52:45PM -0500, Chris McDonough wrote: Two words: *hurt system*. Please, let me kill it. Interesting. How on earth did you determine that? Womanly intuition. You think it has to do with registering help for PluginIndexes? I'm gonna go test that theory - i'll hack registerHelp() to no-op. That's a good verifier. Well, no joy. I hacked ProductContext.register* to return from the first line, so even registerClass() now does nothing. Barf, I still get the ConflictError. Twiddling around with some old suggestions from Dieter Maurer (see http://aspn.activestate.com/ASPN/Mail/Message/Zope-CMF/1978842 ), I added a _p_jar.sync() where he suggested - right before the final except: in initialize_product(). No joy. But - duh - my traceback clearly shows the ConflictError happens before that, in the call to get_transaction().commit(). SO I moved the call to app._p_jar.sync() to right before hte call to get_transaction().commit(). Aaaand... no more ConflictError. Weird weird weird. I'd be happy to check this in to cvs / svc, except that A) I have no clue how to write a test that exercises the bug, and B) so far nobody seems to understand what's really happening. If nobody objects, I wouldnt' mind checking the fix in anyway :-) -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
[Paul Winkler] ... ZODB.POSException.ConflictError: database conflict error ( serial this txn started with 0x035b25f36751f988 2005-02-10 18:59:24.215675, serial currently committed 0x035b743c6d186822 2005-02-24 17:00:25.569220) No time here for thought, but those timestamps are two weeks apart. Yikes! Just thought I'd mention that wink. A write conflict (which this is) happens when (and in this time order): 1. A transaction reads an object O. 2. A different transaction modifies O and commits its new state. 3. The original transaction also modifies O, and tries to commit its new state. #3 fails, because the original transaction has, in general, no way to know whether the changes it tried to make to O's state are consistent with the changes the other transaction made to its state in step #2. For your specific exception above, step #1 started by reading the state of the object as it existed on 2005-02-10. In step #2, someone else modified the same object, and committed its state around 2005-02-24 17:00:25. The original transaction then failed in step #3. Do you use a persistent ZEO client cache? If so, one thing to try is to physically remove all the cache files -- maybe they've gotten into an insane state. It's not impossible that you should have a 2-week difference in timestamps here, but it does look unusual. It's certainly possible (even expected) if two transactions try to modify the same object, and the last time that object was changed was in fact two weeks ago. ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
[Paul Winkler] ... But - duh - my traceback clearly shows the ConflictError happens before that, in the call to get_transaction().commit(). SO I moved the call to app._p_jar.sync() to right before hte call to get_transaction().commit(). Aaaand... no more ConflictError. Well, that's certain: the first thing sync() does is abort the current transaction. Then of course commit() won't raise an exception: sync() threw away all the changes. ... If nobody objects, I wouldnt' mind checking the fix in anyway :-) It would be much clearer instead to change the get_transaction().commit(). to get_transaction().abort() That would get rid of the ConflictError too, and for the same reason. It has the advantage of being obviously insane, though wink. ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 03:16:15PM -0500, Paul Winkler wrote: But - duh - my traceback clearly shows the ConflictError happens before that, in the call to get_transaction().commit(). SO I moved the call to app._p_jar.sync() to right before hte call to get_transaction().commit(). Aaaand... no more ConflictError. Never mind, this breaks existing unit tests. Duh, it prevents products from getting initialized at all :-\ now checking out tim's suggestion... -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
[Tim Peters] ... Do you use a persistent ZEO client cache? If so, one thing to try is to physically remove all the cache files -- maybe they've gotten into an insane state. Off list, Paul said bingo! to that one -- deleting his ZEO client cache files apparently made the problem go away. I'm not really happy with that, because I have no theory for how ZEO client cache files _get_ out of synch. I suspect they do, though, more often than stray gamma rays and pilot error could account for. Alas, I've got no time to give to that these days. If someone else wants to solve that, I'll pass on the cigar Paul offered to me wink. ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Conflict errors on startup, revisited
On Thu, Feb 24, 2005 at 04:11:35PM -0500, Tim Peters wrote: [Tim Peters] ... Do you use a persistent ZEO client cache? If so, one thing to try is to physically remove all the cache files -- maybe they've gotten into an insane state. Off list, Paul said bingo! to that one -- deleting his ZEO client cache files apparently made the problem go away. Yep, meant to mail the list, hit the wrong button. Thanks again Tim! I'm not really happy with that, because I have no theory for how ZEO client cache files _get_ out of synch. I suspect they do, though, more often than stray gamma rays and pilot error could account for. Alas, I've got no time to give to that these days. If someone else wants to solve that, I'll pass on the cigar Paul offered to me wink. Oh-sure-pretend-you-didn't-want-the-bag-of-ham-ly y'rs, -- Paul Winkler http://www.slinkp.com ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )