On 01/24/2011 02:02 PM, Anton Stonor wrote: > Now, I wonder why these pointers were deleted from the current_object > table in the first place. My money is on packing -- and it might fit > with the fact that we recently ran a pack that removed an unusual large > amount of transactions in a single pack (100.000+ transactions). > > But I don't know how to investigate the root cause further. Ideas?
I have meditated on this for some time now. I mentioned I had an idea about packing, but I studied the design and I don't see any way my idea could work. The design is such that it seems impossible that the pack code could produce an inconsistency between the object_state and current_object tables. I have lots of other ideas now, but I don't know which to pursue. I need a lot more information. It would be helpful if you sent me your database to analyze. Some possible causes: - Have you looked for filesystem-level corruption yet? I asked this before and I am waiting for an answer. - Although there is a pack lock, that lock unfortunately gets released automatically if MySQL disconnects prematurely. Therefore, it is possible to force RelStorage to run multiple pack operations in parallel, which would have unpredictable effects. Is there any possibility that you accidentally ran multiple pack operations in parallel? For example, maybe you have a cron job, or you were setting up a cron job at the time, and you started a pack while the cron job was running. (Normally, any attempt to start parallel pack operations will just generate an error, but if MySQL disconnects in just the right way, you'll get a mess.) - Every SQL database has nasty surprises. Oracle, for example, has a nice "read only" mode, but it turns out that mode works differently in RAC environments, leading to silent corruption. As a result, we never use that feature of Oracle anymore. Maybe MySQL has some nasty surprises I haven't yet discovered; maybe the MySQL-specific "delete using" statement doesn't work as expected. - Applications can accidentally cause POSKeyErrors in a variety of ways. For example, persistent objects cached globally can cause POSKeyErrors. Maybe Plone 4 or some add-on uses ZODB incorrectly. Shane _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev