Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 02/02/2011 09:14, Shane Hathaway wrote: > On 02/02/2011 10:57 AM, Chris Withers wrote: >> Er, since when? If that were the case, I'm sure Shane would place >> explicit instructions that it should not be used... > > Safe is relative. MySQL is a good choice for Facebook, but if I knew my > bank was storing my account balance in MySQL, I would close my accounts > immediately. I'll have to respectfully disagree... MySQL has come a long way over the last 10 years, and I'm as confident in it as I am in any relational database, particularly when used with the InnoDB engine. cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 02/02/2011 10:57 AM, Chris Withers wrote: > Er, since when? If that were the case, I'm sure Shane would place > explicit instructions that it should not be used... Safe is relative. MySQL is a good choice for Facebook, but if I knew my bank was storing my account balance in MySQL, I would close my accounts immediately. Shane ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 01/02/2011 23:10, Ruda Porto Filgueiras wrote: > I had similar issues on past and now I follow some rules: > > - pack only with zodbpack > - pack-gc = false ...which means your database will still slowly (or quickly in my case!) grow over time with unreferenced objects. > - pack when the database is not been active update or not update at all. I'm glad you have an app with quiet spells! ;-) This isn't realistic for any active, public web app. > But it does not imply RelStorage bug, since MySQL is know to not be so "safe". Er, since when? If that were the case, I'm sure Shane would place explicit instructions that it should not be used... cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
Only to give some feedback: I had similar issues on past and now I follow some rules: - pack only with zodbpack - pack-gc = false - pack when the database is not been active update or not update at all. But it does not imply RelStorage bug, since MySQL is know to not be so "safe". I'm happy that Chris Withers has been some investigation and test to discover possible issues. I have a large database (15G) and I can help running tests to give more feedback. Cheers, On Mon, Jan 24, 2011 at 6:02 PM, Anton Stonor wrote: > Hi there, > > We have recently experienced a couple of PosKey errors with a Plone 4 site > running RelStorage 1.4.1 and Mysql 5.1. > > After digging down we found that the objects that were throwing > PosKeyErrors actually existed in the object_state table with pickles etc, > however not in the current_object table. > > After inserting the missing pointers into the current_object table, > everything worked fine: > > mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701"; > > +++ > | zoid | tid | > +++ > | 561701 | 255267099158685832 | > +++ > > mysql> INSERT INTO current_object(zoid, tid) VALUES('561701', > '255267099158685832'); > > Looks like it works -- but is this a safe way to fix PosKeyErrors? > > Now, I wonder why these pointers were deleted from the current_object table > in the first place. My money is on packing -- and it might fit with the fact > that we recently ran a pack that removed an unusual large amount of > transactions in a single pack (100.000+ transactions). > > But I don't know how to investigate the root cause further. Ideas? > > > /Anton > > ___ > For more information about ZODB, see the ZODB Wiki: > http://www.zope.org/Wikis/ZODB/ > > ZODB-Dev mailing list - ZODB-Dev@zope.org > https://mail.zope.org/mailman/listinfo/zodb-dev > > -- Rudá Porto Filgueiras http://python-blog.blogspot.com http://twitter.com/rudaporto ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
Hi Shane, Thanks for pursuing this. I have lots of other ideas now, but I don't know which to pursue. I need a > lot more information. It would be helpful if you sent me your database to > analyze. Some possible causes: > > - Have you looked for filesystem-level corruption yet? I asked this before > and I am waiting for an answer. > > Yep, I've checked for file system consistency and Mysql consistency without any error reported. > - Although there is a pack lock, that lock unfortunately gets released > automatically if MySQL disconnects prematurely. Therefore, it is possible > to force RelStorage to run multiple pack operations in parallel, which would > have unpredictable effects. Is there any possibility that you accidentally > ran multiple pack operations in parallel? For example, maybe you have a > cron job, or you were setting up a cron job at the time, and you started a > pack while the cron job was running. (Normally, any attempt to start > parallel pack operations will just generate an error, but if MySQL > disconnects in just the right way, you'll get a mess.) > > That's not unlikely! I've actually seen traces of packing invoked TTW, however the cron job uses zodbpack. I will try to figure out if the PosKeys starts to surface right after that. > - Every SQL database has nasty surprises. Oracle, for example, has a nice > "read only" mode, but it turns out that mode works differently in RAC > environments, leading to silent corruption. As a result, we never use that > feature of Oracle anymore. Maybe MySQL has some nasty surprises I haven't > yet discovered; maybe the MySQL-specific "delete using" statement doesn't > work as expected. > That could also be the case. In fact we have also seen Mysql locking up longer than expected, but that's another story. > - Applications can accidentally cause POSKeyErrors in a variety of ways. > For example, persistent objects cached globally can cause POSKeyErrors. > Maybe Plone 4 or some add-on uses ZODB incorrectly. > I was not aware of that. Next step here would probably be to inspect log files further and grab a copy of the dabase before PosKeys started to appear and see if it is possible to recreate the incident. Again, thanks. Anton ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 01/24/2011 02:02 PM, Anton Stonor wrote: > Now, I wonder why these pointers were deleted from the current_object > table in the first place. My money is on packing -- and it might fit > with the fact that we recently ran a pack that removed an unusual large > amount of transactions in a single pack (100.000+ transactions). > > But I don't know how to investigate the root cause further. Ideas? I have meditated on this for some time now. I mentioned I had an idea about packing, but I studied the design and I don't see any way my idea could work. The design is such that it seems impossible that the pack code could produce an inconsistency between the object_state and current_object tables. I have lots of other ideas now, but I don't know which to pursue. I need a lot more information. It would be helpful if you sent me your database to analyze. Some possible causes: - Have you looked for filesystem-level corruption yet? I asked this before and I am waiting for an answer. - Although there is a pack lock, that lock unfortunately gets released automatically if MySQL disconnects prematurely. Therefore, it is possible to force RelStorage to run multiple pack operations in parallel, which would have unpredictable effects. Is there any possibility that you accidentally ran multiple pack operations in parallel? For example, maybe you have a cron job, or you were setting up a cron job at the time, and you started a pack while the cron job was running. (Normally, any attempt to start parallel pack operations will just generate an error, but if MySQL disconnects in just the right way, you'll get a mess.) - Every SQL database has nasty surprises. Oracle, for example, has a nice "read only" mode, but it turns out that mode works differently in RAC environments, leading to silent corruption. As a result, we never use that feature of Oracle anymore. Maybe MySQL has some nasty surprises I haven't yet discovered; maybe the MySQL-specific "delete using" statement doesn't work as expected. - Applications can accidentally cause POSKeyErrors in a variety of ways. For example, persistent objects cached globally can cause POSKeyErrors. Maybe Plone 4 or some add-on uses ZODB incorrectly. Shane ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 01/24/2011 06:32 PM, Laurence Rowe wrote: > Must this imply a failure to maintain a foreign key constraint? While > there are FK constraints on current_object (zoid, tid) -> object_state > (zoid, tid) there is no foreign key that might prevent a > current_object row from being incorrectly deleted. That's true. Thanks for pitching in. > 2. Something goes wrong during pack gc (either in the pack logic or on > the database). Looking at the code, I think I might see a way that current_object rows could be incorrectly removed by a pack operation, but I'll have to write tests to find out. Shane ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 24 January 2011 21:28, Shane Hathaway wrote: > On 01/24/2011 02:02 PM, Anton Stonor wrote: >> Hi there, >> >> We have recently experienced a couple of PosKey errors with a Plone 4 >> site running RelStorage 1.4.1 and Mysql 5.1. >> >> After digging down we found that the objects that were throwing >> PosKeyErrors actually existed in the object_state table with pickles >> etc, however not in the current_object table. >> >> After inserting the missing pointers into the current_object table, >> everything worked fine: >> >> mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701"; >> >> +++ >> | zoid | tid | >> +++ >> | 561701 | 255267099158685832 | >> +++ >> >> mysql> INSERT INTO current_object(zoid, tid) VALUES('561701', >> '255267099158685832'); >> >> Looks like it works -- but is this a safe way to fix PosKeyErrors? >> >> Now, I wonder why these pointers were deleted from the current_object >> table in the first place. My money is on packing -- and it might fit >> with the fact that we recently ran a pack that removed an unusual large >> amount of transactions in a single pack (100.000+ transactions). >> >> But I don't know how to investigate the root cause further. Ideas? > > This suggests MySQL not only lost some data (due to a MySQL bug or a > filesystem-level error), but it failed to enforce a foreign key that is > supposed to ensure this never happens. I think you need to check the > integrity of your filesystem (e2fsck -f) and database (mysqlcheck -c). > You might also reconsider the choice to use MySQL. Must this imply a failure to maintain a foreign key constraint? While there are FK constraints on current_object (zoid, tid) -> object_state (zoid, tid) there is no foreign key that might prevent a current_object row from being incorrectly deleted. I think that means the possibilities are: 1. The current_object table was not updated properly during a commit or corrupted so that some rows were lost. 2. Something goes wrong during pack gc (either in the pack logic or on the database). 3. Database corruption. Laurence ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
On 01/24/2011 02:02 PM, Anton Stonor wrote: > Hi there, > > We have recently experienced a couple of PosKey errors with a Plone 4 > site running RelStorage 1.4.1 and Mysql 5.1. > > After digging down we found that the objects that were throwing > PosKeyErrors actually existed in the object_state table with pickles > etc, however not in the current_object table. > > After inserting the missing pointers into the current_object table, > everything worked fine: > >mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701"; > >+++ >| zoid | tid| >+++ >| 561701 | 255267099158685832 | >+++ > >mysql> INSERT INTO current_object(zoid, tid) VALUES('561701', > '255267099158685832'); > > Looks like it works -- but is this a safe way to fix PosKeyErrors? > > Now, I wonder why these pointers were deleted from the current_object > table in the first place. My money is on packing -- and it might fit > with the fact that we recently ran a pack that removed an unusual large > amount of transactions in a single pack (100.000+ transactions). > > But I don't know how to investigate the root cause further. Ideas? This suggests MySQL not only lost some data (due to a MySQL bug or a filesystem-level error), but it failed to enforce a foreign key that is supposed to ensure this never happens. I think you need to check the integrity of your filesystem (e2fsck -f) and database (mysqlcheck -c). You might also reconsider the choice to use MySQL. If you're sure you have found all of the corruption, you can do this (with all app servers shut down) to re-create the current_object table: delete from current_object; insert into current_object (zoid, tid) select zoid, max(tid) from object_state; Shane ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
Hi there, We have recently experienced a couple of PosKey errors with a Plone 4 site running RelStorage 1.4.1 and Mysql 5.1. After digging down we found that the objects that were throwing PosKeyErrors actually existed in the object_state table with pickles etc, however not in the current_object table. After inserting the missing pointers into the current_object table, everything worked fine: mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701"; +++ | zoid | tid| +++ | 561701 | 255267099158685832 | +++ mysql> INSERT INTO current_object(zoid, tid) VALUES('561701', '255267099158685832'); Looks like it works -- but is this a safe way to fix PosKeyErrors? Now, I wonder why these pointers were deleted from the current_object table in the first place. My money is on packing -- and it might fit with the fact that we recently ran a pack that removed an unusual large amount of transactions in a single pack (100.000+ transactions). But I don't know how to investigate the root cause further. Ideas? /Anton ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev