Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-02-02 Thread Chris Withers
On 02/02/2011 09:14, Shane Hathaway wrote:
> On 02/02/2011 10:57 AM, Chris Withers wrote:
>> Er, since when? If that were the case, I'm sure Shane would place
>> explicit instructions that it should not be used...
>
> Safe is relative. MySQL is a good choice for Facebook, but if I knew my
> bank was storing my account balance in MySQL, I would close my accounts
> immediately.

I'll have to respectfully disagree...

MySQL has come a long way over the last 10 years, and I'm as confident 
in it as I am in any relational database, particularly when used with 
the InnoDB engine.

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-02-02 Thread Shane Hathaway
On 02/02/2011 10:57 AM, Chris Withers wrote:
> Er, since when? If that were the case, I'm sure Shane would place
> explicit instructions that it should not be used...

Safe is relative.  MySQL is a good choice for Facebook, but if I knew my 
bank was storing my account balance in MySQL, I would close my accounts 
immediately.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-02-02 Thread Chris Withers
On 01/02/2011 23:10, Ruda Porto Filgueiras wrote:
> I had similar issues on past and now I follow some rules:
>
>   - pack only with zodbpack
>   - pack-gc = false

...which means your database will still slowly (or quickly in my case!) 
grow over time with unreferenced objects.

>   - pack when the database is not been active update or not update at all.

I'm glad you have an app with quiet spells! ;-)
This isn't realistic for any active, public web app.

> But it does not imply RelStorage bug, since MySQL is know to not be so "safe".

Er, since when? If that were the case, I'm sure Shane would place 
explicit instructions that it should not be used...

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-02-01 Thread Ruda Porto Filgueiras
Only to give some feedback:

I had similar issues on past and now I follow some rules:

 - pack only with zodbpack
 - pack-gc = false
 - pack when the database is not been active update or not update at all.

But it does not imply RelStorage bug, since MySQL is know to not be so "safe".
I'm happy that Chris Withers has been some investigation and test to
discover possible issues.

I have a large database (15G) and I can help running tests to give
more feedback.

Cheers,

On Mon, Jan 24, 2011 at 6:02 PM, Anton Stonor  wrote:
> Hi there,
>
> We have recently experienced a couple of PosKey errors with a Plone 4 site
> running RelStorage 1.4.1 and Mysql 5.1.
>
> After digging down we found that the objects that were throwing
> PosKeyErrors  actually existed in the object_state table with pickles etc,
> however not in the current_object table.
>
> After inserting the missing pointers into the current_object  table,
> everything worked fine:
>
>   mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701";
>
>   +++
>   | zoid   | tid    |
>   +++
>   | 561701 | 255267099158685832 |
>   +++
>
>   mysql> INSERT INTO current_object(zoid, tid) VALUES('561701',
> '255267099158685832');
>
> Looks like it works -- but is this a safe way to fix PosKeyErrors?
>
> Now, I wonder why these pointers were deleted from the current_object table
> in the first place. My money is on packing -- and it might fit with the fact
> that we recently ran a pack that removed an unusual large amount of
> transactions in a single pack (100.000+ transactions).
>
> But I don't know how to investigate the root cause further. Ideas?
>
>
> /Anton
>
> ___
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev@zope.org
> https://mail.zope.org/mailman/listinfo/zodb-dev
>
>



-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
http://twitter.com/rudaporto
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-27 Thread Anton Stonor
Hi Shane,

Thanks for pursuing this.

I have lots of other ideas now, but I don't know which to pursue.  I need a
> lot more information.  It would be helpful if you sent me your database to
> analyze.  Some possible causes:
>
> - Have you looked for filesystem-level corruption yet?  I asked this before
> and I am waiting for an answer.
>
>
Yep, I've checked for file system consistency and Mysql consistency without
any error reported.



> - Although there is a pack lock, that lock unfortunately gets released
> automatically if MySQL disconnects prematurely.  Therefore, it is possible
> to force RelStorage to run multiple pack operations in parallel, which would
> have unpredictable effects.  Is there any possibility that you accidentally
> ran multiple pack operations in parallel?  For example, maybe you have a
> cron job, or you were setting up a cron job at the time, and you started a
> pack while the cron job was running.  (Normally, any attempt to start
> parallel pack operations will just generate an error, but if MySQL
> disconnects in just the right way, you'll get a mess.)
>
>
That's not unlikely! I've actually seen traces of packing invoked TTW,
however the cron job uses zodbpack. I will try to figure out if the PosKeys
starts to surface right after that.


> - Every SQL database has nasty surprises.  Oracle, for example, has a nice
> "read only" mode, but it turns out that mode works differently in RAC
> environments, leading to silent corruption.  As a result, we never use that
> feature of Oracle anymore.  Maybe MySQL has some nasty surprises I haven't
> yet discovered; maybe the MySQL-specific "delete using" statement doesn't
> work as expected.
>

That could also be the case. In fact we have also seen Mysql locking up
longer than expected, but that's another story.



> - Applications can accidentally cause POSKeyErrors in a variety of ways.
>  For example, persistent objects cached globally can cause POSKeyErrors.
>  Maybe Plone 4 or some add-on uses ZODB incorrectly.
>

I was not aware of that.

Next step here would probably be to inspect log files further and  grab a
copy of the dabase before PosKeys started to appear and see if it is
possible to recreate the incident.

Again, thanks.

Anton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-27 Thread Shane Hathaway
On 01/24/2011 02:02 PM, Anton Stonor wrote:
> Now, I wonder why these pointers were deleted from the current_object
> table in the first place. My money is on packing -- and it might fit
> with the fact that we recently ran a pack that removed an unusual large
> amount of transactions in a single pack (100.000+ transactions).
>
> But I don't know how to investigate the root cause further. Ideas?

I have meditated on this for some time now.  I mentioned I had an idea 
about packing, but I studied the design and I don't see any way my idea 
could work.  The design is such that it seems impossible that the pack 
code could produce an inconsistency between the object_state and 
current_object tables.

I have lots of other ideas now, but I don't know which to pursue.  I 
need a lot more information.  It would be helpful if you sent me your 
database to analyze.  Some possible causes:

- Have you looked for filesystem-level corruption yet?  I asked this 
before and I am waiting for an answer.

- Although there is a pack lock, that lock unfortunately gets released 
automatically if MySQL disconnects prematurely.  Therefore, it is 
possible to force RelStorage to run multiple pack operations in 
parallel, which would have unpredictable effects.  Is there any 
possibility that you accidentally ran multiple pack operations in 
parallel?  For example, maybe you have a cron job, or you were setting 
up a cron job at the time, and you started a pack while the cron job was 
running.  (Normally, any attempt to start parallel pack operations will 
just generate an error, but if MySQL disconnects in just the right way, 
you'll get a mess.)

- Every SQL database has nasty surprises.  Oracle, for example, has a 
nice "read only" mode, but it turns out that mode works differently in 
RAC environments, leading to silent corruption.  As a result, we never 
use that feature of Oracle anymore.  Maybe MySQL has some nasty 
surprises I haven't yet discovered; maybe the MySQL-specific "delete 
using" statement doesn't work as expected.

- Applications can accidentally cause POSKeyErrors in a variety of ways. 
  For example, persistent objects cached globally can cause 
POSKeyErrors.  Maybe Plone 4 or some add-on uses ZODB incorrectly.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-24 Thread Shane Hathaway
On 01/24/2011 06:32 PM, Laurence Rowe wrote:
> Must this imply a failure to maintain a foreign key constraint? While
> there are FK constraints on current_object (zoid, tid) ->  object_state
> (zoid, tid) there is no foreign key that might prevent a
> current_object row from being incorrectly deleted.

That's true.  Thanks for pitching in.

> 2. Something goes wrong during pack gc (either in the pack logic or on
> the database).

Looking at the code, I think I might see a way that current_object rows 
could be incorrectly removed by a pack operation, but I'll have to write 
tests to find out.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-24 Thread Laurence Rowe
On 24 January 2011 21:28, Shane Hathaway  wrote:
> On 01/24/2011 02:02 PM, Anton Stonor wrote:
>> Hi there,
>>
>> We have recently experienced a couple of PosKey errors with a Plone 4
>> site running RelStorage 1.4.1 and Mysql 5.1.
>>
>> After digging down we found that the objects that were throwing
>> PosKeyErrors  actually existed in the object_state table with pickles
>> etc, however not in the current_object table.
>>
>> After inserting the missing pointers into the current_object  table,
>> everything worked fine:
>>
>>    mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701";
>>
>>    +++
>>    | zoid   | tid                |
>>    +++
>>    | 561701 | 255267099158685832 |
>>    +++
>>
>>    mysql> INSERT INTO current_object(zoid, tid) VALUES('561701',
>> '255267099158685832');
>>
>> Looks like it works -- but is this a safe way to fix PosKeyErrors?
>>
>> Now, I wonder why these pointers were deleted from the current_object
>> table in the first place. My money is on packing -- and it might fit
>> with the fact that we recently ran a pack that removed an unusual large
>> amount of transactions in a single pack (100.000+ transactions).
>>
>> But I don't know how to investigate the root cause further. Ideas?
>
> This suggests MySQL not only lost some data (due to a MySQL bug or a
> filesystem-level error), but it failed to enforce a foreign key that is
> supposed to ensure this never happens.  I think you need to check the
> integrity of your filesystem (e2fsck -f) and database (mysqlcheck -c).
> You might also reconsider the choice to use MySQL.

Must this imply a failure to maintain a foreign key constraint? While
there are FK constraints on current_object (zoid, tid) -> object_state
(zoid, tid) there is no foreign key that might prevent a
current_object row from being incorrectly deleted.

I think that means the possibilities are:

1. The current_object table was not updated properly during a commit
or corrupted so that some rows were lost.

2. Something goes wrong during pack gc (either in the pack logic or on
the database).

3. Database corruption.

Laurence
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-24 Thread Shane Hathaway
On 01/24/2011 02:02 PM, Anton Stonor wrote:
> Hi there,
>
> We have recently experienced a couple of PosKey errors with a Plone 4
> site running RelStorage 1.4.1 and Mysql 5.1.
>
> After digging down we found that the objects that were throwing
> PosKeyErrors  actually existed in the object_state table with pickles
> etc, however not in the current_object table.
>
> After inserting the missing pointers into the current_object  table,
> everything worked fine:
>
>mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701";
>
>+++
>| zoid   | tid|
>+++
>| 561701 | 255267099158685832 |
>+++
>
>mysql> INSERT INTO current_object(zoid, tid) VALUES('561701',
> '255267099158685832');
>
> Looks like it works -- but is this a safe way to fix PosKeyErrors?
>
> Now, I wonder why these pointers were deleted from the current_object
> table in the first place. My money is on packing -- and it might fit
> with the fact that we recently ran a pack that removed an unusual large
> amount of transactions in a single pack (100.000+ transactions).
>
> But I don't know how to investigate the root cause further. Ideas?

This suggests MySQL not only lost some data (due to a MySQL bug or a 
filesystem-level error), but it failed to enforce a foreign key that is 
supposed to ensure this never happens.  I think you need to check the 
integrity of your filesystem (e2fsck -f) and database (mysqlcheck -c). 
You might also reconsider the choice to use MySQL.

If you're sure you have found all of the corruption, you can do this 
(with all app servers shut down) to re-create the current_object table:

delete from current_object;
insert into current_object (zoid, tid)
 select zoid, max(tid) from object_state;

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-24 Thread Anton Stonor
Hi there,

We have recently experienced a couple of PosKey errors with a Plone 4 site
running RelStorage 1.4.1 and Mysql 5.1.

After digging down we found that the objects that were throwing
PosKeyErrors  actually existed in the object_state table with pickles etc,
however not in the current_object table.

After inserting the missing pointers into the current_object  table,
everything worked fine:

  mysql> SELECT zoid, tid FROM object_state WHERE zoid="561701";

  +++
  | zoid   | tid|
  +++
  | 561701 | 255267099158685832 |
  +++

  mysql> INSERT INTO current_object(zoid, tid) VALUES('561701',
'255267099158685832');

Looks like it works -- but is this a safe way to fix PosKeyErrors?

Now, I wonder why these pointers were deleted from the current_object table
in the first place. My money is on packing -- and it might fit with the fact
that we recently ran a pack that removed an unusual large amount of
transactions in a single pack (100.000+ transactions).

But I don't know how to investigate the root cause further. Ideas?


/Anton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev