Re: [ZODB-Dev] Relstorage pack problems

2009-01-26 Thread Santi Camps
On Fri, Jan 23, 2009 at 8:45 PM, Shane Hathaway sh...@hathawaymix.org wrote:
 Shane Hathaway wrote:

 Assuming your bad script caused your problem, it is likely that packing
 will still mess up your database, since you still probably have mixed-up
 object_state rows.  Don't pack until I've had a chance to look again.

 Here is some more analysis.  Now that I understand you accidentally merged
 two databases into one by forcing copyTransactionsFrom() to run when it
 shouldn't, I looked for the transactions you merged.  First I looked for the
 OIDs with a confused transaction ID.


 = select zoid from current_object where tid != (select max(tid) from
 object_state where object_state.zoid = current_object.zoid);
  zoid
 --
7
   10
   12
   11
9
8
 (6 rows)


 Then I listed all non-current transaction IDs for those objects.


 =  select zoid, tid from object_state where zoid in (7,8,9,10,11,12) and
 tid != (select tid from current_object where current_object.zoid =
 object_state.zoid);
  zoid |tid
 --+
8 | 250499913748614178
9 | 250499913748614178
   10 | 250499913748614178
   11 | 250499913748614178
   12 | 250499913748614178
7 | 250499913748614178
 (6 rows)


 Based on this information and the information in my last email, I can deduce
 that you fortunately merged only two transactions from another database and
 that while the merge caused conflicts, these objects haven't been otherwise
 modified.  Note that the bad database merge could have happened at any time,
 not necessarily November 17 when these transactions were created.  Anyone
 with access to your database and your broken script could cause this problem
 again.  Fix the script quickly.

 Here are the two bad transactions:

 250499913441768123 | initial database creation
 250499913748614178 | /manage_main\012\012Created Zope Application

 You need to delete all traces of these two transactions from your database.
  Before you do, please ensure nothing is actually using them.  The query
 below should not return any rows.


 select * from current_object where tid in (250499913441768123,
 250499913748614178);


 Assuming that query returns no rows, here is how you can remove the bad
 transactions:


 update object_state set prev_tid = 0 where prev_tid in (250499913441768123,
 250499913748614178);
 delete from object_state where tid in (250499913441768123,
 250499913748614178);
 delete from object_ref where tid in (250499913441768123,
 250499913748614178);
 delete from object_refs_added where tid in (250499913441768123,
 250499913748614178);
 delete from transaction where tid in (250499913441768123,
 250499913748614178);
 commit;


 Once you've done that, you should see no more anomalies in current_object:


 =  select zoid from current_object where tid != (select max(tid) from
 object_state where object_state.zoid = current_object.zoid);
  zoid
 --
 (0 rows)


 I used several shortcuts for this solution, particularly the statement that
 sets prev_tid to 0.  If you had merged a more complex database, I wouldn't
 have been able to use shortcuts.

 I'm glad to know RelStorage didn't do anything wrong after all.  Perhaps the
 copyTransactionsFrom() method could work harder to prevent a mishap like
 this, but that method is part of the ZODB API, not RelStorage, so I don't
 have as much control over it.

 However, I still don't want you to pack yet because my experiments with
 packing your database has revealed some unexpected behavior.  I'm going to
 look into it.


Thanks again, Shane.   We'll fix the script, try removing this 2
transactions and packing on a copy of the database, to see what
happens.   The last pack on a copy works, but then the application
raises a KeyError 8, probably becouse this zoid is one of the affected
by the wrong transactions.

-- 
Santi Camps (Earcon S.L.)
http://www.earcon.com
http://www.kmkey.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-23 Thread Santi Camps
On Fri, Jan 23, 2009 at 2:38 AM, Shane Hathaway sh...@hathawaymix.org wrote:
 Santi,

 I hope you don't mind me discussing your database in public.  I'm not going
 to talk about anything that looks like it could be private.  Other
 RelStorage users might benefit from the analysis.


Hi Shane

That's right, of course.  Furthermore, thanks a lot for this analysis.

 Looking at your database, I see that something bad happened just before
 transaction 250499913441768123.  That number is an encoded time stamp:

   from ZODB.TimeStamp import TimeStamp
   from ZODB.utils import p64
   str(TimeStamp(p64(250499913441768123)))
  '2008-11-17 19:36:04.913130'

 The transaction log entry says initial database creation, which means that
 the database had no root object (OID 0), so ZODB created one and started a
 brand new database.  Strange!  This happened about an hour after a
 transaction labeled:

  /asp_ekartek/kmkey_iso/portal_setup/manage_doUpgrades

 I'm guessing that an upgrade script did something horribly wrong that day.


I've been revising what happens this day, and I think we are near to
get the guilty.This day the database was migrated from
DirectoryStorage to Relstorage.AFAIK, the upgrade should be done
in DirectoryStorage, before the conversion to Relstorage.   I don't
think the upgrade can corrupt the database, all operations ara high
level ones, and we never change ZODB objects by hand.   So, the
problem should be in the conversion process.

I attach the script we use to do the conversion.   Be free to include
in Relstorage if you think it's useful and it is well done (as I said,
I really don't know much about ZODB, I just mix zodbconvert.py with
some DirectoryStorage code)

 Furthermore, the entry for OID 0 in the current_object table points to an
 old transaction rather than the most recent transaction that modified OID 0.
  That's not supposed to happen, even when you undo.  I hope RelStorage
 didn't do that!

 Did you or someone on your team change current_object by hand?  I can
 understand why you would, since a simple modification to current_object
 would be a nice quick fix for the broken upgrade.  The fix would not be
 complete, though, because now the object_state table and the current_object
 table disagree on the current state of OID 0.

 According to object_state, even now, the current state of OID 0 still points
 to the small object graph that was accidentally created on November 17.  The
 pack code relies more on object_state than on current_object, so the pack
 code sees only a handful of objects that are reachable.  Packing with
 garbage collection removes everything that is not reachable.

 The current_object table is really just a cache of object_state.  If the
 schema were fully normalized, there would not be a current_object table.  In
 theory, the current_object table makes it possible to load ZODB objects
 quickly.  But if the current_object table results in problems like this, I
 need to consider alternatives.


We've used the attached script to convert a lot of others databases,
that are packing successfully, so something special should occurs in
this case.The only explanation I can found is that the conversion
would be done without unmounting the DirectoryStorage database from
its Zope mount point, and that caused the problem.   The right way we
use to convert databases from DS to RS is, first of all, detach them
from Zope, then convert, and then mount the resultant RS, but It's
possible that were a human mistake that day.

 In any case, I believe you can get out of this mess pretty easily.  You need
 to delete the extra object states for OID 0 created on November 17.  I tried
 this in my copy of your database:

 delete from object_state where zoid = 0 and tid in (
  250499913441768123, 250499913748614178);

 After that, select count(1) from object_state where zoid = 0; should tell
 you there is only one state in the database for OID 0.  Packing should work
 fine then.  It seemed to do the right thing on my copy, but I don't have
 your application code to check it.


Thank you very much for that information.  I really could not be able
to found it myself.We will try it as soon as possible and let you
know the results

Regards
-- 
Santi Camps (Earcon S.L.)
http://www.earcon.com
http://www.kmkey.com
#!/usr/bin/env python
##
#
# Copyright (c) 2008 Zope Corporation and Contributors.
# All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.1 (ZPL).  A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED AS IS AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE.
#
##
ZODB storage 

Re: [ZODB-Dev] Relstorage pack problems

2009-01-23 Thread Shane Hathaway
Santi Camps wrote:
 I attach the script we use to do the conversion.   Be free to include
 in Relstorage if you think it's useful and it is well done (as I said,
 I really don't know much about ZODB, I just mix zodbconvert.py with
 some DirectoryStorage code)

Ah-ha, you removed from zodbconvert.py the code that checks whether the 
destination already has data.  That was unnecessary and a big mistake. 
By removing that error condition, you made the script blind to the mess 
it was making.

Assuming your bad script caused your problem, it is likely that packing 
will still mess up your database, since you still probably have mixed-up 
object_state rows.  Don't pack until I've had a chance to look again.

Put the error condition back.  Here is what it looks like:

 if storage_has_data(destination):
 msg = Error: the destination storage has data.  Try --clear.
 sys.exit(msg)

It comes just before the call to copyTransactionsFrom().  Also, your 
current storage_has_data() function simply returns True, which is 
insane.  Here is the correct method:

def storage_has_data(storage):
 i = storage.iterator()
 try:
 i[0]
 except IndexError:
 return False
 return True

It's possible that function has a bug due to recent changes in the 
storage iterator API.  If it does, I'll fix it.

Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-23 Thread Shane Hathaway
Alan Runyan wrote:
 
 
 On Fri, Jan 23, 2009 at 1:45 PM, Shane Hathaway sh...@hathawaymix.org 
 mailto:sh...@hathawaymix.org wrote:
 = select zoid from current_object where tid != (select max(tid) from
 object_state where object_state.zoid = current_object.zoid);
  zoid
 --
 7
10
12
11
 9
 8
 (6 rows)
 
 
 Wow.  All of that with some SQL.

Cool, huh?  One caveat: While PostgreSQL executes this query in seconds, 
MySQL 5.0 would likely mis-optimize the subquery and take hours. :-/ 
For MySQL I'd have to use join syntax instead.

 Out of curiosity what would be the process if you were using FileStorage 
 to diagnose the same problem?  Do the same thing by iterating over the 
 entire storage and filling in some structures and doing the comparison 
 manually?

Probably, though ZODB has some pretty good tools for low-level 
FileStorage debugging.

 Sounds like it would take a few hours, with SQL -- it seems 
 pretty straightforward.

I think so too, although with FileStorage it would have been possible to 
guess at the time when the databases were accidentally merged, by 
looking at the position of the bad transactions in the file. 
FileStorage also would have reacted differently to the merged databases, 
though I have no idea how.

 The sooner this gets into the mainline of ZODB -- the better.

Thanks.  Still, FileStorage should always be the default. It's simpler 
and easier to set up than RelStorage.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-22 Thread Santi Camps

 Interesting.  RelStorage is already set up to do something like that.
 It splits packing into two phases.  In the first phase, it makes a
 complete list of every row for the second phase to delete.  The second
 phase could be delayed as long as we want.  Between the two phases, if
 any database connections use objects that have been marked for deletion,
 we could cancel the pack and flag a software bug.

 If you want we make any kind of test with our databases, It will be a 
 pleasure

 It would be very helpful to me if you could provide a copy of your
 database for me to debug.  I'm hoping for a compressed SQL dump.

 Shane


Hi again

I already know what happens:  either object_ref and object_refs_added
tables are completely empty in my database.   I can't understand why,
but it explains the data loose when packing (nothing references
nothing).   So, the mystery is not in Relstorage but in the data.

The only right ways to solve it is to disable pack-gc or try to fill
this tables correctly.   Do you know any simple way to do it ?
Perhaps an export / import should fill this tables ?

Thanks a lot for your help
-- 
Santi Camps (Earcon S.L.)
http://www.earcon.com
http://www.kmkey.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-22 Thread Shane Hathaway
Santi,

I hope you don't mind me discussing your database in public.  I'm not 
going to talk about anything that looks like it could be private.  Other 
RelStorage users might benefit from the analysis.

Looking at your database, I see that something bad happened just before 
transaction 250499913441768123.  That number is an encoded time stamp:

from ZODB.TimeStamp import TimeStamp
from ZODB.utils import p64
str(TimeStamp(p64(250499913441768123)))
   '2008-11-17 19:36:04.913130'

The transaction log entry says initial database creation, which means 
that the database had no root object (OID 0), so ZODB created one and 
started a brand new database.  Strange!  This happened about an hour 
after a transaction labeled:

   /asp_ekartek/kmkey_iso/portal_setup/manage_doUpgrades

I'm guessing that an upgrade script did something horribly wrong that day.

Furthermore, the entry for OID 0 in the current_object table points to 
an old transaction rather than the most recent transaction that modified 
OID 0.  That's not supposed to happen, even when you undo.  I hope 
RelStorage didn't do that!

Did you or someone on your team change current_object by hand?  I can 
understand why you would, since a simple modification to current_object 
would be a nice quick fix for the broken upgrade.  The fix would not be 
complete, though, because now the object_state table and the 
current_object table disagree on the current state of OID 0.

According to object_state, even now, the current state of OID 0 still 
points to the small object graph that was accidentally created on 
November 17.  The pack code relies more on object_state than on 
current_object, so the pack code sees only a handful of objects that are 
reachable.  Packing with garbage collection removes everything that is 
not reachable.

The current_object table is really just a cache of object_state.  If the 
schema were fully normalized, there would not be a current_object table. 
  In theory, the current_object table makes it possible to load ZODB 
objects quickly.  But if the current_object table results in problems 
like this, I need to consider alternatives.

In any case, I believe you can get out of this mess pretty easily.  You 
need to delete the extra object states for OID 0 created on November 17. 
  I tried this in my copy of your database:

delete from object_state where zoid = 0 and tid in (
   250499913441768123, 250499913748614178);

After that, select count(1) from object_state where zoid = 0; should 
tell you there is only one state in the database for OID 0.  Packing 
should work fine then.  It seemed to do the right thing on my copy, but 
I don't have your application code to check it.

Please let me know whether current_object was edited by hand.  If it 
was, then we have a documentation bug.  If RelStorage did that on its 
own, we may have a software bug.

Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-22 Thread Shane Hathaway
Laurence Rowe wrote:
 Shane Hathaway wrote:
 I should note that this KeyError occurs while trying to report on a
 KeyError.  I need to fix that.  Fortunately, the same error pops out anyway.
 
 There's a fix for this in the Jarn branch. Note that to collect more 
 interesting data it rolls back the load connection at this point, 
 relying on the KeyError to cause the transaction to fail.

I saw that, but I didn't feel good about the idea of rolling back the 
transaction.  I wanted the log message to show the state of the database 
as seen by that connection at that moment in time.  That's the kind of 
information you can't get with a SQL query executed at a later time.  I 
figured that if you really want to see all object states, you can just 
use SQL queries later.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-21 Thread Jim Fulton

On Jan 19, 2009, at 2:22 PM, Shane Hathaway wrote:
  A proper fix would be
 to make the mounted object reachable from the mounted database  
 root.  I
 believe the Zope 2 support for mounted databases does this
 automatically, but it's possible that Zope 3 takes an unwise shortcut.

Zope 3 doesn't support mounting.  It simply supports multiple  
databases.  It is up to developers to decide how to use multiple  
databases.  This is a significant potential pitfall.

IMO, the best solution long term is to provide a distributed garbage  
collector that is independent of packing. ZODB 3.9 adds a hook to  
enable this and I intend to write a distributed garbage collector soon.

Jim

--
Jim Fulton
Zope Corporation


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-20 Thread Santi Camps
On Mon, Jan 19, 2009 at 8:22 PM, Shane Hathaway sh...@hathawaymix.org wrote:
 Santi Camps wrote:
 We've been using RelStorage 1.1.c1 in production environments over
 postgresql 8.1 for some time.   It has been working really fine, but
 yesterday we have a big problem packing a relstorage zodb mounted as a
 zope mount point.

 I'm guessing that your mounted object is not attached to the root of the
 mounted database.  In that situation, both RelStorage and FileStorage
 will pack away practically everything in the mounted database, because
 from the viewpoint of the mounted database, the disconnected object is
 unreachable and thus garbage.

 A quick workaround is to disable garbage collection in zope.conf; then
 you can pack without losing any objects at all.  A proper fix would be
 to make the mounted object reachable from the mounted database root.  I
 believe the Zope 2 support for mounted databases does this
 automatically, but it's possible that Zope 3 takes an unwise shortcut.


Sorry, I miss to say we are using Zope 2.9.4, and that mounted object
is already reachable from the mounted database root (adding a ZODB
mount point to the root thought ZMI).

Although, the resultant database, after pack, has a big size.  So, not
all was deleted, but seems the root object not to be found.

We are using the same method on a lot of databases and has been
working fine for a long time, until this case.  I know it's very
difficult to say what happened, but I've reported the case because
perhaps it's possible to take some measures in order to avoid data
loss while packing.  I don't know a lot about ZODB, but
DirectoryStorage, for instance, don't delete objects immediately, but
uses a 'deleted' mark and don't remove them until X days (X configured
in the settings).  I love this caution of DirectoryStorage, but I like
all the other of RelStorage :-)

If you want we make any kind of test with our databases, It will be a pleasure

Thanks a lot for your help
-- 
Santi Camps (Earcon S.L.)
http://www.earcon.com
http://www.kmkey.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-20 Thread Shane Hathaway
Santi Camps wrote:
 Sorry, I miss to say we are using Zope 2.9.4, and that mounted object
 is already reachable from the mounted database root (adding a ZODB
 mount point to the root thought ZMI).

Ok.

 Although, the resultant database, after pack, has a big size.  So, not
 all was deleted, but seems the root object not to be found.

A side note: PostgreSQL does not actually reduce the size of a database
until you vacuum full.  Packing a ZODB in PostgreSQL actually
temporarily increases the size of the database even if it contains a lot
less data.  Still, vacuum full is not the right way to control the
database size; the right solution is to enable autovacuum and pack on a
regular basis.

 We are using the same method on a lot of databases and has been
 working fine for a long time, until this case.  I know it's very
 difficult to say what happened, but I've reported the case because
 perhaps it's possible to take some measures in order to avoid data
 loss while packing.  I don't know a lot about ZODB, but
 DirectoryStorage, for instance, don't delete objects immediately, but
 uses a 'deleted' mark and don't remove them until X days (X configured
 in the settings).  I love this caution of DirectoryStorage, but I like
 all the other of RelStorage :-)

Interesting.  RelStorage is already set up to do something like that.
It splits packing into two phases.  In the first phase, it makes a
complete list of every row for the second phase to delete.  The second
phase could be delayed as long as we want.  Between the two phases, if
any database connections use objects that have been marked for deletion,
we could cancel the pack and flag a software bug.

 If you want we make any kind of test with our databases, It will be a pleasure

It would be very helpful to me if you could provide a copy of your
database for me to debug.  I'm hoping for a compressed SQL dump.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-20 Thread Laurence Rowe
Shane Hathaway wrote:
 
 I should note that this KeyError occurs while trying to report on a
 KeyError.  I need to fix that.  Fortunately, the same error pops out anyway.

There's a fix for this in the Jarn branch. Note that to collect more 
interesting data it rolls back the load connection at this point, 
relying on the KeyError to cause the transaction to fail.

Laurence

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage pack problems

2009-01-19 Thread Shane Hathaway
Santi Camps wrote:
 We've been using RelStorage 1.1.c1 in production environments over
 postgresql 8.1 for some time.   It has been working really fine, but
 yesterday we have a big problem packing a relstorage zodb mounted as a
 zope mount point.

I'm guessing that your mounted object is not attached to the root of the
mounted database.  In that situation, both RelStorage and FileStorage
will pack away practically everything in the mounted database, because
from the viewpoint of the mounted database, the disconnected object is
unreachable and thus garbage.

A quick workaround is to disable garbage collection in zope.conf; then
you can pack without losing any objects at all.  A proper fix would be
to make the mounted object reachable from the mounted database root.  I
believe the Zope 2 support for mounted databases does this
automatically, but it's possible that Zope 3 takes an unwise shortcut.

 But just one second later (22:11:32) a KeyError appears in event.log,
 and the content of the database couldn't been seen in ZMI.   We try to
 remount it, restart zope, and so on, but nothing works, so we had to
 restore a backup.
 
 That's the traceback of event.log:
 Traceback (most recent call last):
 File /usr/local/zope/lib/python/ZODB/Connection.py, line 768, in setstate
 self._setstate(obj)
 File /usr/local/zope/lib/python/ZODB/Connection.py, line 804, in _setstate
 p, serial = self._storage.load(obj._p_oid, self._version)
 File /usr/lib/python2.4/site-packages/relstorage/relstorage.py, line
 311, in load
 self._log_keyerror(oid_int, no tid found(2))
 File /usr/lib/python2.4/site-packages/relstorage/relstorage.py, line
 255, in _log_keyerror
 rows = adapter.iter_object_history(cursor, oid_int)
 File /usr/lib/python2.4/site-packages/relstorage/adapters/common.py,
 line 229, in iter_object_history
 raise KeyError(oid)
 KeyError: 7L

I should note that this KeyError occurs while trying to report on a
KeyError.  I need to fix that.  Fortunately, the same error pops out anyway.

Shane

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev