Re: [ZODB-Dev] Increasing MAX_BUCKET_SIZE for IISet, etc

2011-01-27 Thread Matt Hamilton
Jim Fulton jim at zope.com writes:

 
 On Wed, Jan 26, 2011 at 3:15 PM, Matt Hamilton matth at
 netsight.co.uk wrote:

  So, with up to 300,000 items in some of these IISets, it means to
  iterate over the entire set (during a Catalog query) means loading
  5,000 objects over ZEO from the ZODB, which adds up to quite a bit
  of
  latency. With quite a number of these data structures about, means
  we
  can end up with in the order of 50,000 object in the ZODB cache
  *just*
  for these IISets!
 
 Hopefully, you're not iterating over the entire tree, but still. :)

Alas we are. Or rather, alas, ZCatalog does ;) It would be great if it
didn't but it's just the way it is. If I have 300,000 items in my 
site, and everyone of them visible to someone with the 'Reader' 
role, then the allowedRolesAndUsers index will have an IITreeSet 
with 300,000 elements in it. Yes, we could try and optimize out that 
specific case, but there are others like that too. If all of my 
items have no effective or expires date, then the same happens with 
the effective range index (DateRangeIndex 'always' set).
 
  So... has anyone tried increasing the size of MAX_BUCKET_SIZE in
real
  life?
 
 We have, mainly to reduce the number of conflicts.
 
  I understand that this will increase the potential for conflicts
  if the bucket/set size is larger (however in reality this probably
  can't get worse than it is, as currently as the value inserted is
99%
  of the time greater than the current max value stored -- it is a
  timestamp -- you always hit the last bucket/set in the tree).
 
 Actually, it reduces the number of unresolveable conflicts.
 Most conflicting bucket changes can be resolved, but bucket
 splits can't be and bigger buckets means fewer splits.
 
 The main tradeoff is record size.

Ahh interesting, that is good to know. I've not actually checked the 
conflict resolution code, but do bucket change conflicts actually get
resolved in some sane way, or does the transaction have to be 
retried?

Actually... that is a good point, and something I never thought
of... when you get a Conflict Error in the logs (that was 
resolved) does that mean that _p_resolveConflict was called and 
successful, or does it mean that the transactions were retried 
and that resolved the conflict?

  I was going to experiment with increasing the MAX_BUCKET_SIZE on
an IISet
  from 120 to 1200. Doing a quick test, a pickle of an IISet of 60
items
  is around 336 bytes, an of 600 items is 1580 bytes... so still
very
  much in the realms of a single disk read / network packet.
 
 And imagine if you use zc.zlibstorage to compress records! :)

This is Plone 3, which is Zope 2.10.11, does zc.zlibstorage work on
that, or does it need newer ZODB? Also, unless I can sort out that 
large number of small pickles being loaded, I'd imagine this would 
actually slow things down.

  I'm not sure how the current MAX_BUCKET_SIZE values were
determined,
  but looks like they have been the same since the dawn of time, and
I'm
  guessing might be due a tune?
 
 Probably.
 
  It looks like I can change that constant and recompile the BTree
  package, and it will work fine with existing IISets and just take
  effect on new sets created (ie clear and rebuild the catalog
index).
 
  Anyone played with this before or see any major flaws to my
cunning plan?
 
 We have.  My long term goal is to arrange things so that you can
 specify/change limits by sub-classing the BTree classes.
 Unfortunately, that's been a long-term priority for too long.
 This could be a great narrow project for someone who's willing
 to grok the Python C APIs.

I remember you introduced me to the C API for things like this wy 
back in Reading at the first non US Zope 3 sprint... I was trying to
create compressed list data structures for catalogs I never could 
quite get rid of the memory leaks I was getting! ;) Maybe I'll be 
brave and take another look.

 Changing the default sizes for the II ad LL BTrees is pretty
  straightforward.
 We were more interested in LO (and similar) BTrees. For those,
 it's much harder to guess sizes because you don't know generally
 how big the objects will be, which is why I'd like to make it
  tunable at the
 application level.

Yeah, I guess that is the issue. I wonder if it would be easy for the
code to work out the total size of the bucket in bytes and then 
split based upon that. Or something like 120 items, or 500kB, 
whichever comes first.

Just looking at the cache on the site at the moment, and we have a
total of 
978,355 objects in cache, of which:

312,523  IOBucket
274,025  IISet
116,136  OOBucket
114,626  IIBucket

So 83% of my cache is just those four object types.

-Matt


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Increasing MAX_BUCKET_SIZE for IISet, etc

2011-01-27 Thread Hanno Schlichting
Hi.

On Thu, Jan 27, 2011 at 9:00 AM, Matt Hamilton ma...@netsight.co.uk wrote:
 Alas we are. Or rather, alas, ZCatalog does ;) It would be great if it
 didn't but it's just the way it is. If I have 300,000 items in my
 site, and everyone of them visible to someone with the 'Reader'
 role, then the allowedRolesAndUsers index will have an IITreeSet
 with 300,000 elements in it. Yes, we could try and optimize out that
 specific case, but there are others like that too. If all of my
 items have no effective or expires date, then the same happens with
 the effective range index (DateRangeIndex 'always' set).

You are using queryplan in the site, right? The most typical catalog
query for Plone consists of something like ('allowedRolesAndUsers',
'effectiveRange', 'path', 'sort_on'). Without queryplan you indeed
load the entire tree (or trees inside allowedRolesAndUsers) for each
of these indexes.

With queryplan it knows from prior execution, that the set returned by
the path index is the smallest. So it first calculates this. Then it
uses this small set (usually 10-100 items per folder) to look inside
the other indexes. It then only needs to do an intersection of the
small path set with each of the trees. If the path set has less then
1000 items, it won't even use the normal intersection function from
the BTrees module, but use the optimized Cython based version from
queryplan, which essentially does a for-in loop over the path set.
Depending on the size ratio between the sets this is up to 20 times
faster with in-memory data, and even more so if it avoids database
loads. In the worst case you would load buckets equal to length of the
path set, usually you should load a lot less.

We have large Plone sites in the same range of multiple 100.000 items
and with queryplan and blobs we can run them with ZODB cache sizes of
less than 100.000 items and memory usage of 500mb per single-threaded
process.

Of course it would still be really good to optimize the underlying
data structures, but queryplan should help make this less urgent.

 Ahh interesting, that is good to know. I've not actually checked the
 conflict resolution code, but do bucket change conflicts actually get
 resolved in some sane way, or does the transaction have to be
 retried?

Conflicts inside the same bucket can be resolved and you won't get to
see any log message for them. If you get a ConflictError in the logs,
it's one where the request is being retried.

 And imagine if you use zc.zlibstorage to compress records! :)

 This is Plone 3, which is Zope 2.10.11, does zc.zlibstorage work on
 that, or does it need newer ZODB?

zc.zlibstorage needs a newer ZODB version. 3.10 and up to be exact.

 Also, unless I can sort out that
 large number of small pickles being loaded, I'd imagine this would
 actually slow things down.

The Data.fs would be smaller, making it more likely to fit into the OS
disk cache. The overhead of uncompressing the data is small compared
to the cost of a disk read instead of a memory read. But it's hard to
say what exactly happens with the cache ratio in practice.

Hanno
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Increasing MAX_BUCKET_SIZE for IISet, etc

2011-01-27 Thread Matt Hamilton
Hanno Schlichting hanno at hannosch.eu writes:

 You are using queryplan in the site, right? The most typical catalog
 query for Plone consists of something like ('allowedRolesAndUsers',
 'effectiveRange', 'path', 'sort_on'). Without queryplan you indeed
 load the entire tree (or trees inside allowedRolesAndUsers) for each
 of these indexes.

Yes we are using queryplan. Without it the site becomes pretty much 
unusable.
 
 With queryplan it knows from prior execution, that the set returned by
 the path index is the smallest. So it first calculates this. Then it
 uses this small set (usually 10-100 items per folder) to look inside
 the other indexes. It then only needs to do an intersection of the
 small path set with each of the trees. If the path set has less then
 1000 items, it won't even use the normal intersection function from
 the BTrees module, but use the optimized Cython based version from
 queryplan, which essentially does a for-in loop over the path set.
 Depending on the size ratio between the sets this is up to 20 times
 faster with in-memory data, and even more so if it avoids database
 loads. In the worst case you would load buckets equal to length of the
 path set, usually you should load a lot less.

There still seem to be instances in which the entire set is loaded.  This 
could be an artifact of the fact I am clearing the ZODB cache before each 
]test, which I think seems to be clearing the query plan. Speaking of 
which I saw in the query plan code, some hook to load a pre-defined query 
plan... but I can't see exactly how you supply this plan or in what format 
it is. Do you use this feature?

 We have large Plone sites in the same range of multiple 100.000 items
 and with queryplan and blobs we can run them with ZODB cache sizes of
 less than 100.000 items and memory usage of 500mb per single-threaded
 process.
 
 Of course it would still be really good to optimize the underlying
 data structures, but queryplan should help make this less urgent.

Well, I think we are already at that point ;) There are also I think other
times in which the full set is loaded.

  Ahh interesting, that is good to know. I've not actually checked the
  conflict resolution code, but do bucket change conflicts actually get
  resolved in some sane way, or does the transaction have to be
  retried?
 
 Conflicts inside the same bucket can be resolved and you won't get to
 see any log message for them. If you get a ConflictError in the logs,
 it's one where the request is being retried.

Great. That was that I always thought, but just wanted to check. So in
that case, what does it mean if I see a conflict error for an IISet? Can
they not resolve conflicts internally?

  And imagine if you use zc.zlibstorage to compress records! :)
 
  This is Plone 3, which is Zope 2.10.11, does zc.zlibstorage work on
  that, or does it need newer ZODB?
 
 zc.zlibstorage needs a newer ZODB version. 3.10 and up to be exact.
 
  Also, unless I can sort out that
  large number of small pickles being loaded, I'd imagine this would
  actually slow things down.
 
 The Data.fs would be smaller, making it more likely to fit into the OS
 disk cache. The overhead of uncompressing the data is small compared
 to the cost of a disk read instead of a memory read. But it's hard to
 say what exactly happens with the cache ratio in practice.

Yeah, if we could use it I certainly would :) I guess what I mean above is
that larger pickles would compress better, so lots of small pickles the
compression would be less effective.

-Matt




___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Chris Withers
On 27/01/2011 03:15, Shane Hathaway wrote:
 Okay, so I'll do:

 - e2fsck -f

Hmm, how do I e2fsck a mounted filesystem?

The MySQL filesystem could be unmounted (after I shut down MySQL!), so I 
ran e2fsck on it:

e2fsck -f /dev/sdb1
e2fsck 1.41.3 (12-Oct-2008)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 166/16777216 files (10.2% non-contiguous), 8244979/67107513 
blocks

 - mysqlcheck -c

mysqlcheck -c dev_packed -u root -p
Enter password:
dev_packed.new_oid OK
dev_packed.object_ref  OK
dev_packed.object_refs_added   OK
dev_packed.object_stateOK
dev_packed.pack_object OK

 What logs should I hunt through and what kind of things am I looking
 for?

 /var/log/messages and the like.
  Look for kernel-level errors such as
 block I/O errors, oops, and system panics.  Any such errors have a
 chance of corrupting files.  We need to rule out errors at the kernel or
 below.

None of the above, /var/log/messages is actually pretty empty for the 
last few days. If it helps, both the app server and database server are 
VMWare virtual machines...

 Next, follow the directions in relstorage/tests/README.txt to create the
 4 test databases.  Then run bin/test -p -m relstorage.  All tests
 should pass.

Okay, first problem, the tests only connect to localhost, which means I 
can't exactly test as the app server is one machine and the database 
server is another. However, the two machines are identical, so I setup 
the buildout on the database server with the new test section added.

First up, I get the following failures:

IOError: [Errno 2] No such file or directory: 
'/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/blob/blob_connection.txt'

OSError: [Errno 2] No such file or directory: 
'/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/replicas.conf'

My guess is that these files aren't included by setuptools?

So, I checked out the 1.4.0 tag and added it as a develop egg in the 
buildout.

Now I get:

Running .HFMySQLBlobTests tests:
   Set up .HFMySQLBlobTests in 0.000 seconds.
   Running:

   Ran 70 tests with 0 failures and 0 errors in 9.903 seconds.
Running .HPMySQLBlobTests tests:
   Tear down .HFMySQLBlobTests in 0.000 seconds.
   Set up .HPMySQLBlobTests in 0.000 seconds.
   Running:

   Ran 81 tests with 0 failures and 0 errors in 10.511 seconds.
Running zope.testing.testrunner.layer.UnitTests tests:
   Tear down .HPMySQLBlobTests in 0.005 seconds.
   Set up zope.testing.testrunner.layer.UnitTests in 0.000 seconds.
   Running:
 78/269 (29.0%)

Error in test check16MObject (relstorage.tests.testmysql.HPMySQLTests)
Traceback (most recent call last):
   File /usr/local/lib/python2.6/unittest.py, line 279, in run
 testMethod()
   File /home/zope/relstorage_co/relstorage/tests/reltestbase.py, line 
214, in check16MObject
 self._dostoreNP(oid, data=data)
   File 
/var/buildout-eggs/ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/tests/StorageTestBase.py,
 
line 202, in _dostoreNP
 return self._dostore(oid, revid, data, 1, user, description)
   File 
/var/buildout-eggs/ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/tests/StorageTestBase.py,
 
line 190, in _dostore
 r1 = self._storage.store(oid, revid, data, '', t)
   File /home/zope/relstorage_co/relstorage/storage.py, line 565, in store
 cursor, self._batcher, oid_int, prev_tid_int, data)
   File /home/zope/relstorage_co/relstorage/adapters/mover.py, line 
453, in mysql_store_temp
 command='REPLACE',
   File /home/zope/relstorage_co/relstorage/adapters/batch.py, line 
67, in insert_into
 self.flush()
   File /home/zope/relstorage_co/relstorage/adapters/batch.py, line 
74, in flush
 self._do_inserts()
   File /home/zope/relstorage_co/relstorage/adapters/batch.py, line 
110, in _do_inserts
 self.cursor.execute(stmt, tuple(params))
   File 
/var/buildout-eggs/MySQL_python-1.2.3-py2.6-linux-i686.egg/MySQLdb/cursors.py,
 
line 174, in execute
 self.errorhandler(self, exc, value)
   File 
/var/buildout-eggs/MySQL_python-1.2.3-py2.6-linux-i686.egg/MySQLdb/connections.py,
 
line 36, in defaulterrorhandler
 raise errorclass, errorvalue
OperationalError: (1153, Got a packet bigger than 'max_allowed_packet' 
bytes)

 196/269 (72.9%)

Error in test check16MObject (relstorage.tests.testmysql.HFMySQLTests)
Traceback (most recent call last):
   File /usr/local/lib/python2.6/unittest.py, line 279, in run
 testMethod()
   File /home/zope/relstorage_co/relstorage/tests/reltestbase.py, line 
214, in check16MObject
 self._dostoreNP(oid, data=data)
   File 
/var/buildout-eggs/ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/tests/StorageTestBase.py,
 
line 202, in _dostoreNP
 return self._dostore(oid, revid, data, 1, user, 

Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-27 Thread Shane Hathaway
On 01/24/2011 02:02 PM, Anton Stonor wrote:
 Now, I wonder why these pointers were deleted from the current_object
 table in the first place. My money is on packing -- and it might fit
 with the fact that we recently ran a pack that removed an unusual large
 amount of transactions in a single pack (100.000+ transactions).

 But I don't know how to investigate the root cause further. Ideas?

I have meditated on this for some time now.  I mentioned I had an idea 
about packing, but I studied the design and I don't see any way my idea 
could work.  The design is such that it seems impossible that the pack 
code could produce an inconsistency between the object_state and 
current_object tables.

I have lots of other ideas now, but I don't know which to pursue.  I 
need a lot more information.  It would be helpful if you sent me your 
database to analyze.  Some possible causes:

- Have you looked for filesystem-level corruption yet?  I asked this 
before and I am waiting for an answer.

- Although there is a pack lock, that lock unfortunately gets released 
automatically if MySQL disconnects prematurely.  Therefore, it is 
possible to force RelStorage to run multiple pack operations in 
parallel, which would have unpredictable effects.  Is there any 
possibility that you accidentally ran multiple pack operations in 
parallel?  For example, maybe you have a cron job, or you were setting 
up a cron job at the time, and you started a pack while the cron job was 
running.  (Normally, any attempt to start parallel pack operations will 
just generate an error, but if MySQL disconnects in just the right way, 
you'll get a mess.)

- Every SQL database has nasty surprises.  Oracle, for example, has a 
nice read only mode, but it turns out that mode works differently in 
RAC environments, leading to silent corruption.  As a result, we never 
use that feature of Oracle anymore.  Maybe MySQL has some nasty 
surprises I haven't yet discovered; maybe the MySQL-specific delete 
using statement doesn't work as expected.

- Applications can accidentally cause POSKeyErrors in a variety of ways. 
  For example, persistent objects cached globally can cause 
POSKeyErrors.  Maybe Plone 4 or some add-on uses ZODB incorrectly.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Jürgen Herrmann

 On Thu, 27 Jan 2011 10:32:09 +, Chris Withers 
 ch...@simplistix.co.uk wrote:
 On 27/01/2011 03:15, Shane Hathaway wrote:
 Okay, so I'll do:

 - e2fsck -f

 Hmm, how do I e2fsck a mounted filesystem?

 The MySQL filesystem could be unmounted (after I shut down MySQL!), 
 so I
 ran e2fsck on it:

 e2fsck -f /dev/sdb1
 e2fsck 1.41.3 (12-Oct-2008)
 Pass 1: Checking inodes, blocks, and sizes
 Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
 /dev/sdb1: 166/16777216 files (10.2% non-contiguous), 
 8244979/67107513
 blocks

 - mysqlcheck -c

 mysqlcheck -c dev_packed -u root -p
 Enter password:
 dev_packed.new_oid OK
 dev_packed.object_ref  OK
 dev_packed.object_refs_added   OK
 dev_packed.object_stateOK
 dev_packed.pack_object OK

 What logs should I hunt through and what kind of things am I 
 looking
 for?

 /var/log/messages and the like.
  Look for kernel-level errors such as
 block I/O errors, oops, and system panics.  Any such errors have a
 chance of corrupting files.  We need to rule out errors at the 
 kernel or
 below.

 None of the above, /var/log/messages is actually pretty empty for the
 last few days. If it helps, both the app server and database server 
 are
 VMWare virtual machines...

 Next, follow the directions in relstorage/tests/README.txt to create 
 the
 4 test databases.  Then run bin/test -p -m relstorage.  All tests
 should pass.

 Okay, first problem, the tests only connect to localhost, which means 
 I
 can't exactly test as the app server is one machine and the database
 server is another. However, the two machines are identical, so I 
 setup
 the buildout on the database server with the new test section added.

 First up, I get the following failures:

 IOError: [Errno 2] No such file or directory:
 
 '/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/blob/blob_connection.txt'

 OSError: [Errno 2] No such file or directory:
 
 '/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/replicas.conf'

 My guess is that these files aren't included by setuptools?

 So, I checked out the 1.4.0 tag and added it as a develop egg in the
 buildout.

 Now I get:

 Running .HFMySQLBlobTests tests:
Set up .HFMySQLBlobTests in 0.000 seconds.
Running:

Ran 70 tests with 0 failures and 0 errors in 9.903 seconds.
 Running .HPMySQLBlobTests tests:
Tear down .HFMySQLBlobTests in 0.000 seconds.
Set up .HPMySQLBlobTests in 0.000 seconds.
Running:

Ran 81 tests with 0 failures and 0 errors in 10.511 seconds.
 Running zope.testing.testrunner.layer.UnitTests tests:
Tear down .HPMySQLBlobTests in 0.005 seconds.
Set up zope.testing.testrunner.layer.UnitTests in 0.000 seconds.
Running:
  78/269 (29.0%)

 Error in test check16MObject 
 (relstorage.tests.testmysql.HPMySQLTests)
 Traceback (most recent call last):
File /usr/local/lib/python2.6/unittest.py, line 279, in run
  testMethod()
File /home/zope/relstorage_co/relstorage/tests/reltestbase.py, 
 line
 214, in check16MObject
  self._dostoreNP(oid, data=data)
File
 
 /var/buildout-eggs/ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/tests/StorageTestBase.py,

 line 202, in _dostoreNP
  return self._dostore(oid, revid, data, 1, user, description)
File
 
 /var/buildout-eggs/ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/tests/StorageTestBase.py,

 line 190, in _dostore
  r1 = self._storage.store(oid, revid, data, '', t)
File /home/zope/relstorage_co/relstorage/storage.py, line 565, 
 in store
  cursor, self._batcher, oid_int, prev_tid_int, data)
File /home/zope/relstorage_co/relstorage/adapters/mover.py, line
 453, in mysql_store_temp
  command='REPLACE',
File /home/zope/relstorage_co/relstorage/adapters/batch.py, line
 67, in insert_into
  self.flush()
File /home/zope/relstorage_co/relstorage/adapters/batch.py, line
 74, in flush
  self._do_inserts()
File /home/zope/relstorage_co/relstorage/adapters/batch.py, line
 110, in _do_inserts
  self.cursor.execute(stmt, tuple(params))
File
 
 /var/buildout-eggs/MySQL_python-1.2.3-py2.6-linux-i686.egg/MySQLdb/cursors.py,

 line 174, in execute
  self.errorhandler(self, exc, value)
File
 
 /var/buildout-eggs/MySQL_python-1.2.3-py2.6-linux-i686.egg/MySQLdb/connections.py,

 line 36, in defaulterrorhandler
  raise errorclass, errorvalue
 OperationalError: (1153, Got a packet bigger than 
 'max_allowed_packet'
 bytes)

  196/269 (72.9%)

 Error in test check16MObject 
 (relstorage.tests.testmysql.HFMySQLTests)
 Traceback (most recent call last):
File /usr/local/lib/python2.6/unittest.py, line 279, in run
  testMethod()
File /home/zope/relstorage_co/relstorage/tests/reltestbase.py, 
 line
 214, in check16MObject
  

Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Chris Withers
On 27/01/2011 10:40, Jürgen Herrmann wrote:

 i also had to up max_allowed_packet to 32M to make the tests work.

Indeed, I upped to 32M and now I get no failures:

Total: 420 tests, 0 failures, 0 errors in 59.173 seconds.

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing  Python Consulting
- http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Shane Hathaway
On 01/27/2011 03:32 AM, Chris Withers wrote:
 On 27/01/2011 03:15, Shane Hathaway wrote:
 Okay, so I'll do:

 - e2fsck -f

 Hmm, how do I e2fsck a mounted filesystem?

You don't.  Don't even try. :-)

 The MySQL filesystem could be unmounted (after I shut down MySQL!), so I
 ran e2fsck on it:

 e2fsck -f /dev/sdb1
 e2fsck 1.41.3 (12-Oct-2008)
 Pass 1: Checking inodes, blocks, and sizes
 Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
 /dev/sdb1: 166/16777216 files (10.2% non-contiguous), 8244979/67107513
 blocks

 - mysqlcheck -c

 mysqlcheck -c dev_packed -u root -p
 Enter password:
 dev_packed.new_oid OK
 dev_packed.object_ref  OK
 dev_packed.object_refs_added   OK
 dev_packed.object_stateOK
 dev_packed.pack_object OK

Ok, thanks for checking.

 First up, I get the following failures:

 IOError: [Errno 2] No such file or directory:
 '/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/blob/blob_connection.txt'

 OSError: [Errno 2] No such file or directory:
 '/var/buildout-eggs/RelStorage-1.4.0-py2.6.egg/relstorage/tests/replicas.conf'

 My guess is that these files aren't included by setuptools?

No, they are included AFAICT.  Let's ignore this for now.

 This is a little weird, as I have max_allowed_packet set to 16M.

Increase max_allowed_packet to at least 32M.  16M is just a bit too low.

 Should these tests fail?

No.  I think they will all pass once you increase max_allowed_packet.

 That said, I don't think this has anything to do with the packing bug as

The point was to eliminate some of the most likely causes.  Now we'll 
move on.

 I didn't see any exceptions or, in fact, any logging or output at all
 from zodbpack, and the only other exceptions seen were the POSKeyErrors...

Hmm, you do bring up a good point: zodbpack doesn't configure the 
logging package.  It should.

Can you send me your database?

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Chris Withers
On 27/01/2011 10:47, Shane Hathaway wrote:
 - e2fsck -f

 Hmm, how do I e2fsck a mounted filesystem?

 You don't.  Don't even try. :-)

Yeah, I got that from the warning message it squealed when I tried ;-)
I was more curious about how you'd do this if you needed to (how do you 
unmount / so you can check it?) but I don't think it's the problem here...

 This is a little weird, as I have max_allowed_packet set to 16M.

 Increase max_allowed_packet to at least 32M.  16M is just a bit too low.

Done, and tests all pass now.

 I didn't see any exceptions or, in fact, any logging or output at all
 from zodbpack, and the only other exceptions seen were the POSKeyErrors...

 Hmm, you do bring up a good point: zodbpack doesn't configure the
 logging package.  It should.

Yep ;-) ZConfig would be great so I can plug in a mailinglogger...
It would also be *really* handy if zodbpack could run off a normal 
zope.conf for both logging and storage config (although, in my case, I'd 
then need to be able to specify which storage to pack, I want to avoid 
packing one of them!)

 Can you send me your database?

In a word, no :-(
It's 4GB+ in size (down from 26GB-ish in FileStorage!) and contains 
loads of confidential customer data.

However, more than happy to poke, just tell me where and for what...

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing  Python Consulting
- http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] question about object invalidation after ReadConflictErrors

2011-01-27 Thread Jürgen Herrmann

 hi there!

 i wrongly posted a bug to the zodb bugtracker on tuesday:
 https://bugs.launchpad.net/zodb/+bug/707332

 as it turns out, that bug report was moot. it didn't fix my
 problem. out of curiosity, why is the exception raised inside
 the loop effectifely breaking the loop after the first object
 invalidated? my debug code showed that sometimes ~20 objects
 were in _readCurrent. was this a side effect of the relstorage
 bug involved? or can this happen more frequently? if so, why
 not invalidate all objects in _readCurrent then before re-
 raising the ConflictEror?

 bets regards,
 jürgen
-- 
 XLhost.de ® - Webspace von supersmall bis eXtra Large 

 XLhost.de GmbH
 Jürgen Herrmann, Geschäftsführer
 Boelckestrasse 21, 93051 Regensburg, Germany

 Geschäftsführer: Jürgen Herrmann
 Registriert unter: HRB9918
 Umsatzsteuer-Identifikationsnummer: DE245931218

 Fon:  +49 (0)800 XLHOSTDE [0800 95467833]
 Fax:  +49 (0)800 95467830
 Web:  http://www.XLhost.de
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?

2011-01-27 Thread Anton Stonor
Hi Shane,

Thanks for pursuing this.

I have lots of other ideas now, but I don't know which to pursue.  I need a
 lot more information.  It would be helpful if you sent me your database to
 analyze.  Some possible causes:

 - Have you looked for filesystem-level corruption yet?  I asked this before
 and I am waiting for an answer.


Yep, I've checked for file system consistency and Mysql consistency without
any error reported.



 - Although there is a pack lock, that lock unfortunately gets released
 automatically if MySQL disconnects prematurely.  Therefore, it is possible
 to force RelStorage to run multiple pack operations in parallel, which would
 have unpredictable effects.  Is there any possibility that you accidentally
 ran multiple pack operations in parallel?  For example, maybe you have a
 cron job, or you were setting up a cron job at the time, and you started a
 pack while the cron job was running.  (Normally, any attempt to start
 parallel pack operations will just generate an error, but if MySQL
 disconnects in just the right way, you'll get a mess.)


That's not unlikely! I've actually seen traces of packing invoked TTW,
however the cron job uses zodbpack. I will try to figure out if the PosKeys
starts to surface right after that.


 - Every SQL database has nasty surprises.  Oracle, for example, has a nice
 read only mode, but it turns out that mode works differently in RAC
 environments, leading to silent corruption.  As a result, we never use that
 feature of Oracle anymore.  Maybe MySQL has some nasty surprises I haven't
 yet discovered; maybe the MySQL-specific delete using statement doesn't
 work as expected.


That could also be the case. In fact we have also seen Mysql locking up
longer than expected, but that's another story.



 - Applications can accidentally cause POSKeyErrors in a variety of ways.
  For example, persistent objects cached globally can cause POSKeyErrors.
  Maybe Plone 4 or some add-on uses ZODB incorrectly.


I was not aware of that.

Next step here would probably be to inspect log files further and  grab a
copy of the dabase before PosKeys started to appear and see if it is
possible to recreate the incident.

Again, thanks.

Anton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] question about object invalidation after ReadConflictErrors

2011-01-27 Thread Shane Hathaway
On 01/27/2011 03:59 AM, Jürgen Herrmann wrote:

   hi there!

   i wrongly posted a bug to the zodb bugtracker on tuesday:
   https://bugs.launchpad.net/zodb/+bug/707332

   as it turns out, that bug report was moot. it didn't fix my
   problem. out of curiosity, why is the exception raised inside
   the loop effectifely breaking the loop after the first object
   invalidated? my debug code showed that sometimes ~20 objects
   were in _readCurrent. was this a side effect of the relstorage
   bug involved? or can this happen more frequently? if so, why
   not invalidate all objects in _readCurrent then before re-
   raising the ConflictEror?

That one invalidation is probably some kind of optimization.  I don't 
think it's necessary.  The real invalidation happens later, at a 
transaction boundary, when _flush_invalidations() is called.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Increasing MAX_BUCKET_SIZE for IISet, etc

2011-01-27 Thread Hanno Schlichting
On Thu, Jan 27, 2011 at 11:09 AM, Matt Hamilton ma...@netsight.co.uk wrote:
 Hanno Schlichting hanno at hannosch.eu writes:

 There still seem to be instances in which the entire set is loaded.  This
 could be an artifact of the fact I am clearing the ZODB cache before each
 ]test, which I think seems to be clearing the query plan.

Yes. The queryplan is stored in a volatile attribute, so clearing the
zodb cache will throw away the plan. The queryplan version integrated
into Zope 2.13 stores the plan in a module global with thread locks
around it.

 Speaking of
 which I saw in the query plan code, some hook to load a pre-defined query
 plan... but I can't see exactly how you supply this plan or in what format
 it is. Do you use this feature?

You get a plan representation by calling:

http://localhost:8080/Plone/@@catalogqueryplan-prioritymap

Then add an environment variable pointing to a variable inside a module:

[instance]
recipe = plone.recipe.zope2instance
environment-vars =
CATALOGQUERYPLAN my.customer.module.queryplan

Create that module and put the dump in it. it should start with something like:

# query plan dumped at 'Mon May 24 01:33:28 2010'

queryplan = {
  '/Plone/portal_catalog': {
...
}

You can keep updating this plan with some new data from the dump once
in a while.

Ideally this plan should be persisted in the database at certain
intervals, but we haven't implemented that yet. You don't want to
persist the plan in every request doing a catalog query.

Hanno
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage pack with history-free storage results in POSKeyErrors

2011-01-27 Thread Shane Hathaway
On 01/27/2011 03:57 AM, Chris Withers wrote:
 It would also be *really* handy if zodbpack could run off a normal
 zope.conf for both logging and storage config (although, in my case, I'd
 then need to be able to specify which storage to pack, I want to avoid
 packing one of them!)

Please note that the zodbpack utility is simple, short, and has no extra 
dependencies.  I prefer to keep it that way.  If you want to do 
something more interesting, please fork zodbpack.

 However, more than happy to poke, just tell me where and for what...

Is the problem repeatable?  If you start with the same database twice 
and pack, do you end up with POSKeyErrors on the same OIDs?  I know 
that's probably a long test to run, but I'm not yet sure what else to 
suggest.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev