Re: [ZODB-Dev] ZODB 3.9

2009-04-12 Thread Dieter Maurer
Hanno Schlichting wrote at 2009-4-11 14:43 +0200:
 ...
ZODB 3.9 removed a bunch of deprecated API's. Look at
http://pypi.python.org/pypi/ZODB3/3.9.0a12#change-history to see how
much changed in this version.

The main things were related to Versions are no-longer supported.
which changed some low level API used in quite a number of places and
meant that some of the stuff in Products.OFSP couldn't possibly work
anymore.

Hopefully, a ZODB 3.9 ZEO server is still able to speak with ZODB  3.9
ClientStorage instances...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] 'PersistentReference' object has no attribute '_p_jar'

2009-04-01 Thread Dieter Maurer
Dominique Lederer wrote at 2009-3-30 11:15 +0200:
I am using ZODB 3.8.1 with Relstorage 1.1.3 on Postgres 8.1

Frequently i am getting messages like:

Unexpected error
Traceback (most recent call last):
  File
/home/zope/zope_script/eggs/ZODB3-3.8.1-py2.4-linux-x86_64.egg/ZODB/ConflictResolution.py,
line 207, in tryToResolveConflict
resolved = resolve(old, committed, newstate)
  File
/home/zope/zope_script/eggs/zope.app.keyreference-3.4.1-py2.4.egg/zope/app/keyreference/persistent.py,
line 55, in __cmp__
return cmp(
AttributeError: 'PersistentReference' object has no attribute '_p_jar'

You traceback looks strange. I miss a resolve line.

The bug is probably in this resolve function (missing in the traceback).
The state handed down to the resolve function does not contain
true persistent objects; instead they are replaced by PersistentReferences
with almost no functionality. I think, the only thing one can do
with them is compare them for equality.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] upload a file of 6 MB with the function manage_upload

2009-04-01 Thread Dieter Maurer
Sandra wrote at 2009-4-1 12:17 +:
 ...
def manage_upload(self,file='',REQUEST=None):
 ...
 in python/OFS/image.py. But my Programm run without end.

Zope is not very efficient with uploading large files.
Thus, I may take some time -- but it should work.

I'm making some mistake ?

You should be able to upload even files with manage_upload.

But as you did not show how you have used manage_upload,
there might be some problem in this how (not very likely).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] problem with _p_mtime

2008-12-06 Thread Dieter Maurer
Miles Waller wrote at 2008-12-4 19:42 +:
fstest - no problems
checkbtrees - no problems

fsrefs - returns errors about invalid objects (and reports all objects 
as last updated: 5076-10-09 17:19:26.809896!), and finally fails with a 
KeyError

Traceback (most recent call last):
  File /usr/local/Zope-2.9.8/bin/fsrefs.py, line 157, in ?
main(path)
  File /usr/local/Zope-2.9.8/bin/fsrefs.py, line 130, in main
refs = get_refs(data)
  File /usr/local/Zope-2.9.8/lib/python/ZODB/serialize.py, line 687, 
in get_refs
data = oid_klass_loaders[reference_type](*args)
KeyError: 'n'

This indicates that fsrefs does not understand the data.
There are several possible causes:

  *  fsrefs does not have the correct version

  *  fsrefs has a bug

  *  your storage is damaged.

As you have reported that the storage content could be successfully
exported, a damage is not that likely (the export should have the
same problem in this case).


I think I can see some corruption in the oids of the referenced objects 
as they show as:
\x00\x00\x00\x00\x00\x11'@
\x00\x00\x00\x00\x00#\xd4\xa9
\x00\x00\x00\x00\x00\x11'*
etc... - i wasn't expecting to see [EMAIL PROTECTED] and friends.

This does not indicate any corruption: the oids are treated as
8 byte binary strings. If a byte has a printable representation,
this one is used on printing, otherwise its hex representation.

For example, fsrefs reports not being able to find 
'\x00\x00\x00\x00\x00#\xd4'.  However, I can load the database at the 
zopectl prompt and load objects, and get ob._p_oid to report 
'\x00\x00\x00\x00\x00#\xd4'.

Looks like an fsrefs bug.

If you can load an object from the storage, fsrefs should not report
it as missing.

I also wondered if the first few bytes of the database could have been 
cut off

This is unlikely.
The first bytes contain a magic number (identifying the storage format).
I think, a FileStorage would not open when the magic number were
unrecognizable.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Broken instances after refactoring in ZODB

2008-10-06 Thread Dieter Maurer
Leonardo Santagada wrote at 2008-10-4 16:42 -0300:
 ...
Why doesn't zodb has a table of some form for this info?

You can implement one -- if you think this is worth the effort.

The ZODB has a hook classFactory(connection modulename, globalname)
on the DB class. It is responsible for mapping the pair
modulename, globalname into a class.
Its default calls ZODB.broken.find_global but Zope redefines it
to Zope2.App.ClassFactory.ClassFactory.
You can redefine it further -- if you like.

I heard that  
sometimes for very small objects the string containing this  
information can use up to 30% of the whole space of the file (using  
FileStorage). How does RelStorage store this?

The same way.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-03 Thread Dieter Maurer
Jim Fulton wrote at 2008-10-1 13:40 -0400:
 ...
 It may well be that a restart *may* not lead into a fully functional
 state (though this would indicate a storage bug)

A failure in tpc_finish already indicates a storage bug.

Maybe -- although file system is full might not be so easy to avoid
in all cases



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-03 Thread Dieter Maurer
Christian Theune wrote at 2008-10-3 10:32 +0200:
On Fri, 2008-10-03 at 09:55 +0200, Dieter Maurer wrote:
 Jim Fulton wrote at 2008-10-1 13:40 -0400:
  ...
  It may well be that a restart *may* not lead into a fully functional
  state (though this would indicate a storage bug)
 
 A failure in tpc_finish already indicates a storage bug.
 
 Maybe -- although file system is full might not be so easy to avoid
 in all cases

That should be easy to avoid by allocating the space you need in the
first phase and either release it on an abort or write your 'committed'
marker into it in the second phase.

That's true for a FileStorage -- but it may not be that easy for
other storages (e.g. BSDDB storage).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-01 Thread Dieter Maurer
Jim Fulton wrote at 2008-9-30 18:30 -0400:
 ...
  c. Close the file storage, causing subsequent reads and writes to
 fail.

 Raise an easily recognizable exception.

I raise the original exception.

Sad.

The original exception may have many consequences -- most probably
harmless. The special exception would express that the consequence was
very harmfull.

 In our error handling we look out for some nasty exceptions and  
 enforce
 a restart in such cases. The exception above might be such a nasty
 exception.

The critical log entry should be easy enough to spot.

For humans, but I had in mind that software recognizes the exception
automatically and forces a restart.

Or do you have a logger customization in mind that intercepts the
log entry and then forces a restart?

In may not be trivial to get this right (in a way such that
the log entry does appear in the logfile before the restart starts).

...
 - Have a storage server restart when a tpc_finish call fails.  This
 would work fine for FileStorage, but might be the wrong thing to do
 for another storage.  The server can't know.

 Why do you think that a failing tpc_finish is less critical
 for some other kind of storage?


It's not a question of criticality.  It's a question of whether a  
restart will fix the problem.  I happen to know that a file storage  
would be in a reasonable state after a restart.  I don't know this to  
be the case for some other storage.

But what should an administrator do when this is not the case?
Either a stop or a restart

It may well be that a restart *may* not lead into a fully functional
state (though this would indicate a storage bug) but a definitely not
working system is not much better than one that may potentially not
be fully functional but usually will be apart from storage bugs.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] 3.8.1b8 released and would like to release 3.8.1 soon

2008-09-30 Thread Dieter Maurer
Wichert Akkerman wrote at 2008-9-24 09:44 +0200:
Jim Fulton wrote:
 I'd appreciate it if people would try it out soon.


I can say that the combination of 3.8.1b8 and Dieter's 
zodb-cache-size-bytes patch does not seem to work. With 
zodb-cache-size-bytes set to 1 gigabyte on an instance with a single 
thread and using RelStorage Zope capped its memory usage at 200mb.

I can see two potential reasons (beside a bug in my implementation):

 *  you have not used a very large object count.

The most tight restriction (count or size) restricts what can be
in the cache. With a small object count, this will be tighter than
the byte size restriction.

 *  Size is only estimated -- not exact.

The pickle size is used as size approximation.

I would be surprized however, when the pickle size would be five times
larger than the real size.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Reimplementing Pickle Cache in Python

2008-09-17 Thread Dieter Maurer
Tres Seaver wrote at 2008-9-12 06:35 -0400:
 ...
Reimplementing Pickle Cache in Python
=
 ...
from zope.interface import Attribute
from zope.interface import Interface
class IPickleCache(Interface):
 API of the cache for a ZODB connection.

 ...

Which method moves an object to the front of the ring?
Or do you use an inline expansion for speed reasons?


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Experiences with Relstorage performance for setups with heavy writes

2008-09-17 Thread Dieter Maurer
Andreas Jung wrote at 2008-9-12 10:31 +0200:
anyone having experiences with the performance of Relstorage on Zope 
installations which heavy parallel writes (which is often a bottleneck). 
Does Relstorage provide any significant advantages over ZEO.

As Relstorage emulates FileStorage behaviour for writes/commits,
using the same storage global lock, you should not see a significant
change. Maybe, writing the log temporary file is avoided.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Zope memory usage

2008-09-17 Thread Dieter Maurer
Izak Burger wrote at 2008-9-17 12:10 +0200:
I'm sure this question has been asked before, but it drives me nuts so I 
figured I'll ask again. This is a problem that has been bugging me for 
ages. Why does zope memory use never decrease? Okay, I've seen it 
decrease maybe by a couple megabyte, but never by much. It seems the 
general way to run zope is to put in some kind of monitoring, and 
restart it when memory goes out of bounds. In general it always uses 
more and more RAM until the host starts paging to disk. This sort of 
baby-sitting just seems wrong to me.

This is standard behaviour with long running processes on
a system without memory compaction:

  It almost is a consequence of the increased entropy theorem.
  Memory tends to fragment over time.
  Some memory requests cannot be satisfied by the fragments (because
  the individual fragments are not large enough and compaction is
  not available) and therefore a new large block is requested
  from the operation system.
  
It doesn't seem to make any difference if you set the cache-size to a 
smaller number of objects or use a different number of threads. Over 
time things always go from good to bad and then on to worse. I have only 
two theories: a memory leak, or an issue with garbage collection (python 
side).

The lack of compactions together with weaknesses in *nix memory management
(*nix essentially provides mmap and brk. mmap is not adequate
for large numbers of small memory requests and brk can only allocate/release
at the heap boundary).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTree pickle size

2008-08-28 Thread Dieter Maurer
Roché Compaan wrote at 2008-8-25 17:36 +0200:
On Sun, 2008-08-24 at 08:55 +0200, Roché Compaan wrote:
 Thanks for the feedback. I'll re-run the tests without any text indexes,
 as well as run it with other implementations such as TextIndexNG3 and
 SimpleTextIndex and compare the results.
 

Some more tests show that text indexes aren't the worst offenders. Date
and DateRangeIndexes use IISet in cases where IITreeSet seem more
appropriate. To me there isn't much more value to investigate other text
index implementations. I'd rather spend to time to compare the overall
results with other indexing implementations altogether, like solr or
indexing in a RDMBS.

Listed below are some stats (where I ran my original test in which I
create 1 documents) that compare an unmodified setup, a catalog
without text indexes, a catalog without date indexes, a catalog without
metadata and no catalog at all.

Total size of default setup:  2569.97 MB
Total size excluding text indexes:1963.89 MB

This means text indexes cost about 600 MB (25 %).

Total size excluding date range indexes:  2043.26 MB

This means range indexes cost about 500 MB.

You may consider a Managable RangeIndex instead of the standard
range indexes.

With Managable RangeIndex a DateRangeIndex is implemented
as a RangeIndex with data type DateInteger or DateTimeInteger.


If you also use dm.incrementalsearch with Products.AdvancedQuery,
then you can replace the (expensive, both in terms of storage
as well as runtime) range indexes by incremental filtering --
which may not only let you save lots of space but also can
give dramatic speed improvements.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTree pickle size

2008-08-28 Thread Dieter Maurer
Roché Compaan wrote at 2008-8-24 14:00 +0200:
This is the fsdump output for a single IOBTree:

  data #00032 oid=1bac size=5435 class=BTrees._IOBTree.IOBTree

What is persisted as part of the 5435 bytes? References to containing
buckets? What else?

For optimization reasons,
an IOBTree can in fact essentially be an IOBucket (in case of a small
tree consisting of a single bucket).

This means that the IOBTree above can in fact contains
up to 60 integers with corresponding values (Python objects).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTree pickle size

2008-08-24 Thread Dieter Maurer
Roché Compaan wrote at 2008-8-23 19:31 +0200:
On Sat, 2008-08-23 at 14:09 +0200, Dieter Maurer wrote:
 Roché Compaan wrote at 2008-8-22 14:49 +0200:
 I've been doing some benchmarks on Plone and got some surprising stats
 on the pickle size of btrees and their buckets that are persisted with
 each transaction. Surprising in the sense that they are very big in
 relation to the actual data indexed. I would appreciate it if somebody
 can help me understand what is going on, or just take a look to see if
 the sizes look normal.
 
 In the benchmark I add and index 1 ATDocuments. I commit after each
 document to simulate a transaction per request environment. Each
 document has a 100 byte long description and 100 bytes in it's body. The
 total transaction size however is 40K in the beginning. The transaction
 sizes grow linearly to about 350K when reaching 1 documents.
 
 The Bucket nodes store usually between 22 (OOBucket) and 90 (IIBucket)
 objects in a single bucket.
 
 With any change, the transaction will contain unmodified data
 for several dozens other objects.

Are you saying *all* 22 OOBuckets and 90 IIBuckets will be persisted
again whether they are modified or not?

I did not speak of 22 OOBuckets but of typically 22 entries in an
OOBucket (similarly for IIBucket).

And indeed, when a single entry in an OOBucket is changed, then all
entries are rewritten even if the other entries did not change.

That is because the ZODB load/store granularity is the persistent object
(without persistent subobjects). An OOBucket is a persistent object --
it is loaded/stored always as a whole (all entries together).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTree pickle size

2008-08-24 Thread Dieter Maurer
Dieter Maurer wrote at 2008-8-23 14:09 +0200:
 ...
A typical IISet contains 90 value records and a persistent reference.

I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB
should be expected as typical size of an IISet.
Your IISet instances seem to be about 1.5 kB large.

That is significantly larger than I would expect but maybe not
yet something to worry about.

The larger than expected size probably results from a use of IISet
at a place where IITreeSet would have been better.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTree pickle size

2008-08-23 Thread Dieter Maurer
Roché Compaan wrote at 2008-8-22 14:49 +0200:
I've been doing some benchmarks on Plone and got some surprising stats
on the pickle size of btrees and their buckets that are persisted with
each transaction. Surprising in the sense that they are very big in
relation to the actual data indexed. I would appreciate it if somebody
can help me understand what is going on, or just take a look to see if
the sizes look normal.

In the benchmark I add and index 1 ATDocuments. I commit after each
document to simulate a transaction per request environment. Each
document has a 100 byte long description and 100 bytes in it's body. The
total transaction size however is 40K in the beginning. The transaction
sizes grow linearly to about 350K when reaching 1 documents.

The Bucket nodes store usually between 22 (OOBucket) and 90 (IIBucket)
objects in a single bucket.

With any change, the transaction will contain unmodified data
for several dozens other objects.

What concerns me is that the footprint of indexed data in terms of
BTrees, Buckets and Sets are huge! The total amount of data committed
that related directly to ATDocument is around 30 Mbyte. The total for
BTrees, Buckets and IISets is more than 2 Gbyte. Even taking into
account that Plone has a lot of catalog indexes and metadata columns (I
think 71 in total), this seems very high. 

This is a summary of total data committed per class:

Classname,Object Count,Total Size (Kbytes)
BTrees._IIBTree.IISet,640686,1024506

A typical IISet contains 90 value records and a persistent reference.

I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB
should be expected as typical size of an IISet.
Your IISet instances seem to be about 1.5 kB large.

That is significantly larger than I would expect but maybe not
yet something to worry about.


 ...
BTrees._IIBTree.IIBucket,252121,163524

The same size reasoning applies to IIBuckets: 90 records, but
now consisting of key and value (about 10 bytes).

Your IIBuckets are smaller than one would expect.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Runaway cache size

2008-08-10 Thread Dieter Maurer
[EMAIL PROTECTED] wrote at 2008-7-31 15:09 -0400:
 ...
 I don't have experience with running the db in readonly mode in 
 production.

There is no difference in cache handling between readonly and readwrite
mode.

An old thread explains why this (no-difference) is necessary.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodb does not save transaction

2008-05-29 Thread Dieter Maurer
tsmiller wrote at 2008-5-28 19:55 -0700:
 ...
I have a bookstore that uses the ZODB as its storage.  It uses qooxdoo as
the client and CherryPy for the server.  The server has a 'saveBookById'
routine that works 'most' of the time.  However, sometimes the
transaction.commit() does NOT commit the changes and when I restart my
server the changes are lost.  

This looks like a persistency bug.

Persistency bugs of this kind happen when a nonpersistent mutable
instance is modified in place without that the containing
persistent object is told about the change.

Then the change is persisted only accidentally (together with
another change of the containing persisted object).
However, the change is seen inside the connection that did
the change -- until the containing persistent object is flushed
from the ZODB cache (or a restart happens, of course).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-23 Thread Dieter Maurer
Vincent Pelletier wrote at 2008-5-22 11:21 +0200:
 ...
BTW, the usual error hook treats conflict error exceptions differently from 
others, and I guess it was done so because those can happen in TPC.

No, the reason is to repeat a transaction that failed due to
a ConflictError.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-13 Thread Dieter Maurer
Andreas Jung wrote at 2008-5-13 20:19 +0200:
 ...
 Shared.DC.ZRDB.TM.TM is the standard Zope[2] way to implement a
 ZODB DataManager.

Nowadays you create a datamanager implementing IDataManager and join it 
with the current transaction. Shared.DC.ZRDB.TM.TM is pretty much 
old-old-old-style.

Time to change the Zope 2 code base ;-)

There, you still find the old way -- and it is used by other Zope 2
components



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Multiple databases / mount points : documentation?

2008-04-09 Thread Dieter Maurer
Vincent Rioux wrote at 2008-4-9 11:58 +0200:
I am using zodb FileStorage for a standalone application and looking for 
some advices, tutorials or descriptions for using a zodb made of an 
aggregation of smaller ones.
I have been told that the mount mechanism should make the trick. Any 
pointers are welcome...

Mounts are a Zope concept.

You find a very terse description in Zope's configuration
schema (Zope2/Startup/zopeschema.xml). Look for mount-point.

Once, you have configured a zodb_db with one or more mount
points, you can create mount objects in your storage
via the ZMI.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Analyzing a ZODB.

2008-04-06 Thread Dieter Maurer
Manuel Vazquez Acosta wrote at 2008-4-5 11:49 -0400:
 ...
I wonder if there's a way to actually see what objects (or object types)
are modified by those transactions. So I can go directly to the source
of the (surely innecesary) transaction.

The ZODB utility fsdump generates a human readable
view of your storage.

Among others, you see which transactions have been committed
(together with all transaction metadata) and which objects was
modified (identified by the oid (and I think, their class)).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-25 Thread Dieter Maurer
Benji York wrote at 2008-3-25 09:40 -0400:
Christian Theune wrote:
 I talked to Brian Aker (MySQL guy) two weeks ago and he proposed that we
 should look into a technique called `group commit` to get rid of the commit
 contention.
 ...
Summary: fsync is slow (and the cornerstone of most commit steps), so 
try to gather up a small batch of commits to do all at once (with only 
one call to fsync).

Our commit contention definitely is not caused by fsync.
Our fsync is quite fast. If only fsync would need to be considered,
we could easily process at least 1.000 transactions per second -- but
actually with 10 transactions per second we get contentions a few times
per week.



We do not yet precisely the cause of our commit contentions.
Almost surely there are several causes that all can lead to contention.

We already found:

  *  client side causes (while the client helds to commit lock)
  
- garbage collections (which can block a client in the order of
  10 to 20 s)

- NFS operations (which can take up to 27 s in our setup -- for
  still unknown reasons)

- invalidation processing, espicially ZEO ClientCache processing

  *  server side causes

- commit lock hold during copy phase of pack

- IO trashing during the reachability analysis in pack

- non deterministic server side IO anomalities
  (IO suddently takes several times longer than usual -- for still
  unknown reasons)
 Somewhat like Nagle's algorithm, but for fsync.

The kicker is that OSs and hardware often lie about fsync (and it's 
therefore fast) and good hardware (disk arrays with battery backed write 
cache) already make fsync pretty fast.

Not to suggest that group commit wouldn't speed things up, but it would 
seem that the technique will make the largest improvement for people 
that are using a non-lying fsync on inappropriate hardware.
-- 
Benji York
Senior Software Engineer
Zope Corporation

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-25 Thread Dieter Maurer
Benji York wrote at 2008-3-25 14:24 -0400:
 ... commit contentions ...
 Almost surely there are several causes that all can lead to contention.
 
 We already found:
 
   *  client side causes (while the client helds to commit lock)
   
 - garbage collections (which can block a client in the order of
   10 to 20 s)

Interesting.  Perhaps someone might enjoy investigating turning off 
garbage collection during commits.

A reconfiguration of the garbage collector helped us with this one
(the standard configuration is not well tuned to processes with
large amounts of objects).

 
 - invalidation processing, espicially ZEO ClientCache processing

Interesting.  Not knowing much about how invalidations are handled, I'm 
curious where the slow-down is.  Do you have any more detail?

Not many:

We have a component called RequestMonitor which periodically
checks for long running requests and logs the corresponding stack
traces.
This monitor very often sees requests (holding the commit lock)
which are in ZEO.cache.FileCache.settid.

As the monitor runs asynchronously with the observed threads,
the probability of an observation in a given function
depends on how long the thread is inside this function (total
time, i.e. visits times mean time per visit).
From this, we can conclude that a significant time is spend in
settid.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-21 Thread Dieter Maurer
Chris Withers wrote at 2008-3-20 22:22 +:
Roché Compaan wrote:
 Not yet, they are very time consuming. I plan to do the same tests over
 ZEO next to determine what overhead ZEO introduces.

Remember to try introducing more app servers and see where the 
bottleneck comes ;-)

We have seen commit contention with lots (24) of zeo clients
and a high write rate application (allmost all requests write to
the ZODB).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ERROR ZODB.Connection Couldn't load state for 0x01

2008-03-10 Thread Dieter Maurer
Dylan Jay wrote at 2008-3-10 17:37 +1100:
 ...
I have a few databases being served out of a zeo. I restarted them in a 
routine operation and now I can't restart due to the following error

Any idea on how to fix this?


2008-03-10 06:29:12 ERROR ZODB.Connection Couldn't load state for 0x01
 
line 540, in load_multi_oid
 conn = self._conn.get_connection(database_name)
   File /home/zope/thebe/parts/zope2/lib/python/ZODB/Connection.py, 
line 328, in get_connection
 new_con = self._db.databases[database_name].open(
KeyError: 'edb'

Looks like a configuration problem.

Apparently, there is no database edb configured (but for some
reason expected).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Re: IStorageIteration

2008-02-27 Thread Dieter Maurer
Thomas Lotze wrote at 2008-2-26 09:30 +0100:
Dieter Maurer wrote:

 How often do you need it?
 It is worse the additional index? Especially in view that a storage may
 contain a very large number of transactions?

We've done it differently now anyway, using real iterators which store
their state on the server and get garbage-collected when no longer needed.

Fine. In dm.historical, you could find an alternative:

  It uses exponentially increasing prefetching and
  loads about 2*2**n records to get the first 2**n records.
  This means, it has amortized linear runtime complexity.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO+MultipleClients+ConflictErrors

2008-02-27 Thread Dieter Maurer
Alan Runyan wrote at 2008-2-26 13:07 -0600:
 ...
Most people come at ZODB with previous experience in RDBMS.

How do they map SQL INSERT/UPDATE activities to ZODB data structures?
In a way that does not create hotspot.

I tend to views the objects in an application
as belonging to three types: 

  *  primary content objects (documents, files, images, ...)

  *  containers (folders for organisation)

  *  global auxiliary objects (internal objects used for global
 tasks, such as cataloguing)

For the primary content objects, workflow is usually appropriate
to prevent concurrent modifying access.

(Large) containers should be based on a scalable data structure
with conflict resolution (such as OOBTree). Moreover,
the ids should be chosen randomly (to ensure that concurrent insertions
are likely to be widely spread over the complete structure).

The most difficulties can come from the global auxiliary
objects -- as there a in some way internal, not under direct
control of the application. We are using a variant of QueueCatalog
to tackle hotspots caused by cataloging.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: IStorageIteration

2008-02-25 Thread Dieter Maurer
Thomas Lotze wrote at 2008-2-12 11:09 +0100:
 ...
 I don't think that's going to work here.  Iterating through the
 transactions in the database for each iteration is going to be totally
 non-scalable.

It seems to us that it would actually be the right thing to require that
storages have an efficient, scalable and stateless way of accessing their
transactions by ID. In the case of FileStorage, this might be achieved
using an index analogous to the one mapping object IDs to file positions.

How often do you need it?
It is worse the additional index? Especially in view that a storage
may contain a very large number of transactions?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-07 Thread Dieter Maurer
Roché Compaan wrote at 2008-2-7 21:21 +0200:
 ...
So if I asked you to build a data structure for the ZODB that can do
insertions at a rate comparable to Postgres on high volumes, do you
think that it can be done?

If you need a high write rate, the ZODB is probably not optimal.
Ask yourself whether it is not better to put such high frequency write
data directly into a relational database.

Whenever you have large amounts of highly structured data,
a relational database is necessary more efficient than the ZODB.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: RE : [ZODB-Dev] Re: ZODB Benchmarks

2008-02-06 Thread Dieter Maurer
Mignon, Laurent wrote at 2008-2-6 08:06 +0100:
After a lot of tests and benchmark, my feeling is that the ZODB does not seem 
suitable for systems managing many data stored in a plane hierarchy.
The application that we currently develop is a business process management 
system in opposition to a content management system. In order to guarantee the 
performances necessary, we decided to no more use the ZODB. All data are now 
stored in a relationnal database.

Roché's corrected timings indicate:

  The ZODB is significantly slower than Postgres for insertions
  but camparatively fast (slightly faster) on lookups.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-06 Thread Dieter Maurer
Roché Compaan wrote at 2008-2-6 20:18 +0200:
On Tue, 2008-02-05 at 19:17 +0100, Dieter Maurer wrote:
 Roché Compaan wrote at 2008-2-4 20:54 +0200:
  ...
 I don't follow? There are 2 insertions and there are 1338046 calls
 to persistent_id. Doesn't this suggest that there are 66 objects
 persisted per insertion? This seems way to high?
 
 Jim told you that persistent_id is called for each object and not
 only persistent objects.
 
 An OOBucket contains up to 30 key value pairs, each of which
 are subjected to a call to persistent_id. In each of your pairs,
 there is an additional persistent object. This means, you
 should expect 3 calls to persistent_id for each pair in an OOBucket.

If I understand correctly, for each insertion 3 calls are made to
persistent_id? This is still very far from the 66 I mentioned above?

You did not understand correctly.

You insert an entry. The insertion modifies (at least) one OOBucket.
The OOBucket needs to be written back. For each of its entries
(one is your new one, but there may be up to 29 others) 3
persistent_id calls will happen.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-05 Thread Dieter Maurer
Hello Shane,

Shane Hathaway wrote at 2008-2-3 23:57 -0700:
 ...
Looking into this more, I believe I found the semantic we need in the 
PostgreSQL reference for the LOCK statement [1].  It says this about 
obtaining a share lock in read committed mode: once you obtain the 
lock, there are no uncommitted writes outstanding.  My understanding of 
that statement and the rest of the paragraph suggests the following 
guarantee: in read committed mode, once a reader obtains a share lock on 
a table, it sees the effect of all previous transactions on that table.

I have been too pessimitic with respect to Postgres.

While Postgres uses the freedom of the ASNI isolation level definitions
(they say that some things must not happen but do not prescribe that
other things must necessarily happen), Postgres has a precise
specification for the read committed mode -- it says: in read
committed mode, each query sees the state as it has been when
the query started. This implies that it sees all transactions
that have been committed before the query started. This is sufficient
for your conflict resolution to be correct -- as you hold the commit
lock during conflict resolution such that no new transaction can happen
during the query in question.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-05 Thread Dieter Maurer
Roché Compaan wrote at 2008-2-4 20:54 +0200:
 ...
I don't follow? There are 2 insertions and there are 1338046 calls
to persistent_id. Doesn't this suggest that there are 66 objects
persisted per insertion? This seems way to high?

Jim told you that persistent_id is called for each object and not
only persistent objects.

An OOBucket contains up to 30 key value pairs, each of which
are subjected to a call to persistent_id. In each of your pairs,
there is an additional persistent object. This means, you
should expect 3 calls to persistent_id for each pair in an OOBucket.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-03 Thread Dieter Maurer
Meanwhile I have carefully studied your implementation.

There is only a single point I am not certain about:

  As I understand isolation levels, they garantee that some bad
  things will not happen but not that all not bad thing will happen.

  For read committed this means: it garantees that I will
  only see committed transactions but not necessarily that I will see
  the effect of a transaction as soon as it is committed.

  Your conflict resolution requires that it sees a transaction as
  soon as it is commited.

  The supported relational databases may have this property -- but
  I expect we do not have a written garantee that this will definitely
  be the case.

I plan to make a test which tries to provoke a conflict resolution
failure -- and gives me confidance that the read committed of
Postgres really has the required property.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-03 Thread Dieter Maurer
Roché Compaan wrote at 2008-2-3 09:15 +0200:
 ...
I have tried different commit intervals. The published results are for a
commit interval of 100, iow 100 inserts per commit.

 Your profile looks very surprising:
 
   I would expect that for a single insertion, typically
   one persistent object (the bucket where the insertion takes place)
   is changed. About every 15 inserts, 3 objects are changed (the bucket
   is split) about every 15*125 inserts, 5 objects are changed
   (split of bucket and its container).
   But the mean value of objects changed in a transaction is 20
   in your profile.
   The changed objects typically have about 65 subobjects. This
   fits with OOBuckets.

It was very surprising to me too since the insertion is so basic. I
simply assign a Persistent object with 1 string attribute that is 1K in
size to a key in a OOBTree. I mentioned this earlier on the list and I
thought that Jim's explanation was sufficient when he said that the
persistent_id method is called for all objects including simple types
like strings, ints, etc. I don't know if it explains all the calls that
add up to a mean value of 20 though. I guess the calls are being made by
the cPickle module, but I don't have the experience to investigate this.

The number of persitent_id calls suggests that a written
persistent object has a mean value of 65 subobjects -- which
fits well will OOBuckets.

However, when the profile is for commits with 100 insertions each,
then the number of written persistent objects is far too small.
In fact, we would expect about 200 persistent object writes per transaction:
the 100 new persistent objects assigned plus about as many buckets
changed by these insertions.

 
The keys that I lookup are completely random so it is probably the case
that the lookup causes disk lookups all the time. If this is the case,
is 230ms not still to slow?

Unreasonably slow in fact.

A tree with size 10**7 does likely not have a depth larger than 4
(internal nodes should typically have at least 125 entries, leaves should have
at least 15 -- a tree of depth 4 thus can have about 125**3*15 = 29.x * 10**6).
Therefore, one would expect at most 4 disk accesses.

On my (6 year old) computer, a disk access can take up to 30 ms.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-02 Thread Dieter Maurer
Roché Compaan wrote at 2008-2-1 21:17 +0200:
I have completed my first round of benchmarks on the ZODB and welcome
any criticism and advise. I summarised our earlier discussion and
additional findings in this blog entry:
http://www.upfrontsystems.co.za/Members/roche/where-im-calling-from/zodb-benchmarks

In your insertion test: when do you do commits?
One per insertion? Or one per n insertions (for which n)?


Your profile looks very surprising:

  I would expect that for a single insertion, typically
  one persistent object (the bucket where the insertion takes place)
  is changed. About every 15 inserts, 3 objects are changed (the bucket
  is split) about every 15*125 inserts, 5 objects are changed
  (split of bucket and its container).
  But the mean value of objects changed in a transaction is 20
  in your profile.
  The changed objects typically have about 65 subobjects. This
  fits with OOBuckets.


Lookup times:

0.23 s would be 230 ms not 23 ms.

The reason for the dramatic drop from 10**6 to 10**7 cannot lie in the
BTree implementation itself. Lookup time is proportional to
the tree depth, which ideally would be O(log(n)). While BTrees
are not necessarily balanced (and therefore the depth may be larger
than logarithmic) it is not easy to obtain a severely unbalanced
tree by insertions only.
Other factors must have contributed to this drop: swapping, cache too small,
garbage collections...

Furthermore, the lookup times for your smaller BTrees are far too
good -- fetching any object from disk takes in the order of several
ms (2 to 20, depending on your disk).
This means that the lookups for your smaller BTrees have
typically been served directly from the cache (no disk lookups).
With your large BTree disk lookups probably became necessary.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-01 Thread Dieter Maurer
Hallo Shane,

Shane Hathaway wrote at 2008-1-31 13:45 -0700:
 ...
No, RelStorage doesn't work like that either.  RelStorage opens a second
database connection when it needs to store data.  The store connection
will commit at the right time, regardless of the polling strategy.  The
load connection is already left open between connections; I'm only
talking about allowing the load connection to keep an idle transaction.
 I see nothing wrong with that, other than being a little surprising.

That looks very troubesome.

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.

  and you read older and older data
 which must increase serializability problems

I'm not sure what you're concerned about here.  If a storage instance
hasn't polled in a while, it should poll before loading anything.

Even if it has polled not too far in the past, it should
repoll when the storage is joined to a Zope request processing
(in Connection._setDB):
If it does not, then it may start work with an already outdated
state -- which can have adverse effects when the request bases modifications
on this outdated state.
If everything works fine, than a ConflictError results later
during the commit.

This implies, the read connection must start a new transaction
at least after a ConflictError has occured. Otherwise, the
ConflictError cannot go away.

 
 (Postgres might
 not garantee serializability even when the so called isolation
 level is chosen; in this case, you may not see the problems
 directly but nevertheless they are there).

If that is true then RelStorage on PostgreSQL is already a failed
proposition.  If PostgreSQL ever breaks consistency by exposing later
updates to a load connection, even in the serializable isolation mode,
ZODB will lose consistency.  However, I think that fear is unfounded.
If PostgreSQL were a less stable database then I would be more concerned.

I do not expect that Postgres will expose later updates to the load
connection.

What I fear is described by the following szenario:

   You start a transaction on your load connection L.
   L will see the world as it has been at the start of this transaction.

   Another transaction M modifies object o.

   L reads o, o is modified and committed.
   As L has used o's state before M's modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.

If something causes a commit failure, then the probability of such
failures increases with the outdatedness of L's reads.

 ...
RelStorage only uses the serializable isolation level for loading, not
for storing.  A big commit lock prevents database-level conflicts while
storing.  RelStorage performs ZODB-level conflict resolution, but only
while the commit lock is held, so I don't yet see any opportunity for
consistency to be broken.  (Now I imagine you'll complain the commit
lock prevents scaling, but it uses the same design as ZEO, and that
seems to scale fine.)

Side note:

  We currently face problems with ZEO's commit lock: we have 24 clients
  that produce about 10 transactions per seconds. We observe
  occational commit contentions in the duration of a few minutes.

  We already have found several things that contribute to this problem --
  slow operations on clients while the commit lock is held on ZEO:
  Python garbage collections, invalidation processing, stupid
  application code.
  But there are still some mysteries and we do not yet have
  a good solution.

 
I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits (2PC) but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  [EMAIL PROTECTED] about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, RelStorage
  seems only safe for single storage use.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Dieter Maurer
Andreas Jung wrote at 2008-2-1 12:13 +0100:


--On 1. Februar 2008 03:03:53 -0800 Tarek Ziadé [EMAIL PROTECTED] 
wrote:

 Since BTrees are written in C, I couldn't add my own conflict manager to
 try to merge buckets. (and this is
 way over my head)


But you can inherit from the BTree classes and hook your 
_p_resolveConflict() handler into the Python class - or?

I very much doubt that this is a possible approach:

  A BTree is a complex object, an object that creates new objects
  (partially other BTrees and partially Buckets) when it grows.

  Giving the application used BTree class an _p_resolvedConflict
  will do little -- because the created subobjects (Buckets mainly)
  will not know about it.

  Note especially, that the only effective conflict resolution
  is at the bucket level. As you can see, there is currently
  no way to tell a BTree which Bucket class it should use
  for its buckets -- this renders your advice ineffective.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] The write skew issue

2008-01-31 Thread Dieter Maurer
Christian Theune wrote at 2008-1-30 21:21 +0100:
 ...
That would mean that the write skew phenomenon that you found would be 
valid behaviour, wouldn't it?

No.

 Am I missing something?

Yes. No matter how you order the two transactions in my example,
the result will be different from what the ZODB produces.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer
Shane Hathaway wrote at 2008-1-31 01:08 -0700:
 ...
I admit that polling for invalidations probably limits scalability, but 
I have not yet found a better way to match ZODB with relational 
databases.  Polling in both PostgreSQL and Oracle appears to cause no 
delays right now, but if the polling becomes a problem, within 
RelStorage I can probably find ways to reduce the impact of polling, 
such as limiting the polling frequency.

I am surprised that you think to be able to play with the polling
frequency.

  Postgres will deliver objects as they have been when the
  transaction started.
  Therefore, when you start a postgres transaction
  you must invalidate any object in your cache that
  has been modified between load time and the begin of this
  transaction. Otherwise, your cache can deliver stale state
  not fitting with the objects loaded directly from Postgres.

  I read this as you do not have much room for manouver.
  You must ask Postgres about invalidations when the transaction
  starts.

  Of course, you can in addition ask Postgres periodically
  in order to have a smaller and (hopefully) faster result
  when the transaction starts.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Speedy RelStorage/PostgreSQL

2008-01-31 Thread Dieter Maurer
Shane Hathaway wrote at 2008-1-31 00:12 -0700:
 ...
1. Download ZODB and patch it with poll-invalidation-1-zodb-3-8-0.patch

What does poll invalidation mean?

  The RelStorage maintains a sequence of (object) invalidations ordered
  by transaction-id and the client can ask give me all invalidations
  above this given transaction id? It does so at the start of each
  transaction?

In this case, how does the storage know when it can forget
invalidations?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer
Shane Hathaway wrote at 2008-1-31 11:55 -0700:
 ...
Yes, quite right!

However, we don't necessarily have to roll back the Postgres transaction
on every ZODB.Connection close, as we're doing now.

That sounds very nasty!

In Zope, I definitely *WANT* to either commit or roll back the
transaction when the request finishes. I definitely do not
want to let the following completely unrelated request
decide about the fate of my modifications.

 If we leave the
Postgres transaction open even after the ZODB.Connection closes, then
when the ZODB.Connection reopens, we have the option of not polling,
since at that point ZODB's view of the database remains unchanged from
the last time the Connection was open.

Yes, but you leave the fate of your previous activities to
the future -- and you read older and older data
which must increase serializability problems (Postgres might
not garantee serializability even when the so called isolation
level is chosen; in this case, you may not see the problems
directly but nevertheless they are there).

It's not usually good practice to leave sessions idle in a transaction,
but this case seems like a good exception since it should significantly
reduce the database traffic.

I agree that it can reduce traffic but I am almost convinced that
the price will be high (in either cannot serialize concurrent updates
or not directly noticable serializability violations).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Proposal process?

2008-01-26 Thread Dieter Maurer
Formerly, proposals lived on wiki.zope.org.
There, they could be commented and discussed.

Now proposals live somewhere. Usually, they can not be commented nor
discussed. But, they are registered at Launchpad.

For me, it is completely unclear how Launchpad should be used
to guide the route from a proposal to an eventually implemented
feature. How/where do we discuss and comment on the proposals?
How/where do we decide whether a proposal should be implemented
and in what variant? 

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] PGStorage

2008-01-24 Thread Dieter Maurer
Zvezdan Petkovic wrote at 2008-1-23 17:15 -0500:
On Jan 23, 2008, at 4:05 PM, Flavio Coelho wrote:
 sorry, I never meant to email you personally

I have been wrong: Flavio has not forgotten the list, I had not looked
carefully enough. Sorry!



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Strange File too large problem

2008-01-24 Thread Dieter Maurer
Izak Burger wrote at 2008-1-24 13:57 +0200:
 ...
I'm kind of breaking my normal rules of engagement here by immediately 
sending mail to a new list I just subscribed to, but then Andreas Jung 
did ask me to send a mail about this to the list.

This morning one of our clients suddenly got this error:

Traceback (innermost last):
   Module ZPublisher.Publish, line 121, in publish
   Module Zope2.App.startup, line 240, in commit
   Module transaction._manager, line 96, in commit
   Module transaction._transaction, line 380, in commit
   Module transaction._transaction, line 378, in commit
   Module transaction._transaction, line 436, in _commitResources
   Module ZODB.Connection, line 665, in tpc_vote
   Module ZODB.FileStorage.FileStorage, line 889, in tpc_vote
   Module ZODB.utils, line 96, in cp
IOError: [Errno 27] File too large

Apparently, you do not have large file support and your storage
file has readed the limit for small files.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Strange File too large problem

2008-01-24 Thread Dieter Maurer
Andreas Jung wrote at 2008-1-24 19:20 +0100:
 ...
   Module ZODB.utils, line 96, in cp
 IOError: [Errno 27] File too large

 Apparently, you do not have large file support and your storage
 file has readed the limit for small files.


LFS is usually required for files larger than 2GB. According to my 
information I got from the reporter: the file was 17GB large.

Nevertheless, the operating system reported a file too large
error on write.

This suggests a route for further investigation:

  What causes the reporter's operating system to report file too large.

This is not a ZODB question but one for the respective operating system.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] PGStorage

2008-01-23 Thread Dieter Maurer
Flavio Coelho wrote at 2008-1-22 17:43 -0200:
 ...
Actually what I am trying to run away from is the packing monster ;-)

Jim has optimized pack consideraly (-- zc.FileStorage).

I, too, have worked on pack optimization the last few days (we
cannot yet use Jims work because we are using ZODB 3.4 while
Jims optimization is for ZODB 3.8) and obtained speedups of
more then 80 persent.

I want to be able to use an OO database without the inconvenience of having
it growing out of control and then having to spend hours packing the
database every once in a while. (I do a lot of writes in my DBs). Do this
Holy grail of databases exist? :-)

The pack equivalent of Postgres is called vacuum full.
It is more disruptive than packing 


Maybe, you have a look at the old bsddbstorage.
It could be configured to not use historical data.
Support was discontinued due to lack of interest --
but I report this for the second time within a week
or so. This may indicate a renewed interest.


BTW: stay on the list. I do not like personal emails.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] [FileStorage] Potential data loss through packing

2008-01-21 Thread Dieter Maurer
Looking at the current (not Jims new) pack algorithm to optimize
the reachability analysis, I recognized a behaviour that looks
like a potential data loss through packing.

The potential data loss can occur when an object unreachable at
pack time becomes reachable again after pack time.

The current pack supports a single use case which can cause such
an object resurrection: the use of backpointers (probably from undo).

However, resurrection is possible by other means as well -- e.g.
by reinstating a historical version which references objects
meanwhile deleted.
Packing can cause such objects to get lost (resulting in POSKeyErrors).


Reinstating a historical version which references to meanwhile
deleted objects is probably quite a rare situation such
that the potential data loss seems not to be very critical.

But, potential data loss is nasty, even when the probablity is quite low.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] [fsIndex] surprizing documentation -- inefficiency?

2008-01-21 Thread Dieter Maurer
ZODB.fsIndex tells us in its source code documentation that it splits 
the 8 byte oid into a 6 byte prefix and a two byte suffix and
represents the index by an OOBTree(prefix - fsBucket(suffix - position))

It explains that is uses fsBucket (instead of a full tree) because
the suffix - position would contain at most 256 entries.

This explanation surprises me a bit: why should the bucket contain
only 256 rather than 256 * 256 (= 64.000) entries?


If the assumption is wrong (i.e. the fsBucket can contain up to
64.000 entries), is the implementation inefficient (because of that)?


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Writing Persistent Class

2008-01-21 Thread Dieter Maurer
Marius Gedminas wrote at 2008-1-21 00:08 +0200:
Personally, I'd be afraid to use deepcopy on a persistent object.

A deepcopy is likely to be no copy at all.

  As Python's deepcopy does not know about object ids, it is likely
  that the copy result uses the same oids as the original.
  When you store this copy, objects with the same oid are identified.

If you are lucky, then the ZODB recognizes the problem (because,
it is unable to store two different objects (the result of the deepcopy)
with the same oid in its cache.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [FileStorage] Potential data loss through packing

2008-01-21 Thread Dieter Maurer
Jim Fulton wrote at 2008-1-21 09:41 -0500:
 ... resurrections after pack time may get lost ...
I'm sure the new pack algorithm is immune to this.  It would be  
helpful to design a test case to try to provoke this.

I fear, we can not obtain full immunity at all -- unless we perform
packing offline (after having shut down the storage) or use quite
tight synchronization between packing and normal operations.

Otherwise, resurrection can happen while we are packing -- depending
on how far packing has already proceeded, the resurrection would
need to copy the resurrected objects into its own transaction
rather than simply reference them.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Writing Persistent Class

2008-01-18 Thread Dieter Maurer
Kenneth Miller wrote at 2008-1-17 19:08 -0600:
 ...
Do I always  
need to subclass persistent?

When you assign an instance of your (non persistent derived) class
as an attribute to a persistent object,
then your instance will be persisted together with its persistent
container.
However, local modifications to your instance are not recognized
by the persistency mechanism. You need to explicitly inform the persistent
container about the change.

Moreover, persistent objects define the granularity with which
application and storage interact: load and store work on
the level of persistent objects excluding persistent subobjects.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: memory exhaustion problem

2008-01-17 Thread Dieter Maurer
Flavio Coelho wrote at 2008-1-17 14:57 -0200:
Some progress!

Apparently the combination of:
u._p_deactivate()

You do not need that when you use commit.

transaction.savepoint(True)
transaction.commit()

You can use u._p_jar.cacheGC() instead of the commit.

helped. Memory  consumption keeps growing but much more slowly (about 1/5 of
the original speed). Please correct me if I am wrong, but I believe that
ideally memory usage should stay constant throughout the loop, shouldn't it?

You are sure that SQLLite does not keep data in memory?

Moreover, I shouldn't need to commit either, since I am not modifying the
objects...

The commit calls cacheGC for you. You can instead call cacheGC yourself.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Why does this useage of __setstate__ fail?

2008-01-17 Thread Dieter Maurer
Tres Seaver wrote at 2008-1-17 01:30 -0500:
 ...
Mika, David P (GE, Research) wrote:

 Can someone explain why the test below  (test_persistence) is failing?
 I am adding an attribute after object creation with __setstate__, but
 I can't get the new attribute to persist.

You are mutating the object *inside* your __setstate__:  the ZODB
persistence machinery clears the '_p_changed' flag after you *exit* from
'__setstate__':  the protocol is not intended to support a persistent
write-on-read.

When I remember right, newer ZODB versions allow the __setstate__
implementation to tell whether _p_changed should or should not be
cleared (default: cleared).

Tim Peters added this feature to support the frequent use case,
that __setstate__ is used for object migration.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] minimizing the need for database packing

2007-12-29 Thread Dieter Maurer
Jim Fulton wrote at 2007-12-28 10:20 -0500:
 ...
There Berkely Database Storage supported automatic incremental packing  
without garbage collection.  If someone were to revitalize that effort  
and if one was willing to do without cyclic garbage collection, then  
that storage would remove the need for the sort of disruptive pack we  
have with FileStorage now.

Why do you consider pack disruptive?

Note that I'm working on a new FileStorage packer that is 2-3 times  
faster and, I believe, much less disruptive than the current packing  
algorithm.

If you are at it: I think the lock which protects the finish test
is hold too long. Currently, it is just release for a very short time
and then immeadiately reacquired. It should be safe to release it
immediately after the finish test has failed.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Running ZODB on x64 system

2007-12-02 Thread Dieter Maurer
Jim Fulton wrote at 2007-12-1 10:09 -0500:
 ...
AFAIK, there hasn't been a release that fixes this problem.  A  
contributor to the problem is that I don't think anyone working on  
ZODB has ready access to 64-bit systems. :(

We are using an old (ZODB 3.4) version on a 64 bit linux without
problems.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Running ZODB on x64 system

2007-12-02 Thread Dieter Maurer
Jim Fulton wrote at 2007-12-2 13:51 -0500:
 ...
With what version of Python?

2.4.x

I believe the problem is related to both Python 2.5 and 64-bit systems  
-- possibly specific 64-bit systems.

Okay. No experience with this.

As we use Zope (2), we do not use Python 2.5.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] funky _p_mtime values

2007-09-27 Thread Dieter Maurer
Thomas Clement Mogensen wrote at 2007-9-27 12:43 +0200:
 ...
Within the last few days something very strange has happened: All  
newly created or modified objects get a _p_mtime that is clearly  
incorrect and too big for DataTime to consider it a valid timestamp.  
(ie. int(obj._p_mtime) returns a long).

Values I get for _p_mtime on these newly altered objects are  
something like:
8078347503.108635
with only the last few decimals differing among all affected objects.
Objects changed at the same time appear to get the same stamp.

Looks interesting

When I see such unexplainable things, I tend to speak of alpha rays
Computers are quite reliable -- but not completely. Every now
and then a bit is changing which should not change.
In my current life, I have seen things like this 3 times -- usually
in the form that the content of a file changed without that
any application touched it.

When you accept such a wieldy explanation, then, maybe, I can
give one:

  FileStorage must ensure that all transaction ids (they are
  essentially timestamps) are strictly increasing.

  To this end, it maintains a current transaction id.
  When a new transaction id is needed, it tries to construct
  one from the current time. But if this is smaller than
  the current transaction id, then it increments that a little
  and uses it as new transaction id.

  Thus, if for some reasons, once a bit changed in the 
  current transaction id (or in the file that maintains it
  persistently), then you may no longer get away with it.

On Plone.org, someone asked today how to fix the
effects on the ZODB of an administrator changing the system time to 2008.
If he finds a solution, then your problem may be tackled the same way.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Getting started with ZODB

2007-09-18 Thread Dieter Maurer
Manuzhai wrote at 2007-9-18 12:46 +0200:
 ...
the Documentation link points to a page
that seems to mostly have papers and presentation from 2000-2002.

There is a good guide to the ZODB from Andrew Kuchling (or similar).

It may be old -- but everything is still valid.

On the internet, there is some talk about the different storage
providers, but it seems mostly very old. FileStorage seems to be the
only serious Storage provider delivered with ZODB. Are there any
other general-purpose Storage providers actively being used in the
wild?

FileStorage is must faster than all other storages. Therefore,
it dominates the scene.

DirectoryStorage, too, is used more widely.

Internally, TemporaryStorage is used (a RAM based storage for
sessions). DemoStorage is used for unit tests.

How does their performance compare? FileStorage apparently needs
to keep some index in memory; when does this start to be a problem?

You need about 30 bytes per object. You can calculate when
this starts to make problems for your.

What's new in ZODB4?

I know nothing about ZODB4. The current version is near 3.8, maybe.

There is some talk about blobs, are they
described somewhere?

There should be a proposal at http://wiki.zope.org/ZODB/ListOfProposals;.
But, I just checked, there is not :-(

But a search for ZODB Blob gives quite a few hits.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Recovering from BTree corruption

2007-09-11 Thread Dieter Maurer
Alan Runyan wrote at 2007-9-11 09:27 -0500:
 ...
oid 0xD87110L BTrees._OOBTree.OOBucket
last updated: 2007-09-04 14:43:37.687332, tid=0x37020D3A0CC9DCCL
refers to invalid objects:
oid ('\x00\x00\x00\x00\x00\xb0+f', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xb0N\xbc', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xb0N\xbd', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xd7\xb1\xa0', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xc5\xe8:', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xc3\xc6l', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xc3\xc6m', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xcahC', None) missing: 'unknown'
oid ('\x00\x00\x00\x00\x00\xaf\x07\xc1', None) missing: 'unknown'

Looks as if the OOBucket has lost quite some value links (as
only a single one links to the next bucket).

My questions are:

 - I imagine if there are 'invalid' references this is considered corruption
   or inconsistency?

I depends on your preferences.

 ...
  - Having these invalid references, is this common to  ZODB applications?

No.

At least not for ZODB applications that do not use inter database
references.

 Possibly, there's a backup that has data records for the missing OIDs.

Going to ask hosting company to pull up backups for the past few weeks.
But how i'm going to find this other than seeing if the folder allows me
to iterate over the items is not throwing POSKeyError.  Does that sound
like a decent litmus test?

You can also run fsrefs on it. When you do not get missing ...,
then the backup does not have you POSKeyError (but may lack quite
a few newer modifications).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Recovering from BTree corruption

2007-09-10 Thread Dieter Maurer
Alan Runyan wrote at 2007-9-10 09:34 -0500:
 ...
While debugging this I had a conversation with sidnei about mounted
databases.  He recalled that if your using a mounted database you
should not pack.  If for some reason your mounted database had a cross
reference to another database and somehow you had a dangling reference
to the other database it would cause POSKeyError.

BTrees are actually directed acyclic graphs (DAGs) with two node types
tree (internal node) and bucket (leaf).

Beside its children, a tree contains a link to its leftmost
bucket. Beside its keys/values, a bucket contains a link to
the next bucket.

When you iterate over keys or values, the leftmost bucket
is accessed via the root's leftmost bucket link and then
all buckets are visited via the next bucket links.
Your description seems to indicate that you have lost a
next bucket link.

If you are lucky, then the tree access structure (the children links
of the tree nodes) is still intact -- or if not, is at least
partially intact. Then, you will be able to recover large parts
of your tree.


You have two options:

  * reconstruct the tree from its pickles.

This is the way, the checking of BTrees works.

  * Determine the last key (LK) before you get the POSKeyError;
then use the tree structure to access the next available
key. You may need to try ever larger values above LK
to skip a potentially damanged part of the tree.


I would start with the second approach and switch to the first one
when it becomes too tedious.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Serializability

2007-08-21 Thread Dieter Maurer
Jim Fulton wrote at 2007-8-20 10:32 -0400:
 ...
 Application specific conflict resolution
 would become a really difficult task.

I'm sure you realize that application specific conflict resolution  
violates serializability.

No, I do not realize this.

Assume a counter which is not read only incremented/decremented.
Its application specific conflict resolution ensures
that the schedule is serializable restricted to the counter value.

Things are much more complex when the counter is read (and incremented).
Usually, serializability is lost, then.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer
Tres Seaver wrote at 2007-8-20 10:00 -0400:
 ...
Zope works for this case because each application thread uses a
per-request connection, to which it has exclusive access while the
connection is checked out from the pool (i.e., for the duration of the
request).

At least unless one make persistency errors, such as storing persistent
objects outside the connection (e.g. on class level or in a global
cache).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer
Jim Fulton wrote at 2007-8-20 10:15 -0400:
Excellent analysis snipped

 1. and 3. (but obviously not 2.) could be handled by
 implementing STICKY not by a bit but by a counter.

This has been planned for some. :/

I have (reread) this in your Different Cache Interaction proposal.

Thanks to the GIL, it will also work for concurrent access from
different threads -- if Used and Unused are notified while
the GIL is held.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer
Jim Fulton wrote at 2007-8-20 10:45 -0400:
 ...
Dieter appears to have been bitten by this and he is one of we. :)

We, and I presume he, can be bitten by a Python function called from  
BTree code calling back into the code on the same object.  This is  
possible, for example, in a __cmp__ or related method.  I assume that  
this is what happened to Dieter.  Obviously, this would be a fairly  
special comparison method.

I am not yet sure what really has bitten us -- I am not even sure
whether the object was really deactivated or some memory corruption
caused the object's tail to be overwritten by 0.

When the SIGSEGV had hit, usually a bucket in a TextIndexNG3.lexicon
was affected. This lexicon uses BTrees in a very innocent way.
Its keys are integers and strings -- no fancy __cmp__ method is
involved.

Moreover, we need two things for the deactivation to happen:
the STICKY mechanism must fail *AND* a deactivation must be
called for.

In our Zope/ZODB version, deactivation is done only at transaction
boundaries (it is an early ZODB 3.4 version where snapshops did not
yet call incrgc). Therefore, some commit would need to be
done during the BUCKET_SEARCH call.

The only conceivable cause appears to me that a different thread
modified the bucket and called abort. This would mean
a persistency bug (concurrent use of a persistent object by
several threads). I tried to find such a bug in TextIndexNG3, but
failed.


The problem appears only very rarely -- about 1 to 2 times in about 1 to 2
month. When I analysed the problem in the past, I failed to look
at the object's persistent state (it would have told me whether
the object has been deactivated or overwritten). I just noticed
that the object's head was apparently intact while the object's true
data was 0. Only a few days ago, I recognized that this could
have been the effect of a deactivation.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Serializability

2007-08-19 Thread Dieter Maurer
Analysing the STICKY behaviour of 'Persistent', I recognized
that 'Persistent' does not customize the '__getattr__' but in fact
the '__getattribute__' method. Therefore, 'Persistent' is informed
about any attribute access and not only attribute access on a
ghosted instance.


Thogether with the 'accessed' call in Jim's proposal
http://wiki.zope.org/ZODB/DecouplePersistenceDatabaseAndCache;,
this could be used for a very crude check
of potential serializability conflicts along the following
lines.

  The DataManager maintains a set of objects accessed during a transaction.
  At transaction start, this set is empty and all cached objects
  are in state 'Ghost' or 'Saved'.
  Whenever an object is accessed for the first time, the DataManager's
  'accessed' or 'register' method is called. In both cases,
  the manager adds the object to its accessed set.
  At transaction end, the manager can check whether the state of any
  of its accessed objects has changed in the meantime. If not, no
  serializability conflict happened. Otherwise, a conflict would be
  possible (provided the transaction changed any objects).


The test is very crude, as it does not track whether the tracked
transaction's change really depends on one of the objects
changed by different transactions. We must expect lots
of ConflictErrors. Application specific conflict resolution
would become a really difficult task.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-18 Thread Dieter Maurer
We currently see occational SIGSEGVs in BTrees/BucketTemplate.c:_bucket_get.
I am not yet sure but it looks as if the object had been deactivated
during the BUCKET_SEARCH.

Trying to analyse the problem, I had a close look at the STICKY
mechanism of persistent.Persistent which should prevent
accidental deactivation -- and found it unsafe.


STICKY is one of the states on persistent objects -- beside
GHOST, UPTODATE and CHANGED. It is mostly equivalent to
UPTODATE but prevents deactivation (but not invalidation).

It is used by C extensions that may release the GIL or call back
to Python (which may indirectly release the GIL).
Their typical usage pattern is

  if (obj-state == GHOST) obj-unghostify();
  if (obj-state == UPTODATE) obj-state = STICKY;
  ... do whatever needs to be done with obj ...
  if (obj-state == STICKY) obj-state = UPTODATE

This usage pattern obviously breaks when a similar
code sequence is executed for obj while ... do whatever ..
is executed as it resets STICKY too early (in the nested code
sequence rather than the original one).

This may happen in several ways:

 1. ... do whatever ... does it explicitly

 2. obj is accessed from a different thread

 3. obj is accessed from a Python callback

1. might be considered a bug in ... do whatever ... -- although one
that is not easily avoidable.

2. is a general problem. Not only STICKY is unsafe against concurrent
use -- the complete state model of Persistent is.
We might explicitly state that the concurrent use of persistent objects
is unsafe and check against it.

With respect to STICKY, all three cases can be detected
by prepending if (obj-state == STICKY) ERROR;.

1. and 3. (but obviously not 2.) could be handled by
implementing STICKY not by a bit but by a counter.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Understanding conflicts

2007-08-13 Thread Dieter Maurer
Jim Carroll wrote at 2007-8-12 16:45 +:
 ...
Somehow, the code that adds the message to the persistent 
list is running more than once.  I have read that ZEO will 
re-run python code on a retry

You have read something wrong.

  The only thing, ZEO does in case of a conflict is trying to
  call _p_resolveConflict on the conflicting object.

Every redo is something the application may or may not do.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] What's the deal with _p_independent?

2007-07-08 Thread Dieter Maurer
Stefan H. Holek wrote at 2007-7-7 12:42 +0200:
BTrees.Length is used in many places to maintain the length of  
BTrees. Just the other day it was added to zope.app.container.btree.  
While I am happy about the speed improvements, I am concerned about  
the fact that BTrees.Length declares itself _p_independent.

Your concern may or may not be justified -- depending on how
you use the Length object.

  _p_independent allows the object to deviate from snapshop
  isolation and to see changes which happened after the
  transaction has begun.

  *THUS*, the observed value of a Length object is *NOT* guaranteed
  to correspond to the length of the tree -- even when
  update the Length object whenever you change the length
  of the tree.

  This may or may not be fatal for your application.
  If you use the length only for presentation, it will not matter.

  However, if you base decisions on the value of the Length
  object (in the form if len(obj) == 1: something else: else),
  then this may be fatal.

  If you do not do that, then
  Lengths conflict resolution will ensure that the deviation
  is not permanent -- you will not get long term inconsistencies
  between the Length and the number of objects in the tree.

Note however, that such decisions may be fatal even without _p_independent.

  Zope transaction isolation is snapshop, not serialized.

  Transaction conflicts (defined via serialized isolation)
  are recognized only when they affect the same object.
  Therefore, it is in general unsafe to base decisions about
  changes to one object on another object not changed
  in the same transaction (or when their changes are resolved
  by conflict resolution).

  Thus, you should never base logic (affecting the
  persistent state of another object) on the value of a Length object --
  whether or not it it _p_indentent.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Understanding conflicts

2007-06-23 Thread Dieter Maurer
Jim Carroll wrote at 2007-6-22 16:30 +:
 ...
I'll be checking the quixote mailing list, but quixote isn't going to have
anything zope-specific, and I do think that it's the interaction with zope
that's giving me trouble...

The other sendmail packages are Zope products and can use some
Zope infrastructure. However, eventually, this Zope infrastructure
uses basic mechanisms of th ZODB only. Therefore, non Zope components
can use them as well.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: [Bug] ZODB invalidation processing

2007-05-31 Thread Dieter Maurer
Joachim Schmitz wrote at 2007-5-31 12:07 +0200:
 ...
2007-05-31 09:45:06 INFO Skins.create_level A923157 finished to create 
level 200
Now the conflict error, look at the transaction start-time, this is 
before the restart of zope !!

You are probably tricked out here: the serials are in fact UTC timestamps.
I am not sure but it may well be that the times shown for the serials
are UTC (GMT +0) and not local times.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [Bug] ZODB invalidation processing

2007-05-29 Thread Dieter Maurer
Chris Withers wrote at 2007-5-29 16:02 +0100:
 ...
Once again, it would be nice, now that you have access, if you could 
feed back your changes in areas like these rather than keeping them in 
your own private source tree :-(

I would be busy for about 1 to 2 weeks -- and I do not have that time now.

Moreover, people may not like what I would change -- and an intense
discussion, maybe even struggle may be unravelled.

So, you will still here from time to time a reference to our
private copy of Zope.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: [Bug] ZODB invalidation processing

2007-05-29 Thread Dieter Maurer
Joachim Schmitz wrote at 2007-5-28 17:45 +0200:
In ZODB.Connection.Connection.open I see:

 if self._reset_counter != global_reset_counter:
 # New code is in place.  Start a new cache.
 self._resetCache()
 else:
 self._flush_invalidations()

So self._flush_invalidations() is only called in the else-condition.
In your patch it is always called. I try your version and report back.

As almost always self._reset_counter == global_reset_counter
and _resetCache effectively flushes the cache, the change is *VERY* unlikely
to affect what you see with respect to conflict errors.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] [Bug] ZODB invalidation processing (was: [Zope-dev] many conflict errors)

2007-05-25 Thread Dieter Maurer
Perry wrote at 2007-5-25 13:16 +0200:
database conflict error (oid 0x7905e6, class BTrees._IOBTree.IOBucket,
serial this txn started with 0x036ddc2a44454dee 2007-05-25
09:14:16.000950, serial currently committed 0x036ddc2c21950377
2007-05-25 09:16:07.870801) (80 conflicts (10 unresolved) since startup
at Fri May 25 05:19:08 2007)
 ...
ConflictError: database conflict error (oid 0x7905e6, class
BTrees._IOBTree.IOBucket, serial this txn started with
0x036ddc2b3e989fdd 2007-05-25 09:15:14.670982, serial currently
committed 0x036ddc2dd48f4e33 2007-05-25 09:17:49.818700)

These log entries indicate a bug in ZODB's invalidation processing.

  The first entry tells us that the object was read at 9:14:16
  and the modification conflicts with a write from 9:16:07.

  The second entry tells us that the object was read at 9:15:14
  *BUT* at the time this transaction has started,
  the ZODB should already have known the modification from 9:16:07
  and the object read at 9:15:14 should have been invalidated.
  The new transaction should not have seen any state older than 9:16:07
  (as it begins after this time).


In older ZODB versions, there has been a bug in
ZODB.Connection.Connection._setDB. It has forgotten to flush invalidations
(which may lead to observations as the above).

In our private Zope version, I have still a note like this:

# DM 2005-08-22: always call '_flush_invalidations' as it does
#  more than cache handling only
self._flush_invalidations()
if self._reset_counter != global_reset_counter:
# New code is in place.  Start a new cache.
self._resetCache()
# DM 2005-08-22: always call '_flush_invalidations'
##else:
##self._flush_invalidations()

The note indicates that the bug was fixed at least at 2005-08-22
(though the handling was not completely right in case the cache
was reset).


Maybe, the bug you see now affects only mounted connections?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Same transaction object (re)-used for subsequent requests?

2007-05-04 Thread Dieter Maurer
Andreas Jung wrote at 2007-5-1 11:23 +0200:
 ...
I think you are right (as always). Then let me rephrase the question: how 
can one distinguish if two transaction objects represent the same or
different transactions in such case where memory address is identical?

Why are you interested in such a distinction?

  While you must deliver the same connection in the same transaction,
  there is no harm to deliver a given connection to different transactions
  (provided their lifespans do not overlap).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Implementing Storage Decorators

2007-05-04 Thread Dieter Maurer
Jim Fulton wrote at 2007-5-4 14:40 -0400:

On May 4, 2007, at 2:33 PM, Dieter Maurer wrote:

 Jim Fulton wrote at 2007-5-2 11:52 -0400:
 ...
 I think I still rather like explicit, but I'm on the fence about
 which approach is best.  What do other people think?

 From your description, I would use a subclassing (and forget about
 proxy and copying).

That would be a nightmare, on multiple levels:

- All of the separate implementations would become tightly coupled,  
which is what happens with inheritance.

- Either someone would have to create classes for the various  
permutations of features, or consumers would have to mix and match  
multiple classes to get what they want and sort out the variate  
internal implementation incompatibilities.

Your decorators would become mixin classes
and the final classes would list the features they like -- simpler
than ZCML binding together...

Of course, some features may not play well with one another.
But, that will make problems also with proxies or copying...


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Same transaction object (re)-used for subsequent requests?

2007-05-04 Thread Dieter Maurer
Andreas Jung wrote at 2007-5-4 21:13 +0200:


--On 4. Mai 2007 21:05:00 +0200 Dieter Maurer [EMAIL PROTECTED] wrote:

 But, the transactions are not concurrent in your original description!
 Instead, one transaction has been committed and (only!) then you
 see a transaction with the same id again.

What are you trying to tell? The issue would also happen with a concurrent 
setup. That's why I presented a case where I would see the error in a 
non-concurrent environment. Got it?

No. Not at all.

Lets look at the subject which fortunately remained:

  You assume that some transaction objects are reused for subsequent
  requests. Transactions from subsequent requests are *NEVER* concurrent.

Tim explained to you that it is natural that transactions in
subsequent requests can easily have the same id.
But, *CONCURRENT* transaction can *NEVER* have the same id.

Thus, you can easily use your current caching algorithm:
if you see the same transaction id, then either it is indeed the
same transaction and you must associate the same connection
or it is a different transaction. In this case, it is definitely nonconcurrent
and can therefore, too, get the same connection.

 And if you read carefully you see provided their lifespans do not
 overlap. Obviously, transactions with non overlapping lifespans are not
 concurrent...

The transactions were not overlapping. As Tim wrote: the transactions were 
distinct but they used the address.

Yes, therefore, you do not need to distinguish the two
transactions -- it is sufficient to distinguish the transaction ids.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] checking what refers to an object in zodb

2007-05-04 Thread Dieter Maurer
Chris Withers wrote at 2007-5-4 18:53 +0100:
To try and find out which objects were referencing all these workflow 
histories, we tried the following starting with one of the oid of these 
histories:

from ZODB.FileStorage import FileStorage
from ZODB.serialize import referencesf

fs = FileStorage(path, read_only=1)
data, serialno = fs.load(oid, '')
refs = referencesf(data)

To our surprise, all of the workflow histories returned an empty list 
for refs.

Isn't that very natural?

referencesf(data) determines which objects are referenced by
data and not which objects do reference the object the state of
which is given by data.

Usually, the workflow history does not reference persistent objects.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ExportImport.py: failing on import of extension class

2007-04-26 Thread Dieter Maurer
Paul Winkler wrote at 2007-4-26 02:13 -0400:
In ExportImport._importDuringCommit() I found this little gem:

pfile = StringIO(data)
unpickler = Unpickler(pfile)
unpickler.persistent_load = persistent_load

newp = StringIO()
pickler = Pickler(newp, 1)
pickler.persistent_id = persistent_id

pickler.dump(unpickler.load())
pickler.dump(unpickler.load())
data = newp.getvalue()


What's with the two load-and-dump lines near the end?

They effectively copy the data as a pickle to newp mapping the
persistent ids appropriately.

 ...
If I switch from cPickle to Pickle, I get an extra two lines of
traceback:

Traceback (innermost last):
  Module ZPublisher.Publish, line 101, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 39, in call_object
  Module OFS.ObjectManager, line 543, in manage_importObject
  Module OFS.ObjectManager, line 560, in _importObjectFromFile
  Module ZODB.ExportImport, line 86, in importFile
  Module ZODB.Transaction, line 241, in commit
  Module ZODB.Transaction, line 356, in _commit_objects
  Module ZODB.Connection, line 344, in commit
  Module ZODB.ExportImport, line 153, in _importDuringCommit
  Module pickle, line 872, in load
  Module pickle, line 1153, in load_reduce
  Module copy_reg, line 95, in __newobj__
AttributeError: __new__

Execute this in an interactive interpreter and check what cls in
cls.__new__ is (using pdb.pm()).

 ...
Evidently, copy_reg.__newobj__() is for use with new-style classes.
But that's weird, because the guy that gave me this .zexp says that it
comes from a Zope 2.7 instance (Python 2.3, Plone 2.0) and I'm trying
to load it into an instance with the same versions (he gave me a
Products tarball too).

The python versions might differ.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] another reason to stop supporting versions

2007-04-25 Thread Dieter Maurer
Jim Fulton wrote at 2007-4-24 17:01 -0400:
I'm 99.9% sure that version commit and abort are broken in ZODB.DB.   
The commit methods in CommitVersion, and AbortVersion (and  
TransactionalUndo) call invalidate on the databse too soon -- before  
the transaction has committed.  This can have a number of bad  
effects, including causing inconsistent data in connections.

An argument for keeping version in the past was that they worked.   
Well, I think they don't work and I'm not interested in writing the  
test to fix them.  Is anyone else?

The last time I used Versions was about 2 years ago to
change the indexes of a catalog without a downtime.
Versions was a great help then.

With ManagableIndex, I could achieve this today without
Versions. But not all indexes have yet the necessary possibility
to determine the object values independent from the index id.

Thus, I might still miss Version.

On this other hand, I currently do not have time to provide the
tests and fix the code...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] more lockup information / zope2.9.6+zodb 3.6.2

2007-04-12 Thread Dieter Maurer
Alan Runyan wrote at 2007-4-11 11:31 -0500:
 ... ZEO lockups ...

PeterZ [EMAIL PROTECTED] reported today very similar problems
in [EMAIL PROTECTED]. He, too, gets:

  File /opt/zope/Python-2.4.3/lib/python2.4/asyncore.py, line 343, in
recv
data = self.socket.recv(buffer_size)
error: (113, 'No route to host')

Maybe, you have something in common (the same software version, hardware
part) which causes these problems?


Apart from that, I have seen 2 reasons for ZEO lockups:

  *  a firewall between the ZEO clients and the ZEO server
 which dropped connections without informing the connection
 endpoints

  *  ZEO clients that access the same storage (in the same ZEO)
 via two different connections (leads to a commit deadlock).


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO client cache tempfile oddness

2007-04-10 Thread Dieter Maurer
Paul Winkler wrote at 2007-4-6 13:30 -0400:
 ...
If I understand this stuff correctly, the code in question on a
filesystem that *doesn't* have the sparse file optimization would
equate to write N null bytes to this file as fast as possible.
True?

Posix defines the semantics.

I have not looked it up, but a possible interpretation would also be:
write the n.th byte and let all other bytes undefined.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO for GUI applications

2007-04-01 Thread Dieter Maurer
Robert Gravina wrote at 2007-4-1 00:31 +0900:
 ...
Woohoo! I realised Connection.sync() does exactly what I need, but  
this still doesn't work as expected.

class UpdatedDB(DB):
 def invalidate(self, tid, oids, connection=None, version=''):
 DB.invalidate(self, tid, oids, connection, version)
 if connection is not None:
 connection.sync()

Am I going about this the right way?

Recently, Chris Withers reported a strange AssertionError
after a call of loadBefore. The following discussion lead to the
conclusion that the current ZODB code (its MVCC part) may not allow to call
invalidate when the objects was not modified.

If, on the other hand, the object was modified, invalidate is
called automatically. No need to do it yourself.


Furthermore, be aware that Connection.sync aborts the current transaction.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: KeyError / POSKeyError

2007-03-29 Thread Dieter Maurer
Tim Tisdall wrote at 2007-3-29 12:02 -0400:
  Okay...  I've managed to create a persistent object called 'p' with
the OID of the missing object.  I have no idea how to determine the
database connection object to pass it to the
ZODB.Connection.Connection.add() .

If you have a persistent object (like the root object), then obj._p_jar
is this object's ZODB connection.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-29 Thread Dieter Maurer
Lennart Regebro wrote at 2007-3-28 18:25 +0200:
On 3/27/07, Dieter Maurer [EMAIL PROTECTED] wrote:
 However, this approach is only efficient when the sort index size
 is small compared to the result size.

Sure. But with incremental searching, the result size is always one, right? ;-)

No. You want to have one (the best one) hit from a (potentially) large
set of hits. The size of this set is essential whether or not
a sort index is efficient.

The principle is like this:

  Let S be the hit set.
  Assume you sort index consists of values i1  i2   and
  use D(i) for the set of documents indexed under i,
  Then D(i1) intersect S preceeds D(i2) intersects S preceeds
  D(i3) intersects S, etc in the result ordered by the index i.

  If the index size is small compared to S (and if we assume the
  hits are uniformly distributed over the indexed values),
  then each intersection can determine (on average) a significant
  amount of sorted hits. You can efficiently determine the first hits.

  Assume on the other hand that S contains a single element
  and that the index is large, then almost all intersections are
  a vaste of time (as the result is the empty set).


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] two level cache

2007-03-29 Thread Dieter Maurer
Atmasamarpan Novy wrote at 2007-3-28 11:02 +0200:
 ...
Problem:
Current ZODB design creates a separate cache for each ZODB connection 
(ie. a thread in zope). It means that the same object could be 
replicated in each connection cache. We cannot do much about it since we 
do not know in advance that a particular object will not be modified. 
But it is a kind of waste when a number of modified objects is 
relatively low to a number of read-only objects.

The idea to share read only objects between different connections
came up some month before.

Jim and Tim convinced me that it would not work (at least not without
lots of code insprection for C extensions handling persistent
objects (such as BTrees):

  The problem: while an object may be read only on the application
  level, it is not read only below this level:

E.g. the ZODB will store the objects load state in _p_changed.

  This is not multi-thread safe when different threads can do these
  updates concurrently (as would be possible when read only objects
  were shared between connections).

  Of course, this could be fixed -- but at quite some price...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: KeyError / POSKeyError

2007-03-29 Thread Dieter Maurer
Tim Tisdall wrote at 2007-3-29 16:03 -0400:
   It took me all day, but I finally managed to figure out how to do
what you suggested.  Unfortunately, I still get the very same error:
POSKeyError, Error Value: 0x01edf2 .  Just to make sure I did it
right, 0x01edf2 is the OID I should use in your solution, right?

Yes, but you need to pack it with ZODB.utils.p64 to get
an 8 byte binary string.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: KeyError / POSKeyError

2007-03-27 Thread Dieter Maurer
Tim Tisdall wrote at 2007-3-27 09:17 -0400:
  The broken object is a 1gb plone instance.  Which is what I'm trying
to recover.

You may try to find the (non broken) persistent subobjects of the broken
objects and relink them to a new object.
Then you can delete the broken object.

Whether you have a chance to succeed depends on how badly
the content is destroyed.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Dieter Maurer
Jim Fulton wrote at 2007-3-26 15:55 -0400:
 ...

On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote:

 Jim Fulton wrote at 2007-3-25 09:53 -0400:

 On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
 MF I think one of the main limitations of the current catalog (and
 MF hurry.query) is efficient support for sorting and batching the
 query
 MF results. The Zope 3 catalog returns all matching results, which
 can then
 MF be sorted and batched. This will stop being scalable for large
 MF collections. A relational database is able to do this
 internally, and is
 MF potentially able to use optimizations there.

 What evidence to you have to support this assertion?  We did some
 literature search on this a few years ago and found no special trick
 to avoid sorting costs.

 I know of 2 approaches to reducing sort cost:

 1. Sort your results based on the primary key and therefore, pick
 your primary key to match your sort results.  In terms of the Zope
 catalog framework, the primary keys are the document IDs, which are
 traditionally chosen randomly.  You can pick your primary keys based
 on a desired sort order instead. A variation on this theme is to use
 multiple sets of document ids,  storing multiple sets of ids in each
 index.  Of course, this approach doesn't help with something like
 relevance ranks.

 2. Use an N-best algorithm.  If N is the size of the batch and M is
 the corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) which
 is a significant improvement if N  M, but still quite expensive.

 The major costs in sorting are usually not the log(n) but
 the very high linear costs fetching the sort keys (although for  
 huge n,
 we will reach the asymptotic limits).

Right. The problem is the N not the log(N). :)


 Under normal conditions, a relational database can be far more  
 efficient
 to fetch values either from index structures or the data records
 than Zope -- as

   * its data representation is much more compact

   * it often supports direct access

   * the server itself can access and process all data.


 With the ZODB, the data is hidden in pickles (less compact), there is
 no direct access (instead the complete pickle need to be decoded)

The catalog sort index mechanism uses the un-index data structure in  
the sort index to get sort keys. This is a pretty compact data  
structure.

The data usually is in IOBuckets which contain 45 values on the average.
In a corresponding relational index structure, you could have several hundreds
of values.

 and
 all operations are done in the client (rather than in the server).

Which is often fine if the desired data are in the client cache.  It  
avoids making the storage a bottleneck.

Yes, but the if is important. Quite often, some operations flush
almost all objects from the cache (partly because the cache is controlled
by the number of objects and not by their size) and after that,
filling the cache again takes ages.

Moreover, a relational database can (and usually does) use caching as
well. It is not restricted to a cliend side only technique.

I know that most relational database backends have a different architecture
than ZEO: use one process (or at least one thread) per connection
such that several activities can interleave the high IO wait times.
The one thread ZEO architecture must take more care not to become the
bottleneck.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: KeyError / POSKeyError

2007-03-23 Thread Dieter Maurer
Tim Tisdall wrote at 2007-3-23 16:03 -0400:
  When I run the fsrefs.py on the database I get the following:
-
oid 0x0L persistent.mapping.PersistentMapping
last updated: 2007-01-02 18:59:32.016077, tid=0x36AA393889A1800L
refers to invalid object:
   oid ('\x00\x00\x00\x00\x00\x00\x00\x01', None) missing: 'unknown'

oid 0x1L OFS.Application.Application
last updated: 2007-01-02 18:59:32.016077, tid=0x36AA393889A1800L
refers to invalid objects:
   oid ('\x00\x00\x00\x00\x00\x00\x00\x06', None) missing: 'unknown'
   oid ('\x00\x00\x00\x00\x00\x00\x00\x02', None) missing: 'unknown'
   oid ('\x00\x00\x00\x00\x00\x00\x00\x02', None) missing: 'unknown'
   oid ('\x00\x00\x00\x00\x00\x00\x00\x03', None) missing: 'unknown'

Looks like a buggy fsrefs:

  It complains that oid 0x0L refers to missing oid 0x1L and
  then it complains that oid 0x1L refers to other missind oids.

  Apparently, oid 0x1L is there. Otherwise, it would be difficult
  for it to refer to missing objects.

Do not trust this fsrefs version.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: roll back filestorage zodb to a certain date?

2007-03-22 Thread Dieter Maurer
Chris Withers wrote at 2007-3-22 08:43 +:
Dennis Allison wrote:
And that I'm lazy and really want to be able to do:

python rollback.py 2007-03-21 09:00

You have been told that you can specify a stop time
and the storage will stop at the given time.

Thus, you look at the code how this is achieved and
derive your rollback.py from this analysis.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: roll back filestorage zodb to a certain date?

2007-03-21 Thread Dieter Maurer
Jim Fulton wrote at 2007-3-21 10:06 -0400:
 ...

On Mar 21, 2007, at 9:59 AM, Tres Seaver wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Jim Fulton wrote:
 On Mar 21, 2007, at 6:41 AM, Chris Withers wrote:

 Hi All,

 Is there any existing method or script for rolling back a ZODB
 (filestorage-backed in this case,

 Back end to what?


 if that makes it easier) to a certain point in time?

 Do you actually want to modify the file? Or do you simply want to
 access it as of a particular time.
 There is already a mechanism for opening a file storage as of a
 particular time.


 eg: Make this Data.fs as it was at 9am this morning

 If not, I'll be writing one, where should I add it to when I'm done?
[Jim]
 Before we talk about adding it anywhere, I'd like to see the
 semantics defined more clearly.

[Tres]
 Chris could:

   1. Open the existing file in time-travel mode (readonly, so that it
  loads no transactions later than the given point).  The 'stop'
  parameter to the FileStorage initializer is the one which
  triggers time-travel.

   2. Open a new, writable filestorage.

   3. Run 'copyTransactionsFrom' on the new storage, passing the old,
  and then close the storages.

   4. Restart his Zope with the new storage.

[Jim]
I wasn't asking about implementation.

[Dieter]
But, it is nice to know one (for the rare cases when one has to recover
from a catastrophic human error).

As such a functionality is only necessary in rare occasions,
I think, it need not be sophisticated.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Are Data.fs and Data.fs.index the same regardless of platform?

2007-03-19 Thread Dieter Maurer
Ray Liere wrote at 2007-3-15 08:32 -0700:
 ...

Yes -- to the question in your subject.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Community opinion about search+filter

2007-03-19 Thread Dieter Maurer
Ross Patterson wrote at 2007-3-15 14:25 -0700:
 ...
I recently became obsessed with this problem and sketched out an
architecture for presorted indexes.  I thought I'd take this
opportunity to get some review of what I came to.

From my draft initial README:

 Presort provides intids which assure the corresponding documents will
 be presorted in any BTrees objects where the intid is used as a key.

 Presorted intids exist alongside normal intids.  Intids are
 distributed over the range of integers available on a platform so as
 to avoid moving presorted intids whenever possible, but eventually a
 given presorted intid may need to be placed in between two other
 consecutive presorted intids.  When this happens, one or more
 presorted intids will have to be moved.  Normal intids are unchanging
 as normal.

 Presort also provides a facility for updating objects who store
 presorted intids when a presorted intid is moved, such as in indexes.
 It also provides for indexes that map a given query to the appropriate
 presorted intid result set and for catalogs that use the appropriate
 presorted intid utility to lookup the real objects for the results.

Would this be a viable approach? 

It would be a special case where an object's rank depends only
on the object and not the query (see my earlier response to Martijn).

IncrementalSearch[2] can very efficiently generate initial
sorted batches in such cases.

 Would it be generally useful?

Only when your application needs only very few different orders (for
the largest result sets): you need different presorted ids and a complete
index set using them  for each such order.

Thus, useful for special case applications with large data sets
but not useful for all applications or when only small data sets are
involved.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] mvcc related error?

2007-03-14 Thread Dieter Maurer
Chris Withers wrote at 2007-3-14 10:18 +:
Dieter Maurer wrote:
 Yes, it looks like an error:
 
   Apparently, assert end is not None failed.
   Apparently storage.loadBefore returned a wrong value.

Unfortunately, neither of these means anything to me ;-)

That is because you did not look at the code :-)

I guess I should file a bug report?

Yes.

Why collector?

Formerly, the Zope collector was the right one -- with topic database.

Not sure, whether this has changed recently.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] mvcc related error?

2007-03-13 Thread Dieter Maurer
Chris Withers wrote at 2007-3-13 11:34 +:
One of the users on one of my projects saw this error under high load:

Module Products.QueueCatalog.QueueCatalog, line 458, in reindexObject
Module Products.QueueCatalog.QueueCatalog, line 341, in catalog_object
Module Products.QueueCatalog.QueueCatalog, line 284, in _update
Module ZODB.Connection, line 732, in setstate
Module ZODB.Connection, line 765, in _setstate
Module ZODB.Connection, line 791, in _load_before_or_conflict
Module ZODB.Connection, line 814, in _setstate_noncurrent
AssertionError

Yes, it looks like an error:

  Apparently, assert end is not None failed.
  Apparently storage.loadBefore returned a wrong value.

  I have no idea how this could happen.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: Re[2]: [ZODB-Dev] ZODB load/save tracing

2007-03-06 Thread Dieter Maurer
Jim Fulton wrote at 2007-2-25 08:21 -0600:
It might also be nice to have this generate events.  That is, the  
tracing storage should call zope.event.notify.

I intent in 3.8 or 3.9 to start having ZODB depend on zope.event.  We  
really should have used events rather than adding the callback's  
we've added recently.

Callbacks are more efficient (as they are more local).

And for the ZODB, efficiency may be an issue...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Making use of the zodb transaction framework outside of zodb?

2007-02-14 Thread Dieter Maurer
Petra Chong wrote at 2007-2-13 18:27 -:
 ...
In the docs I have read that it is possible for non-zodb apps to plug
into the transaction framework. However, I am unable to find any
specifics as to how to do this. 

What I'd like to do is this:

1. Have my app import transaction
2. When transaction.commit() is called from my app, have other things be
notified by that.

Ages ago, I found a documentation about the ZODB that details
its transaction handling. I have remembered the essential facts
but forgot where I have found the documentation.
With your favorite search engine, you might be able to find it

Alternatively, you can look at the source code for
transaction._transaction.Transaction.commit or
use Zope's shared.DC.ZRDB.TM.TM (the TM stands for
Transaction Manager).

The TM class is used to interface
Zope's relational database adapters to the transaction system.
To see how TM might be used, you can look at the
implementation of a Zope database adapter. The TM use
is traditionally found in a file called db.py ((almost) all
adapters seem to have been copied from a common ancestor and then
modified), e.g. ZPsycopaDA/db.py.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


  1   2   3   >