from:"Dieter Maurer"

Re: [ZODB-Dev] ZODB 3.9

2009-04-12 Thread Dieter Maurer

Hanno Schlichting wrote at 2009-4-11 14:43 +0200:
> ...
>ZODB 3.9 removed a bunch of deprecated API's. Look at
>http://pypi.python.org/pypi/ZODB3/3.9.0a12#change-history to see how
>much changed in this version.
>
>The main things were related to "Versions are no-longer supported."
>which changed some low level API used in quite a number of places and
>meant that some of the stuff in Products.OFSP couldn't possibly work
>anymore.

Hopefully, a ZODB 3.9 ZEO server is still able to speak with ZODB < 3.9
"ClientStorage" instances...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] upload a file of 6 MB with the function manage_upload

2009-04-01 Thread Dieter Maurer

Sandra wrote at 2009-4-1 12:17 +:
> ...
>def manage_upload(self,file='',REQUEST=None):
> ...
> in python/OFS/image.py. But my Programm run without end.

Zope is not very efficient with uploading large files.
Thus, I may take some time -- but it should work.

>I'm making some mistake ?

You should be able to upload even files with "manage_upload".

But as you did not show how you have used "manage_upload",
there might be some problem in this how (not very likely).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] 'PersistentReference' object has no attribute '_p_jar'

2009-04-01 Thread Dieter Maurer

Dominique Lederer wrote at 2009-3-30 11:15 +0200:
>I am using ZODB 3.8.1 with Relstorage 1.1.3 on Postgres 8.1
>
>Frequently i am getting messages like:
>
>Unexpected error
>Traceback (most recent call last):
>  File
>"/home/zope/zope_script/eggs/ZODB3-3.8.1-py2.4-linux-x86_64.egg/ZODB/ConflictResolution.py",
>line 207, in tryToResolveConflict
>resolved = resolve(old, committed, newstate)
>  File
>"/home/zope/zope_script/eggs/zope.app.keyreference-3.4.1-py2.4.egg/zope/app/keyreference/persistent.py",
>line 55, in __cmp__
>return cmp(
>AttributeError: 'PersistentReference' object has no attribute '_p_jar'

You traceback looks strange. I miss a "resolve" line.

The bug is probably in this "resolve" function (missing in the traceback).
The state handed down to the "resolve" function does not contain
true persistent objects; instead they are replaced by "PersistentReference"s
with almost no functionality. I think, the only thing one can do
with them is compare them for equality.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] problem with _p_mtime

2008-12-06 Thread Dieter Maurer

Miles Waller wrote at 2008-12-4 19:42 +:
>fstest - no problems
>checkbtrees - no problems
>
>fsrefs - returns errors about invalid objects (and reports all objects 
>as last updated: 5076-10-09 17:19:26.809896!), and finally fails with a 
>KeyError
>
>Traceback (most recent call last):
>  File "/usr/local/Zope-2.9.8/bin/fsrefs.py", line 157, in ?
>main(path)
>  File "/usr/local/Zope-2.9.8/bin/fsrefs.py", line 130, in main
>refs = get_refs(data)
>  File "/usr/local/Zope-2.9.8/lib/python/ZODB/serialize.py", line 687, 
>in get_refs
>data = oid_klass_loaders[reference_type](*args)
>KeyError: 'n'

This indicates that "fsrefs" does not understand the data.
There are several possible causes:

  *  "fsrefs" does not have the correct version

  *  "fsrefs" has a bug

  *  your storage is damaged.

As you have reported that the storage content could be successfully
exported, a damage is not that likely (the export should have the
same problem in this case).

>
>I think I can see some corruption in the oids of the referenced objects 
>as they show as:
>\x00\x00\x00\x00\x00\x11'@
>\x00\x00\x00\x00\x00#\xd4\xa9
>\x00\x00\x00\x00\x00\x11'*
>etc... - i wasn't expecting to see [EMAIL PROTECTED] and friends.

This does not indicate any corruption: the oids are treated as
8 byte binary strings. If a byte has a printable representation,
this one is used on printing, otherwise its hex representation.

>For example, fsrefs reports not being able to find 
>'\x00\x00\x00\x00\x00#\xd4"'.  However, I can load the database at the 
>zopectl prompt and load objects, and get ob._p_oid to report 
>'\x00\x00\x00\x00\x00#\xd4"'.

Looks like an "fsrefs" bug.

If you can load an object from the storage, "fsrefs" should not report
it as missing.

>I also wondered if the first few bytes of the database could have been 
>cut off

This is unlikely.
The first bytes contain a magic number (identifying the storage format).
I think, a "FileStorage" would not open when the magic number were
unrecognizable.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] problem with _p_mtime

2008-12-04 Thread Dieter Maurer

Miles wrote at 2008-12-4 13:39 +:
>I've moved a FileStorage from one (old) machine to another (new) 
>machine, but when I mount it on the new machine I get a lot of time errrors:
>
>Traceback (innermost last):
>   Module ZPublisher.Publish, line 115, in publish
>   Module ZPublisher.mapply, line 88, in mapply
>   Module ZPublisher.Publish, line 41, in call_object
>   Module Shared.DC.Scripts.Bindings, line 311, in __call__
>   Module Shared.DC.Scripts.Bindings, line 348, in _bindAndExec
>   Module App.special_dtml, line 176, in _exec
>   Module DocumentTemplate.DT_Let, line 76, in render
>   Module DocumentTemplate.DT_In, line 703, in renderwob
>   Module DocumentTemplate.DT_With, line 76, in render
>   Module DocumentTemplate.DT_Var, line 214, in render
>   Module App.PersistentExtra, line 43, in bobobase_modification_time
>   Module DateTime.DateTime, line 509, in __init__
>   Module DateTime.DateTime, line 760, in _parse_args
>   Module DateTime.DateTime, line 437, in safelocaltime
>TimeError: The time 98040302366.810165 is beyond the range of this 
>Python implementation.

I expect that your storage is damanged and contains some garbage
at a place where a serial should be.

This could probably be fixed -- but other parts of your storage
might be damanged, too.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Analyzing the committed data during one transaction

2008-11-17 Thread Dieter Maurer

Andreas Jung wrote at 2008-11-17 14:21 +0100:
> ...
>> On Nov 15, 2008, at 10:03 AM, Andreas Jung wrote:
>>> is there a way to analyze the data committed during one transaction?
>>> The current usecase is Plone. A simple change to a document causes a
>>> large transaction (between 30k and 100k even for a one-char change). I
>>> am interested to know how many of this data belongs to the
>>> portal_catalog/index and how many are actually changes to the content
>>> object itself (or subobjects).
> ...
>I am basically looking for "real-time monitoring" solution (hacked into 
>the sources).

Why should this be "real-time"?
What magic should intervene when it sees a large transaction?

I have approached analysis tasks like this in two ways:

  *  Use "fsdump" to dump a readable presentation of parts of
 the storage file and look at the relevant transactions
 (Jim has described a programmatic alternative for this).

  *  Instrument "ZEO.ServerStub.ServerStub" to log relevant information
 about ZEO command parameters.

In both cases, I have loaded interesting objects from the ZODB
and looked into them. Very often, the class and content can tell you what
this object is for. E.g. Indexes and catalog metadata are quite easily
recognizable.

This way, I found the horrible inefficent TextIndexNG2
search (before "StupidStorage") behaviour and the huge transaction sizes
due to oversized catalog metadata.

>Rephrasing my question in a different way: how can I get 
>hold of the "parent" object based a persistent object with a _p_jar
>connection? E.g. when the ZODB commits a subobject (e.g. a BTree or a 
>bucket) of portal_catalog/Indexes/some_index then I would like to get 
>hold of the the related parent "first-class" Zope 2 object (e.g. derived 
>from SimpleItem/Item or something like that).

As Jim pointed out, this is difficult.
But often, class and content can tell you what an object is part of.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Broken instances after refactoring in ZODB

2008-10-06 Thread Dieter Maurer

Leonardo Santagada wrote at 2008-10-4 16:42 -0300:
> ...
>Why doesn't zodb has a table of some form for this info?

You can implement one -- if you think this is worth the effort.

The ZODB has a hook "classFactory(connection modulename, globalname)"
on the "DB" class. It is responsible for mapping the pair
"modulename, globalname" into a class.
Its default calls "ZODB.broken.find_global" but Zope redefines it
to "Zope2.App.ClassFactory.ClassFactory".
You can redefine it further -- if you like.

>I heard that  
>sometimes for very small objects the string containing this  
>information can use up to 30% of the whole space of the file (using  
>FileStorage). How does RelStorage store this?

The same way.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-03 Thread Dieter Maurer

Christian Theune wrote at 2008-10-3 10:32 +0200:
>On Fri, 2008-10-03 at 09:55 +0200, Dieter Maurer wrote:
>> Jim Fulton wrote at 2008-10-1 13:40 -0400:
>> > ...
>> >> It may well be that a restart *may* not lead into a fully functional
>> >> state (though this would indicate a storage bug)
>> >
>> >A failure in tpc_finish already indicates a storage bug.
>> 
>> Maybe -- although "file system is full" might not be so easy to avoid
>> in all cases
>
>That should be easy to avoid by allocating the space you need in the
>first phase and either release it on an abort or write your 'committed'
>marker into it in the second phase.

That's true for a "FileStorage" -- but it may not be that easy for
other storages (e.g. "BSDDB" storage).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-03 Thread Dieter Maurer

Jim Fulton wrote at 2008-10-1 13:40 -0400:
> ...
>> It may well be that a restart *may* not lead into a fully functional
>> state (though this would indicate a storage bug)
>
>A failure in tpc_finish already indicates a storage bug.

Maybe -- although "file system is full" might not be so easy to avoid
in all cases



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-10-01 Thread Dieter Maurer

Jim Fulton wrote at 2008-9-30 18:30 -0400:
> ...
>>>  c. Close the file storage, causing subsequent reads and writes to
>>> fail.
>>
>> Raise an easily recognizable exception.
>
>I raise the original exception.

Sad.

The original exception may have many consequences -- most probably
harmless. The special exception would express that the consequence was
very harmfull.

>> In our error handling we look out for some nasty exceptions and  
>> enforce
>> a restart in such cases. The exception above might be such a nasty
>> exception.
>
>The critical log entry should be easy enough to spot.

For humans, but I had in mind that software recognizes the exception
automatically and forces a restart.

Or do you have a logger customization in mind that intercepts the
log entry and then forces a restart?

In may not be trivial to get this right (in a way such that
the log entry does appear in the logfile before the restart starts).

>...
>>> - Have a storage server restart when a tpc_finish call fails.  This
>>> would work fine for FileStorage, but might be the wrong thing to do
>>> for another storage.  The server can't know.
>>
>> Why do you think that a failing "tpc_finish" is less critical
>> for some other kind of storage?
>
>
>It's not a question of criticality.  It's a question of whether a  
>restart will fix the problem.  I happen to know that a file storage  
>would be in a reasonable state after a restart.  I don't know this to  
>be the case for some other storage.

But what should an administrator do when this is not the case?
Either a stop or a restart

It may well be that a restart *may* not lead into a fully functional
state (though this would indicate a storage bug) but a definitely not
working system is not much better than one that may potentially not
be fully functional but usually will be apart from storage bugs.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] 3.8.1b8 released and would like to release 3.8.1 soon

2008-09-30 Thread Dieter Maurer

Wichert Akkerman wrote at 2008-9-24 09:44 +0200:
>Jim Fulton wrote:
>> I'd appreciate it if people would try it out soon.
>>
>
>I can say that the combination of 3.8.1b8 and Dieter's 
>zodb-cache-size-bytes patch does not seem to work. With 
>zodb-cache-size-bytes set to 1 gigabyte on an instance with a single 
>thread and using RelStorage Zope capped its memory usage at 200mb.

I can see two potential reasons (beside a bug in my implementation):

 *  you have not used a very large object count.

The most tight restriction (count or size) restricts what can be
in the cache. With a small object count, this will be tighter than
the byte size restriction.

 *  Size is only estimated -- not exact.

The pickle size is used as size approximation.

I would be surprized however, when the pickle size would be five times
larger than the real size.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] What's best to do when there is a failure in the second phase of 2-phase commit on a storage server

2008-09-30 Thread Dieter Maurer

Jim Fulton wrote at 2008-9-19 13:45 -0400:
> ...
>2. We (ZC) are moving to 64-bit OSs.  I've resisted this for a while  
>due to the extra memory overhead of 64-bit pointers in Python  
>programs, but I've finally (too late) come around to realizing that  
>the benefit far outweighs the cost.  (In this case, the process was  
>around 900MB in size.

That is very strange.
On our Linux systems (Debian etch), the processes can use 2.7 to 2.9 GB
of memory before the os refuses to allocate more.

>It was probably trying to malloc a few hundred  
>MB.  The malloc failed despite the fact that there was more than 2GB  
>of available process address space and system memory.)
>
>3. I plan to add code to FileStorage's _finish that will, if there's  
>an error:
>
>   a. Log a critical message.
>
>   b. Try to roll back the disk commit.
>
>   c. Close the file storage, causing subsequent reads and writes to  
>fail.

Raise an easily recognizable exception.

In our error handling we look out for some nasty exceptions and enforce
a restart in such cases. The exception above might be such a nasty
exception.

If possible, the exception should provide full information about
the original exception (in the way of the nested exceptions of Java,
emulated by Tim at some places in the ZODB code).



>4. I plan to fix the client storage bug.
>
>I can see 3c being controversial. :) In particular, it means that your  
>application will be effectively down without human intervention.

That's why I would prefer an easily recognizable exception -- in order
to restart automatically.

>I considered some other ideas:
>
>- Try to get FileStorage to repair it's meta data.  This is certainly  
>theoretically doable.  For example, it could re-build it's in-memory  
>index. At this point, that's the only thing in question. OTOH,  
>updating it is the only thing left to fail at this point.  If updating  
>it fails, it seems likely that rebuilding it will fail as well.
>
>- Have a storage server restart when a tpc_finish call fails.  This  
>would work fine for FileStorage, but might be the wrong thing to do  
>for another storage.  The server can't know.

Why do you think that a failing "tpc_finish" is less critical
for some other kind of storage?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] [Zope] Zope memory usage

2008-09-18 Thread Dieter Maurer

Manuel Vazquez Acosta wrote at 2008-9-17 20:05 -0400:
>Alan,
>
>I'm replying to the Zope list also, because this issue is perhaps
>related to other components there.
>
>I'm running into the same situation: The python process running my Plone
>site is steadyly growing.
>
>I'm using Zope2.9.8-final (the one which works with Plone 2.5.5).
>
>What is the plan for Zope include such feature?

I expect the feature will land in ZODB 3.9 and then probably in Zope 2.12.
>
>Best regards,
>Manuel.
>
>Alan Runyan wrote:
>> There was a recent modification to limit the ZODB cache to a set size.  i.e.
>> Limit the size of memory usage to 128MB.
>> 
>> The original feature was implemented here:
>>   http://svn.zope.org/ZODB/branches/dm-memory_size_limited-cache/
>> 
>> You can get the feature+3.8 branch of the ZODB from:
>>   http://svn.zope.org/ZODB/branches/zcZODB-3.8/
>> 
>> The changes are also on trunk (will be ZODB 3.9).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Zope memory usage

2008-09-18 Thread Dieter Maurer

Roché Compaan wrote at 2008-9-17 19:52 +0200:
>Thanks for the notice. We'll give this a go and report back.
>
>Do you know how exactly it is decided what stays in the cache? 

The cache replacement strategy has not changed: "lru" ("least recently used").
Objects are removed from the cache in inverse "least recently used" order.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Zope memory usage

2008-09-17 Thread Dieter Maurer

Izak Burger wrote at 2008-9-17 12:10 +0200:
>I'm sure this question has been asked before, but it drives me nuts so I 
>figured I'll ask again. This is a problem that has been bugging me for 
>ages. Why does zope memory use never decrease? Okay, I've seen it 
>decrease maybe by a couple megabyte, but never by much. It seems the 
>general way to run zope is to put in some kind of monitoring, and 
>restart it when memory goes out of bounds. In general it always uses 
>more and more RAM until the host starts paging to disk. This sort of 
>baby-sitting just seems wrong to me.

This is standard behaviour with long running processes on
a system without memory compaction:

  It almost is a consequence of the "increased entropy" theorem.
  Memory tends to fragment over time.
  Some memory requests cannot be satisfied by the fragments (because
  the individual fragments are not large enough and compaction is
  not available) and therefore a new large block is requested
  from the operation system.
  
>It doesn't seem to make any difference if you set the cache-size to a 
>smaller number of objects or use a different number of threads. Over 
>time things always go from good to bad and then on to worse. I have only 
>two theories: a memory leak, or an issue with garbage collection (python 
>side).

The lack of compactions together with weaknesses in *nix memory management
(*nix essentially provides "mmap" and "brk". "mmap" is not adequate
for large numbers of small memory requests and "brk" can only allocate/release
at the heap boundary).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Experiences with Relstorage performance for setups with heavy writes

2008-09-17 Thread Dieter Maurer

Andreas Jung wrote at 2008-9-12 10:31 +0200:
>anyone having experiences with the performance of Relstorage on Zope 
>installations which heavy parallel writes (which is often a bottleneck). 
>Does Relstorage provide any significant advantages over ZEO.

As "Relstorage" emulates "FileStorage" behaviour for writes/commits,
using the same storage global lock, you should not see a significant
change. Maybe, writing the "log" temporary file is avoided.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RFC: Reimplementing Pickle Cache in Python

2008-09-17 Thread Dieter Maurer

Tres Seaver wrote at 2008-9-12 06:35 -0400:
> ...
>Reimplementing Pickle Cache in Python
>=
> ...
>from zope.interface import Attribute
>from zope.interface import Interface
>class IPickleCache(Interface):
>""" API of the cache for a ZODB connection.
>"""
> ...

Which method moves an object to the front of the ring?
Or do you use an inline expansion for speed reasons?


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] wrap-up notes of sprint possible?

2008-08-30 Thread Dieter Maurer

Christian Theune wrote at 2008-8-30 15:41 +0200:
>if I got things right, here's what happened:
>
> 
>- Dieter worked on giving persistent objects a size hint based on the
>  pickle size which will allow us in the future to have cache strategies
>  based on size (instead/in addition to object counts).

Finished, result:

http://svn.zope.org/ZODB/branches/dm-memory_size_limited-cache/

> 
>- Dieter also started working on having ZEO clients drop their cache
>  files instead of verifying them, whenever they currently would start a
>  `full verification` as this would usually be a huge performance hit
>  for larger caches.

Finished, result:

   http://svn.zope.org/ZODB/branches/dm-zeo_drop_cache_instead_verify/


Of course, the branches still need to be reviewed and merged into
the main development line.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Leaking file descriptors in ZEO tests

2008-08-28 Thread Dieter Maurer

Christian Theune wrote at 2008-8-27 16:40 +0200:
>I located an issue with leaking file descriptors in the ZEO tests and
>have a simple proposal how to fix it. (I can imagine a better way to
>exist but can't see one right now.)
>
>Here's what happens:
>
>- The (single) thread in the client process uses a
>  ManagedClientConnection.
>
>- The __init__ of ManagedClientConnection's super class causes a trigger
>  to be created which will then be replaced with a shared trigger by
>  ManagedClientConnection's __init__.
>
>- Unfortunately at this time the trigger that got temporarily created
>  won't be garbage collected (or at least the file descriptor won't be),
>  as asyncore holds a shared, module-global map of file descriptor
>  numbers to file descriptor objects.
>
>I fixed it for me by explicitly closing the trigger in
>ManagedClientConnection before replacing it with the shared one.

Sounds not bad.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-28 Thread Dieter Maurer

Roché Compaan wrote at 2008-8-24 14:00 +0200:
>This is the fsdump output for a single IOBTree:
>
>  data #00032 oid=1bac size=5435 class=BTrees._IOBTree.IOBTree
>
>What is persisted as part of the 5435 bytes? References to containing
>buckets? What else?

For optimization reasons,
an "IOBTree" can in fact essentially be an "IOBucket" (in case of a small
tree consisting of a single bucket).

This means that the "IOBTree" above can in fact contains
up to 60 integers with corresponding values (Python objects).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-28 Thread Dieter Maurer

Roché Compaan wrote at 2008-8-25 17:36 +0200:
>On Sun, 2008-08-24 at 08:55 +0200, Roché Compaan wrote:
>> Thanks for the feedback. I'll re-run the tests without any text indexes,
>> as well as run it with other implementations such as TextIndexNG3 and
>> SimpleTextIndex and compare the results.
>> 
>
>Some more tests show that text indexes aren't the worst offenders. Date
>and DateRangeIndexes use IISet in cases where IITreeSet seem more
>appropriate. To me there isn't much more value to investigate other text
>index implementations. I'd rather spend to time to compare the overall
>results with other indexing implementations altogether, like solr or
>indexing in a RDMBS.
>
>Listed below are some stats (where I ran my original test in which I
>create 1 documents) that compare an unmodified setup, a catalog
>without text indexes, a catalog without date indexes, a catalog without
>metadata and no catalog at all.
>
>Total size of default setup:  2569.97 MB
>Total size excluding text indexes:1963.89 MB

This means text indexes cost about 600 MB (25 %).

>Total size excluding date range indexes:  2043.26 MB

This means range indexes cost about 500 MB.

You may consider a "Managable RangeIndex" instead of the standard
range indexes.

With "Managable RangeIndex" a "DateRangeIndex" is implemented
as a "RangeIndex" with data type "DateInteger" or "DateTimeInteger".

If you also use "dm.incrementalsearch" with "Products.AdvancedQuery",
then you can replace the (expensive, both in terms of storage
as well as runtime) range indexes by incremental filtering --
which may not only let you save lots of space but also can
give dramatic speed improvements.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-23 Thread Dieter Maurer

Dieter Maurer wrote at 2008-8-23 14:09 +0200:
> ...
>A typical "IISet" contains 90 value records and a persistent reference.
>
>I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB
>should be expected as typical size of an "IISet".
>Your "IISet" instances seem to be about 1.5 kB large.
>
>That is significantly larger than I would expect but maybe not
>yet something to worry about.

The larger than expected size probably results from a use of "IISet"
at a place where "IITreeSet" would have been better.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-23 Thread Dieter Maurer

Jean Jordaan wrote at 2008-8-23 20:44 +0700:
>> That is significantly larger than I would expect but maybe not
>> yet something to worry about.
>
>[...]
>
>> Your "IIBuckets" are smaller than one would expect.
>
>These are plain ATDocuments, so either Plone's behaviour is unexpected
>or the measurement is off.

They are likely not yet as filled as one would expect them.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-23 Thread Dieter Maurer

Roché Compaan wrote at 2008-8-23 19:31 +0200:
>On Sat, 2008-08-23 at 14:09 +0200, Dieter Maurer wrote:
>> Roché Compaan wrote at 2008-8-22 14:49 +0200:
>> >I've been doing some benchmarks on Plone and got some surprising stats
>> >on the pickle size of btrees and their buckets that are persisted with
>> >each transaction. Surprising in the sense that they are very big in
>> >relation to the actual data indexed. I would appreciate it if somebody
>> >can help me understand what is going on, or just take a look to see if
>> >the sizes look normal.
>> >
>> >In the benchmark I add and index 1 ATDocuments. I commit after each
>> >document to simulate a transaction per request environment. Each
>> >document has a 100 byte long description and 100 bytes in it's body. The
>> >total transaction size however is 40K in the beginning. The transaction
>> >sizes grow linearly to about 350K when reaching 1 documents.
>> 
>> The "Bucket" nodes store usually between 22 ("OOBucket") and 90 ("IIBucket")
>> objects in a single bucket.
>> 
>> With any change, the transaction will contain unmodified data
>> for several dozens other objects.
>
>Are you saying *all* 22 OOBuckets and 90 IIBuckets will be persisted
>again whether they are modified or not?

I did not speak of "22 OOBuckets" but of typically 22 entries in an
"OOBucket" (similarly for "IIBucket").

And indeed, when a single entry in an "OOBucket" is changed, then all
entries are rewritten even if the other entries did not change.

That is because the ZODB load/store granularity is the persistent object
(without persistent subobjects). An "OOBucket" is a persistent object --
it is loaded/stored always as a whole (all entries together).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] BTree pickle size

2008-08-23 Thread Dieter Maurer

Roché Compaan wrote at 2008-8-22 14:49 +0200:
>I've been doing some benchmarks on Plone and got some surprising stats
>on the pickle size of btrees and their buckets that are persisted with
>each transaction. Surprising in the sense that they are very big in
>relation to the actual data indexed. I would appreciate it if somebody
>can help me understand what is going on, or just take a look to see if
>the sizes look normal.
>
>In the benchmark I add and index 1 ATDocuments. I commit after each
>document to simulate a transaction per request environment. Each
>document has a 100 byte long description and 100 bytes in it's body. The
>total transaction size however is 40K in the beginning. The transaction
>sizes grow linearly to about 350K when reaching 1 documents.

The "Bucket" nodes store usually between 22 ("OOBucket") and 90 ("IIBucket")
objects in a single bucket.

With any change, the transaction will contain unmodified data
for several dozens other objects.

>What concerns me is that the footprint of indexed data in terms of
>BTrees, Buckets and Sets are huge! The total amount of data committed
>that related directly to ATDocument is around 30 Mbyte. The total for
>BTrees, Buckets and IISets is more than 2 Gbyte. Even taking into
>account that Plone has a lot of catalog indexes and metadata columns (I
>think 71 in total), this seems very high. 
>
>This is a summary of total data committed per class:
>
>Classname,Object Count,Total Size (Kbytes)
>BTrees._IIBTree.IISet,640686,1024506

A typical "IISet" contains 90 value records and a persistent reference.

I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB
should be expected as typical size of an "IISet".
Your "IISet" instances seem to be about 1.5 kB large.

That is significantly larger than I would expect but maybe not
yet something to worry about.


> ...
>BTrees._IIBTree.IIBucket,252121,163524

The same size reasoning applies to "IIBucket"s: 90 records, but
now consisting of key and value (about 10 bytes).

Your "IIBuckets" are smaller than one would expect.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Runaway cache size

2008-08-10 Thread Dieter Maurer

[EMAIL PROTECTED] wrote at 2008-7-31 15:09 -0400:
> ...
>> I don't have experience with running the db in readonly mode in 
>> production.

There is no difference in cache handling between readonly and readwrite
mode.

An old thread explains why this (no-difference) is necessary.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Persistent class in python

2008-07-26 Thread Dieter Maurer

-LМикола Харечко wrote at 2008-7-19 19:50 +0300:-A
>Hi. I am trying to rewrite Persistent class to python. How can i write
>python class that passes this tests (persistent.txt ):
>
>Basic type structure
>
>  >>> Persistent.__dictoffset__
>  0
>  >>> Persistent.__weakrefoffset__
>  0
>  >>> Persistent.__basicsize__ > object.__basicsize__
>  True
>
>and at the same time - it will be possible to create weakreferences to
>this class (in PickleCache). Also this class must not have __slots__
>(__getstate__ method).

I think, you should change the test.

The tests above verify that "Persistent" is implemented in "C",
thus cannot work when you implement "Persistent" in Python.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] What the ZODB really does

2008-06-04 Thread Dieter Maurer

Martijn Faassen wrote at 2008-6-3 19:33 +0200:
> ...
>* if you don't inherit your class from Persistent, or use a python 
>builtin (which doesn't inherit from Persistent), things will be 
>serialized multiple time, as far as I'm aware. (I may be wrong)

In general you are right.

There are a few exceptional cases: the ZODB uses Python's
serialization (pickle) and this has support for sharing.
This means, if the same persistent object contains the same 
object twice, these instances are shared even if they are instances
of "Persistent".

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] zodb does not save transaction

2008-05-29 Thread Dieter Maurer

tsmiller wrote at 2008-5-28 19:55 -0700:
> ...
>I have a bookstore that uses the ZODB as its storage.  It uses qooxdoo as
>the client and CherryPy for the server.  The server has a 'saveBookById'
>routine that works 'most' of the time.  However, sometimes the
>transaction.commit() does NOT commit the changes and when I restart my
>server the changes are lost.  

This looks like a persistency bug.

Persistency bugs of this kind happen when a nonpersistent mutable
instance is modified in place without that the containing
persistent object is told about the change.

Then the change is persisted only accidentally (together with
another change of the containing persisted object).
However, the change is seen inside the connection that did
the change -- until the containing persistent object is flushed
from the ZODB cache (or a restart happens, of course).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-23 Thread Dieter Maurer

Vincent Pelletier wrote at 2008-5-22 11:21 +0200:
> ...
>BTW, the usual error hook treats conflict error exceptions differently from 
>others, and I guess it was done so because those can happen in TPC.

No, the reason is to repeat a transaction that failed due to
a "ConflictError".



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-19 Thread Dieter Maurer

Vincent Pelletier wrote at 2008-5-14 02:49 +0200:
>Le Tuesday 13 May 2008 20:02:42 Dieter Maurer, vous avez écrit :
>> Someone convinced us that error handling should (of course)
>> see the state the error happened and not a new clean state -- in
>> order to be able to report about the errorneous state
>
>Then, in my opinion, it should not be executed "inside" what failed, but in a 
>clean environment with a "pointer" (in non-technical meaning) to the failed 
>transaction.

Quite difficult and complex.

>> Another reason was also: should your error template need to run
>> in a fresh transaction, then just abort the old one.
>
>How ? IIRC it's a bad coding practice to interact with transaction mechanism 
>from what's considered as "inside" a transaction (ZPublisher being the 
>borderline).

That's what you have now. The border line is the end of request
processing (which includes error handling).


Think a bit about your wish:

  As soon as one transaction ends, a new one starts.

  What should happen with your artificial error handling transaction?
  Should it be aborted? or committed?. What should happen when the
  commit fails -- another error handling, in another error handling
  transaction?


The current behaviour is good in most cases.
If you dislike it in some special cases, abort the transaction
(you will get a new one, aborted automatically at the end
of error handling, unless you do the commit).

> 
>Maybe 2 cases should be handled differently:
> - exception happened when processing transaction: do not abort immediately
> - exception happened in transaction handling (hopefully only in "commit"):
>   abort to offer error handling a "usable" environment

The better alternative would be to not prevent "join"s to
a doomed transaction.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-13 Thread Dieter Maurer

Andreas Jung wrote at 2008-5-13 20:19 +0200:
> ...
>> "Shared.DC.ZRDB.TM.TM" is the standard Zope[2] way to implement a
>> ZODB DataManager.
>
>Nowadays you create a datamanager implementing IDataManager and join it 
>with the current transaction. Shared.DC.ZRDB.TM.TM is pretty much 
>old-old-old-style.

Time to change the Zope 2 code base ;-)

There, you still find the old way -- and it is used by other Zope 2
components



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-13 Thread Dieter Maurer

Andreas Jung wrote at 2008-5-13 10:44 +0200:
> ...
>> Is there any reason why TM._register is hiding exceptions ?
>
>Isn't the right approach for integrating a module with ZODB transactions 
>using a ZODB DataManager?

"Shared.DC.ZRDB.TM.TM" is the standard Zope[2] way to implement a
ZODB DataManager.


And, as usual, "_register" should not hide exceptions. I.e., this is a bug.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Shared/DC/ZRDB/TM.py:_register

2008-05-13 Thread Dieter Maurer

Vincent Pelletier wrote at 2008-5-13 10:36 +0200:
> ...
>To reproduce the problem:
> - create a class inheriting from TM, and define a method calling register.
>   Add logs to abort and commit method to track transaction end.
> - create a default error page which triggers a call to the method created
>   above. (this is equivalent to accessing some database from the error page,
>   for example)
>   Call it twice.
>   Most realistic case is using an instance surviving the transaction (a
>   global, or a persistant object)
> - tigger an error in TPC (a raise in vote is the most realistic case) and get
>   the error page to render. The most obvious breakage I could see is when
>   trying to undo an non-undoable transaction (modify a script twice, undo the
>   oldest without undoing the newest)
>
>Here is what happens:
> - first transaction (the "undo" in my example) raises in TPC, transaction is
>   marked as failed
> - error message gets rendered in the same transaction (that's a ZPublisher
>   bug, but I think the problem "root" is hiding the failure)

Formerly, the transaction was aborted before error handling.

Someone convinced us that error handling should (of course) 
see the state the error happened and not a new clean state -- in
order to be able to report about the errorneous state

Therefore, it was changed.
And I think, it was correct to change it.


Another reason was also: should your error template need to run
in a fresh transaction, then just abort the old one.
If the transaction were aborted before error handling, then
an error template with different requirements does not have a chance



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Multiple databases / mount points : documentation?

2008-04-09 Thread Dieter Maurer

Vincent Rioux wrote at 2008-4-9 11:58 +0200:
>I am using zodb FileStorage for a standalone application and looking for 
>some advices, tutorials or descriptions for using a zodb made of an 
>aggregation of smaller ones.
>I have been told that the "mount" mechanism should make the trick. Any 
>pointers are welcome...

"Mount"s are a Zope concept.

You find a very terse description in Zope's configuration
schema ("Zope2/Startup/zopeschema.xml"). Look for "mount-point".

Once, you have configured a "zodb_db" with one or more mount
points, you can create "mount" objects in your storage
via the ZMI.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

2008-04-07 Thread Dieter Maurer

Anton Stonor wrote at 2008-4-7 16:16 +0200:
>We have a setup with a ZEO server and 4 ZEO clients.
>
>During the last weeks we have seen almost daily deadlocks in some of the 
>ZEO clients. I've tried to wait for up to 30 minutes before restarting a 
>client.
>
>I could need an advice on how to debug this.
>
>With DeadlockDebugger I see the same pattern each time:
>
>One thread is hanging:
> ...
> r_flags, r_args = self.wait(msgid)
>   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", 
>line 638, in wait

This means that the client is waiting for a reply -- which apparently
does not come.

Maybe some router or firewall sometimes drops packages and connections?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Analyzing a ZODB.

2008-04-06 Thread Dieter Maurer

Manuel Vazquez Acosta wrote at 2008-4-5 11:49 -0400:
> ...
>I wonder if there's a way to actually see what objects (or object types)
>are modified by those transactions. So I can go directly to the source
>of the (surely innecesary) transaction.

The ZODB utility "fsdump" generates a human readable
view of your storage.

Among others, you see which transactions have been committed
(together with all transaction metadata) and which objects was
modified (identified by the oid (and I think, their class)).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Problems installing ZODB under Windows

2008-03-27 Thread Dieter Maurer

Andreas Holtz wrote at 2008-3-26 14:04 +0100:
> ...
>error: Setup script exited with error: Python was built with Visual Studio 
>2003;
>extensions must be built with a compiler than can generate compatible binaries.
>Visual Studio 2003 was not found on this system. If you have Cygwin installed,
>you can try compiling with MingW32, by passing "-c mingw32" to setup.py.

The message tells you about the problem.
You need a C compiler, compatible with "Visual Studio 2003".


Alternatively, you may find binary installers for the components
you need. When I remember right, then binary installers are
available for Windows for many ZODB versions.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-26 Thread Dieter Maurer

Sean Allen wrote at 2008-3-25 15:23 -0400:
> ...
>
>On Mar 25, 2008, at 2:54 PM, Dieter Maurer wrote:
>> Benji York wrote at 2008-3-25 14:24 -0400:
>>> ... commit contentions ...
>>>> Almost surely there are several causes that all can lead to  
>>>> contention.
>>>>
>>>> We already found:
>>>>
>>>>  *  client side causes (while the client helds to commit lock)
>>>>
>>>>- garbage collections (which can block a client in the order of
>>>>  10 to 20 s)
>...
>> A reconfiguration of the garbage collector helped us with this one
>> (the standard configuration is not well tuned to processes with
>> large amounts of objects).
>
>what'd you do?

# reconfigure garbage collector
#  generation 0 GC at "(allocated - freed) == 7.000"; analyse 7.000 objects
#  generation 1 GC at "(allocated - freed) == 140.000"; analyse 140.000 objects
#  generation 2 GC at "(allocated - freed) == 1.400.000"; analyse all objects
import gc; gc.set_threshold(7000, 20, 10)



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-25 Thread Dieter Maurer

Benji York wrote at 2008-3-25 14:24 -0400:
> ... commit contentions ...
>> Almost surely there are several causes that all can lead to contention.
>> 
>> We already found:
>> 
>>   *  client side causes (while the client helds to commit lock)
>>   
>> - garbage collections (which can block a client in the order of
>>   10 to 20 s)
>
>Interesting.  Perhaps someone might enjoy investigating turning off 
>garbage collection during commits.

A reconfiguration of the garbage collector helped us with this one
(the standard configuration is not well tuned to processes with
large amounts of objects).

> 
>> - invalidation processing, espicially ZEO ClientCache processing
>
>Interesting.  Not knowing much about how invalidations are handled, I'm 
>curious where the slow-down is.  Do you have any more detail?

Not many:

We have a component called RequestMonitor which periodically
checks for long running requests and logs the corresponding stack
traces.
This monitor very often sees requests (holding the commit lock)
which are in "ZEO.cache.FileCache.settid".

As the monitor runs asynchronously with the observed threads,
the probability of an observation in a given function
depends on how long the thread is inside this function (total
time, i.e. visits times mean time per visit).
>From this, we can conclude that a significant time is spend in
"settid".



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-25 Thread Dieter Maurer

Benji York wrote at 2008-3-25 09:40 -0400:
>Christian Theune wrote:
>> I talked to Brian Aker (MySQL guy) two weeks ago and he proposed that we
>> should look into a technique called `group commit` to get rid of the "commit
>> contention".
> ...
>Summary: fsync is slow (and the cornerstone of most commit steps), so 
>try to gather up a small batch of commits to do all at once (with only 
>one call to fsync).

Our commit contention definitely is not caused by "fsync".
Our "fsync" is quite fast. If only "fsync" would need to be considered,
we could easily process at least 1.000 transactions per second -- but
actually with 10 transactions per second we get contentions a few times
per week.



We do not yet precisely the cause of our commit contentions.
Almost surely there are several causes that all can lead to contention.

We already found:

  *  client side causes (while the client helds to commit lock)
  
- garbage collections (which can block a client in the order of
  10 to 20 s)

- NFS operations (which can take up to 27 s in our setup -- for
  still unknown reasons)

- invalidation processing, espicially ZEO ClientCache processing

  *  server side causes

- commit lock hold during copy phase of pack

- IO trashing during the reachability analysis in pack

- non deterministic server side IO anomalities
  (IO suddently takes several times longer than usual -- for still
  unknown reasons)
> Somewhat like Nagle's algorithm, but for fsync.
>
>The kicker is that OSs and hardware often lie about fsync (and it's 
>therefore fast) and good hardware (disk arrays with battery backed write 
>cache) already make fsync pretty fast.
>
>Not to suggest that group commit wouldn't speed things up, but it would 
>seem that the technique will make the largest improvement for people 
>that are using a non-lying fsync on inappropriate hardware.
>-- 
>Benji York
>Senior Software Engineer
>Zope Corporation

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-03-21 Thread Dieter Maurer

Chris Withers wrote at 2008-3-20 22:22 +:
>Roché Compaan wrote:
>> Not yet, they are very time consuming. I plan to do the same tests over
>> ZEO next to determine what overhead ZEO introduces.
>
>Remember to try introducing more app servers and see where the 
>bottleneck comes ;-)

We have seen "commit contention" with lots (24) of zeo clients
and a high write rate application (allmost all requests write to
the ZODB).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] ERROR ZODB.Connection Couldn't load state for 0x01

2008-03-10 Thread Dieter Maurer

Dylan Jay wrote at 2008-3-10 17:37 +1100:
> ...
>I have a few databases being served out of a zeo. I restarted them in a 
>routine operation and now I can't restart due to the following error
>
>Any idea on how to fix this?
>
>
>2008-03-10 06:29:12 ERROR ZODB.Connection Couldn't load state for 0x01
> 
>line 540, in load_multi_oid
> conn = self._conn.get_connection(database_name)
>   File "/home/zope/thebe/parts/zope2/lib/python/ZODB/Connection.py", 
>line 328, in get_connection
> new_con = self._db.databases[database_name].open(
>KeyError: 'edb'

Looks like a configuration problem.

Apparently, there is no database "edb" configured (but for some
reason expected).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] ZEO+MultipleClients+ConflictErrors

2008-02-27 Thread Dieter Maurer

Alan Runyan wrote at 2008-2-26 13:07 -0600:
> ...
>Most people come at ZODB with previous experience in RDBMS.
>
>How do they map SQL INSERT/UPDATE activities to ZODB data structures?
>In a way that does not create hotspot.

I tend to views the objects in an application
as belonging to three types: 

  *  primary content objects (documents, files, images, ...)

  *  containers (folders for organisation)

  *  global auxiliary objects (internal objects used for global
 tasks, such as cataloguing)

For the primary content objects, workflow is usually appropriate
to prevent concurrent modifying access.

(Large) containers should be based on a scalable data structure
with conflict resolution (such as "OOBTree"). Moreover,
the ids should be chosen randomly (to ensure that concurrent insertions
are likely to be widely spread over the complete structure).

The most difficulties can come from the global auxiliary
objects -- as there a in some way internal, not under direct
control of the application. We are using a variant of "QueueCatalog"
to tackle hotspots caused by cataloging.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: Re: IStorageIteration

2008-02-27 Thread Dieter Maurer

Thomas Lotze wrote at 2008-2-26 09:30 +0100:
>Dieter Maurer wrote:
>
>> How often do you need it?
>> It is worse the additional index? Especially in view that a storage may
>> contain a very large number of transactions?
>
>We've done it differently now anyway, using real iterators which store
>their state on the server and get garbage-collected when no longer needed.

Fine. In "dm.historical", you could find an alternative:

  It uses exponentially increasing prefetching and
  loads about 2*2**n records to get the first 2**n records.
  This means, it has amortized linear runtime complexity.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: IStorageIteration

2008-02-25 Thread Dieter Maurer

Thomas Lotze wrote at 2008-2-12 11:09 +0100:
> ...
>> I don't think that's going to work here.  Iterating through the
>> transactions in the database for each iteration is going to be totally
>> non-scalable.
>
>It seems to us that it would actually be the right thing to require that
>storages have an efficient, scalable and stateless way of accessing their
>transactions by ID. In the case of FileStorage, this might be achieved
>using an index analogous to the one mapping object IDs to file positions.

How often do you need it?
It is worse the additional index? Especially in view that a storage
may contain a very large number of transactions?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-08 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-7 21:44 +0200:
> ...
>There are use cases where having a container in the ZODB that can handle
>large volumes and maintain a high insertion rate would be very
>convenient. An example of such a use case would be a site with millions
>of members where each member has their own folder containing different
>content types. The rate at which new members register is very high as
>well

I do not believe that the insertion rate the ZODB can handle now
would not be sufficient to handle this use case.

I do not have your timings present, but from our installation
I know that the ZODB can handle 10 transactions per second.
This would mean about 36.000 per day (10 hour days)
and about 1 million in a month.

>so the folder needs to handle insertions quickly. In this use case
>you are not dealing with structured data. If members in a site with such
>large volumes start to generate content, indexes in the ZODB become
>problematic too because of the slow rate of insertion.

We have several write intensive applications with storages
in the order of 10 to 20 GB and 10 to 20 millions objects
-- and have not yet seen problems with the insertion rate.

We do see other problems (notably commit contention)
but these problems cannot be solved by an increased insertion
rate.

>And it this point
>you start to stuff everything in relational database and the whole
>experience becomes painful ...

We speak again when you observe a concrete problem in a real
installation caused by limited insertion rate.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-07 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-7 21:21 +0200:
> ...
>So if I asked you to build a data structure for the ZODB that can do
>insertions at a rate comparable to Postgres on high volumes, do you
>think that it can be done?

If you need a high write rate, the ZODB is probably not optimal.
Ask yourself whether it is not better to put such high frequency write
data directly into a relational database.

Whenever you have large amounts of highly structured data,
a relational database is necessary more efficient than the ZODB.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-06 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-6 20:18 +0200:
>On Tue, 2008-02-05 at 19:17 +0100, Dieter Maurer wrote:
>> Roché Compaan wrote at 2008-2-4 20:54 +0200:
>> > ...
>> >I don't follow? There are 2 insertions and there are 1338046 calls
>> >to persistent_id. Doesn't this suggest that there are 66 objects
>> >persisted per insertion? This seems way to high?
>> 
>> Jim told you that "persistent_id" is called for each object and not
>> only persistent objects.
>> 
>> An OOBucket contains up to 30 key value pairs, each of which
>> are subjected to a call to "persistent_id". In each of your pairs,
>> there is an additional persistent object. This means, you
>> should expect 3 calls to "persistent_id" for each pair in an "OOBucket".
>
>If I understand correctly, for each insertion 3 calls are made to
>"persistent_id"? This is still very far from the 66 I mentioned above?

You did not understand correctly.

You insert an entry. The insertion modifies (at least) one OOBucket.
The "OOBucket" needs to be written back. For each of its entries
(one is your new one, but there may be up to 29 others) 3
"persistent_id" calls will happen.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RE : [ZODB-Dev] Re: ZODB Benchmarks

2008-02-06 Thread Dieter Maurer

Mignon, Laurent wrote at 2008-2-6 08:06 +0100:
>After a lot of tests and benchmark, my feeling is that the ZODB does not seem 
>suitable for systems managing many data stored in a plane hierarchy.
>The application that we currently develop is a business process management 
>system in opposition to a content management system. In order to guarantee the 
>performances necessary, we decided to no more use the ZODB. All data are now 
>stored in a relationnal database.

Roché's corrected timings indicate:

  The ZODB is significantly slower than Postgres for insertions
  but camparatively fast (slightly faster) on lookups.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-05 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-4 20:54 +0200:
> ...
>I don't follow? There are 2 insertions and there are 1338046 calls
>to persistent_id. Doesn't this suggest that there are 66 objects
>persisted per insertion? This seems way to high?

Jim told you that "persistent_id" is called for each object and not
only persistent objects.

An OOBucket contains up to 30 key value pairs, each of which
are subjected to a call to "persistent_id". In each of your pairs,
there is an additional persistent object. This means, you
should expect 3 calls to "persistent_id" for each pair in an "OOBucket".



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-05 Thread Dieter Maurer

Hello Shane,

Shane Hathaway wrote at 2008-2-3 23:57 -0700:
> ...
>Looking into this more, I believe I found the semantic we need in the 
>PostgreSQL reference for the LOCK statement [1].  It says this about 
>obtaining a share lock in read committed mode: "once you obtain the 
>lock, there are no uncommitted writes outstanding".  My understanding of 
>that statement and the rest of the paragraph suggests the following 
>guarantee: in read committed mode, once a reader obtains a share lock on 
>a table, it sees the effect of all previous transactions on that table.

I have been too pessimitic with respect to Postgres.

While Postgres uses the freedom of the ASNI isolation level definitions
(they say that some things must not happen but do not prescribe that
other things must necessarily happen), Postgres has a precise
specification for the "read committed" mode -- it says: in read
committed mode, each query sees the state as it has been when
the query started. This implies that it sees all transactions
that have been committed before the query started. This is sufficient
for your conflict resolution to be correct -- as you hold the commit
lock during conflict resolution such that no new transaction can happen
during the query in question.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-03 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-3 09:15 +0200:
> ...
>I have tried different commit intervals. The published results are for a
>commit interval of 100, iow 100 inserts per commit.
>
>> Your profile looks very surprising:
>> 
>>   I would expect that for a single insertion, typically
>>   one persistent object (the bucket where the insertion takes place)
>>   is changed. About every 15 inserts, 3 objects are changed (the bucket
>>   is split) about every 15*125 inserts, 5 objects are changed
>>   (split of bucket and its container).
>>   But the mean value of objects changed in a transaction is 20
>>   in your profile.
>>   The changed objects typically have about 65 subobjects. This
>>   fits with "OOBucket"s.
>
>It was very surprising to me too since the insertion is so basic. I
>simply assign a Persistent object with 1 string attribute that is 1K in
>size to a key in a OOBTree. I mentioned this earlier on the list and I
>thought that Jim's explanation was sufficient when he said that the
>persistent_id method is called for all objects including simple types
>like strings, ints, etc. I don't know if it explains all the calls that
>add up to a mean value of 20 though. I guess the calls are being made by
>the cPickle module, but I don't have the experience to investigate this.

The number of "persitent_id" calls suggests that a written
persistent object has a mean value of 65 subobjects -- which
fits well will OOBuckets.

However, when the profile is for commits with 100 insertions each,
then the number of written persistent objects is far too small.
In fact, we would expect about 200 persistent object writes per transaction:
the 100 new persistent objects assigned plus about as many buckets
changed by these insertions.

> 
>The keys that I lookup are completely random so it is probably the case
>that the lookup causes disk lookups all the time. If this is the case,
>is 230ms not still to slow?

Unreasonably slow in fact.

A tree with size 10**7 does likely not have a depth larger than 4
(internal nodes should typically have at least 125 entries, leaves should have
at least 15 -- a tree of depth 4 thus can have about 125**3*15 = 29.x * 10**6).
Therefore, one would expect at most 4 disk accesses.

On my (6 year old) computer, a disk access can take up to 30 ms.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-03 Thread Dieter Maurer

Meanwhile I have carefully studied your implementation.

There is only a single point I am not certain about:

  As I understand isolation levels, they garantee that some bad
  things will not happen but not that all not bad thing will happen.

  For "read committed" this means: it garantees that I will
  only see committed transactions but not necessarily that I will see
  the effect of a transaction as soon as it is committed.

  Your conflict resolution requires that it sees a transaction as
  soon as it is commited.

  The supported relational databases may have this property -- but
  I expect we do not have a written garantee that this will definitely
  be the case.

I plan to make a test which tries to provoke a conflict resolution
failure -- and gives me confidance that the "read committed" of
Postgres really has the required property.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-02 Thread Dieter Maurer

Roché Compaan wrote at 2008-2-1 21:17 +0200:
>I have completed my first round of benchmarks on the ZODB and welcome
>any criticism and advise. I summarised our earlier discussion and
>additional findings in this blog entry:
>http://www.upfrontsystems.co.za/Members/roche/where-im-calling-from/zodb-benchmarks

In your insertion test: when do you do commits?
One per insertion? Or one per n insertions (for which "n")?


Your profile looks very surprising:

  I would expect that for a single insertion, typically
  one persistent object (the bucket where the insertion takes place)
  is changed. About every 15 inserts, 3 objects are changed (the bucket
  is split) about every 15*125 inserts, 5 objects are changed
  (split of bucket and its container).
  But the mean value of objects changed in a transaction is 20
  in your profile.
  The changed objects typically have about 65 subobjects. This
  fits with "OOBucket"s.


Lookup times:

0.23 s would be 230 ms not 23 ms.

The reason for the dramatic drop from 10**6 to 10**7 cannot lie in the
BTree implementation itself. Lookup time is proportional to
the tree depth, which ideally would be O(log(n)). While BTrees
are not necessarily balanced (and therefore the depth may be larger
than logarithmic) it is not easy to obtain a severely unbalanced
tree by insertions only.
Other factors must have contributed to this drop: swapping, cache too small,
garbage collections...

Furthermore, the lookup times for your smaller BTrees are far too
good -- fetching any object from disk takes in the order of several
ms (2 to 20, depending on your disk).
This means that the lookups for your smaller BTrees have
typically been served directly from the cache (no disk lookups).
With your large BTree disk lookups probably became necessary.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Dieter Maurer

Andreas Jung wrote at 2008-2-1 12:13 +0100:
>
>
>--On 1. Februar 2008 03:03:53 -0800 Tarek Ziadé <[EMAIL PROTECTED]> 
>wrote:
>>
>> Since BTrees are written in C, I couldn't add my own conflict manager to
>> try to merge buckets. (and this is
>> way over my head)
>>
>
>But you can inherit from the BTree classes and hook your 
>_p_resolveConflict() handler into the Python class - or?

I very much doubt that this is a possible approach:

  A BTree is a complex object, an object that creates new objects
  (partially other BTrees and partially Buckets) when it grows.

  Giving the application used "BTree" class an "_p_resolvedConflict"
  will do little -- because the created subobjects (Buckets mainly)
  will not know about it.

  Note especially, that the only effective conflict resolution
  is at the bucket level. As you can see, there is currently
  no way to tell a "BTree" which "Bucket" class it should use
  for its buckets -- this renders your advice ineffective.

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-01 Thread Dieter Maurer

Hallo Shane,

Shane Hathaway wrote at 2008-1-31 13:45 -0700:
> ...
>No, RelStorage doesn't work like that either.  RelStorage opens a second
>database connection when it needs to store data.  The store connection
>will commit at the right time, regardless of the polling strategy.  The
>load connection is already left open between connections; I'm only
>talking about allowing the load connection to keep an idle transaction.
> I see nothing wrong with that, other than being a little surprising.

That looks very troubesome.

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.

>>  and you read older and older data
>> which must increase serializability problems
>
>I'm not sure what you're concerned about here.  If a storage instance
>hasn't polled in a while, it should poll before loading anything.

Even if it has polled not too far in the past, it should
repoll when the storage is joined to a Zope request processing
(in "Connection._setDB"):
If it does not, then it may start work with an already outdated
state -- which can have adverse effects when the request bases modifications
on this outdated state.
If everything works fine, than a "ConflictError" results later
during the commit.

This implies, the read connection must start a new transaction
at least after a "ConflictError" has occured. Otherwise, the
"ConflictError" cannot go away.

> 
>> (Postgres might
>> not garantee serializability even when the so called isolation
>> level is chosen; in this case, you may not see the problems
>> directly but nevertheless they are there).
>
>If that is true then RelStorage on PostgreSQL is already a failed
>proposition.  If PostgreSQL ever breaks consistency by exposing later
>updates to a load connection, even in the serializable isolation mode,
>ZODB will lose consistency.  However, I think that fear is unfounded.
>If PostgreSQL were a less stable database then I would be more concerned.

I do not expect that Postgres will expose later updates to the load
connection.

What I fear is described by the following szenario:

   You start a transaction on your load connection "L".
   "L" will see the world as it has been at the start of this transaction.

   Another transaction "M" modifies object "o".

   "L" reads "o", "o" is modified and committed.
   As "L" has used "o"'s state before "M"'s modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.

If something causes a commit failure, then the probability of such
failures increases with the outdatedness of "L"'s reads.

> ...
>RelStorage only uses the serializable isolation level for loading, not
>for storing.  A big commit lock prevents database-level conflicts while
>storing.  RelStorage performs ZODB-level conflict resolution, but only
>while the commit lock is held, so I don't yet see any opportunity for
>consistency to be broken.  (Now I imagine you'll complain the commit
>lock prevents scaling, but it uses the same design as ZEO, and that
>seems to scale fine.)

Side note:

  We currently face problems with ZEO's commit lock: we have 24 clients
  that produce about 10 transactions per seconds. We observe
  occational commit contentions in the duration of a few minutes.

  We already have found several things that contribute to this problem --
  slow operations on clients while the commit lock is held on ZEO:
  Python garbage collections, invalidation processing, stupid
  application code.
  But there are still some mysteries and we do not yet have
  a good solution.

> 
I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits ("2PC") but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  "[EMAIL PROTECTED]" about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, "RelStorage"
  seems only safe for single storage use.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer

Shane Hathaway wrote at 2008-1-31 11:55 -0700:
> ...
>Yes, quite right!
>
>However, we don't necessarily have to roll back the Postgres transaction
>on every ZODB.Connection close, as we're doing now.

That sounds very nasty!

In Zope, I definitely *WANT* to either commit or roll back the
transaction when the request finishes. I definitely do not
want to let the following completely unrelated request
decide about the fate of my modifications.

> If we leave the
>Postgres transaction open even after the ZODB.Connection closes, then
>when the ZODB.Connection reopens, we have the option of not polling,
>since at that point ZODB's view of the database remains unchanged from
>the last time the Connection was open.

Yes, but you leave the fate of your previous activities to
the future -- and you read older and older data
which must increase serializability problems (Postgres might
not garantee serializability even when the so called isolation
level is chosen; in this case, you may not see the problems
directly but nevertheless they are there).

>It's not usually good practice to leave sessions idle in a transaction,
>but this case seems like a good exception since it should significantly
>reduce the database traffic.

I agree that it can reduce traffic but I am almost convinced that
the price will be high (in either "cannot serialize concurrent updates"
or not directly noticable serializability violations).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Speedy RelStorage/PostgreSQL

2008-01-31 Thread Dieter Maurer

Shane Hathaway wrote at 2008-1-31 11:35 -0700:
>Dieter Maurer wrote:
>> Shane Hathaway wrote at 2008-1-31 00:12 -0700:
>>> ...
>>> 1. Download ZODB and patch it with poll-invalidation-1-zodb-3-8-0.patch
>> 
>> What does "poll invalidation" mean?
>> 
>>   The RelStorage maintains a sequence of (object) invalidations ordered
>>   by "transaction-id" and the client can ask "give me all invalidations
>>   above this given transaction id"? It does so at the start of each
>>   transaction?
>> 
>> In this case, how does the storage know when it can forget
>> invalidations?
>
>That's not quite right--RelStorage does not maintain any per-client
>notification list.  Instead, each client asks whether any transactions
>have committed since the last poll.  The transaction ID of the last poll
>is tracked by the storage instance bound to the connection.  If any
>transactions have committed, then the client gets a list of OIDs changed
>by those transactions and invalidates the corresponding objects.

Then the storage (backend) must maintain all object invalidations down to
the latest transaction id that might be used in a poll.
If it does not, some invalidations may be missing when this poll
asks.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] RelStorage now in Subversion

2008-01-31 Thread Dieter Maurer

Shane Hathaway wrote at 2008-1-31 01:08 -0700:
> ...
>I admit that polling for invalidations probably limits scalability, but 
>I have not yet found a better way to match ZODB with relational 
>databases.  Polling in both PostgreSQL and Oracle appears to cause no 
>delays right now, but if the polling becomes a problem, within 
>RelStorage I can probably find ways to reduce the impact of polling, 
>such as limiting the polling frequency.

I am surprised that you think to be able to play with the polling
frequency.

  Postgres will deliver objects as they have been when the
  transaction started.
  Therefore, when you start a postgres transaction
  you must invalidate any object in your cache that
  has been modified between load time and the begin of this
  transaction. Otherwise, your cache can deliver stale state
  not fitting with the objects loaded directly from Postgres.

  I read this as you do not have much room for manouver.
  You must ask Postgres about invalidations when the transaction
  starts.

  Of course, you can in addition ask Postgres periodically
  in order to have a smaller and (hopefully) faster result
  when the transaction starts.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Speedy RelStorage/PostgreSQL

2008-01-31 Thread Dieter Maurer

Shane Hathaway wrote at 2008-1-31 00:12 -0700:
> ...
>1. Download ZODB and patch it with poll-invalidation-1-zodb-3-8-0.patch

What does "poll invalidation" mean?

  The RelStorage maintains a sequence of (object) invalidations ordered
  by "transaction-id" and the client can ask "give me all invalidations
  above this given transaction id"? It does so at the start of each
  transaction?

In this case, how does the storage know when it can forget
invalidations?



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] The write skew issue

2008-01-31 Thread Dieter Maurer

Christian Theune wrote at 2008-1-30 21:21 +0100:
> ...
>That would mean that the write skew phenomenon that you found would be 
>valid behaviour, wouldn't it?

No.

> Am I missing something?

Yes. No matter how you order the two transactions in my example,
the result will be different from what the ZODB produces.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] The write skew issue

2008-01-30 Thread Dieter Maurer

Christian Theune wrote at 2008-1-29 16:32 +0100:
> ...
>When I looked up the definitions in Wikipedia about isolation and 
>serializability again I didn't find any hint about the conditions how to 
>decide which ordering is preferred.

>From Wikipedia ("http://en.wikipedia.org/wiki/Serializable_%28databases%29";):

  A schedule is serializable, if its outcome ... is equal
  to the outcome of its transactions ececuted sequentially without
  overlapping.

I interpret this as "any ordering" of the transactions will do.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] Proposal process?

2008-01-26 Thread Dieter Maurer

Formerly, proposals lived on "wiki.zope.org".
There, they could be commented and discussed.

Now proposals live somewhere. Usually, they can not be commented nor
discussed. But, they are registered at Launchpad.

For me, it is completely unclear how Launchpad should be used
to guide the route from a proposal to an eventually implemented
feature. How/where do we discuss and comment on the proposals?
How/where do we decide whether a proposal should be implemented
and in what "variant"? 

-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Strange "File too large" problem

2008-01-24 Thread Dieter Maurer

Andreas Jung wrote at 2008-1-24 19:20 +0100:
> ...
>>>   Module ZODB.utils, line 96, in cp
>>> IOError: [Errno 27] File too large
>>
>> Apparently, you do not have "large file support" and your storage
>> file has readed the limit for "small" files.
>>
>
>LFS is usually required for files larger than 2GB. According to my 
>information I got from the reporter: the file was 17GB large.

Nevertheless, the operating system reported a "file too large"
error on "write".

This suggests a route for further investigation:

  What causes the reporter's operating system to report "file too large".

This is not a ZODB question but one for the respective operating system.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Strange "File too large" problem

2008-01-24 Thread Dieter Maurer

Izak Burger wrote at 2008-1-24 13:57 +0200:
> ...
>I'm kind of breaking my normal rules of engagement here by immediately 
>sending mail to a new list I just subscribed to, but then Andreas Jung 
>did ask me to send a mail about this to the list.
>
>This morning one of our clients suddenly got this error:
>
>Traceback (innermost last):
>   Module ZPublisher.Publish, line 121, in publish
>   Module Zope2.App.startup, line 240, in commit
>   Module transaction._manager, line 96, in commit
>   Module transaction._transaction, line 380, in commit
>   Module transaction._transaction, line 378, in commit
>   Module transaction._transaction, line 436, in _commitResources
>   Module ZODB.Connection, line 665, in tpc_vote
>   Module ZODB.FileStorage.FileStorage, line 889, in tpc_vote
>   Module ZODB.utils, line 96, in cp
>IOError: [Errno 27] File too large

Apparently, you do not have "large file support" and your storage
file has readed the limit for "small" files.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] PGStorage

2008-01-24 Thread Dieter Maurer

Zvezdan Petkovic wrote at 2008-1-23 17:15 -0500:
>On Jan 23, 2008, at 4:05 PM, Flavio Coelho wrote:
>> sorry, I never meant to email you personally

I have been wrong: Flavio has not forgotten the list, I had not looked
carefully enough. Sorry!



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Writing Persistent Class

2008-01-24 Thread Dieter Maurer

Marius Gedminas wrote at 2008-1-23 23:19 +0200:
>On Mon, Jan 21, 2008 at 07:15:42PM +0100, Dieter Maurer wrote:
>> Marius Gedminas wrote at 2008-1-21 00:08 +0200:
>> >Personally, I'd be afraid to use deepcopy on a persistent object.
>> 
>> A deepcopy is likely to be no copy at all.
>> 
>>   As Python's "deepcopy" does not know about object ids, it is likely
>>   that the copy result uses the same oids as the original.
>>   When you store this copy, objects with the same oid are identified.
>
>This appears not to be the case.  The following script prints "Looks OK":

You are right -- and I do now understand why.

"deepcopy" uses "__getstate__" to get at the object's state.
"Persistent.__getstate__" does not include the special persistent
attributes. Most of the "_p_" and all "_v_" attributes are not included.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] PGStorage

2008-01-23 Thread Dieter Maurer

Alan Runyan wrote at 2008-1-23 13:32 -0600:
> ...
>each record in a catalog may have an object for each index/metadata attribute
>you are capturing.  and possibly a few others.  Each catalog entries contain
>ts of object per object being indexed.That is my understanding.

It does not completely fit reality.

There is no correspondence from catalogued objects to a single
or even many persitent objects in the catalog.

The catalog maintains a (non persistent) metadata record for
each catalogued object -- a single one, for all metadata fields.
These records (they are tuples) are maintained in "IOBucket"s.
An "IOBucket" can have up to 60 entries.

Each index maintains some information for a catalogued object --
but not in individual (object specific) objects. Instead
persistent objects (such as "IITreeSet|IOBTree|OIBTree"s) are
used to combine the information about many (about 40 to 120) objects.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] PGStorage

2008-01-23 Thread Dieter Maurer

Alan Runyan wrote at 2008-1-23 11:24 -0600:
> ...
>My understanding is
>the catalog is what makes storages a misery.

I do not think that this is true.

The catalog only often contains lots of objects -- and maybe some of
them with not so good persistency design.

>CMF/portal_catalog mounted as Filestorage
>and CMF could be mounted as PGStorage.
>
>I presume you would see much more reasonable performance?

At the storage level, all objects look identical: a pair of pickles.
Differences are only in the pickle sizes and in the access frequency...



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] PGStorage

2008-01-23 Thread Dieter Maurer

Flavio Coelho wrote at 2008-1-22 17:43 -0200:
> ...
>Actually what I am trying to run away from is the "packing monster" ;-)

Jim has optimized pack consideraly (--> "zc.FileStorage").

I, too, have worked on pack optimization the last few days (we
cannot yet use Jims work because we are using ZODB 3.4 while
Jims optimization is for ZODB 3.8) and obtained speedups of
more then 80 persent.

>I want to be able to use an OO database without the inconvenience of having
>it growing out of control and then having to spend hours packing the
>database every once in a while. (I do a lot of writes in my DBs). Do this
>Holy grail of databases exist? :-)

The pack equivalent of Postgres is called "vacuum full".
It is more disruptive than packing 


Maybe, you have a look at the old "bsddbstorage".
It could be configured to not use historical data.
Support was discontinued due to lack of interest --
but I report this for the second time within a week
or so. This may indicate a renewed interest.


BTW: stay on the list. I do not like personal emails.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] PGStorage

2008-01-22 Thread Dieter Maurer

Flavio Coelho wrote at 2008-1-22 10:57 -0200:
> ...
>Can anyone tell me if PGstorage is stable enough for production use?

I expect that it will behave similar to "OracleStorage".

"OracleStorage" was abandoned because it was almost an order
or magnitude slower than "FileStorage".

Carefully think whether you really need pickle data in a
relational database (rather than in the file system).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Writing Persistent Class

2008-01-22 Thread Dieter Maurer

Kenneth Miller wrote at 2008-1-21 12:23 -0600:
> This is fine for me I believe. I only needed to have a copy of an  
>object for a short while after the ZODB connection has closed with no  
>intention of ever inserting it back into zodb. Out of curiousity, what  
>would be the proper way to accomplish this task?

You may try a "deepcopy" for this -- as long as you do not try
to store the result in the ZODB, it is likely to work.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] [FileStorage] Potential data loss through packing

2008-01-21 Thread Dieter Maurer

Jim Fulton wrote at 2008-1-21 09:41 -0500:
> ... resurrections after pack time may get lost ...
>I'm sure the new pack algorithm is immune to this.  It would be  
>helpful to design a test case to try to provoke this.

I fear, we can not obtain full immunity at all -- unless we perform
packing offline (after having shut down the storage) or use quite
tight synchronization between packing and normal operations.

Otherwise, resurrection can happen while we are packing -- depending
on how far packing has already proceeded, the resurrection would
need to copy the resurrected objects into its own transaction
rather than simply reference them.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Writing Persistent Class

2008-01-21 Thread Dieter Maurer

Marius Gedminas wrote at 2008-1-21 00:08 +0200:
>Personally, I'd be afraid to use deepcopy on a persistent object.

A deepcopy is likely to be no copy at all.

  As Python's "deepcopy" does not know about object ids, it is likely
  that the copy result uses the same oids as the original.
  When you store this copy, objects with the same oid are identified.

If you are lucky, then the ZODB recognizes the problem (because,
it is unable to store two different objects (the result of the "deepcopy")
with the same oid in its cache.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] [fsIndex] surprizing documentation -- inefficiency?

2008-01-21 Thread Dieter Maurer

"ZODB.fsIndex" tells us in its source code documentation that it splits 
the 8 byte oid into a 6 byte prefix and a two byte suffix and
represents the index by an "OOBTree(prefix -> fsBucket(suffix -> position))"

It explains that is uses "fsBucket" (instead of a full tree) because
the "suffix -> position" would contain at most 256 entries.

This explanation surprises me a bit: why should the bucket contain
only 256 rather than 256 * 256 (= 64.000) entries?


If the assumption is wrong (i.e. the "fsBucket" can contain up to
64.000 entries), is the implementation inefficient (because of that)?


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] [FileStorage] Potential data loss through packing

2008-01-21 Thread Dieter Maurer

Looking at the current (not Jims new) pack algorithm to optimize
the reachability analysis, I recognized a behaviour that looks
like a potential data loss through packing.

The potential data loss can occur when an object unreachable at
pack time becomes reachable again after pack time.

The current pack supports a single use case which can cause such
an object resurrection: the use of backpointers (probably from "undo").

However, resurrection is possible by other means as well -- e.g.
by reinstating a historical version which references objects
meanwhile deleted.
Packing can cause such objects to get lost (resulting in POSKeyErrors).


Reinstating a historical version which references to meanwhile
deleted objects is probably quite a rare situation such
that the potential data loss seems not to be very critical.

But, potential data loss is nasty, even when the probablity is quite low.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Writing Persistent Class

2008-01-18 Thread Dieter Maurer

Kenneth Miller wrote at 2008-1-17 19:08 -0600:
> ...
>Do I always  
>need to subclass persistent?

When you assign an instance of your (non "persistent" derived) class
as an attribute to a persistent object,
then your instance will be persisted together with its persistent
container.
However, local modifications to your instance are not recognized
by the persistency mechanism. You need to explicitly inform the persistent
container about the change.

Moreover, "persistent" objects define the granularity with which
application and storage interact: load and store work on
the level of persistent objects excluding persistent subobjects.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Mutlithreaded ZODB applications.

2008-01-17 Thread Dieter Maurer

Kenneth Miller wrote at 2008-1-17 15:03 -0600:
>I see in the user guide that it mentions that you need to have a connection
>instance per thread to allow multiple threads to access a particular FS.
>Does anyone have any simple example code on doing this?

News threads usually have a form like:

 import transaction
 from ZODB.POSException import ConflictError

 conn = db.open() # open a new ZODB connection
 root = conn.root()
 try:
   retry = True
   while retry:
 retry = False
 try:
   do_your_function(root)
   transaction.commit()
 except ConflictError:
   transaction.abort()
   retry = True # you may want to restrict the possible retry number
 except:
   transaction.abort()
   # you may want to log your exception in some way
   raise
 finally:
   conn.close()


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: memory exhaustion problem

2008-01-17 Thread Dieter Maurer

Flavio Coelho wrote at 2008-1-17 14:57 -0200:
>Some progress!
>
>Apparently the combination of:
>u._p_deactivate()

You do not need that when you use "commit".

>transaction.savepoint(True)
>transaction.commit()

You can use "u._p_jar.cacheGC()" instead of the "commit".

>helped. Memory  consumption keeps growing but much more slowly (about 1/5 of
>the original speed). Please correct me if I am wrong, but I believe that
>ideally memory usage should stay constant throughout the loop, shouldn't it?

You are sure that "SQLLite" does not keep data in memory?

>Moreover, I shouldn't need to commit either, since I am not modifying the
>objects...

The commit calls "cacheGC" for you. You can instead call "cacheGC" yourself.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: Why does this useage of setstate fail?

2008-01-17 Thread Dieter Maurer

Tres Seaver wrote at 2008-1-17 01:30 -0500:
> ...
>Mika, David P (GE, Research) wrote:
>
>>> Can someone explain why the test below  (test_persistence) is failing?
>>> I am adding an attribute after object creation with __setstate__, but
>>> I can't get the new attribute to persist.
>
>You are mutating the object *inside* your __setstate__:  the ZODB
>persistence machinery clears the '_p_changed' flag after you *exit* from
>'__setstate__':  the protocol is not intended to support a persistent
>"write-on-read".

When I remember right, newer ZODB versions allow the "__setstate__"
implementation to tell whether "_p_changed" should or should not be
cleared (default: cleared).

Tim Peters added this feature to support the frequent use case,
that "__setstate__" is used for object migration.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] config files documentation

2008-01-16 Thread Dieter Maurer

Flavio Coelho wrote at 2008-1-15 15:49 -0200:
>is there any documentation of ZODB and ZEO congi files? at least a good
>example with all possible tags?

ZConfig based configurations (such as that for ZODB and ZEO)
are well documented in the corresponding schema files.

For ZODB, the relevant part ist described in "ZODB/component.xml";
for ZEO it is "ZEO/schema.xml".



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] minimizing the need for database packing

2008-01-03 Thread Dieter Maurer

Jim Fulton wrote at 2008-1-2 13:32 -0500:
> ...
>> The old "pack" code acquires the "commit" lock at the start
>> of the "copyRest" (copy transactions after pack time) phase
>> and releases it only every 20 "copyOne" calls for the duration
>> of one "copyOne".
>
>
>OK. I don't know what you mean by the "finish test".

A lock needs to be held at the end of "copyRest" until
the state is changed (i.e. "Data.fs.pack" is renamed to "Data.fs").
This is the "finish test".

>As I mentioned, the new packing algorithm holds the commit lock much  
>less than the old one does.

Fine.

When we use it, I will have a look whether I still see optimization
potential :-)



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] minimizing the need for database packing

2008-01-02 Thread Dieter Maurer

Jim Fulton wrote at 2007-12-29 16:06 -0500:
> ...
>> If you are at it: I think the lock which protects the "finish" test
>> is hold too long. Currently, it is just release for a very short time
>> and then immeadiately reacquired. It should be safe to release it
>> immediately after the "finish" test has failed.
>
>
>I don't know what you are referring to. Could you be more specific?

The old "pack" code acquires the "commit" lock at the start
of the "copyRest" (copy transactions after pack time) phase
and releases it only every 20 "copyOne" calls for the duration
of one "copyOne".



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] minimizing the need for database packing

2007-12-29 Thread Dieter Maurer

Jim Fulton wrote at 2007-12-28 10:20 -0500:
> ...
>There Berkely Database Storage supported automatic incremental packing  
>without garbage collection.  If someone were to revitalize that effort  
>and if one was willing to do without cyclic garbage collection, then  
>that storage would remove the need for the sort of disruptive pack we  
>have with FileStorage now.

Why do you consider "pack" disruptive?

>Note that I'm working on a new FileStorage packer that is 2-3 times  
>faster and, I believe, much less disruptive than the current packing  
>algorithm.

If you are at it: I think the lock which protects the "finish" test
is hold too long. Currently, it is just release for a very short time
and then immeadiately reacquired. It should be safe to release it
immediately after the "finish" test has failed.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Running ZODB on x64 system

2007-12-02 Thread Dieter Maurer

Jim Fulton wrote at 2007-12-2 13:51 -0500:
> ...
>With what version of Python?

2.4.x

>I believe the problem is related to both Python 2.5 and 64-bit systems  
>-- possibly specific 64-bit systems.

Okay. No experience with this.

As we use Zope (2), we do not use Python 2.5.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Upgrade vom ZODB3.3 to ZODB3.4.2

2007-12-02 Thread Dieter Maurer

Krieg, Alexander wrote at 2007-11-29 14:15 +0100:
>i run ZODB3.3 as standalone database for my indico-cms and i would like 
>to upgrade the versions, but info and docu is little or i cannot find it.
>What do i have to do for upgrading, do i have to remove all installed 
>directries under my python-directorie or can i just install it on top of 
>zodb3.3.

It is safer to remove (or move to a backup place) the old installation
before you install the new one.

This is *ALWAYS* the case -- thus, there is no need to explicitely
document it.

Installation over an existing installation *MAY* work -- but
of course, nobody tests for it (because whether or not it works
may depend on the precise version that is already installed).
Therefore, chances are high that is does not work.


*If* you want to take shortcuts, then you are on your own
Thus, go for the safer option



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Running ZODB on x64 system

2007-12-02 Thread Dieter Maurer

Jim Fulton wrote at 2007-12-1 10:09 -0500:
> ...
>AFAIK, there hasn't been a release that fixes this problem.  A  
>contributor to the problem is that I don't think anyone working on  
>ZODB has ready access to 64-bit systems. :(

We are using an old (ZODB 3.4) version on a 64 bit linux without
problems.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Rename top-level svn dir to ZODB3?

2007-10-05 Thread Dieter Maurer

Philipp von Weitershausen wrote at 2007-10-4 23:46 +0200:
>Most other projects in subversion are named after their eggs (or the 
>other around, whichever way you look at it). This rule helps me and I 
>suspect others find stuff more easily. Unfortunately, the ZODB breaks 
>this rule in such a subtle manner that I usually always get it wrong.
>
>Since we can't easily rename the ZODB3 egg to ZODB (dependencies would 
>break), I suggest we rename the subversion directory to 'ZODB3'. Then at 
>least it'll be consistent.

I am only concerned how things are when installed.
If you want to change the package name from "ZODB" to "ZODB3",
I would strongly object.

But, I assume you are not proposing such a change.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: Simple Zope Question

2007-10-05 Thread Dieter Maurer

Thusjanthan (Nathan) Kubendranathan wrote at 2007-10-4 11:10 -0700:
>To all my fellow zope experts,
>
> 
>
>I am not very proficient in zope. I need to extract all the data from a
>data.fs (zope filesystem) into a database. Basically all it contains is
>users and notes about users etc. So it's a note taking system. 
>
> 
>
>So basically I am trying to recover some data from a zope.fs file and I have
>no clue how to get about it. How do I being to read the data from this file?
>So far I have the following: 

"FileStorage" is too low level:

   At the "FileStorage" level, an object is described
   by an oid, a serial (i.e. a timestamp, when the object was
   written) and some opaque data.

   Only the "DB" level above knows that the opaque data
   consists usually of two pickles: one describing the class/type
   and the other describing the object state.


You may try to transfer your data via FTP/WebDAV from your Zope
to the filesystem. However, it will only work, if the objects
have decent FTP-Support (which may not be the case).


BTW: This mailing list is not very adequate for your question.
Please use "[EMAIL PROTECTED]" for followups.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] funky _p_mtime values

2007-09-27 Thread Dieter Maurer

Thomas Clement Mogensen wrote at 2007-9-27 12:43 +0200:
> ...
>Within the last few days something very strange has happened: All  
>newly created or modified objects get a _p_mtime that is clearly  
>incorrect and too big for DataTime to consider it a valid timestamp.  
>(ie. int(obj._p_mtime) returns a long).
>
>Values I get for _p_mtime on these newly altered objects are  
>something like:
>8078347503.108635
>with only the last few decimals differing among all affected objects.
>Objects changed at the same time appear to get the same stamp.

Looks interesting

When I see such unexplainable things, I tend to speak of alpha rays
Computers are quite reliable -- but not completely. Every now
and then a bit is changing which should not change.
In my current life, I have seen things like this 3 times -- usually
in the form that the content of a file changed without that
any application touched it.

When you accept such a wieldy explanation, then, maybe, I can
give one:

  FileStorage must ensure that all transaction ids (they are
  essentially timestamps) are strictly increasing.

  To this end, it maintains a "current transaction id".
  When a new transaction id is needed, it tries to construct
  one from the current time. But if this is smaller than
  the "current transaction id", then it increments that a little
  and uses it as new transaction id.

  Thus, if for some reasons, once a bit changed in the 
  current transaction id (or in the file that maintains it
  persistently), then you may no longer get away with it.

On Plone.org, someone asked today how to fix the
effects on the ZODB of an administrator changing the system time to 2008.
If he finds a solution, then your problem may be tackled the same way.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Getting started with ZODB

2007-09-18 Thread Dieter Maurer

Manuzhai wrote at 2007-9-18 12:46 +0200:
> ...
>the Documentation link points to a page
>that seems to mostly have papers and presentation from 2000-2002.

There is a good guide to the ZODB from Andrew Kuchling (or similar).

It may be old -- but everything is still valid.

>On the internet, there is some talk about the different storage
>providers, but it seems mostly very old. FileStorage seems to be the
>only "serious" Storage provider delivered with ZODB. Are there any
>other general-purpose Storage providers actively being used in the
>wild?

"FileStorage" is must faster than all other storages. Therefore,
it dominates the scene.

"DirectoryStorage", too, is used more widely.

Internally, "TemporaryStorage" is used (a RAM based storage for
sessions). DemoStorage is used for unit tests.

>How does their performance compare? FileStorage apparently needs
>to keep some index in memory; when does this start to be a problem?

You need about 30 bytes per object. You can calculate when
this starts to make problems for your.

>What's new in ZODB4?

I know nothing about ZODB4. The current version is near 3.8, maybe.

>There is some talk about blobs, are they
>described somewhere?

There should be a proposal at "http://wiki.zope.org/ZODB/ListOfProposals";.
But, I just checked, there is not :-(

But a search for "ZODB Blob" gives quite a few hits.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Recovering from BTree corruption

2007-09-11 Thread Dieter Maurer

Alan Runyan wrote at 2007-9-11 09:27 -0500:
> ...
>oid 0xD87110L BTrees._OOBTree.OOBucket
>last updated: 2007-09-04 14:43:37.687332, tid=0x37020D3A0CC9DCCL
>refers to invalid objects:
>oid ('\x00\x00\x00\x00\x00\xb0+f', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xb0N\xbc', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xb0N\xbd', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xd7\xb1\xa0', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xc5\xe8:', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xc3\xc6l', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xc3\xc6m', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xcahC', None) missing: ''
>oid ('\x00\x00\x00\x00\x00\xaf\x07\xc1', None) missing: ''

Looks as if the "OOBucket" has lost quite some value links (as
only a single one links to the next bucket).

>My questions are:
>
> - I imagine if there are 'invalid' references this is considered "corruption"
>   or "inconsistency"?

I depends on your preferences.

> ...
>  - Having these invalid references, is this common to  ZODB applications?

No.

At least not for ZODB applications that do not use inter database
references.

>> Possibly, there's a backup that has data records for the missing OIDs.
>
>Going to ask hosting company to pull up backups for the past few weeks.
>But how i'm going to find this other than "seeing if the folder allows me
>to iterate over the items" is not throwing POSKeyError.  Does that sound
>like a decent litmus test?

You can also run "fsrefs" on it. When you do not get "missing ...",
then the backup does not have you POSKeyError (but may lack quite
a few newer modifications).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Recovering from BTree corruption

2007-09-10 Thread Dieter Maurer

Alan Runyan wrote at 2007-9-10 09:34 -0500:
> ...
>While debugging this I had a conversation with sidnei about mounted
>databases.  He recalled that if your using a mounted database you
>should not pack.  If for some reason your mounted database had a cross
>reference to another database and somehow you had a dangling reference
>to the other database it would cause POSKeyError.

BTrees are actually directed acyclic graphs (DAGs) with two node types
"tree" (internal node) and "bucket" (leaf).

Beside its children, a "tree" contains a link to its leftmost
bucket. Beside its keys/values, a "bucket" contains a link to
the next "bucket".

When you iterate over "keys" or "values", the leftmost bucket
is accessed via the root's leftmost bucket link and then
all buckets are visited via the "next bucket" links.
Your description seems to indicate that you have lost a
"next bucket" link.

If you are lucky, then the tree access structure (the children links
of the "tree" nodes) is still intact -- or if not, is at least
partially intact. Then, you will be able to recover large parts
of your tree.


You have two options:

  * reconstruct the tree from its pickles.

This is the way, the checking of BTrees works.

  * Determine the last key ("LK") before you get the "POSKeyError";
then use the tree structure to access the next available
key. You may need to try ever larger values above "LK"
to skip a potentially damanged part of the tree.


I would start with the second approach and switch to the first one
when it becomes too tedious.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Serializability

2007-08-21 Thread Dieter Maurer

Jim Fulton wrote at 2007-8-20 10:32 -0400:
> ...
>> Application specific conflict resolution
>> would become a really difficult task.
>
>I'm sure you realize that application specific conflict resolution  
>violates serializability.

No, I do not realize this.

Assume a counter which is not read only incremented/decremented.
Its "application specific conflict resolution" ensures
that the schedule is serializable restricted to the counter value.

Things are much more complex when the counter is read (and incremented).
Usually, serializability is lost, then.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer

Jim Fulton wrote at 2007-8-20 10:45 -0400:
> ...
>Dieter appears to have been bitten by this and he is one of we. :)
>
>We, and I presume he, can be bitten by a Python function called from  
>BTree code calling back into the code on the same object.  This is  
>possible, for example, in a __cmp__ or related method.  I assume that  
>this is what happened to Dieter.  Obviously, this would be a fairly  
>"special" comparison method.

I am not yet sure what really has bitten us -- I am not even sure
whether the object was really deactivated or some memory corruption
caused the object's tail to be overwritten by "0".

When the SIGSEGV had hit, usually a bucket in a "TextIndexNG3.lexicon"
was affected. This lexicon uses "BTrees" in a very innocent way.
Its keys are integers and strings -- no fancy "__cmp__" method is
involved.

Moreover, we need two things for the deactivation to happen:
the STICKY mechanism must fail *AND* a deactivation must be
called for.

In our Zope/ZODB version, deactivation is done only at transaction
boundaries (it is an early ZODB 3.4 version where snapshops did not
yet call "incrgc"). Therefore, some "commit" would need to be
done during the "BUCKET_SEARCH" call.

The only conceivable cause appears to me that a different thread
modified the bucket and called "abort". This would mean
a persistency bug (concurrent use of a persistent object by
several threads). I tried to find such a bug in "TextIndexNG3", but
failed.


The problem appears only very rarely -- about 1 to 2 times in about 1 to 2
month. When I analysed the problem in the past, I failed to look
at the object's persistent state (it would have told me whether
the object has been deactivated or overwritten). I just noticed
that the object's head was apparently intact while the object's true
data was 0. Only a few days ago, I recognized that this could
have been the effect of a deactivation.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer

Jim Fulton wrote at 2007-8-20 10:15 -0400:
>Excellent analysis snipped
>
>> 1. and 3. (but obviously not 2.) could be handled by
>> implementing "STICKY" not by a bit but by a counter.
>
>This has been planned for some. :/

I have (reread) this in your "Different Cache Interaction" proposal.

Thanks to the GIL, it will also work for concurrent access from
different threads -- if "Used" and "Unused" are notified while
the GIL is held.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] Re: [Persistent] STICKY mechanism unsafe

2007-08-20 Thread Dieter Maurer

Tres Seaver wrote at 2007-8-20 10:00 -0400:
> ...
>Zope works for this case because each application thread uses a
>per-request connection, to which it has exclusive access while the
>connection is checked out from the pool (i.e., for the duration of the
>request).

At least unless one make persistency errors, such as storing persistent
objects outside the connection (e.g. on class level or in a global
cache).



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] Serializability

2007-08-19 Thread Dieter Maurer

Analysing the STICKY behaviour of 'Persistent', I recognized
that 'Persistent' does not customize the '__getattr__' but in fact
the '__getattribute__' method. Therefore, 'Persistent' is informed
about any attribute access and not only attribute access on a
ghosted instance.


Thogether with the 'accessed' call in Jim's proposal
"http://wiki.zope.org/ZODB/DecouplePersistenceDatabaseAndCache";,
this could be used for a very crude check
of potential serializability conflicts along the following
lines.

  The DataManager maintains a set of objects accessed during a transaction.
  At transaction start, this set is empty and all cached objects
  are in state 'Ghost' or 'Saved'.
  Whenever an object is accessed for the first time, the DataManager's
  'accessed' or 'register' method is called. In both cases,
  the manager adds the object to its accessed set.
  At transaction end, the manager can check whether the state of any
  of its accessed objects has changed in the meantime. If not, no
  serializability conflict happened. Otherwise, a conflict would be
  possible (provided the transaction changed any objects).


The test is very crude, as it does not track whether the tracked
transaction's change really depends on one of the objects
changed by different transactions. We must expect lots
of "ConflictError"s. Application specific conflict resolution
would become a really difficult task.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] [Persistent] STICKY mechanism unsafe

2007-08-18 Thread Dieter Maurer

We currently see occational SIGSEGVs in "BTrees/BucketTemplate.c:_bucket_get".
I am not yet sure but it looks as if the object had been deactivated
during the "BUCKET_SEARCH".

Trying to analyse the problem, I had a close look at the STICKY
mechanism of "persistent.Persistent" which should prevent
accidental deactivation -- and found it unsafe.


"STICKY" is one of the states on persistent objects -- beside
"GHOST", "UPTODATE" and "CHANGED". It is mostly equivalent to
"UPTODATE" but prevents deactivation (but not invalidation).

It is used by C extensions that may release the GIL or call back
to Python (which may indirectly release the GIL).
Their typical usage pattern is

  if (obj->state == GHOST) obj->unghostify();
  if (obj->state == UPTODATE) obj->state = STICKY;
  ... do whatever needs to be done with "obj" ...
  if (obj->state == STICKY) obj->state = UPTODATE

This usage pattern obviously breaks when a similar
code sequence is executed for "obj" while "... do whatever .."
is executed as it resets "STICKY" too early (in the "nested" code
sequence rather than the original one).

This may happen in several ways:

 1. "... do whatever ..." does it explicitly

 2. "obj" is accessed from a different thread

 3. "obj" is accessed from a Python callback

1. might be considered a bug in "... do whatever ..." -- although one
that is not easily avoidable.

2. is a general problem. Not only "STICKY" is unsafe against concurrent
use -- the complete state model of "Persistent" is.
We might explicitly state that the concurrent use of persistent objects
is unsafe and check against it.

With respect to STICKY, all three cases can be detected
by prepending "if (obj->state == STICKY) ERROR;".

1. and 3. (but obviously not 2.) could be handled by
implementing "STICKY" not by a bit but by a counter.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

1 2 3 4 >

1 - 100 of 394 matches

Mail list logo