[ZODB-Dev] Re: What makes the ZODB slow?
Florent Guillaume wrote: Chris Withers wrote: Florent Guillaume wrote: I can comment, I have a big brain too: the code in the catalog uses per-connection series of keys, so no conflicts arise. Really? I thought they were per-thread... wasn't aware that each thread was tied to one connection indefinitely... I thought the connections were pooled and assigned to threads on an ad-hoc basis? The series of keys are stored in a _v_ attribute which is per-connection. And a connection is never used by more that one thread at a time. Yep, I think you're right, I'd be happier still if one of the authors of that code piped up in agreement ;-) Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
Dieter Maurer wrote: PostGres does use looks, lots of them and for different purposes. Could ZODB use locks to gain a similar performance boost? The only thing for which Postgres does not use locks is reading. For this is uses MVCC (which we meanwhile adapted for the ZODB to get rid of ReadConflictErrors). Right... And even when locks are used, conflicts arise (they take on the form of deadlocks). I have seen several of them with Postgres -- not as deadlocks but as concurrent update failed. Ah good, it's not just us then ;-) Most of our ConflictErrors come from the session machinery -- because conflict resolution works there only in a very limited way (due to limited history availability). Would having more history help? Of the rest, 147 of the 177 are either Products.Transience.Transience.Increaser or Products.Transience.Transience.Length2 Yes, these are our hits -- despite the fact that our increaser is much more intelligent (and increases only rarely and not on each access) Hmmm, mind if I commit that increaser to the trunk? There's a ZF board meeting some time soon after which you should get your official invite to become a committer member... cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
Chris Withers wrote at 2006-6-27 09:56 +0100: Dieter Maurer wrote: PostGres does use looks, lots of them and for different purposes. Could ZODB use locks to gain a similar performance boost? Maybe, but it would be a really big change... However, as I explained in an earlier message, the major speed difference does *not* come from optimistic versus pessimistic concurrency control (the optimistic approach is usually more efficient) but from: 1. more efficient storage for highly structured data 2. relational databases support a limited set of datatypes (tables, indexes) and know the behaviour. Operations therefore can be executed by the server. Object oriented databases, on the other hand, usually support an unlimited number of datatypes where the behaviour lives in the applications and the server is stupid. This causes high volumes of data to be exchanged between the server and the clients 3. (unlike Andreas' feeling) the typical ZODB operation modify much more objects than apparently similar Postgres operations. If for example 10 Zope objects are modified and this cause the full text indexes to be updated then this can cause more modifications than the update of hundreds of Postgres rows (as such rows cannot contain mass data -- due to the restriction to simple types). ... Most of our ConflictErrors come from the session machinery -- because conflict resolution works there only in a very limited way (due to limited history availability). Would having more history help? Sure. ... Yes, these are our hits -- despite the fact that our increaser is much more intelligent (and increases only rarely and not Hmmm, mind if I commit that increaser to the trunk? It's part of a proprietary extension product. But, I can ask whether I can move over the essence to the Zope core. There's a ZF board meeting some time soon after which you should get your official invite to become a committer member... Very fine! -- Dieter ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
On 26 Jun 2006, at 15:02, Chris Withers wrote: Florent Guillaume wrote: BTrees perform best when keys' prefixes are randomly distributed. So if your application generates keys like 'foo001', 'foo002',... you'll get lots of conflicts. Same for consecutive integers in IOBTree. Tempted to call bullshit on this, since there's code in the catalog to specifically assign series of keys... ...of course, that code may be evil, and people with bigger brains (hi Tim/Jeremy/Jim!) would have to comment.. I can comment, I have a big brain too: the code in the catalog uses per-connection series of keys, so no conflicts arise. Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of RD +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
Andreas Jung wrote: BTrees perform best when keys' prefixes are randomly distributed. So if your application generates keys like 'foo001', 'foo002',... you'll get lots of conflicts. Same for consecutive integers in IOBTree. Tempted to call bullshit on this, since there's code in the catalog to specifically assign series of keys... Calm down Oh, sorry, forgot some smilies ;-) Don't worry, perfectly calm here *grinz* Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
Roché Compaan wrote at 2006-6-23 19:04 +0200: ... In a test where one commits an instance of a Persistent subclass that have only 2 string attributes, 300 objects per second are created on average. Writing the exact same strings to a two column table in an RDBMS, yields more than 3000 records per second including indexing of the data. This largely is the fault of fsync. It tends to be extremely slow on many platforms. There was an interesting poll for fsync timings in this mailing list (about 1 year or so ago). The ZODB uses fsync once per transaction. Apparently, many relational databases do it less often and therefore achieve a much higher transaction rate. -- Dieter ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
The ZODB is actually very fast. [...] So you're probably observing slowness in the frameworks on top of it. I'll believe this anytime :-] In our case, a transaction may be a workflow state change on say 50 objects. Two or three people try a transaction like that within a couple of seconds of one another, and ConflictErrors crop up. In a log with 402 ConflictErrors, 225 are on BTrees (_IIBTree.IITreeSet, _IOBTree.IOBucket, _OOBTree.OOBTree, _OOBTree.OOBucket all feature). We assume these all relate to catalog indexing. Of the rest, 147 of the 177 are either Products.Transience.Transience.Increaser or Products.Transience.Transience.Length2 The role the framework (Plone, unsurprisingly) is playing in this case, is that it leans hard on the catalog during a transaction lasting a number of seconds. To mitigate this, we want to create a savepoint and then commit more often while iterating and changing workflow, rolling back to the savepoint if necessary. -- jean ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Re: What makes the ZODB slow?
Jean Jordaan wrote: The ZODB is actually very fast. [...] So you're probably observing slowness in the frameworks on top of it. I'll believe this anytime :-] In our case, a transaction may be a workflow state change on say 50 objects. Two or three people try a transaction like that within a couple of seconds of one another, and ConflictErrors crop up. In a log with 402 ConflictErrors, 225 are on BTrees (_IIBTree.IITreeSet, _IOBTree.IOBucket, _OOBTree.OOBTree, _OOBTree.OOBucket all feature). We assume these all relate to catalog indexing. Of the rest, 147 of the 177 are either Products.Transience.Transience.Increaser or Products.Transience.Transience.Length2 The role the framework (Plone, unsurprisingly) is playing in this case, is that it leans hard on the catalog during a transaction lasting a number of seconds. To mitigate this, we want to create a savepoint and then commit more often while iterating and changing workflow, rolling back to the savepoint if necessary. BTrees perform best when keys' prefixes are randomly distributed. So if your application generates keys like 'foo001', 'foo002',... you'll get lots of conflicts. Same for consecutive integers in IOBTree. Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of RD +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
--On 23. Juni 2006 17:51:35 +0200 Florent Guillaume [EMAIL PROTECTED] wrote: BTrees perform best when keys' prefixes are randomly distributed. So if your application generates keys like 'foo001', 'foo002',... you'll get lots of conflicts. Same for consecutive integers in IOBTree. hm..are you sure about that? -aj pgp75ozEv05Vw.pgp Description: PGP signature ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
On Fri, 2006-06-23 at 15:11 +0200, Florent Guillaume wrote: I often daydream of a ZODB that will one day have such great performance that it won't be necessary to adopt a hybrid backend. I know there is a huge difference between objects and records in an RDBMS, but in an attempt to understand more, I want to know what makes the ZODB so much slower than a relational database when writing a lot? Is it possible to speed it up in any way? Other questions that come to mind: What overhead does undo add to performance? Can state be serialised more economically to reduce disk IO? Is the ZODB really slow, or is it just Zope and Plone or grand object frameworks built on top it that make it appear slow? (In all my benchmarks this is shown to be mostly true) The ZODB is actually very fast. It has one drawback, which is that concurrent writes are resolved only for class designed for that (namely BTrees), otherwise it's left up to the application to deal with it when it receives a ConflictError. So you're probably observing slowness in the frameworks on top of it. This is not really the fundamental explanation I was fishing for, and I don't think that you are entirely right. I don't think one can call the ZODB fast (I hope to some day). It might be fast in it's handling of hierarchical data or reading lots of objects, but I won't exactly call it fast. Just compare the speed new objects are created in the ZODB, with the speed of records being created in an RDMBS. In a test where one commits an instance of a Persistent subclass that have only 2 string attributes, 300 objects per second are created on average. Writing the exact same strings to a two column table in an RDBMS, yields more than 3000 records per second including indexing of the data. In the ZODB I still have to index data which will add additional overhead. Adding more columns to the SQL table and writing more data to it, doesn't hurt performance either. The above test most probably doesn't compare apples with apples, but maybe in pointing out why not, more fundamental differences become clear. Maybe the fundamental difference is that pickles of objects have a bigger footprint and yield to more disk IO, or most of the ZODB is implemented in Python. I don't know, and I'm still curious. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
On 23 Jun 2006, at 17:55, Andreas Jung wrote: --On 23. Juni 2006 17:51:35 +0200 Florent Guillaume [EMAIL PROTECTED] wrote: BTrees perform best when keys' prefixes are randomly distributed. So if your application generates keys like 'foo001', 'foo002',... you'll get lots of conflicts. Same for consecutive integers in IOBTree. hm..are you sure about that? It all depends on the concurrency for the use of these consecutive ids really. The problem is bucket splits. A bucket split cannot be resolved by conflict resolution code of BTrees. Let's say B is the size of a bucket and you have N leaf buckets in the whole BTree. If you use consecutive ids, you'll get a bucket split every B/2 inserts (assuming buckets are half-filled on average). If you use random ids, you'll get a bucket split on average every N*B/ 2 inserts. All this roughly (I'm ignoring details like internal nodes). If two processes concurrently use sequential ids from the same pool at the same time, I'd say there one in B chances of getting a conflict error. It's only one in (N*B)^2 if the ids are random. All back-of-the-envelope calculations of course... Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of RD +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Re: What makes the ZODB slow?
Jean Jordaan wrote at 2006-6-23 16:24 +0200: ... write conflicts by large transactions ... To mitigate this, we want to create a savepoint and then commit more often while iterating and changing workflow, rolling back to the savepoint if necessary. I fear this will not work -- at least not when you mean the ZODB savepoints. The ZODB savepoints are on the sub-transaction level. Write conflicts happen at the transaction boundary. -- Dieter ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev