Re: [ZODB-Dev] Speeding up ZODB (was "redis cache for RelStorage")
On Thu, May 5, 2011 at 4:43 AM, Pedro Ferreira wrote: ... > Since we are talking about speed, does anyone have any tips on making > ZODB (in general) faster? It's hard to give general advice. The basic thing to be aware of is that lods from ZEO cache have some cost, but are fairly cheap especially if your ZEO cache can fit in the local operating-system disk cache. (You may want to add RAM to your clients.) Loads from the storage server are quite a bit more expensive. If you have a large database (100s of Gigs) then disk access times on your storage server can also be a major factor. Effective use of ZEO cache can make a big difference. To decide how big your ZEO cache should be, turn on ZEO cache tracing, starting from an empty cache, and use the ZEO/scripts/cache_simul.py to experiment with different cache sizes. To enable cache tracing, run your application with the ZEO_CACHE_TRACE environment variable set to a non-empty value. > In our project, the DB is apparently the > bottleneck, and we are considering implementing a memcache layer in > order to avoid fetching so often from DB. You should look at your ZEO cache configuration first. You may be able to improve performance quite a bit by simply increasing your ZEO cache sizes. If you have larger ZEO caches, you should probably use persistent caches. You probably also want to set: - drop-cache-rather-verify to true on the client - set invalidation-age on the server to at least an hour or two so that you deal with being disconnected from the storage server for a reasonable period of time without having to verify. If you haven't, make sure binary data like photos, movies, whatever are stored in blobs. Consider compression your database records: http://pypi.python.org/pypi/zc.zlibstorage Not only will thhis save disk space, but it will allow more of your database to fit in the storage server's disk cache. It will allow your ZEO caches to store 2-3 times as many records for a given cache size. A disadvantage of ZEO caches is that they aren't shared between processes and I've been thinking of ways to leverage something like memcached. > However, we were also > wondering if we could in some way take advantage of different computer > hardware - since the ZEO server is mostly single-threaded we thought of > getting a machine with higher clock freq and larger cache rather than a > commodity 8-core server (which is what we are using now). Is your storage server CPU bound? Starting with ZODB 3.10, ZEO storage servers are multi-threaded. They have a thread for each client. We have a storage server that has run at 120% cpu on a 4-core box. Also, if you use zc.FileStorage, packing is mostly done in a separate process. > Any tips on the kind of hardware that performs best with ZODB/ZEO? A major source of slow down *can* be disk access times. How's IO wait on your server? If IO wait is high, then consider adding ram to get a larger cache or moving the database to an ssd. We're running our largest most active database on an SSD. (Blobs are still on magnetic disk.) Again, compression can help a lot here, allowing databases to fit on ssd that otherwise wouldn't. > Are > there any adjustments that can be done at the OS or even application > layer that might improve performance? Look at how your application is using data. If you have requests that have to load a lot of data, maybe you can refactor your application to load fewer. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] How to check for setting the same values on persistent objects?
Le 04/05/2011 11:53, Hanno Schlichting a écrit : > Hi. > > I tried to analyze the overhead of changing content in Plone a bit. It > turns out we write back a lot of persistent objects to the database, > even tough the actual values of these objects haven't changed. > > Digging deeper I tried to understand what happens here: > > 1. persistent.__setattr__ will always set _p_changed to True and thus > cause the object to be written back > 2. Some BTree buckets define the "VALUE_SAME" macro. If the macro is > available and the new value is the same as the old, the change is > ignored > 3. The VALUE_SAME macro is only defined for the int, long and float > value variants but not the object based ones > 4. All code in Products.ZCatalog does explicit comparisons of the old > and new value and ignores non-value-changes. I haven't seen any other > code doing this. > > I'm assuming doing a general check for "old == new" is not safe, as it > might not be implemented correctly for all objects and doing the > comparison might be expensive. I know very few of ZODB internals but in Python "old == new" does not means "old is new" I don't know the way ZODB retrieve a particular object exactly but I assume it does this using _p_oid. So for persistant classes you could check old._p_oid == new._p_oid. For string, int you can of course use old is new. Sorry if I'm wrong, as I may miss lot of thing ! Alex ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Speeding up ZODB (was "redis cache for RelStorage")
On Thu, May 5, 2011 at 10:43 AM, Pedro Ferreira wrote: > Since we are talking about speed, does anyone have any tips on making > ZODB (in general) faster? Query fewer objects from the database. Make sure you don't store lots of tiny persistent objects in the database, I'd aim for storing data in chunks of 8-32kb or use blobs for larger objects. Remember that ZODB is a key/value storage for the most part. Model your data accordingly. > In our project, the DB is apparently the > bottleneck, and we are considering implementing a memcache layer in > order to avoid fetching so often from DB. Before you do that, you might consider switching to RelStorage, which already has a memcached caching layer in addition to the connection caches. But remember that throwing more caches at the problem isn't a solution. It's likely the way you store or query the data from the database that's not optimal. > However, we were also > wondering if we could in some way take advantage of different computer > hardware - since the ZEO server is mostly single-threaded we thought of > getting a machine with higher clock freq and larger cache rather than a > commodity 8-core server (which is what we are using now). The ZEO server needs almost no CPU power, except for garbage collection and packing. During normal operations the CPU speed should be irrelevant. > Any tips on the kind of hardware that performs best with ZODB/ZEO? Are > there any adjustments that can be done at the OS or even application > layer that might improve performance? Faster disks. Whatever you can do to get faster disks will help performance. But that's general advise that applies to all database servers. You can also throw more memory at the db server, so the operating systems disk cache will kick in and you'll actually read data from memory instead of the disks. Hanno ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Speeding up ZODB (was "redis cache for RelStorage")
> I suspect either one is so fast that the speed of Redis or Memcached is > irrelevant. If you want speed, minimize the latency of the *network*, > and that means getting good network hardware. Since we are talking about speed, does anyone have any tips on making ZODB (in general) faster? In our project, the DB is apparently the bottleneck, and we are considering implementing a memcache layer in order to avoid fetching so often from DB. However, we were also wondering if we could in some way take advantage of different computer hardware - since the ZEO server is mostly single-threaded we thought of getting a machine with higher clock freq and larger cache rather than a commodity 8-core server (which is what we are using now). Any tips on the kind of hardware that performs best with ZODB/ZEO? Are there any adjustments that can be done at the OS or even application layer that might improve performance? Thanks, Pedro -- José Pedro Ferreira Software Developer, Indico Project http://indico-software.org +---+ + '``'--- `+ CERN - European Organization for Nuclear Research + |CERN| / + 1211 Geneve 23, Switzerland + ..__. \. + IT-UDS-AVC + \\___.\ + Office: 513-R-0042 + /+ Tel. +41227677159 +---+ ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev