Re: [ZODB-Dev] Iterate through all objects from ZODB
On Mon, Sep 22, 2014 at 11:12 PM, Carlos Sanchez carlos.sanc...@nextthought.com wrote: Hi, I was wondering if there is an official API and/or a way to iterate through all objects in a ZODB database. In general, official interfaces are found in ZODB.interfaces. IStorageCurrentRecordIteration lets you iterate over meta data about objects in the database, including oid, tid and pickle. Both FileStorage and ZEO implement this interface. You can pass the oid to a connection's get method to get the object. Iterating over the entire database requires some care to avoid exceeding RAM. After dealing with each object, you'll probably want to call cacheGC on the connection to free unneeded memory. ... We are using RelStorage (MySQL) and ZEO (4 Dev) I don't know if RelStorage implements IStorageCurrentRecordIteration. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Race condition in RelStorage history-free pack?
You should be able to arrange to some time in the past. You may need a custom pack script. (I don't remember off hand if zeopack accepts fractional days.) But maybe packing some small time period back (say 1 hour) would work around the issue. Jim On Fri, Jul 18, 2014 at 10:09 AM, Ian McCracken i...@zenoss.com wrote: We have an area of our product that can, depending on certain conditions, produce lots of rapid transactions all at once. We’re getting many reports of POSKeyErrors at sites where the transaction volume is higher than others. They appear to coincide (in many cases, at least) with zodbpack runs. After some investigation, it appears that it’s a timing issue. If an object is unreferenced during the pre_pack() stage, it will be marked for GC. If it then becomes referenced before the actual DELETE FROM is executed, it will be deleted nonetheless and POSKeyErrors will result. Now, I don’t know how the object is unreferenced and referenced in separate transactions; as far as I can tell, there are no two-transaction moves or anything like that happening. Nonetheless, it’s entirely possible we can solve this by tracking down the code that sets up the race condition. But is there any lower-level way around the race condition? A good amount of our application is pluggable by the end user, and I’d like to avoid vague technical warnings about transaction boundaries if possible :) If we could verify that no POSKeyErrors will result from the DELETE before running it, that would be much simpler. I should also mention that running zodbpack with days=1 isn’t an across-the-board solution for us. We have some customers for whom rapid growth of the database is an issue, and they pack more frequently. —Ian ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 4:32 AM, Frédéric Iterbeke frederic.iterb...@ugent.be wrote: Op 30/06/2014 9:30, Alessandro Pisa schreef: I have a ~70Gb Data.fs that does not pack anymore. When I pack it it creates a ~8GB Data.fs.pack, then it evaluates thi condition: -https://github.com/zopefoundation/ZODB/blob/3.9.5/src/ZODB/FileStorage/fspack.py#L410 as True, removes, the Data.fs.pack and returns. The same happens with ZODB-3.10.5 and using pack-gc = false. In every permutation I tried the produced Data.fs.pack files have the same checksum. Does anybody have some hints? What I am trying to do: - comment the Data.fs.pack removing. - use that file as a new Data.fs Well, 70 Gb Data.fs is pretty big. But not impossible afaik. We have much larger datavases. If you set pack-gc = false it's normal that nothing is removed. No, it's not. If you think the code is doing something wrong and you would like to try packing anyway, I would suggest you just comment the entire if in the fspack.py code and try running this version. If I read the code correctly, this would force (at least trying) a pack in any case, assuming pack-gc=true. Really? You're suggesting modifying the code. I'm not guaranteeing anything and I'm just a zodb user though ;) And yet you suggest modifying code you don't understand. Amazing. And remember to use a copy of your data when doing stuff like this ;) At least you suggested making a copy. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 4:40 AM, Alessandro Pisa alessandro.p...@gmail.com wrote: On 30 June 2014 10:32, Frédéric Iterbeke frederic.iterb...@ugent.be wrote: Op 30/06/2014 9:30, Alessandro Pisa schreef: I have a ~70Gb Data.fs that does not pack anymore. When I pack it it creates a ~8GB Data.fs.pack, then it evaluates thi condition: -https://github.com/zopefoundation/ZODB/blob/3.9.5/src/ZODB/FileStorage/fspack.py#L410 as True, removes, the Data.fs.pack and returns. The same happens with ZODB-3.10.5 and using pack-gc = false. In every permutation I tried the produced Data.fs.pack files have the same checksum. Does anybody have some hints? What I am trying to do: - comment the Data.fs.pack removing. - use that file as a new Data.fs Well, 70 Gb Data.fs is pretty big. But not impossible afaik. If you set pack-gc = false it's normal that nothing is removed. Setting or unsetting it doesn't change the produced Data.fs.pack (it has the same md5sum). Anyway the pack time considerably reduces (from 4 hours to 25 minutes). That's because the GC (and even pack algorithm) built into FileStorage is very inefficient. When you pack a file storage with GC enabled, you are really doing 2 things: 1. Removing non-current database records (as of the pack time). This is properly called packing. 2. Removing objects that are no longer reachable (from any records, current or non-current) from the root object. If packing doesn't remove any objects, neither will packing. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 7:21 AM, Simone Deponti simone.depo...@abstract.it wrote: Hi Jim, On Mon, Jun 30, 2014 at 12:43 PM, Jim Fulton j...@zope.com wrote: As the comment suggests, if you continued packing, the new file would be as large as the old one, because no records would be removed. This is likely either because a) you've already packed to that pack time before, or b) None of the objects written up to the pack time have been written after the pack time and this there are no old records to be removed. Therefore, if I get it right, what happens is: * All transactions prior to the packing time are scanned to see if they contain reachable data, Not right. - First, it does a scan to determine which records are current as of the pack time. This has nothing to do with reachability or or GC. If an object is modified, then a new record is written and becomes current. Packing (before or without GC simply removes (or more precicely copies) current records. - It copies current records to pack files. If it determines that all records have been copied, then it stops, as there's no purpose in proceeding. if they do, they are kept. Therefore the condition there checks that, if we have reached the pack time (after which, all transactions are copied over anyway) and none has been detected as deletable, then it doesn't make sense to go on packing. * If pack-gc is on, then all the transaction prior to pack time that have been kept are purged of unreachable objects GC doesn't have anything directly to do with pack time. GC removed objects that are no longer reachable from the root from any records surviving a pack. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 7:51 AM, Frédéric Iterbeke frederic.iterb...@ugent.be wrote: ... If you think the code is doing something wrong and you would like to try packing anyway, I would suggest you just comment the entire if in the fspack.py code and try running this version. If I read the code correctly, this would force (at least trying) a pack in any case, assuming pack-gc=true. Really? You're suggesting modifying the code. Sometimes I modify code like that in isolated environments to try to patch things or debug problems. Yes, also with code I haven't written myself or reviewed from the first 'till the last line. I did not suggest committing any change to the current codebase, if that is what you were implying. No, I'm saying it's reckless to suggest modifications to code you don't understand. Suppose your changes seemed to work but caused data corruption that wasn't detected until much later. Frédéric might have tried your suggested, throught it worked and then applied it to his production database. I'm not guaranteeing anything and I'm just a zodb user though ;) And yet you suggest modifying code you don't understand. Amazing. I was just trying to help. Which is the purpose of this list, is it not? Only if you're qualified. Do you help with brain surgery too? And remember to use a copy of your data when doing stuff like this ;) At least you suggested making a copy. At least you tried giving a relevant answer later on in the thread. Ever thought of the fact that the information you just gave on this list on the workings of pack, which other people are trying to comprehend and interpret right, is nowhere to be found in documentation? So users are left to find out for themselves. Or should we all try to fully understand each letter of code in a product before using it? Sorry the documentation isn't thorough enough. It's posts like this that make me not want to try to help others anymore. If I can keep an unqualified helper from causing harm, I've accomplished something. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 8:24 AM, Alessandro Pisa alessandro.p...@gmail.com wrote: On 30 June 2014 12:43, Jim Fulton j...@zope.com wrote: On Mon, Jun 30, 2014 at 3:30 AM, Alessandro Pisa alessandro.p...@gmail.com wrote: Hello everybody :) As the comment suggests, if you continued packing, the new file would be as large as the old one, because no records would be removed. This is likely either because a) you've already packed to that pack time before. b) None of the objects written up to the pack time have been written after the pack time and this there are no old records to be removed. Strange, I am making a 0 day pack. Perhaps you had a clock problem and the recent records have timestamps in the future. How can I convince zeopack of that? You pass a pack time of now. :) Is it possible to remove this previous pack memory and act as it would be the first pack? Theoretically. Would this be effective? At causing dangling references, possibly. Any suggestion for reducing the Data.fs size? I suggest using the file-store iterator to look at the transaction timestamps. Something like: from ZODB.FileStorage import FileIterator it = FileIterator('s.fs') last = None for t in it: if last is not None: if t.tid = last: print 'wtf', repr(t.tid) last = t.tid If you've said to pack to the present and you aren't writing to the database, then I would expect it to stop at the end of the file, unless you have a problem with your transaction ids. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 10:44 AM, Alessandro Pisa alessandro.p...@gmail.com wrote: On 30 June 2014 16:02, Alessandro Pisa alessandro.p...@gmail.com wrote: On 30 June 2014 15:42, Jim Fulton j...@zope.com wrote: On Mon, Jun 30, 2014 at 8:24 AM, Alessandro Pisa I suggest using the file-store iterator to look at the transaction timestamps. This is what I got [root@zeoserver]# cat scripts/test_tids.py from ZODB.FileStorage import FileIterator it = FileIterator('var/filestorage/Data.fs') print 'Size %s' % it._file_size last = None counter = 0 for t in it: if last is not None: if t.tid = last: import pdb; pdb.set_trace() print 'wtf', repr(t.tid) counter += 1 last = t.tid print 'Last transaction: %s-%s' % (t._tpos, t._tend) print 'Transactions: %s' % counter [root@zeoserver]# ./bin/zopepy scripts/test_tids.py Size 67395884639 Last transaction: 67392229872-67395884631 Transactions: 1275366 It seems that tids are in order :/ Then I suggest seeing if any are in the future. You can create a tid for now with repr(ZODB.TimeStamp.TimeStamp(Y, M. D, h, m, s)) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Pack problem
On Mon, Jun 30, 2014 at 11:38 AM, Alessandro Pisa alessandro.p...@gmail.com wrote: ... I checked the last transaction tid and it is in the future... ... print 'Transactions: %s' % counter [root@zeoserver]# ./bin/zopepy scripts/test_tids.py Size 67395884639 Last transaction: 67392229872-67395884631 (2014-07-01 00:10:04.686741) Transactions: 1275366 So, you can pass a time.time in future, or just wait a few hours :) Thanks for the valuable help of everybody, I learned a lot. You're welcome. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Massive LockError: Couldn't lock....check_size.lock messages
On Tue, Apr 15, 2014 at 9:28 AM, Andreas Jung li...@zopyx.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi there, we have a Plone 4.2 setup running ZEO with 3 application servers and a non-shared blob setup, ZODB3-3.10.5-py2.7-linux-i686 We see massive amounts of the following error message on every application server. The application servers are configured to use a local blob-cache directory. We tried to tune the blob cache size (even down to zero) but without success. Any idea about this issue? This is a missfeature in zc.lockfile. It's crying wolf. (Well, not really, but it has know way of knowing if the lock is important. Sometimes it is, and sometimes not.) These messages can be ignored. If someone wanted to fix this, they could add an argument to the constructor to suppress these log messages. In the mean time, as a work around, you could adjust your logger configuration to suppress these yourself. Jim Andreas 2014-04-15T15:24:34 ERROR zc.lockfile Error locking file /srv/gehirn/dasgehirn_buildout/var/blobstorage.cache/check_size.lock; pid=25381 Traceback (most recent call last): File /srv/gehirn/dasgehirn_buildout/eggs/zc.lockfile-1.0.2-py2.7.egg/zc/lockfile/__init__.py, line 84, in __init__ _lock_file(fp) File /srv/gehirn/dasgehirn_buildout/eggs/zc.lockfile-1.0.2-py2.7.egg/zc/lockfile/__init__.py, line 59, in _lock_file raise LockError(Couldn't lock %r % file.name) LockError: Couldn't lock '/srv/gehirn/dasgehirn_buildout/var/blobstorage.cache/check_size.lock' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQGUBAEBAgAGBQJTTTPkAAoJEADcfz7u4AZjY7oLvjOM2VyMng1g8QDnQheQ8MCY pEjxNGTfKskZKJaeYdbOBsSAGC46dYEQNee56CRbPO+G4QsjXDnMjPGIrbUF9/CD DeEA5wiCeTkb/Y7WRhfqvCDuLmIHEY0oYeFlpXH2bQ62uTGA4BD1EnvYi6wJKurH WsM4JU5X+KzHJhC867M5fFpL0r0kLXVHNovY9q3zCdrS4kqTPr/OEgKKPpI4ytIK BDpRtV7ZRPKE0Kgi6mJAPYr6F541BqMVQdCrK14LtbKCwcJDbrrVcsM+eZSjI9AN wqtVB30OwfRQiYFBRmRfm8ncHrALx4W4a6rvvoMAp9uf7F8cuIRRIvW4vymIxiR0 a2jfal2PR0QwM4XW7tb/42lhE1Gvwmd0VPk+bSg6MHYoMB78j6wOnBjlot3BJADS kXpTnYXAkHLe+PQX7JROX53C67f/pN1rXkr9VwEGWKWQaODS26n2IachD5fs8AC5 a38Jo3EcR3cTIYiZady62M7muJqkJO4= =UF30 -END PGP SIGNATURE- ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] New ZODB Mailing list
I've created a new ZODB mailing list via google groups: https://groups.google.com/forum/#!forum/zodb I've done this for 2 reasons: - We're having some difficulty managing mail.zope.org. Currently, there are some issues (certificate related?) that are causing mail to bounce. This is causing the membership in this list to slowly erode. This is fixable by someone with knowledge of email arcana or someone willing to spend the time to learn about it. Personally, I have better things to do, and no one else in the Zope Foundation has stepped forward, so ... - I also want to be more welcoming to ZODB users. A user-oriented list is long overdue. For now, we'll discuss user as well as developer issues on the new list. It's not like traffic has been heavy lately. ( I would like to know what's up with the zodb-dev google group. I'm 91% sure one of *us* must have set it up at some point.) In a week or so, I'll disable posts to this list, with a notice to move to the google group. In the mean time, I'll post to both places. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Sprucing up zodb.org and pull request
This was posted to z...@googlegroups.com. If you want to reply, please do so there. I'd like to spruce up zodb.org, especially the documentation. I think the ZODB is under-appreciated and it's hard to promote it without a good docs and a reasonable web site. I plan to do this in small increments, to make it easier for me to squeeze in time and to make it easier for other people to contribute through their own small changes and review. I'd like to offer my sincere thanks to the folks who set it up initially, providing a basis for incremental improvements. The website is a sphinx project: https://github.com/zopefoundation/zodbdocs now hosted by Read the Docs. Thanks Read the Docs!!! Help would be much appreciated and can be offered in small parts. From offering edits, to reviewing pull requests, to pointing out problems. If you see something that needs to be fixed or want to suggest an improvement, please file an issue: https://github.com/zopefoundation/zodbdocs/issues If you want to help by reviewing edits, look for pull requests: https://github.com/zopefoundation/zodbdocs/pulls You don't need to be a Zope contributor to make a documentation pull request. BTW, I'd like non-trivial changes to be made via pull request. This last week, I finally got around to some overdue work on the tutorial. It would be great is someone would review my changes: https://github.com/zopefoundation/zodbdocs/pull/1 Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Does ZODB pipeline load requests?
On Wed, Feb 19, 2014 at 7:40 PM, Dylan Jay d...@pretaweb.com wrote: Iterators certainly seem like a logical place to start. As an example I originally was doing a TTW zope reindex of a single index. Due to conflict problems I used a modified version of this https://github.com/plone/Products.PloneOrg/blob/master/scripts/catalog_rebuild.py (which I'd love to integrate something similar into zcatalog sometime). Both use iterators I believe. I think even if there was an explicit api where you can pass in an iterator, a max buffer length and you'd get passed back another iterator. Then asynchronously objects will load to try and keep ahead of the iterator consumption. e.g. for obj in async_load(myitr, 50): dox(obj) I like the idea of a wrapper. I think a) you're pushing the abstraction to far, and b) this doesn't have to be a ZODB API, at least not initially. In any case, if the lower-level API exists, it would be straightforward to implement one like above. I don't know how that would help with a loop like this however for obj in async_load(myitr, 50): dox(obj.getMainObject()) Well, this would simply be another custom iterator wrapper. for ob in main_iterator(myiter): ... Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Did someone create a zodb-dev google group?
I just tried creating one, but it was already taken and is not public. :( Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Does ZODB pipeline load requests?
On Tue, Feb 18, 2014 at 8:59 PM, Dylan Jay d...@pretaweb.com wrote: Hi, I'm seeing a a ZCatalog reindex of a large number of objects take a long time while only using 10% cpu. I'm not sure yet if this is due to the size of the objects and therefore the network is saturated, or the ZEO file reads aren't fast enough. How heavily loaded is your storage server, especially %CPU of the server process? Also, are the ZODB object or client caches big enough for the job? However looking at the protocol I didn't see a way for code such as the ZCatalog to give a hint to ZEO as to what they wanted to load next so the time is taken by network delays rather than either ZEO or app. Is that the case? It is the case that a ZEO client does one read at a time and that there's no easy way to pre-load objects. I'm guessing if it is, it's a fundamental design problem that can't be fixed :( I don't think there's a *fundamental* problem. There are three issues. The hardest to solve isn't at the storage level. I'll mention the 2 easiest problems first: 1. The ZEO client implementation only allows one outstanding request at a time, even on a client with multiple threads. This is merely a clumsy implementation. The protocol easily allows for multiple outstanding reads! 2. The storage API doesn't provide a way to read multiple objects at once, or to otherwise hint that additional objects will be loaded. Both of these are fairly straightforward to fix. It's just a matter of time. :) 3. You have to be able to predict what data are going to be needed. This IMO is rather hard, at least at a general level. It's what's left me somewhat under-motivated to address the first 2 problems. We really should address problems 1 and 2 to make it possible for people to experiment with approaches to problem 3. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Does ZODB pipeline load requests?
On Wed, Feb 19, 2014 at 9:57 AM, Dylan Jay d...@pretaweb.com wrote: On 19 Feb 2014, at 10:44 pm, Jim Fulton j...@zope.com wrote: ... yeah I figured it might be the case thats its hard to predict. In this case it's catalog indexing so I was wondering if something could be done with __iter__ on a btree? It's a reasonably good guess that you could start preloading more of those objects if the first few are loaded? Iterators certainly seem like a logical place to start. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] [zopefoundation/ZODB] 49919d: test for POSKeyError during transaction commit
On Wed, Feb 5, 2014 at 1:57 AM, Marius Gedminas mar...@gedmin.as wrote: On Tue, Feb 04, 2014 at 12:44:09PM -0500, Tres Seaver wrote: On 02/04/2014 06:28 AM, Godefroid Chapelle wrote: Le 03/02/14 20:53, Tres Seaver a écrit : I wish you hadn't pushed that -- some of these changes are definitely inappropriate on the 3.10 branch (adding an Acquisition dependency is definitely wrong). Agreed. Note that non-trivial commits to a release branch, like master, should be via pull request. Acquisition is added as a test dependency. Any hint how to replicate the bug without acquisition is welcome. Define a subclass of Persistent which emulates what Acquisition does, e.g.: from persistent import Persistent class Foo(Persistent): @property def _p_jar(self): # or whatever attribute trggers return object() What if full replication requires a C extension module? (I hope that's not true and that it is possible to reproduce the bug using some fakes, but I haven't spent the time investigating this.) I'm going to dig into this. I'm baffled by the assertion that this has anything to do with readCurrent. Regardless of whether it should have been made to the 3.10 branch, I'm going to use Godefroid's test case to dig further. Which other change is inappropriate ? Adding MANIFEST.in on a release branch seems wrong to me (I don't like them anyway, and we *definitely* don't want to encourage instsall-from-a-github-generated-tarball on a release branch). That's like objecting if someone adds a .gitignore to a release branch. Or a .travis.yml. It's not code, it's metadata. Yup. (I never liked setuptool's magic let me query git to see what source files you have, but not by default, oh no, instead let's assume everybody has installed the non-standard plugin into their system Pythons and then let's silently produce broken tarballs if they haven't, because obviously implicit is better than explicit, and when there's temptation the right thing is to guess behavior anyway, and we *definitely* don't want broken sdists on PyPI.) I couldn't agree more. One of the advantages of moving to git was circumventing setuptool's misguided magic. I've no idea what Tres was referring to wrt instsall-from-a-github-generated-tarball, but I use Manifest.in files in all my modern projects. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] [zopefoundation/ZODB] 49919d: test for POSKeyError during transaction commit
On Wed, Feb 5, 2014 at 9:23 AM, Godefroid Chapelle got...@bubblenet.be wrote: Le 05/02/14 13:25, Jim Fulton a écrit : Acquisition is added as a test dependency. Any hint how to replicate the bug without acquisition is welcome. Define a subclass of Persistent which emulates what Acquisition does, e.g.: from persistent import Persistent class Foo(Persistent): @property def _p_jar(self): # or whatever attribute trggers return object() What if full replication requires a C extension module? (I hope that's not true and that it is possible to reproduce the bug using some fakes, but I haven't spent the time investigating this.) I'm going to dig into this. I'm baffled by the assertion that this has anything to do with readCurrent. For sure : the POSKeyError happens during connection.commit when checking oids stored in Connection._readCurrent mapping. (see traceback at http://rpatterson.net/blog/poskeyerror-during-commit) The _readCurrent mapping is populated only by calls to Connection.readCurrent method. In the Plone code base, the only way I found to get that Connection.readCurrent method to be called is by adding a key value pair to a BTree. _BTree_set C function is then called, which in turn calls readCurrent by inlining the PER_READCURRENT macro. This calls the cPersistence.c readCurrent function, which in turn calls readCurrent method on the ZODB connection. Wow. I had to dig a bit to remind myself (vaguely) why I added this. When setting a key value pair on a new (not already committed) instance of a standard BTree, readCurrent method is not called on the connection. This is with your change, right? My understanding is that it is due to the fact that _p_jar and _p_oid are only set during transaction commit. They can be set earlier by calling the connection add method. This is used often for frameworks that use object ids at the application level. However, with a new BTree instance that also inherits from Acquisition.Implicit, readCurrent method is called on ZODB connection when setting key value pair. The only explanation I found is that this instance _p_jar attribute has a value (acquired in a way or another ?). You could also simulate this by adding an object to a connection using a connection's add method. Wanna update the test to use this technique instead? In this case, when readCurrent is called on an object created during a savepoint and this savepoint is rolled back, the oid is leftover in the Connection._readCurrent mapping. This leads to the POSKeyError when committing later as checkCurrentSerialInTransaction cannot check the object since it went away at rollback. This brings us to the fix I propose: calls to readCurrent should not track objects with oid equal to z64. ... This was a very long explanation which I hope will help to confirm the fix or to come up with a better one. PS: keep in mind that english is not my mothertongue. :) You do very well. I think your fix is correct. As you point out, It doesn't make sense to guard against conflicts on new objects. I think a cleaner test could be written using the connection add method. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Unpickler.noload, zc.zodbdgc and multi-database reference bug (IndexError)?
On Thu, Jan 30, 2014 at 9:58 AM, jason.mad...@nextthought.com wrote: Hello ZODB dev, I was recently trying to GC a large multi-database setup for the first time using zc.zodbdgc. The process wouldn't complete (or really even get started) because of an IndexError being thrown from `zc.zodbdgc.getrefs` (__init__.py line 287). As I traced through it, it began to look like the combination of `cPickle.Unpickler.noload` and multi-database persistent ids (which in ZODB are list objects) fails, generating an empty list instead of the expected [ref type, args] list documented in `ZODB.serialize`. This makes it impossible to correctly GC a multi-database. I was curious if anyone else had seen this I haven't. I'm the author of zodbdgc and I use it regularly, including on large (for some definition) databases. , or maybe I'm just doing something wrong? We solved our problem by using `load` instead of `noload`, but I wondered if there might be a better way? Details: I'm working under Python 2.7.6 and 2.7.3 with ZODB 4.0.0, zc.zodbdgc 0.6.1 and eventually zodbpickle 0.5.2. Most of my results were repeated on both Mac OS X and Linux. Why are you using zodbpickle? Perhaps that is behaving differently from cPickle in some fashion? After hitting the IndexError, I began debugging the problem. When it became clear that the persistent_load callback was simply getting the wrong persistent ids passed to it (empty lists instead of complete multi-db refs), I tried swapping in zodbpickle for the stock cPickle to the same effect. Here's some code demonstrating the problem: This pickle data came right out of ZODB, captured during a debug session of zc.zodbdgc. It has three persistent ids, two cross database and one in the same database: p = 'cBTrees.OOBTree\nOOBTree\nq\x01.X\x0c\x00\x00\x00Users_1_Prodq\x02]q\x03(U\x01m(U\x0cUsers_1_Prodq\x04U\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x05czope.site.folder\nFolder\nq\x06tq\x07eQX\x0c\x00\x00\x00Users_2_Prodq\x08]q\t(U\x01m(U\x0cUsers_2_Prodq\nU\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x0bh\x06tq\x0ceQX\x0b\x00\x00\x00dataserver2q\r(U\x08\x00\x00\x00\x00\x00\x00\x00\x10q\x0eh\x06tQq\x0f.' This code is copy-and-pasted out of zc.zodbgc getrefs. It's supposed to find all the persistent refs and put them inside the `refs` list: import cPickle import cStringIO refs = [] u = cPickle.Unpickler(cStringIO.StringIO(p)) u.persistent_load = refs u.noload() u.noload() But if we look at `refs`, we see that the first two cross-database refs are returned as empty lists, not the correct value: refs [[], [], ('\x00\x00\x00\x00\x00\x00\x00\x10', None)] If instead we use `load` to read the state, we get the correct references: refs = [] u = cPickle.Unpickler(cStringIO.StringIO(p)) u.persistent_load = refs u.noload() u.load() refs [['m', ('Users_1_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', class 'zope.site.folder.Folder')], ['m', ('Users_2_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', class 'zope.site.folder.Folder')], ('\x00\x00\x00\x00\x00\x00\x00\x10', class 'zope.site.folder.Folder')] The results are the same using zodbpickle or using an actual callback function instead of the append-directly-to-list shortcut. If we fix the IndexError by checking the size of the list first, we miss all the cross-db references, meaning that a GC is going to be too aggressive. But using `load` is slower and requires access to all of the classes referenced. If anyone has run into this before or has other suggestions, I'd appreciate hearing them. I'd try using ZODB 3.10. I suspect a ZODB 4 incompatibility of some sort. Unfortunately, I don't have time to dig into this now. This weekend, I'll at least see if I can make zodbdgc tests pass with ZODB 4. Perhaps that will shed light. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Unpickler.noload, zc.zodbdgc and multi-database reference bug (IndexError)?
On Thu, Jan 30, 2014 at 12:40 PM, jason.mad...@nextthought.com wrote: On Jan 30, 2014, at 11:12, jason.mad...@nextthought.com wrote: So it seems that the behaviour of `noload` might have changed between 2.6.x and 2.7.x? Apologies for replying to myself, but I think I found the root cause. After some further investigation, I found issue 1101399 (http://bugs.python.org/issue1101399), complaining that noload is broken for subclasses of dict. The fix for this issue was applied to the cPython trunk in October of 2009 without any corresponding tests (http://hg.python.org/releasing/2.7.6/rev/d0f005e6fadd). In releases with this fix (if I'm reading the code correctly), Probably because the original code had no tests (because we weren't doing that then) and no documentation either (my bad) because this was added for the specific use case of efficiently scraping out references in ZODB. This fix means that multi-database references are always going to be returned as an empty list under noload (again, if I'm reading the code correctly). This means that multi-references and noload don't work under Python 2.7.x or 3.x with zodbpickle and so consequently neither does an unmodified zc.zodbdgc. Sigh. We're still using Python 2.6 for out database servers. :) I don't know what the best way forward is. Our solution to use `load` instead seems to work for us, but may not work for everyone. It will probably work, but be slower, which isn't a huge deal since zc.zodbdgc runs out of process. We run it on ZRS secondaries, which are otherwise idle. Maybe zodbpickle could revert the fix in its branch and zc.zodbgc could depend on that? I'm happy to help test other ideas. That may be the best way forward. I can't speak with much confidence until I've had a chance to wade in and refresh my memory on this stuff. Something I wish I'd done differently in ZODB and contemplated changing on a number of occasions is the handling of references. I wish I'd stored them outside the pickle so they could be analyzed without unpicking (or at least without unpickling the application data). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs
On Fri, Nov 15, 2013 at 8:01 PM, Jens W. Klein j...@bluedynamics.com wrote: I started a new packing script for Relstorage (history free, postgresql). It is based on incoming reference counting. Did you look at zc.zodbdgc? I think it implements something very close to what you're proposing. It's been in production for a few years now at ZC. Not sure if it would need to be updated for relstorage. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs
On Mon, Nov 18, 2013 at 8:43 AM, Jan-Wijbrand Kolman janwijbr...@gmail.com wrote: On 11/18/13 12:19 PM, Jim Fulton wrote: On Fri, Nov 15, 2013 at 8:01 PM, Jens W. Klein j...@bluedynamics.com wrote: I started a new packing script for Relstorage (history free, postgresql). It is based on incoming reference counting. Did you look at zc.zodbdgc? I think it implements something very close to what you're proposing. It's been in production for a few years now at ZC. Not sure if it would need to be updated for relstorage. AFAICT it does not work against a relstorage backend. Or at least I think to understand that from: http://www.zodb.org/en/latest/documentation/articles/multi-zodb-gc.html [...This documentation does not apply to RelStorage which has the same features built-in, but accessible in different ways. Look at the options for the zodbpack script. The –prepack option creates a table containing the same information as we are creating in the reference database[...] I didn't write that. I think zodbdgz probably would work, possibly with some modifications. If nothing else, it should be consulted, but then again, writing software is fun. Note that the important aspect here isn't cross-database references, but the garbage collection algorithm, which is incremental and uses a linear scan of the database. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs
On Mon, Nov 18, 2013 at 12:00 PM, Jens W. Klein j...@bluedynamics.com wrote: Hi Jim, thanks for the hint (also in the other post). I looked at zc.zodbdgc and took some inspiration from it. As far as I understand it stores the incoming references in a separate filestorage backend. This is just a temporary file to avoid storing all the data in memory. So this works similar to my impelmentation but uses the ZODB infrastructure. I dont see how I make zc.zodbdgc play with Relstorage and since it works on the abstracted ZODB level using pickles I don't know what you're saying, since I don't know what it refers to. zodbdgc works with storages. relstorage conforms to the storage API. It's possible some changes would be needed, but they should be minor. I suspected it to be not fast enough for so many obejcts No idea why, or what fast enough is. We use it on a database with ~200 million objects. - so I skipped this alternative. Good luck. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Relstorage and over growing database.
On Mon, Nov 11, 2013 at 4:24 PM, Daniel Widerin dan...@widerin.net wrote: Hi, just want to share our experience: My ZODB contains 300mio objects on relstorage/pgsql. The amount of objects is caused by btrees stored on plone dexterity contenttypes. It's size is 160GB. At that size it's impossible to pack because the pre-pack takes 100 days. jensens and me are searching for different packing algorithms and methods to achieve better packing performance. We're keeping you updated here! How i solved my problem for now: I converted into FileStorage which took about 40 hours and Data.fs was 55GB in size. Now i tried to run zeopack on that database - which succeeded and database was reduced to 7.8 GB - still containing 40mio objects. After that i migrated back to relstorage because of better performance and the result is a 11 GB db in pgsql. Hah. Nice. Have you measured an improvement in relstorage performance in practice? Is it enough to justify this hassle? WRT packaging algorithms: - You might look at zc.FileStorage which takes a slightly different approach than FileStorage: - Does most of the packing work in a separate process to avoid the GIL. - Doesn't do GC. - Has some other optimizations I don't recall. For our large databases, it's much faster than normal file-storage packing. - Consider separating garbage collection and packing. This allows garbage collection to be run mostly against a replica and to be spread out, if necessary. Look at zc.zodbdgc. Anyone experienced similar problems packing large relstorage databases? The graph traversal takes a really long time. maybe we can improve that by storing additional information in the relational database? Any hints or comments are welcome. Definately look at zodbdgc. It doesn't traverse the graph. It essentially does reference counting and is able to iterate over the database, which for FileStorage, is relatively quick. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZODB.FileStorage.format: TxnHeader cannot handle Unicode 'descr'
On Mon, Oct 7, 2013 at 11:58 AM, Tres Seaver tsea...@palladion.com wrote: ... transaction.note is defined to take a bytes string. Pyramid should encode the path before passing it to transaction.note. The interfaces says text. I realize that this is likely for hysterical raisins, but if we mean bytes, we should say so. Note that the implementation's use of an unadorned string literal to join the values means that in Py3k, it really *is* text, and not bytes. If we want the application to do the encoding, then we should change that literal as well. Agreed. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZODB.FileStorage.format: TxnHeader cannot handle Unicode 'descr'
On Sat, Oct 5, 2013 at 1:47 AM, Chao Meng bobom...@gmail.com wrote: Hi there, Here is an issue from zopefoundation/ZODB github three months ago. I am facing similar issue now and don't know how to resolve. github issue link: https://github.com/zopefoundation/ZODB/issues/12 When I use Pyramid+ZODB+traversal and use some Chinese characters in URL. Note that my resource tree saved in ZODB with unicode fine for the Chinese characters as object names. Basically when save transaction, ZODB.FileStorage.format TxnHeader uses request.path_info as it's descr, which is unicode, but TxnHeader cannot handle Unicode :( It would be great if anyone can help or give some pointers. This is a Pyramid bug. transaction.note is defined to take a bytes string. Pyramid should encode the path before passing it to transaction.note. Alternatively, Pyramid could store the path in transaction extended info, which accepts any picklable type. Of course, we could revisit this. If we did, I'd deprecate the transaction user and description attributes and only support meta data via the extended info mechanism, which I'd rename meta data. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Backward-incompatible change in TimeStamp repr in ZODB 4.
TimeStamps are used mainly to convert between high-level date-time representations and 8-byte binary transaction IDs. Historically, a TimeStamp's repr was used to retrieve the binary representation. repr was used because, under the hood, slots are much faster to access than methods. In ZODB 3.3, a TimeStamp raw method was added to retrieve the binary data for a time stamp. I wasn't actually aware of this until recently. From an API point of view, using raw rather than repr is cleaner. I don't know if the performance implications are significant, though probably not. In Python 3, __repr__ returns Unicode, rather than binary data, so it's no-longer possible to use it to get the binary representation of a time stamp. TimeStamp's __repr__ was changed to return the repr of it's binary data. Python 3 was going to have to be broken, but this was also a breaking change for Python 2. I don't remember this issue being raised earlier. If so, I missed it. In any case, going forward, it's best to embrace raw() as the correct way to get the binary representation of time stamps. This is mainly a heads up for people porting to ZODB 4. I don't really see much value in returning the repr of the binary data. I'd at least wrap the string in a TimeStamp call, something like TimeStamp(b'...'). I'd hoped that ZODB 4.0 would be backward compatible with ZODB3. That's why ZODB3 3.11 is a meta package that requires the ZODB 4 packages. Unfortunately, this means that ZODB3 3.11 isn't backward compatible. Fortunately, ZODB 3.11.0 is still in alpha. :) I think the best option is to release what is currently ZODB3 3.11.0a3 as ZODB3 4.0.0. This will allow packages that depend on ZODB3 to be used with ZODB 4, but it will clearly label the ZODB4 4.0.0 release as not backward compatible. Another option is to leave things as they are. Since buildout 2, and now pip, prefer final releases, existing applications that use current buildout or pip aren't broken by ZODB3 3.11a3, even if they don't pin versions. If someone wants to mix ZODB3 and ZODB 4, they can explicitly require ZODB3 3.11a3. Thoughts? Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] https://github.com/zopefoundation/zc.beforestorage
On Fri, Sep 27, 2013 at 12:35 PM, Christian Tismer tis...@stackless.com wrote: Hi, I saw me subscribed to zc.beforestorage , today. Congratulations! ;) If I'm not mislead, versions are no longer supported in 4.0, or is this still a supported approach? versions were removed in 3.9. I think history should not depend on having pack()'ed or not, but an explicit snapshot feature that puts a set of objects into some history object. shrug That would be a new feature. One I've even contemplated, sort of, in a notional FileStorage2 design that stores data in a sequence of files, which would allow point-in-time snapshots. Has that been discussed, and can someone please point me at it? Undoubtably, but I can't think of an instance in particular. beforestorage takes advantage of the fact that most ZODB storage implementations keep a limited sequence of transactions to provide a limited form of time travel and, most importantly, to provide a temporary snapshot of a database that's being written, mainly for use with DemoStorage. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] ZODB 4.0.0 and ZEO 4.0.0 Released
Hopefully, we can increase our development tempo a bit now that we have this base to build on. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] polite advice request
On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer tis...@stackless.com wrote: ... We get a medication prescription database in a certain serialized format which is standard in Germany for all pharmacy support companies. This database comes in ~25 files == tables in a zip file every two weeks. The DB is actually a structured set of SQL tables with references et al. So you get an entire database snapshot every 2 weeks? I actually did not want to change the design and simply created the table structure that they have, using ZODB, with tables as btrees that contain tuples for the records, so this is basically the SQL model, mimicked in Zodb. OK. I don't see what advantage you hope to get from ZODB. What is boring is the fact, that the database gets incremental updates all the time, changed prices, packing info, etc. Are these just data updates? Or schema updates too? We need to cope with millions of recipes that come from certain dates and therefore need to inquire different versions of the database. I don't understand this. What's a recipe? Why do you need to consider old versions of the database? Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] polite advice request
On Sun, Aug 18, 2013 at 1:40 PM, [mabe] pub...@enkore.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 He meant prescription. In german Rezept is the word for both prescription and recipe (like in cooking). Easy to confuse for us germans in english :) Great. Now I don't know what he meant by prescription. :) Does it matter? Might it as easily be foos and bars? Christian, Are you saying that you might need to access items from an old database that aren't in the current snapshot? Jim On 08/18/2013 06:34 PM, Jim Fulton wrote: On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer tis...@stackless.com wrote: We need to cope with millions of recipes that come from certain dates and therefore need to inquire different versions of the database. I don't understand this. What's a recipe? Why do you need to consider old versions of the database? -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (GNU/Linux) iQIcBAEBAgAGBQJSEQb/AAoJEAOmTcUxK/swEXgP/Ry3x9Y98wp43e2F2cf2063O F2UGRNZfylMjG3kTBLfwW9eH5KWk7AmCXdzUw/fXggueyg0NrH9f8aScYVPYHSEp g3q9n/I93DrMdDakqLXcnpHlKuUrd1ZfBk+XSyavvnOdV4LWGJ6+Wd8yqAFmUUCl bn//STvajUqSpO1+nG0aQsSceeTCVTEuyzQ/O4nSujhERG2ED7XOwi/1WwgruZSY 2ZGZCeLmHHLgYg6G8zPDRX6q/Y0GYLGi2bCQ0aQWlHEkBJBtPgCWn3rG+9GBlNXv bSXu0yjbaHL3q8VvdwAh4Y7n8E9TV1KVojOJmCg6MOA+AusL475Lao2/yBtZG3s3 mg12/NSUY/hGGoqtnsvXkIV8+ggK7WVlZRDzAoiHymR/3kdNO4MWYxFcvjCrvu8x RB6gIsVLglWKu5cuCJDrK7eGmdVK/y0Tmtl2qGKNnn+PJrZqNB9rk2kfmPMVIBdy VkFjvBQICL3aFZjSEDeqOeLdis221V9y3ndgKer6K5OG2KBNsv8dUX2smb7Qx7RT dbhhXwhI3C9i7ifzDEcrUavUfJCDQNLQovo1F/sL5hChFJAFS6USeWALt7B41YBu lN5ThjgIhkuyWfhs+ZAPeze5rRcY5lt+3oWLcD9fav+jJsifGodBdLrJ2dbljtWw 4FJBrKq/+ULC03toajwM =A/VY -END PGP SIGNATURE- ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] when to use connection close/sync
On Fri, Aug 16, 2013 at 9:38 AM, Joerg Baach li...@baach.de wrote: Hi *, after having looked around on google and the zodb api documentation[1] I am still unsure how to handle connections properly (especially in the context of syncing zeo clients): When do I need to: - connection.close() When you're done using a connection. :) - connection.sync() Never. This is obsolete. I suspect you aren't asking the right question. I'm gonna guess you're asking: What do I need to do to see database updates (made by other clients/connections)? The answer to that question is: To see database changes, you need to start a new transaction. You always see data as of the time the current transaction started. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: database ids
On Wed, Aug 14, 2013 at 6:49 PM, Vincent Pelletier plr.vinc...@gmail.comwrote: Le 15 août 2013 00:09, Jim Fulton j...@zope.com a écrit : Comments? Please make database ID reachable where _p_oid is reachable (maybe on _p_jar, I don't mind a few attribute lookup levels/trivial calls). Good idea. ob._p_jar.db().id Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] RFC: database ids
When using a database server (ZEO, relstorage), you can make a configuration error that causes you to connect to the wrong database. This can be especially painful in a situation where you get disconnected from the server and reconnect to an incorrect server and end up with objects from separate databases in the same cache. This happened to us (ZC) once when we fat-fingered a ZRS database fail-over. ZEO currently defends against this by refusing to connect to a server if the server's last transaction ID is less than the last transaction ID the client has seen. This has a couple of problems: - The test is too weak. - It makes fail-over to a slightly out of date secondary storage quite painful. I propose to add a database identifier that clients can verify. - To minimize impact to storage implementations, the database identifier will be stored under the ZODB_DATABASE_ID key of object 0 (root object). The key will be added on database open if it is absent. The value will be a configured value, or a UUID. - If a ZEO client is configured with a database identifier, then it will refuse to connect to a database without a matching identifier. - If a ZEO client is *not* configured with a database identifier, it will configure itself with the identifier of the first server it connects to, saving the information in the ZEO cache. This will at least protect against reconnect to the wrong server. - A ZEO client can *optionally* be configured to discard cache if it (re)connects to a server with a last transaction lower than the last one the client has seen as long as the database ID matches. - ZRS secondaries will also check database ids when (re)connecting to primaries. Comments? Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Ackward PersistentList read Performance
On Tue, Aug 13, 2013 at 9:40 AM, Joerg Baach li...@baach.de wrote: Hi *, I was trying to measure the impact of using different kind of objects to store data in ZODB (disk, ram, time). Whats really ackward is the measurement for reading content from PersistentLists (that are stored in an IOBTree): case a == g.edges=IOBTree() for j in range(1,100): edge =PersistentList([j,1,2,{}]) g.edges[j] = edge x = list(g.edges.values()) y = [e[3] for e in x] #this takes 30 seconds case b == g.edges=IOBTree() for j in range(1,100): edge =[j,1,2,{}] g.edges[j] = edge x = list(g.edges.values()) y = [e[3] for e in x] #this takes 0.09 seconds So, can it really be that using a PersistentList is 300 times slower? Yes. This would be true of *any* persistent object. In the first case, You're creating 100+B database objects, were B is ~20. In the second case, you're creating B persistent objects. Depending on what you do between cases A and B, you may also have to load 100+B vs B objects. Am I doing something completely wrong, It depends on your application. Generally, one uses a BTree to avoid loading a large collection into memory. Iterating over the whole thing defeats that. Deciding whether to use a few large database objects or many small ones is a tradeoff between efficiency of access and efficiency of update, depending on access patterns. or am I missing something? Possibly I am using ZODB3-3.10.5. The whole setup (incl. results) is at https://github.com/jhb/zodbtime tl;dr Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] BTrees package problems
On Tue, Jul 23, 2013 at 9:12 PM, Christian Tismer tis...@stackless.com wrote: ... What I'm after is a way to over-ride the implementation by user code. I did not yet check it this is implemented already, in the Python way of sub-classing built-ins. BTrees (somewhet to mysurprise, do support subclassing, although you'll need to write custom __getstate__ and __setstate__ methods to handle both the BTree data and instance data. It would be better if the methods provided by the BTree classes handled this automatically (by checking for an instance dictionary or slots). Also, the BTree implementation isn't informed by user provided attributes. It would be better if it was. For example. I'd like bucket and internal node sizes to be controllable via instance attributes (typically defined in a subclass. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] BTrees package problems
On Mon, Jul 22, 2013 at 9:06 PM, Christian Tismer tis...@stackless.com wrote: ... Actually, I would like to add a callable-check instead, to allow for more flexible derivatives. I don't understand this. Simple: I am writing BTree forests for versioned, read-only databases. For that, I need a way to create a version of Bucket that allows to override the _next field by maybe a callable. Otherwise all the buckets are chained together and I have no way to let frozen BTrees share buckets. In retrospect, it might make more sense to do the chaining a level up. Buckets themselves don't care about chaining. The tree wants buckets to be chained to support iteration. I'm not really sure if that helps your use case. When I played with the structure, I was happy/astonished to see the _next field being writable and thought it was intended to be so. It was not, in the end ;-) It's clearly a bug. The code has a comment right above the attribute definition stating that it's (supposed to be) read only, but the implementation makes them writable. There doesn't seem to be anything that depends on writing this attribute. I verified this by adding a fix and running the tests (in 3.10). For what you're trying to do, I suspect you want to fork BTrees, or start over. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] BTrees package problems
On Sat, Jul 20, 2013 at 11:27 PM, Christian Tismer tis...@stackless.com wrote: The BTrees package is an attempt to isolate certain things from ZODB. While I appreciate the general intent, I cannot see the advantage at this point: - BTrees can be imported alone, yes. But it has the extensions prepared with special ZODB slots, which makes this very questionable. - BTrees furthermore claims the BTrees global bame for it, all though it is not a general BTree package, but for ZODB BTrees, only. Yeah, I worried about this when we broke it out. OTOH, there isn't much concern with namespace pollution in the Python community. :/ - BTrees has a serious bug, see the following example: from BTrees import OOBTree as BT t = BT.BTree() for num in range(100): ... k = str(num) ... t[k] = k ... t._firstbucket._next = None len(t) Bus error: 10 (tmp)minimax:doc tismer$ Ouch. So there is either an omission to make t._next() read-only, or a check of its validity is missing. Yup. OTOH, you're the first person to encounter this after many years, so while this is bad, and needs to be fixed, I'm not sure how serious it is as a practical matter. Actually, I would like to add a callable-check instead, to allow for more flexible derivatives. I don't understand this. * this was my second little rant about ZODB. Not finished as it seems. please, see this again as my kraut way of showing interest in improving very good things. :) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] BTrees and ZODB simplicity
On Sat, Jul 20, 2013 at 11:43 PM, Christian Tismer tis...@stackless.com wrote: Third rant, dear Zope-Friends (and I mean it as friends!). In an attempt to make the ZODB a small, independant package, ZODB has been split into many modules. Maybe not as many as you think: persistent, transaction, ZEO, ZODB and BTrees. 5 shrug I appreciate that, while I think it partially has the opposite effect: - splitting BTrees apart is a good idea per se. But the way as it is, it adds more Namespace-pollution than benefits: To make sense of BTrees, you need the ZODB, and only the ZODB! So, why should then BTrees be a top-level module at all? This does not feel natural, but eavesdropping, pretending as something that is untrue. I think: - BTrees should either be a ZODB sub-package in its current state, - or a real stand-alone package with some way of adding persistence as an option. I don't agree that because a package depends on ZODB it should be in ZODB. There are lots of packages that depend on ZODB. I agree with your sentiments about namespace pollution. You and I may be the only ones that care though .3 ;). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] make ZODB as small and compact as expected
On Sun, Jul 21, 2013 at 12:12 AM, Christian Tismer tis...@stackless.com wrote: This is my last emission for tonight. I would be using ZODB as a nice little package if it was one. There should be nothing else but ZODB.some_package Instead, there is BTrees persistent transaction zc.lockfile zc.zlibstorage ZConfig zdaemon ZEO ZODB ZODB3 (zlibstorage) zope.interface and what I might have forgotton. Exception: There is also zodbpickle which I think is very usefull and general-purpose, and I wan to keep it, also I will try to push it into standard CPython. So, while all the packages are not really large, there are too many namespaces touched, and things like Zope Enterprize Objects are not meant to be here as open source pretending modules which the user never asked for. Despite it's tech-bubblishish acronym expansion, which few people are aware of, ZEO is the standard client-server component of ZODB, is widely used, and is certainly open source. I think these things could be re-packed into a common namespace and be made simpler. If ZODB had been born much later, it would certainly have used a namespace package. Now, it would be fairly disruptive to change it. Even zope.interface could be removed from this intended-to-be user-friendly simple package. I don't understand what you're saying. It's a dependency if ZODB. So while the amount of code is astonishingly small, the amount of abstraction layering tells the reader that this was never really meant to be small. And this makes average, simple-minded users like me shy away and go back to simpler modules like Durus. But the latter has serious other pitfalls, which made me want to re-package ZODB into something small, pretty, tool-ish, versatile thing for the pocket. Actually I'm trying to re-map ZOPE to the simplistic Durus interface, without its short-comings and lack of support. I think a successfully down-scaled, isolated package with ZODB's great implementation, but a more user-oriented interface would help ZODB a lot to get widely accepted and incorporated into very many projects. Right now people are just too much concerned of implicit complication which actually does not exist. I volunteer to start such a project. Proposing the name david, as opposed to goliath. ZODB is an old project that has accumulated some cruft over the years, however: - I've tried to simplify it and, with the exception of ZEO, I think it's pretty straightforward. - ZODB is used by a lot of people with varying needs and tastes. The fact that it is pretty modular has allowed a lot of useful customizations. - I'm pretty happy with the layered storage architecture. - With modern package installation tools like buildout and pip, having lots of dependencies shouldn't be a problem. ZODB uses lots of packages that have uses outside of ZODB. I consider this a strength, not a weakness. Honestly, I have no interest in catering to users who don't use buildout, or pip, or easy_install. - The biggest thing ZODB needs right now is documentation. Unfortunately, this isn't easy. There is zodb.org, but much better documentation is needed. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] zc.zlibstorage missing from zodb package
On Sat, Jul 20, 2013 at 11:09 PM, Christian Tismer tis...@stackless.com wrote: Hi friends, I'm trying to work with ZODB. (!) Cool. Coming from durus development since a couple of weeks, I am spoiled by simplicity. Actually, I'm annoyed by durus' incapability to accept patches, so I'm considering to put my efforts into ZODB. On the other hand, ZODB tries to become small and non-intrusive, but looking at its imports, this is still not a small package, and I'm annoyed of this package as well. - missing the zc.zlibstorage module is missing, IMHO. I don't understand this statement. besides that, zc.zlibstorage was not maintained since quite a while and imports ZOPE3. It's still maintained, but hasn't required maintenance in some time. This is nonsensical. It depends on ZODB and zope.interface (and zope.testing and manuel for tests). ... - discussion zc.zlibstorage requites a wrapper to add it to filestorage. I consider this an option, instead, and a simple boolean flag to switch it on and off. The module is way too simple to add all this config extra complication to even think of it. The layered storage architecture made it very easy and low risk to add this capability. Further, some have suggested that we should use different compression schemes. Making this pluggable makes it more flexible. Having said that though, I agree that compression is something people almost always want, and I can understand your desire to make it simpler. - proposal: let me integrate that with ZODB and add a config option, instead of a wrapper. I don't know what you mean by integrate. I suggest, if you want to make it simpler is to provide new ZConfig tags or Python factories that make configuration simpler the way you'd like, but that do so by assembling layers under the hood. ... Meant in a friendly, collaborative sense -- Chris Much appreciated. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] make ZODB as small and compact as expected
On Mon, Jul 22, 2013 at 9:15 AM, Stephan Richter stephan.rich...@gmail.com wrote: On Sunday, July 21, 2013 06:12:34 AM Christian Tismer wrote: ... ZConfig In my opinion this is a relic from the times before configparser existed. IMO, ZConfig is very useful in some specific cases, especially ZODB and logging configuration. It is also used by other projects outside of ZODB. ZEO This is separate for historical reasons. I agree it could be merged into the ZODB project these days. It was separate a long time ago. It's been part of the ZODB distribution for a long time until recently. It makes sense for it to be optional, as it's of no interest to people who use relstorage. More importantly, it's more complicated than any other part of ZODB and it makes a lot of sense for ZODB development to be unburdened of it. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] BTrees and ZODB simplicity
On Mon, Jul 22, 2013 at 8:11 AM, Christian Tismer tis...@stackless.com wrote: On 22.07.13 13:13, Jim Fulton wrote: On Sat, Jul 20, 2013 at 11:43 PM, Christian Tismer tis...@stackless.com wrote: Third rant, dear Zope-Friends (and I mean it as friends!). In an attempt to make the ZODB a small, independant package, ZODB has been split into many modules. Maybe not as many as you think: persistent, transaction, ZEO, ZODB and BTrees. 5 shrug I appreciate that, while I think it partially has the opposite effect: - splitting BTrees apart is a good idea per se. But the way as it is, it adds more Namespace-pollution than benefits: To make sense of BTrees, you need the ZODB, and only the ZODB! So, why should then BTrees be a top-level module at all? This does not feel natural, but eavesdropping, pretending as something that is untrue. I think: - BTrees should either be a ZODB sub-package in its current state, - or a real stand-alone package with some way of adding persistence as an option. I don't agree that because a package depends on ZODB it should be in ZODB. There are lots of packages that depend on ZODB. This is generally true. In the case of BTrees, I think the ZODB is nothing without BTrees, and BTrees make no sense without a storage and carry those _p_attributes which are not optional. This is true of every class that subclasses Persistent. BTrees would make more sense as a standalone package if the persistence model were pluggable. But that is also theoretical because I don't see right now how to split that further with all the C code. Well, it's definitely possible. Early in the evolution of BTrees, there we ifdefs that turned off dependence on persistence. But even with the dependence on Persistent, their still perfectly usable without storing them in a database. Their use is just a lot more compelling in the presence of a database. ... I agree with your sentiments about namespace pollution. You and I may be the only ones that care though .3 ;). Yay, actually I care mainly because just trying 'pip install ZODB' spreads out n folders in my site-packages, and 'pip uninstall ZODB' leaves n-1 to pick the names by hand. That's why I want things nicely grouped ;-) shrug Maybe you should use virtualenv or buildout so as to leave your site-packages alone. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Replication for the Zope Object Database
On Mon, May 27, 2013 at 8:42 PM, Carlos de la Guardia carlos.delaguar...@gmail.com wrote: Hey Jim, great news! pypi link is wrong, should be: https://pypi.python.org/pypi/zc.zrs Right you are. Thanks. Year for changes in pypi page is shown as 2015. Oops. Fixed on PyPI and and future releases. JIm -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Replication for the Zope Object Database
I'm happy to announce that we've released ZRS, a replication framework for ZODB, as open source. ZRS provided primary/secondary replication for ZODB File Storages, typically as part of ZEO (ZODB client-server) servers. To learn more, see: http://www.zope.com/products/x1752814276/Zope-Replication-Services and: https://pypi.python.org/zc.zrs Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Fri, May 10, 2013 at 5:04 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/08/2013 12:34 PM, Tres Seaver wrote: On 04/29/2013 08:37 PM, Stephan Richter wrote: Well, that's the py3 branch. As Tres mentioned, zodbpickle is ready for Py3 with noload() support. I totally agree that we do not need to solve any of the transition work now. So for ZODB Py3 support we need to: 1. Merge the py3 branch into trunk. 2. Simplify zodbpickle to just contain the cPickle code that is Py3 compatible. I do not care whether this happens for ZODB 4.0 or 4.1 as long as I get some commitment that 4.1 Chris and I chatted with Jim about this over beers last Friday. I explained that the current 'py3; branch does not require the 'zodbpickle everywhere' stuff (the Python2 side doesn't use 'zodbpickle'). Jim then agreed that we could merge that branch before releasing 4.0. We will need to add some caveats to the docs / changelog (Python3 support is only for new applications, no forward- / backward-compatibility for data, etc.) Given that ZODB won't import or use 'zodbpickle' under Python2, I don't think we need to remove the current Python2 support (as released in 0.4.1): the Python3 version (with noload()) has been there all along. I have merged the 'py3' branch to 'master': - - All tests pass under all four platforms using buildout. - - All unit tests pass on all four platforms using 'setup.py test'. I added the following note to the changelog: ZODB 4.0.x is supported on Python 3.x for *new* applications only. Due to changes in the standard library's pickle support, the Python3 support does **not** provide forward- or backward-compatibility at the data level with Python2. A future version of ZODB may add such support. Applications which need migrate data from Python2 to Python3 should plan to script this migration using separte databases, e.g. via a dump-and-reload approach, or by providing explicit fix-ups of the pickled values as transactions are copied between storages. I pushed out a ZODB 4.0.0b1 release after the merge. If the buildbots stay green over the weekend, I think we can release a 4.0.0 final early next week. Great, thanks! Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Sun, Apr 28, 2013 at 8:34 PM, Stephan Richter stephan.rich...@gmail.com wrote: On Sunday, April 28, 2013 07:23:12 PM Jim Fulton wrote: Can ZODB 4 be used now without zodbpickle? No, unfortunately for Py2 we need the custom cPickle and for Py3 `noload()` support (as Tres mentioned). This is a problem. The only change in ZODB 4.0 was supposed to be the breakup. This was supposed to be a low-risk release. The separation into multiple packages was supposed to increase agility, but now it appears we're stuck. I'd like there to a stable 4.0 release **soon** that doesn't use zodbpickle for Python 2. For now, I suggest we focus on stability and the ability to make progress on non-Python-3-related work. After that is achieved, I suggest we get to the point where people can create new databases and use them with Python 3. We need to do this without hindering the ability to make new stable releases. As far as the grander vision for Python2/3 transition and interoperability, we need to make progress incrementally and not sacrifice stability of the master branch. I made the 3.11 release fully expecting a stable 4.0 release soon. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Mon, Apr 29, 2013 at 10:24 AM, Stephan Richter stephan.rich...@gmail.com wrote: On Monday, April 29, 2013 09:48:05 AM Jim Fulton wrote: I'd like there to a stable 4.0 release **soon** that doesn't use zodbpickle for Python 2. I would like to agree. But on the other hand, the ZODB release cycles are very long and the prospect of waiting another 6-12 months before any Python 3 support lands, is really scary because it prohibits me to even write a new project in Python 3. As stated here: https://mail.zope.org/pipermail/zodb-dev/2012-October/014770.html I was hoping that the breakup of the ZODB packages would allow us to increase the tempo of releases. But increasing tempo is only possible of master is stable. (CH has just invested about 6 man-months into the porting effort and without ZODB we are basically stuck. But we do not need a transition plan, since we can recreate our ZODBs from configuration files.) Could we compromise and support Python 3 in ZODB 4.0 without necessarily solve all the migration strategy issues? I suggested that in the part fo my email that you snipped. In fact, by using zodbpickle, zodbpickle can have a separate, faster release cycle experimenting with some transition strategies. Maybe one way to install ZODB 4.0 would be to not use zodbpickle and use cPickle instead. We already have all that stuff separated into a _compat module, so that should not be too hard. Right. As I suggested, let's get to a point where we can get a stable ZODB 4.0 release for Python 2. As soon as we get that, let's get a ZODB 4.0.x or 4.1 release that works on Python 3, presumably via zodbpickle. While we want to make progress on Python 3, we can't hold ZODB hostage to the Python 3 porting effort. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Mon, Apr 29, 2013 at 10:20 AM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/29/2013 09:48 AM, Jim Fulton wrote: On Sun, Apr 28, 2013 at 8:34 PM, Stephan Richter stephan.rich...@gmail.com wrote: On Sunday, April 28, 2013 07:23:12 PM Jim Fulton wrote: Can ZODB 4 be used now without zodbpickle? No, unfortunately for Py2 we need the custom cPickle and for Py3 `noload()` support (as Tres mentioned). This is a problem. The only change in ZODB 4.0 was supposed to be the breakup. This was supposed to be a low-risk release. The separation into multiple packages was supposed to increase agility, but now it appears we're stuck. The only reason we had delayed the 4.0 release (in my mind, anyway) was that it was a good way to signal the Py3k compatibliity changes. That was a bad idea. Unless you want to reinforce the fact that Python 3 is an agility killer. .5 ;) There's release meta data to signal Python 3 compatibility. I'm not wedded to calling the Py3k-compatible release 4.0. Cool. I'd like there to a stable 4.0 release **soon** that doesn't use zodbpickle for Python 2. For now, I suggest we focus on stability and the ability to make progress on non-Python-3-related work. After that is achieved, I suggest we get to the point where people can create new databases and use them with Python 3. We need to do this without hindering the ability to make new stable releases. The trunk of the 'ZODB' package does not have any of the Py3k / zodbpickle changes yet. We could make a ZODB 4.0b1 release from the trunk today +1. and create a '4.0' stable branch prior to any merge of the 'py3' work. Let's keep master stable. Maybe someone will want to add features before the Python 3 support is stable. I don't want to hold 4.1 hostage either. I suggest breaking the Python 3 work into increments that can each be introduced without sacrificing stability. The first increment could provide Python 3 support without any conversion or compatibility support. This is something you could probably achieve pretty quickly and would allow you meet your immediate goals. As far as the grander vision for Python2/3 transition and interoperability, we need to make progress incrementally and not sacrifice stability of the master branch. I made the 3.11 release fully expecting a stable 4.0 release soon. That was of the 'ZODB3' meta-package, right? Yes. It was predicated on a stable 4.0 release that had very little in it beyond the split into separate packages. It was intended to help people start transitioning from ZODB3 to ZODB, but that can't happen until ZODB is stable. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Mon, Apr 29, 2013 at 10:54 AM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/29/2013 10:51 AM, Jim Fulton wrote: Right. As I suggested, let's get to a point where we can get a stable ZODB 4.0 release for Python 2. As soon as we get that, let's get a ZODB 4.0.x or 4.1 release that works on Python 3, presumably via zodbpickle. As I proposed earlier this morning, I can only reply to one email at a time. :) we can make a non-Py3k, non-zodbpickle 4.0b1 release today from the master branch, and a 4.0 final in a week. Once we get that release out, we can then merge the 'py3' branch, including adding the requirement for 'zodbpickle' under both Python2 and Py3k, and aim for a much expedited 4.1 release which supports Py3k. I'd rather keep this would on a branch until it's known to be stable. I suggest instead focusing on getting new Python 3 applications working without affecting Python 2 apps. IOW, only use zodbpickle for Python 3 *initially*. I want to be able to release from master at almost any time. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Mon, Apr 29, 2013 at 12:25 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/29/2013 11:00 AM, Jim Fulton wrote: Let's keep master stable. Maybe someone will want to add features before the Python 3 support is stable. I don't want to hold 4.1 hostage either. Given that the only folks (besides maybe you) invested in ZODB development want Py3k support ASAP. I don't see that. Do you have features in mind that you would imagine releasing before we land Py3k support? Yes. There are lots of features I'd like to add to ZODB. I tend to work on them when I have time (infrequently) or where we have a driving need at ZC. Long ZODB release cycles provide a lot of stop energy. The only way to get away from long release cycles is to have a stable master that's releasable at any time. OTOH, ZODB is pretty critical software, so we have to be very confident in what we merge to master. I suggest breaking the Python 3 work into increments that can each be introduced without sacrificing stability. The first increment could provide Python 3 support without any conversion or compatibility support. This is something you could probably achieve pretty quickly and would allow you meet your immediate goals. We are already there, AFAIK, on the 'py3' branch: the blocker is just getting out a release. All I ask is a stable releasable master. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Mon, Apr 29, 2013 at 1:15 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/29/2013 12:44 PM, Jim Fulton wrote: Yes. There are lots of features I'd like to add to ZODB. I tend to work on them when I have time (infrequently) or where we have a driving need at ZC. Long ZODB release cycles provide a lot of stop energy. We are already developing this way (the 'py3' branch has not been merged). However, if you do a lot of Python2-only feature work and merge to master, you will likely push back to horizon for merging that branch: we will have to port any work done to it. I'd be happy to say that anything pushed to master has to pass tests on Python 3. I have no interest in delaying Python 3 work. Using the 4.0 label to signal big changes ahead, evaluate carefully before upgrading was the primary reason I had been pushing to get the Py3k stuff landed Apparently. :) But, IIRC, you never discussed this with me. When I announced 4.0, the big change was splitting off ZEO, persistent, and later BTrees. In fact, as you may remember, I suggested splitting BTrees off in 5, because I didn't want to to delay 4. (the low-risk thing would have been more naturally labeled 3.11). Except I explicitly said that 4.0 was supposed to be a low risk release. That's why 3.11 was just a meta-release to aid people in the transition to 4. When I saw all your activity on porting to Python 3, I stepped back to give you room. But now, several months have gone by and we're more or less where we were in November wrt 4.0. I greatly appreciate and support you guys have done on Python 3 porting. I don't mean to criticize the work you've done. If anyone deserves criticism, it's me for not staying on top of this. We need to get to a point where we can release frequently, with confidence. That doesn't mean we will; it depends on people's time to contribute. But we need to be able and we need to plan our activities so we can release frequently. Whether 4.0 supports Python 3 or not, let's quickly get to the point where tests are run and pass on both Python 2 and 3. Once we get to that point, we won't accept pull requests that break Python 3 (or 2, of course). But let's get to the point soon where we can make Python 2 releases with confidence. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
storages (replace_py2_cpickle_). See: https://github.com/zopefoundation/zodbpickle/tree/py2_explicit_bytes - - ``zodbpickle`` should provide a pickler/unpickler for use by Py3k clients who operate against unconverted storages (replace_py3k_pickle_). See: https://github.com/zopefoundation/zodbpickle - - ``zodbpickle`` might need to provide a wrapper storage supporting straddle_no_convert_. Comments? Thanks for taking the time to work all of this out. It sounds rather complex. :) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Wed, Apr 17, 2013 at 2:54 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/16/2013 05:13 PM, Stephan Richter wrote: On Tuesday, April 16, 2013 04:38:06 PM Tres Seaver wrote: Comments? (I don't now why Stephan's e-mail didn't make it to the list). The big omission that I noticed while reading the text carefully is a note saying that you will never be able to use stock Py3k pickle, because it does not support noload(). Thus ``zodbpickle`` is needed for any Py3k code. (I think this is a correction to you your last bullet in _replace_py2_cPickle.) Hmm, I think you are correct. That reminds me, originally we forked pickle.py from Python 3.3. During PyCon I think you decided to start by using cPickle from Python 2.7 instead. If you are starting from Py2.7 cPickle, then supporting Protocol 3 is not easy. Already done (as you note in your follow-up). Given your writeup, I think you are implicitly saying to start from Py3.3 pickle and add the special support for Python 2 binary via the special new type. That sounds good to me. I would actually prefer to fork the Python 3.2 version: the one from 3.3 pulls in a bunch of grotty internal-only usage. I'm confused. I don't understand why we need a Python 3 pickler change to support the new Python 2 binary type. I thought we were going to pickle Python 2 binary objects using the standard Python 3 protocol 3 code? BTW, what are your motivations for all the different strategies? I wanted to document them all, because some of the strategies suit different cases better than others. _ignore_compat is obvious. If you can easily create the ZODB from other data sources, then you can do a one-time switch. In fact, at CipherHealth we have this case, since the ZODB only contains config (which is loaded from text files) and session data. Yup. Even for large CMS systems, I would still make dump-to-filesystem, then reload a requirement. Others disagree, of course (and may have legitimate reasons). Leo Rochael Almeida has clients with databases too big to convert, for instance (the downtime required to do the conversion would be prohibitive, I believe). But which strategy would be useful for a large Plone site for example? I think we should focus on that and provide one good way to do it. Plone has historically preferred in-place migration to dump-reload. Maybe jumping the Py3k curb is enough reason for them to reconsider. I'm hoping to be able to provide some help with in-place conversion in the near future. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility
On Fri, Apr 26, 2013 at 8:02 PM, Stephan Richter stephan.rich...@gmail.com wrote: On Friday, April 26, 2013 05:34:15 PM Tres Seaver wrote: I would like to merge this branch to master early next week and make a release, so that we can evaluate merging the 'py3' branch of ZODB. Thoughts? Note that I have not yet addressed the portions of my proposal which deal with analyzing / converting existing databases, or with the possibly-needed wrapper storage (for on-the-fly conversion). Let's do it. This way people can test ZODB 4 on their existing Py2 code bases and we at CH can test our uibuilder/demo on Python 3. This will give us at least some confidence that we are going in the right direction. We might consider not even tackling DB conversions for ZODB 4.0 and delay that to ZODB 4.1 or leave it even up to an add-on package. This way people can experiment with different approaches and we do not have to nail the conversion problem with ZODB 4.0. I've lost track of where we are with ZODB 4. Can ZODB 4 be used now without zodbpickle? Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Towards ZODB on Python 3
=True) [7] [7] this is the status quo of the 'py3' branch in the ZODB repo [8] OTOH we could implement special support for REDUCE of codecs.decode() in our noload -- I almost got that working before Jim suggested a different approach, which is [6]. At least there's some nice symmetry: no matter if you pickle your stuff on Python 2 or Python 3, you get to deal with bytes becoming unicode when you unpickle. These kinds of guessing games are inevitable when you're migrating pickles from Python 2 to Python 3, but do we want to make them mandatory for day-to-day operation? Perhaps we ought to drop our original goal (3) and require an explicit one-time possibly-lossy conversion process for goal (2), then use pickle protocol 3 on Python 3 and have short pickles, perfect roundtripping of bytestrings? Then there's ZEO, which uses pickles for both payloads _and_ for marshalling in its RPC layer. That's also fun, but I think we can at least declare that ZEO server and client must be on the same Python version, perhaps by bumping the protocol version. So, this is where things stand right now. Plus a few relatively minor matters like adding missing noload() tests to zodbpickle and making zodbpickle work on Python 3.2 [9] [9] https://mail.zope.org/pipermail/checkins/2013-March/065813.html Other than that, the ZODB py3 branch works on Python 3.3 [10]. As long as you're prepared to deal with bytestrings magically transforming into unicodes. [10] Stephan reported running an actual small demo application with it. Where do we go from here? Is this an issue for anything but names (object attributes and global names)? I don't think there's a native strings issue. There *does* seem to be an name issue. In Python 2 and Python 3, (non-buggy) unicode aware applications use bytes and unicode the same way, unicode for text, bytes for data. AFAICT, Python 3 has (admirably) changed the way names are implemented to use unicode, rather than ASCII. Am I missing something? This is a somewhat thorny, but still fairly restricted problem. I would hazard to guess that 99.923% of persistent classes pickle their state using their instance dictionaries. 99.9968% for regular Python classes. We know when we're pickling and unpickling instances and we can apply transformations necessary for the target platforms. I think the fix is pretty straightforward. In the default __setstate__ provided by Persistent, and when loading non-persistent instances: - On Python 2, ASCII encode unicode attribute names. - On Python 3, ASCII decode byte attribute names. The same transformation is necessary when looking up global names. This will cover the vast majority of cases where the default __setstate__ is used. In rare cases where a custom setstate is used, or when Python 3 non-ASCII attribute names are used, then databases may not be sharable accross Python versions. There is also likely to be breakage in dictionaries or BTrees where applications are sloppy about mixing Unicode and byte keys. I don't think we should try to compensate for this. These applications need to be fixed. One could write a database analysis script to detect this kind of breakage (looking for mixed string and unicide keys). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Towards ZODB on Python 3
On Sun, Mar 10, 2013 at 11:25 AM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/10/2013 09:19 AM, Jim Fulton wrote: ... I think the fix is pretty straightforward. In the default __setstate__ provided by Persistent, and when loading non-persistent instances: - On Python 2, ASCII encode unicode attribute names. - On Python 3, ASCII decode byte attribute names. The same transformation is necessary when looking up global names. Hmm, if zodbpickle has to handle the issue for non-persistent instances and global names, wouldn't it be simpler to make it handle persistent instances too? No. It can't know when a key is going to be used for a persistent attribute name. It can examine the stack inside 'load_dict' to figure out that the context is an instance, right? Ugh. What stack? It would be much simpler to handle this in __setstate__ (or the equivalent). This isn't exactly a lot of code. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Towards ZODB on Python 3
On Sun, Mar 10, 2013 at 12:13 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/10/2013 11:55 AM, Jim Fulton wrote: On Sun, Mar 10, 2013 at 11:25 AM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/10/2013 09:19 AM, Jim Fulton wrote: ... I think the fix is pretty straightforward. In the default __setstate__ provided by Persistent, and when loading non-persistent instances: - On Python 2, ASCII encode unicode attribute names. - On Python 3, ASCII decode byte attribute names. The same transformation is necessary when looking up global names. Hmm, if zodbpickle has to handle the issue for non-persistent instances and global names, wouldn't it be simpler to make it handle persistent instances too? No. It can't know when a key is going to be used for a persistent attribute name. It can examine the stack inside 'load_dict' to figure out that the context is an instance, right? Ugh. What stack? The one where the unpickler keeps its work-in-progress? static int load_none(UnpicklerObject *self) { PDATA_APPEND(self-stack, Py_None, -1); return 0; } static int load_dict(UnpicklerObject *self) { PyObject *dict, *key, *value; Py_ssize_t i, j, k; if ((i = marker(self)) 0) return -1; j = Py_SIZE(self-stack); if ((dict = PyDict_New()) == NULL) return -1; for (k = i + 1; k j; k += 2) { key = self-stack-data[k - 1]; value = self-stack-data[k]; if (PyDict_SetItem(dict, key, value) 0) { Py_DECREF(dict); return -1; } } Pdata_clear(self-stack, i); PDATA_PUSH(self-stack, dict, -1); return 0; } That won't work for persistent objects. Persistent state is set by the deserializer, not by the unpickler. The deserializer calls the unpickler to load the state. It then calls __setstate__ on the persistent object to set the state. The serializer doesn't know how to interpret the state, only __setstate__ does. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] transaction: synchronizer newTransaction() behavior
On Fri, Mar 8, 2013 at 9:54 PM, Siddhartha Kasivajhula countvajh...@gmail.com wrote: Hi there, I've been discussing this issue with Laurence Rowe on the pylons-dev mailing list, and he suggested bringing it up here. I'm writing a MongoDB data manager for the python transaction package: https://github.com/countvajhula/mongomorphism I noticed that for a synchronizer, the beforeCompletion() and afterCompletion() methods are always called once the synch has been registered, but the newTransaction() method is only called when an explicit call to transaction.begin() is made. Since it's possible for transactions to be started without this explicit call, I was wondering if there was a good reason why these two cases (explicitly vs implicitly begun transactions) would be treated differently. Nope. This is a bug. That is, should the following two cases not be equivalent, and therefore should the newTransaction() method be called in both cases: (1) t = transaction.get() t.join(my_dm) ..some changes to the data.. transaction.commit() and: (2) transaction.begin() t = transaction.get() t.join(my_dm) ..some changes to the data.. transaction.commit() Only if (1) was was preceded by an ``abort``. The definition of ``get`` is to get the current transaction, creating one, if necessary. Really, ``begin` and ``abort`` are equivalent. It might be better if there wasn't a ``begin`` method, as it's missleading. One should be an alias for the other. I'd be for deprecating ``begin``. The call to ``_new_transaction`` should be moved to the point in ``get`` where a new transaction is created, and ``begin`` should be made an alias for ``abort``. In my mongo dm implementation, I am using the synchronizer to do some initialization before each transaction gets underway, and am currently requiring explicit calls to transaction.begin() at the start of each transaction. Unfortunately, it appears that other third party libraries using the transaction library may not be calling begin() explicitly, and in particular my data manager doesn't work when used with pyramid_tm. Another thing I noticed was that a synchronizer cannot be registered like so: transaction.manager.registerSynch(MySynch()) .. and can only be registered like this: synch = MySynch() transaction.manager.registerSynch(synch) ... which I'm told is due to MySynch() being stored in a WeakSet which means it gets garbage collected. Currently this means that I'm retaining a reference to the synch as a global that I never use. Just seems a bit contrived so thought I'd mention that as well, in case there's anything that can be done about that. This is to prevent memory leaks. Normally, the synchronizier is associated with a database. For example, the synchronizers are database connection methods. A stand-alone synchronizer seems weird to me. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Cache warm up time
On Sat, Mar 9, 2013 at 5:50 AM, Vincent Pelletier plr.vinc...@gmail.com wrote: Le Friday 08 March 2013 18:50:09, Laurence Rowe a écrit : It would be great if there was a way to advise ZODB in advance that certain objects would be required so it could fetch multiple object states in a single request to the storage server. +1 I can see this used to process a large tree, objects being be processed as they are loaded (loadds being pipelined). Pseudo-code interface suggestion: class IPipelinedStorage: def loadMany(oid_list, callback, tid=None, before_tid=None): callback being along the lines of: def callback(oid, data_record, tid, next_tid): if stop_condition: raise ... (StopIteration ? just anything ?) return more_oids_to_queue_for_loading tid and before_tid (mutualy exclusive) specify the snapshot to use, to implement equivalent of loadSerial and loadBefore. class IPipelinedConnection: def walk(ob, callback): callback being along the lines of: def callback(just_loaded_object, referee_list): # do womething on just_loaded_object return filtered_referee_list referee_list would expose at least referee's class (name ?), and hold their oid for Connection.walk internal use (only ?). Or maybe just ghosts, but callback would have to take care of not unghostifying them - it would void the purpose of pipelining loads. Above ZODB (persistent containers with internal persistent objects, like BTree): Implement an iterator over subobjects ignoring intermediate internal structure (think BTree.*Bucket classes). Specific iteration order could probably be specified to be able to implement iterkeys and such in BTree for example, but storage may have to implement load reordering when they happen in parallel (like NEO, and as could probably be implemented for zeoraid and relStorage configured with multiple mirrored databases), limiting latency/processing parallelism and possibly leading to memory footprint explosion. So I think it should be possible to also request no special loading order to get lowest latency backend can provide and somewhat constant memory footprint. Any thought/comment ? I think this is more complicated than necessary. I think a simple method on a storage that gives a hint that a set of object ids will be loaded is enough. A network storage could then issue a pipelined request for those oids. The application can then proceed as usual. I think I've proposed such an API before, but am too lazy to look it up. Something like: load_hint(*oids) I'd like to see this functionality, but I don't have time to do it soon. I must say that I think this API is more likely to be abused than used effectively. Prefetching catalog indexes is a sort of anti-pattern than only makes sense for small catalogs. It would likely make more sense to have a dedicated catalog server that returned oids and possibly object records in response to queries (or whimper, use solr ). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Cache warm up time
On Sat, Mar 9, 2013 at 9:02 AM, Jim Fulton j...@zope.com wrote: ... I think a simple method on a storage that gives a hint that a set of object ids will be loaded is enough. A network storage could then issue a pipelined request for those oids. The application can then proceed as usual. I think I've proposed such an API before, but am too lazy to look it up. Something like: load_hint(*oids) I'd like to see this functionality, but I don't have time to do it soon. I must say that I think this API is more likely to be abused than used effectively. Prefetching catalog indexes is a sort of anti-pattern than only makes sense for small catalogs. It would likely make more sense to have a dedicated catalog server that returned oids and possibly object records in response to queries (or whimper, use solr ). I forgot to mention an even simpler way to reduce the number of round-trips for cataloging data structures is to increase the bucket and internal node sizes. You can do this now by patching the BTree header files. We've done this in the past to reduce the likelihood of database conflicts when buckets split. I suspect the default sizes are too low. I'd really like BTrees to be subclassible and for sizes to be read from instance data, falling back to class settings. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] zodb conversion questions
On Thu, Feb 7, 2013 at 10:48 AM, Jürgen Herrmann juergen.herrm...@xlhost.de wrote: Am 06.02.2013 15:05, schrieb Jürgen Herrmann: Hi there! I hav a relstorage with mysql backend that grew out of bounds and we're looking into different backend solutions now. Possibly also going back to FileStorage and using zeo... Anyway we'll have to convert the databases at some point. As these are live DBs we cannot shut them down for longer than the ususal maintenance interval during the night, so for maybe 2-3h. a full conversion process will never complete in this time so we're looking for a process that can split the conversion into two phases: 1. copy transactions from backup of the source db to the destination db. this can take a long time, we don't care. note the last timestamp/transaction_id converted. 2. shut down the source db 3. copy transactions from the source db to the destination db, starting at the last converted transaction_id. this should be fast, as only a few transactions need to be converted, say 1% . if i would reimplement copyTransactionsFrom() to accept a start transaction_id/timestamp, would this result in dest being an exact copy of source? source = open_my_source_storage() dest = open_my_destination_storage() dest.copyTransactionsFrom(source) last_txn_id = source.lastTransaction() source.close() dest.close() source = open_my_source_storage() # add some transactions source.close() source = open_my_source_storage() dest = open_my_destination_storage() dest.copyTransactionsFrom(source, last_txn_id=last_txn_id) source.close() dest.close() I will reply to myself here :) This actually works, tested with a modified version of FileStorage for now. I modified the signature of copyTransactionsFrom to look like this: def copyTransactionsFrom(self, source, verbose=0, not_before_tid=None): ``start`` would be better to be consistent with the iterator API. not_before_tid is a packed tid or None, None meaning copy all (the default, so no existing API usage would break). Is there public interest in modifying this API permamently? +.1 This API is a bit of an attractive nuisance. I'd rather people learn how to use iterators in their own scripts, as they are very useful and powerful. This API just hides that. The second part, replaying old transactions is a bit more subtle, but it's still worth it for people to be aware of it. If I were doing this today, I'd make this documentation rather than API. But then, documentation ... whimper. Anybody want to look at the actual code changes? Sure, if they have tests. Unfortunately, we can only accept pull requests from zope contributors. Are you one? Wanna be one? :) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] #zodb on freenode
I hang there, fwiw :) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote: On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote: I plan to do ZEO shortly. Well, that didn't go well. svn git fetch spent several days and didn't finish. It seemed to be really thrashing trying to follow tags. I'm probably going to have to convert just the trunk. Done. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Fri, Jan 18, 2013 at 2:38 PM, Claudiu Saftoiu csaft...@gmail.com wrote: I wonder if disk latency is the problem?. As a test you could put the index.fs file into a tmpfs and see if that improves things, or cat index.fs /dev/null to try and force it into the fs cache. Hmm, it would seem not... the cat happens instantly: (env)tsa@sp2772c:~/sports$ time cat Data_IndexDB.fs /dev/null real0m0.065s user0m0.000s sys 0m0.064s The database isn't even very big: rw-r--r-- 1 tsa tsa 233M Jan 18 14:34 Data_IndexDB.fs Which makes me wonder why it takes so long to load it into memory it's just a bit frustrating that the server has 7gb of RAM and it's proving to be so difficult to get ZODB to keep ~300 megs of it up in there. Or, indeed, if linux already has the whole .fs file in a memory cache, where are these delays coming from? There's something I don't quite understand about this whole situation... Some high-level comments: - ZODB doesn't simply load your database into memory. It loads objects when you try to access their state. If you're using ZEO (or relstorage, or neo), each load requires a round-trip to the server. That's typically a millisecond or two, depending on your network setup. (Your database is small, so disk access shouldn't be an issue as it is, presumably in your disk cache. - You say it often takes you a couple of minutes to handle requests. This is obviously very long. It sounds like there's an issue with the way you're using the catalog. It's not that hard get this wrong. I suggest either hiring someone with experience in this area to help you or consider using another tool, like solr. (You could put more details of your application here, but I doubt people will be willing to put in the time to really analyze it and tell you how to fix it. I know I can't.) - solr is so fast it almost makes me want to cry. At ZC, we're increasingly using solr instead of the catalog. As the original author of the catalog, this makes me sad, but we just don't have the time to put in the effort to equal solr/lucene. - A common mistake when using ZODB is to use it like a relational database, puting most data in catalog-like data structures and querying to get most of your data. The strength of a OODB is that you don't have to query to get data from a well-designed object model. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Thu, Jan 17, 2013 at 12:31 PM, Claudiu Saftoiu csaft...@gmail.com wrote: ... One potential thing is this: after a zeopack the index database .fs file is about 400 megabytes, so I figure a cache of 3000 megabytes should more than cover it. Before a zeopack, though - I do one every 3 hours - the file grows to 7.6 gigabytes. In scanning over this thread while writing my last message, I noticed this. This is a ridiculous amount of churn. There is likely something seriously out of whack with your application. Every application is different, but we typically see *weekly* packs reduce database size by at most 50%. Shouldn't the relevant objects - the entire set of latest versions of the objects - be the ones in the cache, thus it doesn't matter that the .fs file is 7.6gb as the actual used bits of it are only 400mb or so? Every object update invalidates cached versions of the obejct in all caches except the writer's. (Even the writer's cached value is invalidated of conflict-resolution was performed.) Another question is, does zeopacking destroy the cache? No, but lots of writing does. If so then that would make sense. I'll have to preload upon every zeopack. If it's not that, then I'm not sure what it could be. I think you have some basic application design problem(s). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Sat, Jan 19, 2013 at 10:06 AM, Leonardo Santagada santag...@gmail.com wrote: On Sat, Jan 19, 2013 at 12:51 PM, Jim Fulton j...@zope.com wrote: - solr is so fast it almost makes me want to cry. At ZC, we're increasingly using solr instead of the catalog. As the original author of the catalog, this makes me sad, but we just don't have the time to put in the effort to equal solr/lucene. We are using it on some projects also... But deploying java is as complicated as python to deploy so it increases 2x the deployment work needed for a project. Yup, integration is a downside, which is why we still use catalogs or related indexing structures too. Do you think this could be a good idea, to have an integrated solution of ZODB for object persistence and integrated indexing using Whoosh? I have no idea. I'm not familiar with whoosh. Probably what would be needed is a project to wrap them together, that does indexing during the commit and exposes an indexing api, like marking fields in persistent objects for indexing and having methods for searching the index. shrug I know people have come up with tools to index ZODB objects with lucene in the past. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Fri, Jan 18, 2013 at 11:55 AM, Claudiu Saftoiu csaft...@gmail.com wrote: If you want to load the btree item into cache, you need to do item._p_activate() That's not going to work, since `item` is a tuple. I don't want to load the item itself into the cache, I just want the btree to be in the cache. Er, to be clearer: my goal is for the preload to load everything into the cache that the query mechanism might use. It seems the bucket approach only takes ~10 seconds on the 350k-sized index trees vs. ~60-90 seconds. This seems to indicate that less things end up being pre-loaded... I guess I was too subtle before. Preloading is a waste of time. Just use a persistent ZEO cache of adequate size and be done with it. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] API question
On Mon, Jan 14, 2013 at 1:32 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 While working on preparation for a Py3k port, I've stumbled across a fundamental issue with how ZODB structures its API. Do we intend that client code do the following:: from ZDOB import DB, FileStorage db = DB(FileStorage('/path/to/Data.fs')) As Marius points out, this doesn't work. or use the module as a facade :: import ZODB db = ZODB.DB(ZODB.FileStorage.FileStorage('/path/to/Data.fs')) This doesn't work either. You haven't imported FileStorage. WRT ZODB.DB, ZODB.DB is an age-old convenience. It's unfortunate that ZODB.DB (the class) shadows the module, ZODB.DB, just like the class ZODB.FileStorage.FileStorage shadows the modules ZODB.FileStorage.FileStorage.FileStorage. (Of course, it's also unfortunate that there's a ZODB.FileStorage.FileStorage.FileStorage module. :) If we had a do-over, we'd use ZODB.db.DB and ZODB.filestorage.FileStorage, and ZODB.DB would be a convenience for ZODB.db.DB. I would actually prefer that clients explicitly import the intermediate modules:: from ZDOB import DB, FileStorage db = DB.DB(FileStorage.FileStorage('/path/to/Data.fs')) So you don't mind shadowing FileStorage.FileStorage.FileStorage. ;) or even better:: from ZDOB.DB import DB # This one can even be ambiguous now FTR, I don't like this style. Somewhat a matter of taste. from ZODB.FileStorage import FileStorage db = DB(FileStorage('/path/to/Data.fs')) The driver for the question is getting the tests to pass under both 'nosetests' and 'setup.py test', where the order of module imports etc. can make the ambiguous cases problematic. It would be a good time to do whatever BBB stuff we need to (I would guess figuring out how to emit deprecation warnings for whichever variants) before releasing 4.0.0. I'm pretty happy with the Zope test runner and I don't think using nosetests is a good reason to cause backward-incompatibility. The zope test runner works just fine with Python 3. Why do you feel compelled to introduce nose? I'm sort of in favor of moving to nose to follow the crowd, although otherwise, nose is far too implicit for my taste. It doesn't hande doctest well at all. Having said that, if I was going to do something like this, I'd rename the modules, ZODB.DB and ZODB.FileStorage to ZODB.db and ZODB.filestorage and add module aliases for backward compatibility. I don't know if that would be enough to satisfy nose. I'm not up for doing any of this for 4.0. I'm not alergic to a 5.0 in the not too distant future. I'm guessing that a switch to nose would also make you rewrite all of the doctests as unittests. As the prrimary maintainer of ZODB, I'm -0.8 on that. Back to APIs, I think 90% of users don't import the APIs but set up ZODB via ZConfig (or probably should, if they don't). For Python use, I think the ZODB.DB class short-cut us useful. Over the last few years, ZODB has grown some additional shortcuts that I think are also useful. Among them: ZODB.DB(filename) - DB with a file storage ZODB.DB(None) - DB with a mapping storage ZODB.connection(filename) - connection to DB with file storage ZODB.connection(None) - connection to DB with mapping storage More importantly: ZEO.client us a shortcut for ZEO.ClientStorage.ClientStorage ZEO.DB(addr or port) - DB with a ZEO client ZEO.connection(addr or port) - connection to DB with a ZEO client Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] API question
On Mon, Jan 14, 2013 at 7:20 PM, Tres Seaver tsea...@palladion.com wrote: ... I'm tempted to rename the 'DB.py' module 'db.py', and jam in a BBB entry in sys.modules for 'ZODB.DB'; likewise, I am tempted to rename the 'FileStorage.py' package 'filestorage', its same-named module '_filestorage.py', and jam in BBB entries for the old names. +.9 if done without backward-incompatiblke breakage. This would be a 4.1 thing. +1 if you used zodb.filestorage.filestorage rather than zodb.filestorage._filestorage. Those renames would make the preferred API: from ZODB import DB # convenience alias for the class from ZODB import db # the moodule from ZODB.db import DB # my preferred speling from ZDOB.filestorage imoprt FileStorage # conv. alias for class from ZODB import filestorage # the package from ZODB.filestorage import FileStorage # my preferred speling This is the same as one earlier. I suspect you meant: from ZODB.filestorage._filestorage import FileStorage but couldn't type the underware. I don't think the packagification of the FileStorage module was a win, but it's too hard to fix it now. Some day, I'd like to work on a filestorage2, but fear I won't ever find the time. :( from ZODB.filestorage import _filestorage # if needed We shouldn't design an API where we expected people to grab underware. Aside from not liking from imports and the _filestorage nit, +1 For extra bonus fun, we could rename 'ZODB' to 'zodb' :) In that case, we might switch to a namespace package, oodb, which I've already reserved: http://pypi.python.org/pypi/oodb But I doubt we're up for this much disruption. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Tue, Jan 15, 2013 at 2:08 PM, Claudiu Saftoiu csaft...@gmail.com wrote: On Tue, Jan 15, 2013 at 2:07 PM, Leonardo Santagada santag...@gmail.com wrote: On Tue, Jan 15, 2013 at 3:10 PM, Jim Fulton j...@zope.com wrote: On Tue, Jan 15, 2013 at 12:00 PM, Claudiu Saftoiu csaft...@gmail.com wrote: Hello all, I'm looking to speed up my server and it seems memcached would be a good way to do it - at least for the `Catalog` (I've already put the catalog in a separate zodb with a separate zeoserver with persistent client caching enabled and it still doesn't run as nice as I like...) I've googled around a bit and found nothing definitive, though... what's the best way to combine zodb/zeo + memcached as of now? My opinion is that a distributed memcached isn't a big enough win, but this likely depends on your use cases. We (ZC) took a different approach. If there is a reasonable way to classify your corpus by URL (or other request parameter), then check out zc.resumelb. This fit our use cases well. Maybe I don't understand zodb correctly but if the catalog is small enough to fit in memory wouldn't it be much faster to just cache the whole catalog on the clients? Then at least for catalog searches it is all mostly as fast as running through python objects. Memcache will put an extra serialize/deserialize step into it (plus network io, plus context switches). That would be fine, actually. Is there a way to explicitly tell ZODB/ZEO to load an entire object and keep it in the cache? I also want it to remain in the cache on connection restart, but I think I've already accomplished that with persistent client-side caching. You can't cause a specific object (or collection of objects) to stay ion the cache, but if you're working set is small enough to fit in the memory or client cache, you can get the same effect. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
So, first, a concise partial answer to a previous question: ZODB provides an in-memory object cache. This is non-persistent. If you restart, it is lost. There is a cache per connection and the cache size is limited by both object count and total object size (as estimated by database record size). ZEO also provides a disk-based cache of database records read from the server. This is normally much larger than the in-memory cache. It can be configured to be persistent. If you're using blobs, then there is a separate blob cache. On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com wrote: You can't cause a specific object (or collection of objects) to stay ion the cache, but if you're working set is small enough to fit in the memory or client cache, you can get the same effect. That makes sense. So, is there any way to give ZODB a Persistent and tell it load everything about the object now for this transaction so that the cache mechanism then gets triggered, or do I have to do a custom search through every aspect of the object, touching all Persistents it touches, etc, in order to get everything loaded? Essentially, when the server restarts, I'd like to pre-load all these objects (my cache is indeed big enough), so that if a few hours later someone makes a request that uses it, the objects will already be cached instead of starting to be cached right then. ZODB doesn't provide any pre-warming facility. This would be application dependent. You're probably better off using a persistent ZEO cache and letting the cache fill with objects you actually use. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?
On Tue, Jan 15, 2013 at 2:45 PM, Claudiu Saftoiu csaft...@gmail.com wrote: On Tue, Jan 15, 2013 at 2:40 PM, Jim Fulton j...@zope.com wrote: So, first, a concise partial answer to a previous question: ZODB provides an in-memory object cache. This is non-persistent. If you restart, it is lost. There is a cache per connection and the cache size is limited by both object count and total object size (as estimated by database record size). ZEO also provides a disk-based cache of database records read from the server. This is normally much larger than the in-memory cache. It can be configured to be persistent. If you're using blobs, then there is a separate blob cache. On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com wrote: You can't cause a specific object (or collection of objects) to stay ion the cache, but if you're working set is small enough to fit in the memory or client cache, you can get the same effect. That makes sense. So, is there any way to give ZODB a Persistent and tell it load everything about the object now for this transaction so that the cache mechanism then gets triggered, or do I have to do a custom search through every aspect of the object, touching all Persistents it touches, etc, in order to get everything loaded? Essentially, when the server restarts, I'd like to pre-load all these objects (my cache is indeed big enough), so that if a few hours later someone makes a request that uses it, the objects will already be cached instead of starting to be cached right then. ZODB doesn't provide any pre-warming facility. This would be application dependent. You're probably better off using a persistent ZEO cache and letting the cache fill with objects you actually use. Okay, that makes sense. Would that be a server-side cache, or a client-side cache? There are no server-side caches (other than the OS disk cache). I believe I've already succeeded in getting a client-side persistent disk-based cache to work (my zodb_indexdb_uri is zeo://%(here)s/zeo_indexdb.sock?cache_size=2000MBconnection_cache_size=50connection_pool_size=5var=zeocacheclient=index), This configuration syntax isn't part of ZODB. I'm not familiar with the options there. but this doesn't seem to be what you're referring to as that is exactly the same size as the in-memory cache. I doubt it, but who knows? Could you provide some pointers as to how to get a persistent disk-based cache on the ZEO server, if that is what you meant? ZODB is configured via ZConfig. The parameters are defined here: https://github.com/zopefoundation/ZODB/blob/master/src/ZODB/component.xml Not too readable, but at least precise. :/ Look at the parameters for zodb and zeoclient. Here's an example: zodb main cache-size 10 pool-size 7 zeoclient blob-cache-size 1GB blob-dir /home/zope/foo-classifieds/blob-cache cache-size 2GB server das-head1.foo.zope.net:11100 server das-head2.foo.zope.net:11100 /zeoclient /zodb If you want to use this syntax with paste, see: http://pypi.python.org/pypi/zc.zodbwsgi Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] zeopack error in zrpc.connection
On Mon, Jan 7, 2013 at 1:04 PM, Claudiu Saftoiu csaft...@gmail.com wrote: How do I go about fixing this? Let me know if I can provide any other information that would be helpful. I took the advice in this thread: https://mail.zope.org/pipermail/zodb-dev/2012-January/014526.html The exception that comes up, from the zeo server log, is: 2013-01-07T13:01:49 ERROR ZEO.zrpc (14891) Error raised in delayed method Traceback (most recent call last): File /home/tsa/env/lib/python2.6/site-packages/ZEO/StorageServer.py, line 1377, in run result = self._method(*self._args) File /home/tsa/env/lib/python2.6/site-packages/ZEO/StorageServer.py, line 343, in _pack_impl self.storage.pack(time, referencesf) File /home/tsa/env/lib/python2.6/site-packages/ZODB/blob.py, line 796, in pack result = unproxied.pack(packtime, referencesf) File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/FileStorage.py, line 1078, in pack pack_result = self.packer(self, referencesf, stop, gc) File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/FileStorage.py, line 1034, in packer opos = p.pack() File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line 397, in pack self.gc.findReachable() File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line 190, in findReachable self.findReachableAtPacktime([z64]) File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line 275, in findReachableAtPacktime for oid in self.findrefs(pos): File /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line 328, in findrefs return self.referencesf(self._file.read(dh.plen)) File /home/tsa/env/lib/python2.6/site-packages/ZODB/serialize.py, line 630, in referencesf u.noload() TypeError: 'NoneType' object does not support item assignment I'm afraid this doesn't seem to help me figure out what's wrong... I suspect your database is corrupted. You'd probably want to look at the record in question to be sure. You could disable garbage collection, but if you have a damaged record, you might want to use the previous version of the record (if it exists) to recover it. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Fri, Jan 4, 2013 at 1:19 PM, Jim Fulton j...@zope.com wrote: I'll do and BTrees, persistent and transaction in about a week. ... This weekend, I'll update svn for these packages and ZODB to indicate that development is taking place in github. So ZODB, persistent, BTrees and transaction are no in github, so noted in SVN and SVN projects now read only. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] repoze.catalog.query very slow
Y'all might have better luck with this on zope-dev. Jim On Thu, Jan 3, 2013 at 5:25 PM, Jeff Shell j...@bottlerocket.net wrote: Dear gods, I hope you get an answer to this question, as I've noticed the same thing with very large indexes (using zc.catalog). I believe that at the root layers repoze.catalog is built around the same concepts and structures. When I tried to trace down the problems with a profiler, they all revolved around loading the relevant portions of the indexes into memory. It had nothing to do with the final results of the query; had nothing to do with waking up the 'result' objects; all of the slowness seemed to be in loading the indexes themselves into memory. In one case, we were only using one index (a SetIndex), with about 15 document ids. This is from my own profiling. All I know is that this is very slow, and then very fast, and the object cache and the relevant indexes ability to keep all of their little BTrees or Buckets or Sets or whatever in that object cache seem to have a tremendous impact on query and response time - far more than is taken up by then waking up the content objects in your result set. When the indexes aren't in memory, in my case I found the slowness to be in BTrees's 'multiunion' function; the real slowness was in calling ZODB's setstate (which is loading into memory). This is just BTree (catalog index) data being loaded at this point: Profiling a fresh site (no object cache / memory population yet) winterhome-firstload.profile% callees multiunion Function called... ncalls tottime cumtime {BTrees._IFBTree.multiunion} - 659800.132 57.891 Connection.py:848(setstate) winterhome-firstload.profile% callers multiunion Function was called by... ncalls tottime cumtime {BTrees._IFBTree.multiunion} - 270.348 58.239 index.py:203(apply) (yep, 58 seconds; very slow ZEO network load in a demostorage setup where ZEO cannot update its client cache, which makes these setstate problems very exaggerated). 'multiunion' is called 27 times, but one of those times takes 58 seconds). Profiling the same page again with everything all loaded winterhome-secondload.profile% callees multiunion Function called... ncalls tottime cumtime {BTrees._IFBTree.multiunion} - winterhome-secondload.profile% callers multiunion Function was called by... ncalls tottime cumtime {BTrees._IFBTree.multiunion} - 270.1930.193 index.py:203(apply) (this time, multiunion doesn't require any ZODB loads, and its 27 calls internal time and cumulative time are relatively speedy) If there's a good strategy for getting and keeping these things in memory, I'd love to know it; but when the catalog indexes are competing with all of the content objects that make up a site, it's hard to know what to do or even how to configure the object cache counts well without running into serious memory problems. On Jan 3, 2013, at 2:50 PM, Claudiu Saftoiu csaft...@gmail.com wrote: Hello all, Am I doing something wrong with my queries, or is repoze.catalog.query very slow? I have a `Catalog` with ~320,000 objects and 17 `CatalogFieldIndex`es. All the objects are indexed and up to date. This is the query I ran (field names renamed): And(InRange('float_field', 0.01, 0.04), InRange('datetime_field', seven_days_ago, today), Eq('str1', str1), Eq('str2', str2), Eq('str3', str3), Eq('str4', str4)) It returned 15 results so it's not a large result set by any means. The strings are like labels - there are 20 things any one of the string fields can be. This query took a few minutes to run the first time. Re-running it again in the same session took 1 second each time. When I restarted the session it took only 30 seconds, and again 1 second each subsequent time. What makes it run so slow? Is it that the catalog isn't fully in memory? If so, is there any way I can guarantee the catalog will be in memory given that my entire database doesn't fit in memory all at once? Thanks, - Claudiu ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev Thanks, Jeff Shell j...@bottlerocket.net ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote: On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote: I plan to do ZEO shortly. Well, that didn't go well. svn git fetch spent several days and didn't finish. It seemed to be really thrashing trying to follow tags. I'm probably going to have to convert just the trunk. I'm guessing that this is related to the project split. I'll do and BTrees, persistent and transaction in about a week. Or so. :) I suspect that these will have the same problem and that I'll have to convert just the trunk. persistent went smoothly enough. persistent is now in github. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Fri, Jan 4, 2013 at 1:05 PM, Jim Fulton j...@zope.com wrote: On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote: On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote: I plan to do ZEO shortly. Well, that didn't go well. svn git fetch spent several days and didn't finish. It seemed to be really thrashing trying to follow tags. I'm probably going to have to convert just the trunk. I'm guessing that this is related to the project split. I'll do and BTrees, persistent and transaction in about a week. Or so. :) I suspect that these will have the same problem and that I'll have to convert just the trunk. persistent went smoothly enough. persistent is now in github. BTrees and transaction converted smoothly too. Go figure. This weekend, I'll update svn for these packages and ZODB to indicate that development is taking place in github. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote: I plan to do ZEO shortly. Well, that didn't go well. svn git fetch spent several days and didn't finish. It seemed to be really thrashing trying to follow tags. I'm probably going to have to convert just the trunk. I'm guessing that this is related to the project split. I'll do and BTrees, persistent and transaction in about a week. Or so. :) I suspect that these will have the same problem and that I'll have to convert just the trunk. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
On Wed, Jan 2, 2013 at 1:04 PM, Leonardo Santagada santag...@gmail.com wrote: On Wed, Jan 2, 2013 at 12:37 PM, Jim Fulton j...@zope.com wrote: On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote: I plan to do ZEO shortly. Well, that didn't go well. svn git fetch spent several days and didn't finish. It seemed to be really thrashing trying to follow tags. I'm probably going to have to convert just the trunk. I'm guessing that this is related to the project split. Pypy had a very hard problem with that trying to convert from svn to mercurial. They end up using a home made tool for that, maybe they still have it. After converting to mercurial it is easy to convert to git. Thanks. I'll keep that in mind. Do you know where the tool can be found? Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Blobstorage shrinks only after the second pack operation - bug or feature?
On Sun, Dec 30, 2012 at 3:50 AM, Andreas Jung li...@zopyx.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I noticed a strange behavior with packing a storage having lots of data in a blob storage (Plone 4.2, Zope 2.13). I had a large Plone site (5 GB of data in blobstorage) in a dedicated storage. I removed the Plone Site object and packed the storage through the Zope 2 database management screen. The size of the Data.fs went down however the blobstorage size remained the same. Packing a second removed the obsolete data from the blob storage. So why do I have to pack two times in order to get a minimized blob storage? Without knowing more details, it's impossible to know. What did you have pack-keep-old set to? Bug or feature? I doubt either. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Blobstorage shrinks only after the second pack operation - bug or feature?
On Sun, Dec 30, 2012 at 12:22 PM, Andreas Jung li...@zopyx.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jim Fulton wrote: On Sun, Dec 30, 2012 at 3:50 AM, Andreas Jung li...@zopyx.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I noticed a strange behavior with packing a storage having lots of data in a blob storage (Plone 4.2, Zope 2.13). I had a large Plone site (5 GB of data in blobstorage) in a dedicated storage. I removed the Plone Site object and packed the storage through the Zope 2 database management screen. The size of the Data.fs went down however the blobstorage size remained the same. Packing a second removed the obsolete data from the blob storage. So why do I have to pack two times in order to get a minimized blob storage? Without knowing more details, it's impossible to know. What did you have pack-keep-old set to? I used the default of the Zope 2 UI which is 0 (days). You didn't answer my question. I didn't ask how many days you packed to. I asked if you set pack-keep-old. I'm guessing from your response that you didn't. If you don't set pack-keep-old to false, then old blobs are kept around, just like the file-storage file, except that, rather than making copies, hard links are created. The second time you packed, the old links would be removed, freeing up the space taken by the old blobs. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZODB projects to github
On Wed, Dec 26, 2012 at 1:42 PM, Jim Fulton j...@zope.com wrote: I'd like to move ZODB-related projects to github nowish. (I'm having problems converting ZODB though :() OK, I think I worked out the problems I was having with svn2git (It's broken. I wrote a simple Python script that did the same thing, except for the brokeness :) The new repo is at: https://github.com/zopefoundation/ZODB Please don't check into the svn project any more. It would be great if people would kick the tires on this to make sure I didn't miss anything in the conversion. In a few days, I'll mark the svn project as moved to github. Eventually, I'll set the tests up to run with travis. It would be just fine if someone beat me to it. :) Also, generally, I'd like to update trunk via pull requests, rather than direct checkins to trunk (exceptions being things like travis setup or checkins associated with releasing.) Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Moving ZODB projects to github
On Fri, Dec 28, 2012 at 3:44 PM, Mikko Ohtamaa mi...@opensourcehacker.com wrote: Hi, OK, I think I worked out the problems I was having with svn2git (It's broken. I wrote a simple Python script that did the same thing, except for the brokeness :) I also run problems with svn2git. In the end I just ended up migrating trunk + limited history : Is your Python script available somewhere, just for the poor souls of the future who will come after us? It's a work in progress. I plan to publish it more widely when I've used it more and maybe/probably incorporated the github/oauth goodness you pointed me to. But here's a snapshot. Let me know if it works for you (or doesn't). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm svn2git.py Description: Binary data ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github
I plan to do ZEO shortly. I'll do and BTrees, persistent and transaction in about a week. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Moving ZODB projects to github
I'd like to move ZODB-related projects to github nowish. (I'm having problems converting ZODB though :() If you're currently working on something, let me know so I don't leave your work behind. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Please clean up your unneeded dev branches in the ZODB svn project
Some of the branches (especially some tseaver_...), are giving svn2git fits. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Please clean up your unneeded dev branches in the ZODB svn project
On Wed, Dec 26, 2012 at 7:25 PM, Jim Fulton j...@zope.com wrote: Some of the branches (especially some tseaver_...), are giving svn2git fits. G. The branches that bothered svn2git were already deleted. Whimper. Sorry. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] shared cache when no write?
On Wed, Dec 12, 2012 at 6:31 PM, Dylan Jay d...@pretaweb.com wrote: Hi, I've been working with zope for over 12 years and something that keeps coming up is sacling IO bound operations in Zope. The typical example is where you build an app that calls external apis. While this is happening a zope thread isn't doing any other processing and because there is a 1 thread 1 zodb cache limit. You can run into scalability problems as you can only have as many threads your RAM / average cache size. The end result is low throughput while still having low CPU. I've consulted on some $$$ sites where others have made this mistake. It's an easy mistake to make as SQL/PHP systems don't tend to have this limitation so new developers to zope often don't to think of it. I was listening to a talk by a Java guy on Friday where he warned that a common newbie mistake was to have too large a database connection pool, causing lots of RAM usage. I expect though that ZODB caches, consisting of live Python objects exacerbate this effect. The possible workarounds aren't pretty. You can segregate your api calling requests to zeo clients with large numbers of threads with small caches using some fancy load balancing rules. You can rework that part of your application to not use zope, perhaps using edge side includes to make it seem p art of the same app. Feel free to shoot down the following makes no sense. What if two or more threads could share a zodb cache up until the point at which one wants to write. This is the point at which you can't share a cache in a consistent manner in my understanding. At that point the transaction could be blocked until other readonly transactions had finished and continue by itself? or perhaps the write transaction could be aborted and restarted with a special flag to ensure it was processed with the cache to itself. As long as requests which involve external access are readonly with regard to zope then this would improve throughput. This might seem an edge case but consider where you want to integrate an external app into a zope or Plone app. Often the external api is doing the writing not the zope part. For example clicking a button on a plone site to make plone send a tweet. It might also improve throughput on zope requests which involve zodb cache misses as they are also IO bound. A simpler approach might be to manage connections better at the application level so you don't need so many of them. If you're goinng to spend a lot of time blocked waiting on some external service, why not close the database connection and reopen it when you need it? Then you could have a lot more threads than database connections. It's possible that ZODB could help at the savepoint level. For example, maybe you could somehow allow savepoints to be used accross tranasctions and connections. This would be a lot saner that tring to share a cache accross threads. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] shared cache when no write?
On Thu, Dec 13, 2012 at 4:18 PM, Dylan Jay d...@pretaweb.com wrote: ... I'd never considered that the cache was attached to the db connection rather than the thread. I just reread http://docs.zope.org/zope2/zope2book/MaintainingZope.html and it says exactly that. So what your saying is I'd tune db connections down to memory size on an instance dedicated to io bound and then increase the threads. Whenever a thread requests a db connection and there isn't one available it will block. So I just optimize my app the release the db connection when not needed. In fact I could tune all my copes this way since a zone with 10 threads and 2 connections is going to end up queuing requests the same as 2 threads and 10 connections? Something like that. It's a little more complicated than that because Zope 2 is managing connections for you, it would be easy to run afoul of that. This is a case where something that usually makes your life easier, makes it harder. :) What I'd do is use a separate database other than the one Zope 2 is using. Then you can manage connections yourself without conflicting with the publisher is doing. Then, when you want to use the database, you just open the database, being careful to close it when you're going to block. The downside being that you'll have separate transactions. This should be easier to achieve and changes the application less than the erp5 background task solution mentioned. It would probably be a good idea to lean more bout how erp does this. The erp approach sounds like a variation on what I suggested. I can see from the previous post, as there is no checkout semantics in zodb, I don't know what checkout semantics means. you are free to write anytime so there is no sane way to block at the point someone wants to write to an object, so it wouldn't work. ZODB provides a very simple concurrency model by giving each connection (and in common practice, each thread) it's own view of the database. If you break that, then you're injecting concurrency issues into the app or in some pretty magical layer. You perhaps could have a single read only db connection which is shared? But even if the database data was only read, objects have other state that may be mutated. You'd have to inspect every class to make sure it's thread safe. That's too scary for me. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZEO invalidation message transaction-inbound or outbound
On Mon, Dec 3, 2012 at 2:35 PM, Shane Hathaway sh...@hathawaymix.org wrote: I've seen ZEO clients become stale due to network instability. The clients caught up the moment they changed something. This was years ago, though. ZEO clients now send keep-alive messages to the storage server to prevent this from happening. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Need Windows Volunteers -- Losing interest in windows
For a while now, I've tried to make sure ZODB runs on windows. I'll keep doing so as long as it is fairly easy. I'm not a windows developer and life is short. I have a Windows XP VM with the free Microsoft compiler on it. It's getting rather long in the tooth and doesn't seem to work with Python 3. shrug. I'll keep using it as long as it works, but ... We need people who care about Windows to step up. It's necessary, but not sufficient to run tests on windows. We need people willing to debug and fix windows issues. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] ZODB 3.11.0a1 released -- the breakup!
I just released ZODB 3.11.0a1. ZODB3 is now a meta package that requires persistent, ZODB, BTrees and ZEO, all at versions =4dev. I expect this release to be boring. If I don't hear of any problems, I plan to move these releases to beta in a few days and to final a few days after that. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZEO invalidation message transaction-inbound or outbound
On Fri, Nov 30, 2012 at 1:37 AM, Andreas Jung li...@zopyx.com wrote: a customer made the observation that that ZEO clients became inconsistent after some time (large CMF-based application running on Zope 2.12 afaik). Customer made some investigation and noticed that the ZEO invalidations have been queued (in some cases for hours). I can't imagine invalidations being queued for many seconds, much less hours without some serious network pathology. How did they come to this conclusion? ..so the overall state of the ZEO clients became inconsistent. Aren't ZEO invalidation messages supposed to be transmitted within the current transaction (and outbound as it seems to happen here). No. Invalidations (and all other data) are queued for transmission over the network. Isn't a ZEO cluster to be completely consistent at any time? Each client has a consistent view of the database, but not necessarily an up-to-date one. Differene clients may have views of the database as it was at (usually slightly) different points time, depending on which data they've recieved. It's not practical for all clients to have precisely the same view of the database as each other, although they should differ by seconds, or less. The only way I can see clients having views of the database far out of sync with the server is if the clients are disconncted. A ZEO client can continue functioning normally from a server while disconnected as long as it doesn't write anything and has all the data it needs in its cache. It has a consistent view of the database, but not an up-to-date one. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Call for volunteers: help w finishing Python BTrees
On Thu, Nov 8, 2012 at 10:59 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/21/2012 06:50 PM, Tres Seaver wrote: On 10/04/2011 01:32 PM, Jim Fulton wrote: On Tue, Oct 4, 2011 at 11:36 AM, David Glick davidgl...@groundwire.org wrote: On 10/4/11 8:33 AM, Jim Fulton wrote: Someone recently told me I should be more agressive about asking for help. If someone is looking for an opportunity to help, finishing the Python version of BTrees would help a lot. I think I got this started pretty well, but ran out of time. This is needed for running ZODB on PyPy and jython, both of which I'd like to see. svn+ssh://svn.zope.org/repos/main/ZODB/branches/jim-python-btrees Jim P.S. Much thanks to Tres for his work on the Python version of persistence. What tasks remain to be done? (I assume running the tests will give a starting point, but perhaps there are other todo items you know of?) Really, just getting the tests to pass. I think there are a lot of legacy, but still supporte features that need to be fixed. (This is a really old package.) In a fresh checkout of the branch, I see what looks like an infinite loop in the tests: I left it running for an hour just now, and it hung inside the '_set_operation' helper function inside the 'test_difference' testcase for 'PureOO' testcase. Just a quick update: my 'pure_python' branch now passes all tests on Python 2.6, 2.7, and PyPy (no C extension1) I plan to do a lot of cleanup during the PyConCA sprints next week before merging the branch to the trunk. Awesome. Thanks. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] RFC: ZODB 4.0 (without persistent)
On Sat, Oct 20, 2012 at 3:37 PM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/20/2012 01:47 PM, Jim Fulton wrote: I had the impression that Tres was proposing more. shrug I released BTrees 4.0.0, and created a ZODB branch for the (trivial) shift to depending on it: http://svn.zope.org/ZODB/branches/tseaver-btrees_as_egg/ That branch passes all tests, and should be ready for merging. Merged and released. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] ZODB 4.0.0a1 released
Not to be confused with ZODB3! :) This is the first ZODB 4 release. I'll be making a ZEO 4.0.0a1 release soon and then a ZODB3 3.11.0 release that simply requires ZEO 4. Although I wonder if that should be a ZODB 3 4.0.0a1 release. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles
On Wed, Oct 24, 2012 at 12:33 AM, Tim Godfrey t...@obsidian.com.au wrote: Hi Jim Do you have any idea as to why people recommend against many storages under a single Zeo? Seriously? Go back and read the thread. Also can increasing the invalidation-queue-size help this if there is memory to spare on the machine? Invalidation queue size has nothing to do with this. Jim Tim On 23 October 2012 00:53, Jim Fulton j...@zope.com wrote: On Sun, Oct 21, 2012 at 8:48 PM, Tim Godfrey t...@obsidian.com.au wrote: Hi Jon Thanks for your response. Is that something that has been done in a later version of Zeoserver than mine (ZODB3-3.10.3)? No. Every time I've tried to switch to poll in a asyncore, I've had problems. It this your recommended action for the issue I'm having or are there still some configuration changes I can make? Many people have recommended not hosting many storages in a single process. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm -- Tim Godfrey Obsidian Consulting Group P: +61 3 9355 7844 F: +61 3 9350 4097 E: t...@obsidian.com.au W: http://www.obsidian.com.au/ -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles
On Sun, Oct 21, 2012 at 8:48 PM, Tim Godfrey t...@obsidian.com.au wrote: Hi Jon Thanks for your response. Is that something that has been done in a later version of Zeoserver than mine (ZODB3-3.10.3)? No. Every time I've tried to switch to poll in a asyncore, I've had problems. It this your recommended action for the issue I'm having or are there still some configuration changes I can make? Many people have recommended not hosting many storages in a single process. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles
On Thu, Oct 18, 2012 at 3:09 PM, Leonardo Rochael Almeida leoroch...@gmail.com wrote: Thanks for pitching in with an answer! Having a single ZEO server handling more than one or two storages is not usually a good idea. ZEO does not handle each storage in a separate thread, so you're underusing multiple CPUs if you have them. Nit pick: ZEO handles each client connection and storage in a separate thread. (So 30 storages and 16 clients means 480 threads :) It is Python's GIL that prevents effective use of multiple processors. ZEO goes out if it's way to let I/O C code run concurrently (since I/O isn't subject to the GIL) and I've seen ZEO storage servers use up to 200% CPU on 4-core boxes (2 cores worth, IOW). Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton Jerky is better than bacon! http://zo.pe/Kqm ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev