Re: [ZODB-Dev] Iterate through all objects from ZODB

2014-09-23 Thread Jim Fulton
On Mon, Sep 22, 2014 at 11:12 PM, Carlos Sanchez
carlos.sanc...@nextthought.com wrote:
 Hi,

 I was wondering if there is an official API and/or a way to iterate through
 all objects in a ZODB database.

In general, official interfaces are found in ZODB.interfaces.

IStorageCurrentRecordIteration lets you iterate over meta data about
objects in the database, including oid, tid and pickle. Both
FileStorage and ZEO implement this interface.

You can pass the oid to a connection's get method to get the object.
Iterating over the entire database requires some care to avoid
exceeding RAM.  After dealing with each object, you'll probably want
to call cacheGC on the connection to free unneeded memory.

...

 We are using RelStorage (MySQL) and ZEO (4 Dev)

I don't know if RelStorage implements IStorageCurrentRecordIteration.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Race condition in RelStorage history-free pack?

2014-07-18 Thread Jim Fulton
You should be able to arrange to some time in the past. You may need a
custom pack script. (I don't remember off hand if zeopack accepts fractional
days.)  But maybe packing some small time period back (say 1 hour)
would work around the issue.

Jim

On Fri, Jul 18, 2014 at 10:09 AM, Ian McCracken i...@zenoss.com wrote:
 We have an area of our product that can, depending on certain conditions,
 produce lots of rapid transactions all at once. We’re getting many reports
 of POSKeyErrors at sites where the transaction volume is higher than others.
 They appear to coincide (in many cases, at least) with zodbpack runs.

 After some investigation, it appears that it’s a timing issue. If an object
 is unreferenced during the pre_pack() stage, it will be marked for GC. If it
 then becomes referenced before the actual DELETE FROM is executed, it will
 be deleted nonetheless and POSKeyErrors will result.

 Now, I don’t know how the object is unreferenced and referenced in separate
 transactions; as far as I can tell, there are no two-transaction moves or
 anything like that happening. Nonetheless, it’s entirely possible we can
 solve this by tracking down the code that sets up the race condition.

 But is there any lower-level way around the race condition? A good amount of
 our application is pluggable by the end user, and I’d like to avoid vague
 technical warnings about transaction boundaries if possible :) If we could
 verify that no POSKeyErrors will result from the DELETE before running it,
 that would be much simpler.

 I should also mention that running zodbpack with days=1 isn’t an
 across-the-board solution for us. We have some customers for whom rapid
 growth of the database is an issue, and they pack more frequently.

 —Ian

 ___
 For more information about ZODB, see http://zodb.org/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 https://mail.zope.org/mailman/listinfo/zodb-dev




-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 4:32 AM, Frédéric Iterbeke
frederic.iterb...@ugent.be wrote:
 Op 30/06/2014 9:30, Alessandro Pisa schreef:

 I have a ~70Gb Data.fs that does not pack anymore.
 When I pack it it creates a ~8GB Data.fs.pack, then it evaluates thi
 condition:

 -https://github.com/zopefoundation/ZODB/blob/3.9.5/src/ZODB/FileStorage/fspack.py#L410
 as True, removes, the Data.fs.pack and returns.

 The same happens with ZODB-3.10.5 and using pack-gc = false.
 In every permutation I tried the produced Data.fs.pack files have the
 same checksum.

 Does anybody have some hints?

 What I am trying to do:
   - comment the Data.fs.pack removing.
   - use that file as a new Data.fs

 Well, 70 Gb Data.fs is pretty big. But not impossible afaik.

We have much larger datavases.


 If you set pack-gc = false it's normal that nothing is removed.

No, it's not.

 If you think the code is doing something wrong and you would like to try
 packing anyway, I would suggest you just comment the entire if in the
 fspack.py code and try running this version. If I read the code correctly,
 this would force (at least trying) a pack in any case, assuming
 pack-gc=true.

Really? You're suggesting modifying the code.


 I'm not guaranteeing anything and I'm just a zodb user though ;)

And yet you suggest modifying code you don't understand.

Amazing.

 And
 remember to use a copy of your data when doing stuff like this ;)

At least you suggested making a copy.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 4:40 AM, Alessandro Pisa
alessandro.p...@gmail.com wrote:
 On 30 June 2014 10:32, Frédéric Iterbeke frederic.iterb...@ugent.be wrote:
 Op 30/06/2014 9:30, Alessandro Pisa schreef:

 I have a ~70Gb Data.fs that does not pack anymore.
 When I pack it it creates a ~8GB Data.fs.pack, then it evaluates thi
 condition:

 -https://github.com/zopefoundation/ZODB/blob/3.9.5/src/ZODB/FileStorage/fspack.py#L410
 as True, removes, the Data.fs.pack and returns.

 The same happens with ZODB-3.10.5 and using pack-gc = false.
 In every permutation I tried the produced Data.fs.pack files have the
 same checksum.

 Does anybody have some hints?

 What I am trying to do:
   - comment the Data.fs.pack removing.
   - use that file as a new Data.fs

 Well, 70 Gb Data.fs is pretty big. But not impossible afaik.

 If you set pack-gc = false it's normal that nothing is removed.


 Setting or unsetting it doesn't change the produced Data.fs.pack (it
 has the same md5sum).
 Anyway the pack time considerably reduces (from 4 hours to 25 minutes).

That's because the GC (and even pack algorithm) built into FileStorage
is very inefficient.

When you pack a file storage with GC enabled, you are really doing 2
things:

1. Removing non-current database records (as of the pack time).
This is properly called packing.

2. Removing objects that are no longer reachable (from any records,
current or non-current) from the root object.  If packing doesn't
remove any objects, neither will packing.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 7:21 AM, Simone Deponti
simone.depo...@abstract.it wrote:
 Hi Jim,

 On Mon, Jun 30, 2014 at 12:43 PM, Jim Fulton j...@zope.com wrote:

 As the comment suggests, if you continued packing, the new file
 would be as large as the old one, because no records would be
 removed.  This is likely either because a) you've already packed to
 that pack time before, or b) None of the objects written up to the pack time
 have been written after the pack time and this there are no old records
 to be removed.

 Therefore, if I get it right, what happens is:

 * All transactions prior to the packing time are scanned to see if
 they contain reachable data,

Not right.

- First, it does a scan to determine which records are current
  as of the pack time.  This has nothing to do with reachability or
  or GC.  If an object is modified, then a new record is written and becomes
  current.  Packing (before or without GC simply removes (or more precicely
  copies) current records.

- It copies current records to pack files.

If it determines that all records have been copied, then
it stops, as there's no purpose in proceeding.

 if they do, they are kept. Therefore the
 condition there checks that, if we have reached the pack time (after
 which, all transactions are copied over anyway) and none has been
 detected as deletable, then it doesn't make sense to go on packing.

 * If pack-gc is on, then all the transaction prior to pack time that
 have been kept are purged of unreachable objects

GC doesn't have anything directly to do with pack time.
GC removed objects that are no longer reachable from
the root from any records surviving a pack.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 7:51 AM, Frédéric Iterbeke
frederic.iterb...@ugent.be wrote:
...
 If you think the code is doing something wrong and you would like to try
 packing anyway, I would suggest you just comment the entire if in the
 fspack.py code and try running this version. If I read the code
 correctly,
 this would force (at least trying) a pack in any case, assuming
 pack-gc=true.

 Really? You're suggesting modifying the code.

 Sometimes I modify code like that in isolated environments to try to patch
 things or debug problems. Yes, also with code I haven't written myself or
 reviewed from the first 'till the last line.

 I did not suggest committing any change to the current codebase, if that is
 what you were implying.

No, I'm saying it's reckless to suggest modifications to code you
don't understand.

Suppose your changes seemed to work but caused data corruption
that wasn't detected until much later.

Frédéric might have tried your suggested, throught it worked and then
applied it to his production database.


 I'm not guaranteeing anything and I'm just a zodb user though ;)

 And yet you suggest modifying code you don't understand.

 Amazing.

 I was just trying to help. Which is the purpose of this list, is it not?

Only if you're qualified.  Do you help with brain surgery too?


 And
 remember to use a copy of your data when doing stuff like this ;)

 At least you suggested making a copy.

 At least you tried giving a relevant answer later on in the thread.

 Ever thought of the fact that the information you just gave on this list on
 the workings of pack, which other people are trying to comprehend and
 interpret right, is nowhere to be found in documentation? So users are left
 to find out for themselves.

 Or should we all try to fully understand each letter of code in a product
 before using it?

Sorry the documentation isn't thorough enough.

 It's posts like this that make me not want to try to help others anymore.

If I can keep an unqualified helper from causing harm, I've accomplished
something.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 8:24 AM, Alessandro Pisa
alessandro.p...@gmail.com wrote:
 On 30 June 2014 12:43, Jim Fulton j...@zope.com wrote:
 On Mon, Jun 30, 2014 at 3:30 AM, Alessandro Pisa
 alessandro.p...@gmail.com wrote:
 Hello everybody :)

 As the comment suggests, if you continued packing, the new file
 would be as large as the old one, because no records would be
 removed.  This is likely either because a) you've already packed to
 that pack time before.
 b) None of the objects written up to the pack time
 have been written after the pack time and this there are no old records
 to be removed.

 Strange, I am making a 0 day pack.

Perhaps you had a clock problem and the recent records have
timestamps in the future.

 How can I convince zeopack of that?

You pass a pack time of now. :)

 Is it possible to remove this previous pack memory and act as it
 would be the first pack?

Theoretically.

 Would this be effective?

At causing dangling references, possibly.

 Any suggestion for reducing the Data.fs size?

I suggest using the file-store iterator to look at the transaction
timestamps.

Something like:

  from ZODB.FileStorage import FileIterator

  it = FileIterator('s.fs')

  last = None
  for t in it:
  if last is not None:
  if t.tid = last:
  print 'wtf', repr(t.tid)
  last = t.tid


If you've said to pack to the present and you
aren't writing to the database, then I would expect it to stop at
the end of the file, unless you have a problem with your
transaction ids.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 10:44 AM, Alessandro Pisa
alessandro.p...@gmail.com wrote:
 On 30 June 2014 16:02, Alessandro Pisa alessandro.p...@gmail.com wrote:
 On 30 June 2014 15:42, Jim Fulton j...@zope.com wrote:
 On Mon, Jun 30, 2014 at 8:24 AM, Alessandro Pisa

 I suggest using the file-store iterator to look at the transaction
 timestamps.


 This is what I got

 [root@zeoserver]# cat scripts/test_tids.py
 from ZODB.FileStorage import FileIterator
 it = FileIterator('var/filestorage/Data.fs')
 print 'Size %s' % it._file_size
 last = None
 counter = 0
 for t in it:
 if last is not None:
 if t.tid = last:
 import pdb; pdb.set_trace()
 print 'wtf', repr(t.tid)
 counter += 1
 last = t.tid
 print 'Last transaction: %s-%s' % (t._tpos, t._tend)
 print 'Transactions: %s' % counter
 [root@zeoserver]# ./bin/zopepy scripts/test_tids.py
 Size 67395884639
 Last transaction: 67392229872-67395884631
 Transactions: 1275366

 It seems that tids are in order :/

Then I suggest seeing if any are in the future.

You can create a tid for now with repr(ZODB.TimeStamp.TimeStamp(Y, M.
D, h, m, s))


Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack problem

2014-06-30 Thread Jim Fulton
On Mon, Jun 30, 2014 at 11:38 AM, Alessandro Pisa
alessandro.p...@gmail.com wrote:
...
 I checked the last transaction tid and it is in the future...
...
 print 'Transactions: %s' % counter
 [root@zeoserver]# ./bin/zopepy scripts/test_tids.py
 Size 67395884639
 Last transaction: 67392229872-67395884631 (2014-07-01 00:10:04.686741)
 Transactions: 1275366

So, you can pass a time.time in future, or just wait a few hours :)

 Thanks for the valuable help of everybody, I learned  a lot.

You're welcome.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Massive LockError: Couldn't lock....check_size.lock messages

2014-04-15 Thread Jim Fulton
On Tue, Apr 15, 2014 at 9:28 AM, Andreas Jung li...@zopyx.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi there,

 we have a Plone 4.2 setup running ZEO with 3 application servers
 and a non-shared blob setup, ZODB3-3.10.5-py2.7-linux-i686

 We see massive amounts of the following error message on every
 application server. The application servers are configured
 to use a local blob-cache directory. We tried to tune the blob
 cache size (even down to zero) but without success.

 Any idea about this issue?

This is a missfeature in zc.lockfile.  It's crying wolf.
(Well, not really, but it has know way of knowing if the
lock is important. Sometimes it is, and sometimes not.)

These messages can be ignored.

If someone wanted to fix this, they could add an argument
to the constructor to suppress these log messages.

In the mean time, as a work around, you could adjust your logger
configuration to suppress these yourself.

Jim


 Andreas

 2014-04-15T15:24:34 ERROR zc.lockfile Error locking file
 /srv/gehirn/dasgehirn_buildout/var/blobstorage.cache/check_size.lock;
 pid=25381
 Traceback (most recent call last):
   File
 /srv/gehirn/dasgehirn_buildout/eggs/zc.lockfile-1.0.2-py2.7.egg/zc/lockfile/__init__.py,
 line 84, in __init__
 _lock_file(fp)
   File
 /srv/gehirn/dasgehirn_buildout/eggs/zc.lockfile-1.0.2-py2.7.egg/zc/lockfile/__init__.py,
 line 59, in _lock_file
 raise LockError(Couldn't lock %r % file.name)
 LockError: Couldn't lock
 '/srv/gehirn/dasgehirn_buildout/var/blobstorage.cache/check_size.lock'
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (Darwin)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iQGUBAEBAgAGBQJTTTPkAAoJEADcfz7u4AZjY7oLvjOM2VyMng1g8QDnQheQ8MCY
 pEjxNGTfKskZKJaeYdbOBsSAGC46dYEQNee56CRbPO+G4QsjXDnMjPGIrbUF9/CD
 DeEA5wiCeTkb/Y7WRhfqvCDuLmIHEY0oYeFlpXH2bQ62uTGA4BD1EnvYi6wJKurH
 WsM4JU5X+KzHJhC867M5fFpL0r0kLXVHNovY9q3zCdrS4kqTPr/OEgKKPpI4ytIK
 BDpRtV7ZRPKE0Kgi6mJAPYr6F541BqMVQdCrK14LtbKCwcJDbrrVcsM+eZSjI9AN
 wqtVB30OwfRQiYFBRmRfm8ncHrALx4W4a6rvvoMAp9uf7F8cuIRRIvW4vymIxiR0
 a2jfal2PR0QwM4XW7tb/42lhE1Gvwmd0VPk+bSg6MHYoMB78j6wOnBjlot3BJADS
 kXpTnYXAkHLe+PQX7JROX53C67f/pN1rXkr9VwEGWKWQaODS26n2IachD5fs8AC5
 a38Jo3EcR3cTIYiZady62M7muJqkJO4=
 =UF30
 -END PGP SIGNATURE-
 ___
 For more information about ZODB, see http://zodb.org/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 https://mail.zope.org/mailman/listinfo/zodb-dev



-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] New ZODB Mailing list

2014-02-22 Thread Jim Fulton
I've created a new ZODB mailing list via google groups:

https://groups.google.com/forum/#!forum/zodb

I've done this for 2 reasons:

- We're having some difficulty managing mail.zope.org.

  Currently, there are some issues (certificate related?) that
  are causing mail to bounce.  This is causing the membership
  in this list to slowly erode.

  This is fixable by someone with knowledge of email arcana
  or someone willing to spend the time to learn about it.

  Personally, I have better things to do, and no one else
  in the Zope Foundation has stepped forward, so ...

- I also want to be more welcoming to ZODB users.

  A user-oriented list is long overdue.

  For now, we'll discuss user as well as developer issues
  on the new list. It's not like traffic has been heavy lately.

  ( I would like to know what's up with the zodb-dev
   google group.  I'm 91% sure one of *us* must have
   set it up at some point.)

In a week or so, I'll disable posts to this list, with a
notice to move to the google group.  In the mean time,
I'll post to both places.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Sprucing up zodb.org and pull request

2014-02-22 Thread Jim Fulton
This was posted to z...@googlegroups.com.  If you
want to reply, please do so there.

I'd like to spruce up zodb.org, especially the documentation.

I think the ZODB is under-appreciated and it's hard to promote
it without a good docs and a reasonable web site.

I plan to do this in small increments, to make it easier for me
to squeeze in time and to make it easier for other people to
contribute through their own small changes and review.

I'd like to offer my sincere thanks to the folks who set it up
initially, providing a basis for incremental improvements.

The website is a sphinx project:

  https://github.com/zopefoundation/zodbdocs

now hosted by Read the Docs. Thanks Read the Docs!!!

Help would be much appreciated and can be offered in small
parts. From offering edits, to reviewing pull requests, to pointing
out problems.

If you see something that needs to be fixed or want to suggest
an improvement, please file an issue:

  https://github.com/zopefoundation/zodbdocs/issues

If you want to help by reviewing edits, look for pull requests:

  https://github.com/zopefoundation/zodbdocs/pulls

You don't need to be a Zope contributor to make a documentation
pull request.

BTW, I'd like non-trivial changes to be made via pull request.

This last week, I finally got around to some overdue work on the
tutorial. It would be great is someone would review my changes:

  https://github.com/zopefoundation/zodbdocs/pull/1

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Does ZODB pipeline load requests?

2014-02-20 Thread Jim Fulton
On Wed, Feb 19, 2014 at 7:40 PM, Dylan Jay d...@pretaweb.com wrote:
 Iterators certainly seem like a logical place to start.

 As an example I originally was doing a TTW zope reindex of a single index.
 Due to conflict problems I used a modified version of this 
 https://github.com/plone/Products.PloneOrg/blob/master/scripts/catalog_rebuild.py
  (which I'd love to integrate something similar into zcatalog sometime).
 Both use iterators I believe.

 I think even if there was an explicit api where you can pass in an iterator, 
 a max buffer length and you'd get passed back another iterator. Then 
 asynchronously objects will load to try and keep ahead of the iterator 
 consumption.
 e.g.
 for obj in async_load(myitr, 50):
dox(obj)

I like the idea of a wrapper. I think a) you're pushing the abstraction
to far, and b) this doesn't have to be a ZODB API, at least not initially.

In any case, if the lower-level API exists, it would be straightforward
to implement one like above.

 I don't know how that would help with a loop like this however

 for obj in async_load(myitr, 50):
dox(obj.getMainObject())

Well, this would simply be another custom iterator wrapper.

  for ob in main_iterator(myiter):
  ...

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Did someone create a zodb-dev google group?

2014-02-19 Thread Jim Fulton
I just tried creating one, but it was already taken and is not public. :(

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Does ZODB pipeline load requests?

2014-02-19 Thread Jim Fulton
On Tue, Feb 18, 2014 at 8:59 PM, Dylan Jay d...@pretaweb.com wrote:
 Hi,

 I'm seeing a a ZCatalog reindex of a large number of objects take a long time 
 while only using 10% cpu. I'm not sure yet if this is due to the size of the 
 objects and therefore the network is saturated, or the ZEO file reads aren't 
 fast enough.

How heavily loaded is your storage server, especially %CPU of the
server process?

Also, are the ZODB object or client caches big enough for the job?

 However looking at the protocol I didn't see a way for code such as the 
 ZCatalog to give a hint to ZEO as to what they wanted to load next so the 
 time is taken by network delays rather than either ZEO or app. Is that the 
 case?

It is the case that a ZEO client does one read at a time and that
there's no easy way to pre-load objects.

 I'm guessing if it is, it's a fundamental design problem that can't be fixed 
 :(

I don't think there's a *fundamental* problem.  There are three
issues. The hardest to solve
isn't at the storage level. I'll mention the 2 easiest problems first:

1. The ZEO client implementation only allows one outstanding request at a time,
even on a client with multiple threads.  This is merely a clumsy
implementation.

The protocol easily allows for multiple outstanding reads!

2. The storage API doesn't provide a way to read multiple objects at once, or to
otherwise hint that additional objects will be loaded.

Both of these are fairly straightforward to fix. It's just a matter of time. :)

3. You have to be able to predict what data are going to be needed.

This IMO is rather hard, at least at a general level. It's what's left
me somewhat under-motivated to address the first 2 problems.

We really should address problems 1 and 2 to make it possible
for people to experiment with approaches to problem 3.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Does ZODB pipeline load requests?

2014-02-19 Thread Jim Fulton
On Wed, Feb 19, 2014 at 9:57 AM, Dylan Jay d...@pretaweb.com wrote:
 On 19 Feb 2014, at 10:44 pm, Jim Fulton j...@zope.com wrote:
...
 yeah I figured it might be the case thats its hard to predict. In this case 
 it's catalog indexing so I was wondering if something could be done with 
 __iter__ on a btree? It's a reasonably good guess that you could start 
 preloading more of those objects if the first few are loaded?

Iterators certainly seem like a logical place to start.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [zopefoundation/ZODB] 49919d: test for POSKeyError during transaction commit

2014-02-05 Thread Jim Fulton
On Wed, Feb 5, 2014 at 1:57 AM, Marius Gedminas mar...@gedmin.as wrote:
 On Tue, Feb 04, 2014 at 12:44:09PM -0500, Tres Seaver wrote:
 On 02/04/2014 06:28 AM, Godefroid Chapelle wrote:
  Le 03/02/14 20:53, Tres Seaver a écrit :
  I wish you hadn't pushed that -- some of these changes are definitely
  inappropriate on the 3.10 branch (adding an Acquisition dependency
  is definitely wrong).

Agreed. Note that non-trivial commits to a release branch, like
master, should be
via pull request.

 
  Acquisition is added as a test dependency. Any hint how to replicate
  the bug without acquisition is welcome.

 Define a subclass of Persistent which emulates what Acquisition does, e.g.:

   from persistent import Persistent
   class Foo(Persistent):
   @property
   def _p_jar(self): # or whatever attribute trggers
   return object()

 What if full replication requires a C extension module?

 (I hope that's not true and that it is possible to reproduce the bug
 using some fakes, but I haven't spent the time investigating this.)

I'm going to dig into this.  I'm baffled by the assertion that this has
anything to do with readCurrent.

Regardless of whether it should have been made to the 3.10 branch,
I'm going to use Godefroid's test case to dig further.



  Which other change is inappropriate ?

 Adding MANIFEST.in on a release branch seems wrong to me (I don't like
 them anyway, and we *definitely* don't want to encourage
 instsall-from-a-github-generated-tarball on a release branch).

 That's like objecting if someone adds a .gitignore to a release branch.
 Or a .travis.yml.  It's not code, it's metadata.

Yup.

 (I never liked setuptool's magic let me query git to see what source
 files you have, but not by default, oh no, instead let's assume
 everybody has installed the non-standard plugin into their system
 Pythons and then let's silently produce broken tarballs if they haven't,
 because obviously implicit is better than explicit, and when there's
 temptation the right thing is to guess behavior anyway, and we
 *definitely* don't want broken sdists on PyPI.)

I couldn't agree more. One of the advantages of moving to
git was circumventing setuptool's misguided magic.

I've no idea what Tres was referring to wrt
instsall-from-a-github-generated-tarball, but I use Manifest.in
files in all my modern projects.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [zopefoundation/ZODB] 49919d: test for POSKeyError during transaction commit

2014-02-05 Thread Jim Fulton
On Wed, Feb 5, 2014 at 9:23 AM, Godefroid Chapelle got...@bubblenet.be wrote:
 Le 05/02/14 13:25, Jim Fulton a écrit :

 Acquisition is added as a test dependency. Any hint how to replicate
  the bug without acquisition is welcome.

 
 Define a subclass of Persistent which emulates what Acquisition does,
  e.g.:
 
from persistent import Persistent
class Foo(Persistent):
@property
def _p_jar(self): # or whatever attribute trggers
return object()

 
 What if full replication requires a C extension module?
 
 (I hope that's not true and that it is possible to reproduce the bug
 using some fakes, but I haven't spent the time investigating this.)

 I'm going to dig into this.  I'm baffled by the assertion that this has
 anything to do with readCurrent.


 For sure : the POSKeyError happens during connection.commit when checking
 oids stored in Connection._readCurrent mapping.

 (see traceback at http://rpatterson.net/blog/poskeyerror-during-commit)

 The _readCurrent mapping is populated only by calls to
 Connection.readCurrent method.

 In the Plone code base, the only way I found to get that
 Connection.readCurrent method to be called is by adding a key value pair to
 a BTree.

 _BTree_set C function is then called, which in turn calls readCurrent by
 inlining the PER_READCURRENT macro.

 This calls the cPersistence.c readCurrent function, which in turn calls
 readCurrent method on the ZODB connection.

Wow.  I had to dig a bit to remind myself (vaguely) why I added this.



 When setting a key value pair on a new (not already committed) instance of a
 standard BTree, readCurrent method is not called on the connection.

This is with your change, right?



 My understanding is that it is due to the fact that _p_jar and _p_oid are
 only set during transaction commit.

They can be set earlier by calling the connection add method.

This is used often for frameworks that use object ids at the application level.



 However, with a new BTree instance that also inherits from
 Acquisition.Implicit, readCurrent method is called on ZODB connection when
 setting key value pair. The only explanation I found is that this instance
 _p_jar attribute has a value (acquired in a way or another ?).

You could also simulate this by adding an object to a connection using
a connection's add method.

Wanna update the test to use this technique instead?


 In this case, when readCurrent is called on an object created during a
 savepoint and this savepoint is rolled back, the oid is leftover in the
 Connection._readCurrent mapping. This leads to the POSKeyError when
 committing later as checkCurrentSerialInTransaction cannot check the object
 since it went away at rollback.

 This brings us to the fix I propose: calls to readCurrent should not track
 objects with oid equal to z64.

...

 This was a very long explanation which I hope will help to confirm the fix
 or to come up with a better one.

 PS: keep in mind that english is not my mothertongue.

:) You do very well.

I think your fix is correct.  As you point out, It doesn't make sense to
guard against conflicts on new objects.

I think a cleaner test could be written using the connection add method.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Unpickler.noload, zc.zodbdgc and multi-database reference bug (IndexError)?

2014-01-30 Thread Jim Fulton
On Thu, Jan 30, 2014 at 9:58 AM,  jason.mad...@nextthought.com wrote:
 Hello ZODB dev,

 I was recently trying to GC a large multi-database setup for the first time 
 using zc.zodbdgc. The process wouldn't complete (or really even get started) 
 because of an IndexError being thrown from `zc.zodbdgc.getrefs` (__init__.py 
 line 287). As I traced through it, it began to look like the combination of 
 `cPickle.Unpickler.noload` and multi-database persistent ids (which in ZODB 
 are list objects) fails, generating an empty list instead of the expected 
 [ref type, args] list documented in `ZODB.serialize`. This makes it 
 impossible to correctly GC a multi-database.

 I was curious if anyone else had seen this

I haven't. I'm the author of zodbdgc and I use it regularly, including on
large (for some definition) databases.

, or maybe I'm just doing something wrong? We solved our problem by using 
`load` instead of `noload`, but I wondered if there might be a better way?

 Details:

 I'm working under Python 2.7.6 and 2.7.3 with ZODB 4.0.0, zc.zodbdgc 0.6.1 
 and eventually zodbpickle 0.5.2. Most of my results were repeated on both Mac 
 OS X and Linux.

Why are you using zodbpickle?  Perhaps that is behaving differently
from cPickle in some
fashion?


 After hitting the IndexError, I began debugging the problem. When it became 
 clear that the persistent_load callback was simply getting the wrong 
 persistent ids passed to it (empty lists instead of complete multi-db refs), 
 I tried swapping in zodbpickle for the stock cPickle to the same effect. 
 Here's some code demonstrating the problem:


 This pickle data came right out of ZODB, captured during a debug session of 
 zc.zodbdgc. It has three persistent ids, two cross database and one in the 
 same database:

  p = 
 'cBTrees.OOBTree\nOOBTree\nq\x01.X\x0c\x00\x00\x00Users_1_Prodq\x02]q\x03(U\x01m(U\x0cUsers_1_Prodq\x04U\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x05czope.site.folder\nFolder\nq\x06tq\x07eQX\x0c\x00\x00\x00Users_2_Prodq\x08]q\t(U\x01m(U\x0cUsers_2_Prodq\nU\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x0bh\x06tq\x0ceQX\x0b\x00\x00\x00dataserver2q\r(U\x08\x00\x00\x00\x00\x00\x00\x00\x10q\x0eh\x06tQq\x0f.'

 This code is copy-and-pasted out of zc.zodbgc getrefs. It's supposed to find 
 all the persistent refs and put them inside the `refs` list:

  import cPickle
  import cStringIO
  refs = []
  u = cPickle.Unpickler(cStringIO.StringIO(p))
  u.persistent_load = refs
  u.noload()
  u.noload()

 But if we look at `refs`, we see that the first two cross-database refs are 
 returned as empty lists, not the correct value:

  refs
 [[], [], ('\x00\x00\x00\x00\x00\x00\x00\x10', None)]

 If instead we use `load` to read the state, we get the correct references:

  refs = []
  u = cPickle.Unpickler(cStringIO.StringIO(p))
  u.persistent_load = refs
  u.noload()
  u.load()

  refs
 [['m', ('Users_1_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', class 
 'zope.site.folder.Folder')],
  ['m', ('Users_2_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', class 
 'zope.site.folder.Folder')],
  ('\x00\x00\x00\x00\x00\x00\x00\x10', class 'zope.site.folder.Folder')]

 The results are the same using zodbpickle or using an actual callback 
 function instead of the append-directly-to-list shortcut.

 If we fix the IndexError by checking the size of the list first, we miss all 
 the cross-db references, meaning that a GC is going to be too aggressive. But 
 using `load` is slower and requires access to all of the classes referenced. 
 If anyone has run into this before or has other suggestions, I'd appreciate 
 hearing them.

I'd try using ZODB 3.10.  I suspect a ZODB 4 incompatibility of some sort.

Unfortunately, I don't have time to dig into this now.

This weekend, I'll at least see if I can make zodbdgc tests pass with ZODB 4.
Perhaps that will shed light.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Unpickler.noload, zc.zodbdgc and multi-database reference bug (IndexError)?

2014-01-30 Thread Jim Fulton
On Thu, Jan 30, 2014 at 12:40 PM,  jason.mad...@nextthought.com wrote:

 On Jan 30, 2014, at 11:12, jason.mad...@nextthought.com wrote:

 So it seems that the behaviour of `noload` might have changed between 2.6.x 
 and 2.7.x?

 Apologies for replying to myself, but I think I found the root cause.

 After some further investigation, I found issue 1101399 
 (http://bugs.python.org/issue1101399), complaining that noload is broken for 
 subclasses of dict. The fix for this issue was applied to the cPython trunk 
 in October of 2009 without any corresponding tests 
 (http://hg.python.org/releasing/2.7.6/rev/d0f005e6fadd). In releases with 
 this fix (if I'm reading the code correctly),

Probably because the original code had no tests (because we weren't
doing that then) and
no documentation either (my bad) because this was added for the
specific use case of
efficiently scraping out references in ZODB.




 This fix means that multi-database references are always going to be returned 
 as an empty list under noload (again, if I'm reading the code correctly). 
 This means that multi-references and noload don't work under Python 2.7.x or 
 3.x with zodbpickle and so consequently neither does an unmodified zc.zodbdgc.

Sigh.  We're still using Python 2.6 for  out database servers. :)

 I don't know what the best way forward is. Our solution to use `load` instead 
 seems to work for us, but may not work for everyone.

It will probably work, but be slower, which isn't a huge deal since
zc.zodbdgc runs out of
process. We run it on ZRS secondaries, which are otherwise idle.

 Maybe zodbpickle could revert the fix in its branch and zc.zodbgc could 
 depend on that? I'm happy to help test other ideas.

That may be the best way forward.

I can't speak with much confidence until I've had a chance to
wade in and refresh my memory on this stuff.

Something I wish I'd done differently in ZODB and contemplated changing on a
number of occasions is the handling of references. I wish I'd stored
them outside the
pickle so they could be analyzed without unpicking (or at least
without unpickling the
application data).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs

2013-11-18 Thread Jim Fulton
On Fri, Nov 15, 2013 at 8:01 PM, Jens W. Klein j...@bluedynamics.com wrote:
 I started a new packing script for Relstorage (history free, postgresql). It
 is based on incoming reference counting.

Did you look at zc.zodbdgc?  I think it implements something very close to
what you're proposing.  It's been in production for a few years now at ZC.

Not sure if it would need to be updated for relstorage.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs

2013-11-18 Thread Jim Fulton
On Mon, Nov 18, 2013 at 8:43 AM, Jan-Wijbrand Kolman
janwijbr...@gmail.com wrote:
 On 11/18/13 12:19 PM, Jim Fulton wrote:

 On Fri, Nov 15, 2013 at 8:01 PM, Jens W. Klein j...@bluedynamics.com
 wrote:

 I started a new packing script for Relstorage (history free, postgresql).
 It
 is based on incoming reference counting.


 Did you look at zc.zodbdgc?  I think it implements something very close to
 what you're proposing.  It's been in production for a few years now at ZC.

 Not sure if it would need to be updated for relstorage.


 AFAICT it does not work against a relstorage backend. Or at least I think to
 understand that from:

 http://www.zodb.org/en/latest/documentation/articles/multi-zodb-gc.html

 [...This documentation does not apply to RelStorage which has the same
 features built-in, but accessible in different ways. Look at the options for
 the zodbpack script. The –prepack option creates a table containing the same
 information as we are creating in the reference database[...]

I didn't write that.  I think zodbdgz probably would work, possibly
with some modifications.
If nothing else, it should be consulted, but then again, writing
software is fun.

Note that the important aspect here isn't cross-database references,
but the garbage
collection algorithm, which is incremental and uses a linear scan of
the database.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Optimizing RelStorage Packing for large DBs

2013-11-18 Thread Jim Fulton
On Mon, Nov 18, 2013 at 12:00 PM, Jens W. Klein j...@bluedynamics.com wrote:
 Hi Jim,

 thanks for the hint (also in the other post). I looked at zc.zodbdgc and
 took some inspiration from it. As far as I understand it stores the incoming
 references in a separate filestorage backend.

This is just a temporary file to avoid storing all the data in memory.

 So this works similar to my
 impelmentation but uses the ZODB infrastructure. I dont see how I make
 zc.zodbdgc play with Relstorage and since it works on the abstracted ZODB
 level using pickles

I don't know what you're saying, since I don't know what it refers to.

zodbdgc works with storages.  relstorage conforms to the storage API.
It's possible some changes would be needed, but they should be minor.

 I suspected it to be not fast enough for so many obejcts

No idea why, or what fast enough is.  We use it on a database with
~200 million objects.

 - so I skipped this alternative.

Good luck.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage and over growing database.

2013-11-11 Thread Jim Fulton
On Mon, Nov 11, 2013 at 4:24 PM, Daniel Widerin dan...@widerin.net wrote:
 Hi, just want to share our experience:

 My ZODB contains 300mio objects on relstorage/pgsql. The amount of
 objects is caused by btrees stored on plone dexterity contenttypes. It's
 size is 160GB. At that size it's impossible to pack because the pre-pack
 takes 100 days.

 jensens and me are searching for different packing algorithms and
 methods to achieve better packing performance. We're keeping you updated
 here!

 How i solved my problem for now:

 I converted into FileStorage which took about 40 hours and Data.fs was
 55GB in size. Now i tried to run zeopack on that database - which
 succeeded and database was reduced to 7.8 GB - still containing 40mio
 objects. After that i migrated back to relstorage because of better
 performance and the result is a 11 GB db in pgsql.

Hah. Nice.  Have you measured an improvement in relstorage performance
in practice? Is it enough to justify this hassle?

WRT packaging algorithms:

- You might look at zc.FileStorage which takes a slightly different approach
  than FileStorage:

  - Does most of the packing work in a separate process to avoid the GIL.

  - Doesn't do GC.

  - Has some other optimizations I don't recall.  For our large databases,
it's much faster than normal file-storage packing.

- Consider separating garbage collection and packing.  This allows
  garbage collection to be run mostly against a replica and to be spread
  out, if necessary.  Look at zc.zodbdgc.

 Anyone experienced similar problems packing large relstorage databases?
 The graph traversal takes a really long time. maybe we can improve that
 by storing additional information in the relational database?

 Any hints or comments are welcome.

Definately look at zodbdgc.  It doesn't traverse the graph. It essentially does
reference counting and is able to iterate over the database, which for
FileStorage, is relatively quick.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZODB.FileStorage.format: TxnHeader cannot handle Unicode 'descr'

2013-10-07 Thread Jim Fulton
On Mon, Oct 7, 2013 at 11:58 AM, Tres Seaver tsea...@palladion.com wrote:
...
 transaction.note is defined to take a bytes string.  Pyramid should
 encode the path before passing it to transaction.note.

 The interfaces says text.  I realize that this is likely for
 hysterical raisins, but if we mean bytes, we should say so.

 Note that the implementation's use of an unadorned string literal to join
 the values means that in Py3k, it really *is* text, and not bytes.  If we
 want the application to do the encoding, then we should change that
 literal as well.

Agreed.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZODB.FileStorage.format: TxnHeader cannot handle Unicode 'descr'

2013-10-05 Thread Jim Fulton
On Sat, Oct 5, 2013 at 1:47 AM, Chao Meng bobom...@gmail.com wrote:
 Hi there,

 Here is an issue from zopefoundation/ZODB github three months ago. I am
 facing similar issue now and don't know how to resolve.

 github issue link: https://github.com/zopefoundation/ZODB/issues/12

 When I use Pyramid+ZODB+traversal and use some Chinese characters in URL.
 Note that my resource tree saved in ZODB with unicode fine for the Chinese
 characters as object names.

 Basically when save transaction, ZODB.FileStorage.format TxnHeader uses
 request.path_info as it's descr, which is unicode, but TxnHeader cannot
 handle Unicode :(

 It would be great if anyone can help or give some pointers.

This is a Pyramid bug.

transaction.note is defined to take a bytes string.  Pyramid
should encode the path before passing it to transaction.note.

Alternatively, Pyramid could store the path in transaction
extended info, which accepts any picklable type.

Of course, we could revisit this.  If we did, I'd deprecate the
transaction user and description attributes and only support
meta data via the extended info mechanism, which I'd rename
meta data.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Backward-incompatible change in TimeStamp repr in ZODB 4.

2013-09-29 Thread Jim Fulton
TimeStamps are used mainly to convert between high-level
date-time representations and 8-byte binary transaction IDs.
Historically, a TimeStamp's repr was used to retrieve the binary
representation.  repr was used because, under the hood, slots
are much faster to access than methods.

In ZODB 3.3, a TimeStamp raw method was added to retrieve
the binary data for a time stamp.  I wasn't actually aware of this
until recently.  From an API point of view, using raw rather than
repr is cleaner.  I don't know if the performance implications
are significant, though probably not.

In Python 3, __repr__ returns Unicode, rather than binary data,
so it's no-longer possible to use it to get the binary representation
of a time stamp.  TimeStamp's __repr__ was changed to
return the repr of it's binary data.  Python 3 was going to have
to be broken, but this was also a breaking change for Python 2.

I don't remember this issue being raised earlier. If so, I missed it.

In any case, going forward, it's best to embrace raw() as the correct
way to get the binary representation of time stamps.  This is
mainly a heads up for people porting to ZODB 4.

I don't really see much value in returning the repr of the binary
data.  I'd at least wrap the string in a TimeStamp call, something
like TimeStamp(b'...').

I'd hoped that ZODB 4.0 would be backward compatible with
ZODB3.  That's why ZODB3 3.11 is a meta package that
requires the ZODB 4 packages.  Unfortunately, this means
that ZODB3 3.11 isn't backward compatible.  Fortunately,
ZODB 3.11.0 is still in alpha. :)

I think the best option is to release what is currently
ZODB3 3.11.0a3 as ZODB3 4.0.0.  This will allow
packages that depend on ZODB3 to be used with
ZODB 4, but it will clearly label the ZODB4 4.0.0 release
as not backward compatible.

Another option is to leave things as they are.
Since buildout 2, and now pip, prefer final releases,
existing applications that use current buildout or pip
aren't broken by ZODB3 3.11a3, even if they don't
pin versions.  If someone wants to mix ZODB3 and
ZODB 4, they can explicitly require ZODB3 3.11a3.

Thoughts?

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] https://github.com/zopefoundation/zc.beforestorage

2013-09-27 Thread Jim Fulton
On Fri, Sep 27, 2013 at 12:35 PM, Christian Tismer tis...@stackless.com wrote:
 Hi,

 I saw me subscribed to zc.beforestorage , today.

Congratulations! ;)

 If I'm not mislead, versions are no longer supported in 4.0, or
 is this still a supported approach?

versions were removed in 3.9.

 I think history should not depend on having pack()'ed or not,
 but an explicit snapshot feature that puts a set of objects
 into some history object.

shrug  That would be a new feature.  One I've even contemplated,
sort of, in a notional FileStorage2 design that stores data in a
sequence of files, which would allow point-in-time snapshots.

 Has that been discussed, and can someone please point me at it?

Undoubtably, but I can't think of an instance in particular.

beforestorage takes advantage of the fact that most ZODB storage
implementations keep a limited sequence of transactions to provide
a limited form of time travel and, most importantly, to provide a temporary
snapshot of a database that's being written, mainly for use with
DemoStorage.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] ZODB 4.0.0 and ZEO 4.0.0 Released

2013-09-18 Thread Jim Fulton
Hopefully, we can increase our development tempo a bit now that we
have this base to build on.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Jim Fulton
On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer tis...@stackless.com wrote:
...
 We get a medication prescription database in a certain serialized format
 which is standard in Germany for all pharmacy support companies.

 This database comes in ~25 files == tables in a zip file every two weeks.
 The DB is actually a structured set of SQL tables with references et al.

So you get an entire database snapshot every 2 weeks?

 I actually did not want to change the design and simply created the table
 structure that they have, using ZODB, with tables as btrees that contain
 tuples for the records, so this is basically the SQL model, mimicked in
 Zodb.

OK.  I don't see what advantage you hope to get from ZODB.

 What is boring is the fact, that the database gets incremental updates all
 the time,
 changed prices, packing info, etc.

Are these just data updates? Or schema updates too?

 We need to cope with millions of recipes that come from certain dates
 and therefore need to inquire different versions of the database.

I don't understand this. What's a recipe?  Why do you need to
consider old versions of the database?

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Jim Fulton
On Sun, Aug 18, 2013 at 1:40 PM, [mabe] pub...@enkore.de wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 He meant prescription.

 In german Rezept is the word for both prescription and recipe (like in
 cooking). Easy to confuse for us germans in english :)

Great.  Now I don't know what he meant by prescription. :) Does it
matter?  Might it as easily be foos and bars?

Christian,

Are you saying that you might need to access items
from an old database that aren't in the current snapshot?

Jim



 On 08/18/2013 06:34 PM, Jim Fulton wrote:
 On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer
 tis...@stackless.com wrote:
 We need to cope with millions of recipes that come from certain
 dates and therefore need to inquire different versions of the
 database.

 I don't understand this. What's a recipe?  Why do you need to
 consider old versions of the database?


 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.20 (GNU/Linux)

 iQIcBAEBAgAGBQJSEQb/AAoJEAOmTcUxK/swEXgP/Ry3x9Y98wp43e2F2cf2063O
 F2UGRNZfylMjG3kTBLfwW9eH5KWk7AmCXdzUw/fXggueyg0NrH9f8aScYVPYHSEp
 g3q9n/I93DrMdDakqLXcnpHlKuUrd1ZfBk+XSyavvnOdV4LWGJ6+Wd8yqAFmUUCl
 bn//STvajUqSpO1+nG0aQsSceeTCVTEuyzQ/O4nSujhERG2ED7XOwi/1WwgruZSY
 2ZGZCeLmHHLgYg6G8zPDRX6q/Y0GYLGi2bCQ0aQWlHEkBJBtPgCWn3rG+9GBlNXv
 bSXu0yjbaHL3q8VvdwAh4Y7n8E9TV1KVojOJmCg6MOA+AusL475Lao2/yBtZG3s3
 mg12/NSUY/hGGoqtnsvXkIV8+ggK7WVlZRDzAoiHymR/3kdNO4MWYxFcvjCrvu8x
 RB6gIsVLglWKu5cuCJDrK7eGmdVK/y0Tmtl2qGKNnn+PJrZqNB9rk2kfmPMVIBdy
 VkFjvBQICL3aFZjSEDeqOeLdis221V9y3ndgKer6K5OG2KBNsv8dUX2smb7Qx7RT
 dbhhXwhI3C9i7ifzDEcrUavUfJCDQNLQovo1F/sL5hChFJAFS6USeWALt7B41YBu
 lN5ThjgIhkuyWfhs+ZAPeze5rRcY5lt+3oWLcD9fav+jJsifGodBdLrJ2dbljtWw
 4FJBrKq/+ULC03toajwM
 =A/VY
 -END PGP SIGNATURE-
 ___
 For more information about ZODB, see http://zodb.org/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 https://mail.zope.org/mailman/listinfo/zodb-dev



-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] when to use connection close/sync

2013-08-16 Thread Jim Fulton
On Fri, Aug 16, 2013 at 9:38 AM, Joerg Baach li...@baach.de wrote:
 Hi *,

 after having looked around on google and the zodb api documentation[1] I
 am still unsure how to handle connections properly (especially in the
 context of syncing zeo clients):

 When do I need to:

 - connection.close()

When you're done using a connection. :)

 - connection.sync()

Never.  This is obsolete.

I suspect you aren't asking the right question.

I'm gonna guess you're asking:

What do I need to do to see database updates (made by other
clients/connections)?

The answer to that question is:

To see database changes, you need to start a new transaction.

You always see data as of the time the current transaction started.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: database ids

2013-08-15 Thread Jim Fulton
On Wed, Aug 14, 2013 at 6:49 PM, Vincent Pelletier plr.vinc...@gmail.comwrote:

 Le 15 août 2013 00:09, Jim Fulton j...@zope.com a écrit :
  Comments?

 Please make database ID reachable where _p_oid is reachable (maybe on
 _p_jar, I don't mind a few attribute lookup levels/trivial calls).

Good idea.  ob._p_jar.db().id

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] RFC: database ids

2013-08-14 Thread Jim Fulton
When using a database server (ZEO, relstorage), you can make a
configuration error that causes you to connect to the wrong database.
This can be especially painful in a situation where you get
disconnected from the server and reconnect to an incorrect server
and end up with objects from separate databases in the same cache.
This happened to us (ZC) once when we fat-fingered a ZRS database
fail-over.

ZEO currently defends against this by refusing to connect to a server
if the server's last transaction ID is less than the last transaction
ID the client has seen.  This has a couple of problems:

- The test is too weak.

- It makes fail-over to a slightly out of date secondary storage quite
  painful.

I propose to add a database identifier that clients can verify.

- To minimize impact to storage implementations, the database
  identifier will be stored under the ZODB_DATABASE_ID key of object 0
  (root object).  The key will be added on database open if it is
  absent. The value will be a configured value, or a UUID.

- If a ZEO client is configured with a database identifier, then it
  will refuse to connect to a database without a matching identifier.

- If a ZEO client is *not* configured with a database identifier, it
  will configure itself with the identifier of the first server it
  connects to, saving the information in the ZEO cache.  This will at
  least protect against reconnect to the wrong server.

- A ZEO client can *optionally* be configured to discard cache if it
  (re)connects to a server with a last transaction lower than the last
  one the client has seen as long as the database ID matches.

- ZRS secondaries will also check database ids when (re)connecting to
  primaries.

Comments?

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Ackward PersistentList read Performance

2013-08-13 Thread Jim Fulton
On Tue, Aug 13, 2013 at 9:40 AM, Joerg Baach li...@baach.de wrote:
 Hi *,

 I was trying to measure the impact of using different kind of objects to
 store data in ZODB (disk, ram, time).

 Whats really ackward is the measurement for reading content from
 PersistentLists (that are stored in an IOBTree):

 case a
 ==
 g.edges=IOBTree()
 for j in range(1,100):
 edge =PersistentList([j,1,2,{}])
 g.edges[j] = edge

 x = list(g.edges.values())
 y = [e[3] for e in x]   #this takes 30 seconds

 case b
 ==
 g.edges=IOBTree()
 for j in range(1,100):
 edge =[j,1,2,{}]
 g.edges[j] = edge

 x = list(g.edges.values())
 y = [e[3] for e in x]   #this takes 0.09 seconds

 So, can it really be that using a PersistentList is 300 times slower?

Yes.  This would be true of *any* persistent object. In the first
case, You're creating 100+B database objects, were B is ~20.
In the second case, you're creating B persistent objects.

Depending on what you do between cases A and B, you may also
have to load 100+B vs B objects.

 Am
 I doing something completely wrong,

It depends on your application.  Generally, one uses a BTree to avoid
loading a large collection into memory.  Iterating over the whole
thing defeats that.

Deciding whether to use a few large database objects or many small
ones is a tradeoff between efficiency of access and efficiency of
update, depending on access patterns.

 or am I missing something?

Possibly

 I am using ZODB3-3.10.5. The whole setup (incl. results) is at
 https://github.com/jhb/zodbtime

tl;dr

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-24 Thread Jim Fulton
On Tue, Jul 23, 2013 at 9:12 PM, Christian Tismer tis...@stackless.com wrote:
...
 What I'm after is a way to over-ride the implementation by user code.
 I did not yet check it this is implemented already, in the Python way of
 sub-classing built-ins.

BTrees (somewhet to mysurprise, do support subclassing, although you'll need
to write custom __getstate__ and __setstate__ methods to handle both the BTree
data and instance data.  It would be better if the methods provided by
the BTree classes
handled this automatically (by checking for an instance dictionary or slots).

Also, the BTree implementation isn't informed by user provided
attributes.  It would be better
if it was.  For example. I'd like bucket and internal node sizes to be
controllable via instance
attributes (typically defined in a subclass.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-23 Thread Jim Fulton
On Mon, Jul 22, 2013 at 9:06 PM, Christian Tismer tis...@stackless.com wrote:
...
 Actually, I would like to add a callable-check instead, to allow for more
 flexible derivatives.

 I don't understand this.


 Simple: I am writing BTree forests for versioned, read-only databases.

 For that, I need a way to create a version of Bucket that allows to
 override the _next field by maybe a callable.
 Otherwise all the buckets are chained together and I have no way
 to let frozen BTrees share buckets.

In retrospect, it might make more sense to do the chaining a level up.
Buckets themselves don't care about chaining. The tree wants buckets
to be chained to support iteration.  I'm not really sure if that helps your
use case.

 When I played with the structure, I was happy/astonished to see the _next
 field
 being writable and thought it was intended to be so.
 It was not, in the end ;-)

It's clearly a bug.  The code has a comment right above the attribute definition
stating that it's (supposed to be) read only, but the implementation makes
them writable.

There doesn't seem to be anything that depends on writing this attribute.
I verified this by adding a fix and running the tests (in 3.10).

For what you're trying to do, I suspect you want to fork BTrees, or start
over.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-22 Thread Jim Fulton
On Sat, Jul 20, 2013 at 11:27 PM, Christian Tismer tis...@stackless.com wrote:
 The BTrees package is an attempt to isolate certain things from ZODB.

 While I appreciate the general intent, I cannot see the advantage at
 this point:

 - BTrees can be imported alone, yes. But it has the extensions prepared
with special ZODB slots, which makes this very questionable.

 - BTrees furthermore claims the BTrees global bame for it, all though it
is not a general BTree package, but for ZODB BTrees, only.

Yeah, I worried about this when we broke it out.

OTOH, there isn't much concern with namespace
pollution in the Python community. :/

 - BTrees has a serious bug, see the following example:

  from BTrees import OOBTree as BT
  t = BT.BTree()
  for num in range(100):
 ...   k = str(num)
 ...   t[k] = k
 ...
  t._firstbucket._next = None
  len(t)
 Bus error: 10
 (tmp)minimax:doc tismer$

Ouch.


 So there is either an omission to make t._next() read-only, or a check
 of its validity is missing.

Yup.  OTOH, you're the first person to encounter this
after many years, so while this is bad, and needs to be
fixed, I'm not sure how serious it is as a practical matter.

 Actually, I would like to add a callable-check instead, to allow for more
 flexible derivatives.

I don't understand this.


 * this was my second little rant about ZODB. Not finished as it seems.

 please, see this again as my kraut way of showing interest in improving
 very good things.

:)

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees and ZODB simplicity

2013-07-22 Thread Jim Fulton
On Sat, Jul 20, 2013 at 11:43 PM, Christian Tismer tis...@stackless.com wrote:
 Third rant, dear Zope-Friends (and I mean it as friends!).

 In an attempt to make the ZODB a small, independant package, ZODB
 has been split into many modules.

Maybe not as many as you think:
persistent, transaction, ZEO, ZODB and BTrees.

5 shrug


 I appreciate that, while I think it partially has the opposite effect:

 - splitting BTrees apart is a good idea per se.
But the way as it is, it adds more Namespace-pollution than benefits:

To make sense of BTrees, you need the ZODB, and only the ZODB!
So, why should then BTrees be a top-level module at all?

This does not feel natural, but eavesdropping, pretending as something
that is untrue.

 I think:

  - BTrees should either be a ZODB sub-package in its current state,

  - or a real stand-alone package with some way of adding persistence as
an option.

I don't agree that because a package depends on ZODB
it should be in ZODB.  There are lots of packages that depend
on ZODB.

I agree with your sentiments about namespace pollution.
You and I may be the only ones that care though .3 ;).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] make ZODB as small and compact as expected

2013-07-22 Thread Jim Fulton
On Sun, Jul 21, 2013 at 12:12 AM, Christian Tismer tis...@stackless.com wrote:
 This is my last emission for tonight.

 I would be using ZODB as a nice little package if it was one.

 There should be nothing else but

 ZODB.some_package

 Instead, there is

 BTrees
 persistent
 transaction
 zc.lockfile
 zc.zlibstorage
 ZConfig
 zdaemon
 ZEO
 ZODB
 ZODB3   (zlibstorage)
 zope.interface

 and what I might have forgotton.

 Exception:
 There is also
 zodbpickle
 which I think is very usefull and general-purpose, and I wan to keep it,
 also I will try to push it into standard CPython.

 So, while all the packages are not really large, there are too many
 namespaces
 touched, and things like Zope Enterprize Objects are not meant to be here
 as open source pretending modules which the user never asked for.

Despite it's tech-bubblishish acronym expansion, which
few people are aware of, ZEO is the standard client-server
component of ZODB, is widely used, and is certainly open source.


 I think these things could be re-packed into a common namespace
 and be made simpler.

If ZODB had been born much later, it would certainly have used
a namespace package.  Now, it would be fairly disruptive to change
it.

 Even zope.interface could be removed from
 this intended-to-be user-friendly simple package.

I don't understand what you're saying.  It's a dependency
if ZODB.

 So while the amount of code is astonishingly small, the amount of
 abstraction layering tells the reader that this was never really meant to
 be small.

 And this makes average, simple-minded users like me shy away and go
 back to simpler modules like Durus.

 But the latter has serious other pitfalls, which made me want to re-package
 ZODB into something small, pretty, tool-ish, versatile thing for the pocket.

 Actually I'm trying to re-map ZOPE to the simplistic Durus interface,
 without its short-comings and lack of support.
 I think a successfully down-scaled, isolated package with ZODB's
 great implementation, but a more user-oriented interface would
 help ZODB a lot to get widely accepted and incorporated into very
 many projects.
 Right now people are just too much concerned of implicit complication which
 actually does not exist.

 I volunteer to start such a project. Proposing the name david, as opposed
 to goliath.

ZODB is an old project that has accumulated some cruft over the years,
however:

- I've tried to simplify it and, with the exception of ZEO,
  I think it's pretty straightforward.

- ZODB is used by a lot of people with varying
  needs and tastes.  The fact that it is pretty modular has
  allowed a lot of useful customizations.

- I'm pretty happy with the layered storage architecture.

- With modern package installation tools like buildout and pip,
  having lots of dependencies shouldn't be a problem.
  ZODB uses lots of packages that have uses outside of ZODB.
  I consider this a strength, not a weakness.

  Honestly, I have no interest in catering to users who don't use
  buildout, or pip, or easy_install.

- The biggest thing ZODB needs right now is documentation.
  Unfortunately, this isn't easy. There is zodb.org,
  but much better documentation is needed.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zc.zlibstorage missing from zodb package

2013-07-22 Thread Jim Fulton
On Sat, Jul 20, 2013 at 11:09 PM, Christian Tismer tis...@stackless.com wrote:
 Hi friends,

 I'm trying to work with ZODB. (!)

Cool.


 Coming from durus development since a couple of weeks, I am
 spoiled by simplicity.

 Actually, I'm annoyed by durus' incapability to accept patches,
 so I'm considering to put my efforts into ZODB.

 On the other hand, ZODB tries to become small and non-intrusive,
 but looking at its imports, this is still not a small package, and I'm
 annoyed of this package as well.

 - missing

the zc.zlibstorage module is missing, IMHO.

I don't understand this statement.

besides that, zc.zlibstorage was not maintained since quite a while
and imports ZOPE3.

It's still maintained, but hasn't required maintenance in some
time.

This is nonsensical.  It depends on ZODB and zope.interface
(and zope.testing and manuel for tests).

...

 - discussion

zc.zlibstorage requites a wrapper to add it to filestorage.
I consider this an option, instead, and a simple boolean flag to switch
it on and off.
The module is way too simple to add all this config extra complication
to even think of it.

The layered storage architecture made it very easy and low risk
to add this capability.  Further, some have suggested that we
should use different compression schemes.  Making this pluggable
makes it more flexible.

Having said that though, I agree that compression is something
people almost always want, and I can understand your desire to
make it simpler.

 - proposal:
let me integrate that with ZODB and add a config option, instead of
a wrapper.

I don't know what you mean by integrate.  I suggest, if you want
to make it simpler is to provide new ZConfig tags or Python factories
that make configuration simpler the way you'd like, but that do so
by assembling layers under the hood.

...

 Meant in a friendly, collaborative sense -- Chris

Much appreciated.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] make ZODB as small and compact as expected

2013-07-22 Thread Jim Fulton
On Mon, Jul 22, 2013 at 9:15 AM, Stephan Richter
stephan.rich...@gmail.com wrote:
 On Sunday, July 21, 2013 06:12:34 AM Christian Tismer wrote:
...
  ZConfig

 In my opinion this is a relic from the times before configparser existed.

IMO, ZConfig is very useful in some specific cases, especially ZODB and logging
configuration.

 It is
 also used by other projects outside of ZODB.

  ZEO

 This is separate for historical reasons. I agree it could be merged into the
 ZODB project these days.

It was separate a long time ago.  It's been part of the ZODB distribution
for a long time until recently.

It makes sense for it to be optional, as it's of no interest to people who
use relstorage.

More importantly, it's more complicated than any other part of ZODB and it
makes a lot of sense for ZODB development to be unburdened of it.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees and ZODB simplicity

2013-07-22 Thread Jim Fulton
On Mon, Jul 22, 2013 at 8:11 AM, Christian Tismer tis...@stackless.com wrote:
 On 22.07.13 13:13, Jim Fulton wrote:

 On Sat, Jul 20, 2013 at 11:43 PM, Christian Tismer tis...@stackless.com
 wrote:

 Third rant, dear Zope-Friends (and I mean it as friends!).

 In an attempt to make the ZODB a small, independant package, ZODB
 has been split into many modules.

 Maybe not as many as you think:
 persistent, transaction, ZEO, ZODB and BTrees.

 5 shrug

 I appreciate that, while I think it partially has the opposite effect:

 - splitting BTrees apart is a good idea per se.
 But the way as it is, it adds more Namespace-pollution than benefits:

 To make sense of BTrees, you need the ZODB, and only the ZODB!
 So, why should then BTrees be a top-level module at all?

 This does not feel natural, but eavesdropping, pretending as
 something
 that is untrue.

 I think:

   - BTrees should either be a ZODB sub-package in its current state,

   - or a real stand-alone package with some way of adding persistence as
 an option.

 I don't agree that because a package depends on ZODB
 it should be in ZODB.  There are lots of packages that depend
 on ZODB.


 This is generally true. In the case of BTrees, I think the ZODB
 is nothing without BTrees, and BTrees make no sense without
 a storage and carry those _p_attributes which are not optional.

This is true of every class that subclasses Persistent.


 BTrees would make more sense as a standalone package if the persistence
 model were pluggable. But that is also theoretical because I don't see
 right now how to split that further with all the C code.

Well, it's definitely possible.  Early in the evolution of BTrees, there we
ifdefs that turned off dependence on persistence.

But even with the dependence on Persistent, their still perfectly usable without
storing them in a database.  Their use is just a lot more compelling
in the presence
of a database.

...

 I agree with your sentiments about namespace pollution.
 You and I may be the only ones that care though .3 ;).


 Yay, actually I care mainly because just trying 'pip install ZODB'
 spreads out n folders in my site-packages, and 'pip uninstall ZODB' leaves
 n-1
 to pick the names by hand. That's why I want things nicely grouped ;-)

shrug

Maybe you should use virtualenv or buildout so as to leave your site-packages
alone.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Replication for the Zope Object Database

2013-05-28 Thread Jim Fulton
On Mon, May 27, 2013 at 8:42 PM, Carlos de la Guardia
carlos.delaguar...@gmail.com wrote:
 Hey Jim,

 great news!

 pypi link is wrong, should be: https://pypi.python.org/pypi/zc.zrs

Right you are. Thanks.


 Year for changes in pypi page is shown as 2015.

Oops. Fixed on PyPI and and future releases.

JIm

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Replication for the Zope Object Database

2013-05-27 Thread Jim Fulton
I'm happy to announce that we've released ZRS, a replication framework
for ZODB, as open source.

ZRS provided primary/secondary replication for ZODB File Storages,
typically as part of ZEO (ZODB client-server) servers.

To learn more, see:

  http://www.zope.com/products/x1752814276/Zope-Replication-Services

and:

  https://pypi.python.org/zc.zrs

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-05-10 Thread Jim Fulton
On Fri, May 10, 2013 at 5:04 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 05/08/2013 12:34 PM, Tres Seaver wrote:
 On 04/29/2013 08:37 PM, Stephan Richter wrote:
 Well, that's the py3 branch. As Tres mentioned, zodbpickle is ready
 for Py3 with noload() support. I totally agree that we do not need
 to solve any of the transition work now.

 So for ZODB Py3 support we need to:

 1. Merge the py3 branch into trunk. 2. Simplify zodbpickle to just
 contain the cPickle code that is Py3 compatible.

 I do not care whether this happens for ZODB 4.0 or 4.1 as long as I
 get some commitment that 4.1

 Chris and I chatted with Jim about this over beers last Friday.  I
 explained that the current 'py3; branch does not require the
 'zodbpickle everywhere' stuff (the Python2 side doesn't use
 'zodbpickle').  Jim then agreed that we could merge that branch before
 releasing 4.0.  We will need to add some caveats to the docs /
 changelog (Python3 support is only for new applications, no forward- /
 backward-compatibility for data, etc.)

 Given that ZODB won't import or use 'zodbpickle' under Python2, I
 don't think we need to remove the current Python2 support (as released
 in 0.4.1):  the Python3 version (with noload()) has been there all
 along.


 I have merged the 'py3' branch to 'master':

 - -  All tests pass under all four platforms using buildout.

 - -  All unit tests pass on all four platforms using 'setup.py test'.

 I added the following note to the changelog:

ZODB 4.0.x is supported on Python 3.x for *new* applications only.
Due to changes in the standard library's pickle support, the Python3
support does **not** provide forward- or backward-compatibility
at the data level with Python2.  A future version of ZODB may add
such support.

Applications which need migrate data from Python2 to Python3 should
plan to script this migration using separte databases, e.g. via a
dump-and-reload approach, or by providing explicit fix-ups of the
pickled values as transactions are copied between storages.

 I pushed out a ZODB 4.0.0b1 release after the merge.  If the buildbots
 stay green over the weekend, I think we can release a 4.0.0 final early
 next week.

Great, thanks!

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Sun, Apr 28, 2013 at 8:34 PM, Stephan Richter
stephan.rich...@gmail.com wrote:
 On Sunday, April 28, 2013 07:23:12 PM Jim Fulton wrote:
 Can ZODB 4 be used now without zodbpickle?

 No, unfortunately for Py2 we need the custom cPickle and for Py3 `noload()`
 support (as Tres mentioned).

This is a problem.

The only change in ZODB 4.0 was supposed to be the breakup.

This was supposed to be a low-risk release.  The separation into
multiple packages was supposed to increase agility, but now it
appears we're stuck.

I'd like there to a stable 4.0 release **soon**
that doesn't use zodbpickle for Python 2.

For now, I suggest we focus on stability and the ability to make progress
on non-Python-3-related work.

After that is achieved, I suggest we get to the point where people can
create new databases and use them with Python 3.  We need to do
this without hindering the ability to make new stable releases.

As far as the grander vision for Python2/3 transition and interoperability,
we need to make progress incrementally and not sacrifice stability
of the master branch.

I made the 3.11 release fully expecting a stable 4.0 release soon.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Mon, Apr 29, 2013 at 10:24 AM, Stephan Richter
stephan.rich...@gmail.com wrote:
 On Monday, April 29, 2013 09:48:05 AM Jim Fulton wrote:
 I'd like there to a stable 4.0 release **soon**
 that doesn't use zodbpickle for Python 2.

 I would like to agree. But on the other hand, the ZODB release cycles are very
 long and the prospect of waiting another 6-12 months before any Python 3
 support lands, is really scary because it prohibits me to even write a new
 project in Python 3.

As stated here:

https://mail.zope.org/pipermail/zodb-dev/2012-October/014770.html

I was hoping that the breakup of the ZODB packages would allow
us to increase the tempo of releases.

But increasing tempo is only possible of master is stable.


 (CH has just invested about 6 man-months into the porting
 effort and without ZODB we are basically stuck. But we do not need a 
 transition
 plan, since we can recreate our ZODBs from configuration files.)

 Could we compromise and support Python 3 in ZODB 4.0 without necessarily solve
 all the migration strategy issues?

I suggested that in the part fo my email that you snipped.

 In fact, by using zodbpickle, zodbpickle
 can have a separate, faster release cycle experimenting with some transition
 strategies. Maybe one way to install ZODB 4.0 would be to not use zodbpickle
 and use cPickle instead. We already have all that stuff separated into a
 _compat module, so that should not be too hard.

Right. As I suggested, let's get to a point where we can get a stable ZODB 4.0
release for Python 2.  As soon as we get that, let's get a ZODB 4.0.x or 4.1
release that works on Python 3, presumably via zodbpickle.

While we want to make progress on Python 3, we can't hold
ZODB hostage to the Python 3 porting effort.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Mon, Apr 29, 2013 at 10:20 AM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/29/2013 09:48 AM, Jim Fulton wrote:
 On Sun, Apr 28, 2013 at 8:34 PM, Stephan Richter
 stephan.rich...@gmail.com wrote:
 On Sunday, April 28, 2013 07:23:12 PM Jim Fulton wrote:
 Can ZODB 4 be used now without zodbpickle?

 No, unfortunately for Py2 we need the custom cPickle and for Py3
 `noload()` support (as Tres mentioned).

 This is a problem.

 The only change in ZODB 4.0 was supposed to be the breakup.

 This was supposed to be a low-risk release.  The separation into
 multiple packages was supposed to increase agility, but now it appears
 we're stuck.

 The only reason we had delayed the 4.0 release (in my mind, anyway) was
 that it was a good way to signal the Py3k compatibliity changes.

That was a bad idea. Unless you want to reinforce the fact that
Python 3 is an agility killer. .5 ;)

There's release meta data to signal Python 3 compatibility.

 I'm not
 wedded to calling the Py3k-compatible release 4.0.

Cool.

 I'd like there to a stable 4.0 release **soon** that doesn't use
 zodbpickle for Python 2.

 For now, I suggest we focus on stability and the ability to make
 progress on non-Python-3-related work.

 After that is achieved, I suggest we get to the point where people
 can create new databases and use them with Python 3.  We need to do
 this without hindering the ability to make new stable releases.

 The trunk of the 'ZODB' package does not have any of the Py3k /
 zodbpickle changes yet.  We could make a ZODB 4.0b1 release from the
 trunk today

+1.

 and create a '4.0' stable branch prior to any merge of the
 'py3' work.

Let's keep master stable.  Maybe someone will want to
add features before the Python 3 support is stable.
I don't want to hold 4.1 hostage either.

I suggest breaking the Python 3 work into increments
that can each be introduced without sacrificing stability.

The first increment could provide Python 3 support
without any conversion or compatibility support. This is
something you could probably achieve pretty quickly and
would allow you meet your immediate goals.

 As far as the grander vision for Python2/3 transition and
 interoperability, we need to make progress incrementally and not
 sacrifice stability of the master branch.

 I made the 3.11 release fully expecting a stable 4.0 release soon.

 That was of the 'ZODB3' meta-package, right?

Yes. It was predicated on a stable 4.0 release that had
very little in it beyond the split into separate packages.

It was intended to help people start transitioning
from ZODB3 to ZODB, but that can't happen until ZODB is
stable.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Mon, Apr 29, 2013 at 10:54 AM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/29/2013 10:51 AM, Jim Fulton wrote:

 Right. As I suggested, let's get to a point where we can get a stable
 ZODB 4.0 release for Python 2.  As soon as we get that, let's get a
 ZODB 4.0.x or 4.1 release that works on Python 3, presumably via
 zodbpickle.

 As I proposed earlier this morning,

I can only reply to one email at a time. :)

 we can make a non-Py3k,
 non-zodbpickle 4.0b1 release today from the master branch, and a 4.0
 final in a week.

 Once we get that release out, we can then merge the 'py3' branch,
 including adding the requirement for 'zodbpickle' under both Python2 and
 Py3k, and aim for a much expedited 4.1 release which supports Py3k.

I'd rather keep this would on a branch until it's known to be stable.

I suggest instead focusing on getting new Python 3 applications working
without affecting Python 2 apps.  IOW, only use zodbpickle for Python 3
*initially*.

I want to be able to release from master at almost any time.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Mon, Apr 29, 2013 at 12:25 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/29/2013 11:00 AM, Jim Fulton wrote:
 Let's keep master stable.  Maybe someone will want to add features
 before the Python 3 support is stable. I don't want to hold 4.1
 hostage either.

 Given that the only folks (besides maybe you) invested in ZODB
 development want Py3k support ASAP. I don't see that.  Do you have
 features in mind that you would imagine releasing before we land Py3k
 support?

Yes.  There are lots of features I'd like to add to ZODB.  I tend to work
on them when I have time (infrequently) or where we have a driving
need at ZC.  Long ZODB release cycles provide a lot of stop energy.

The only way to get away from long release cycles is to have a stable
master that's releasable at any time.  OTOH, ZODB is pretty critical
software, so
we have to be very confident in what we merge to master.



 I suggest breaking the Python 3 work into increments that can each be
 introduced without sacrificing stability.

 The first increment could provide Python 3 support without any
 conversion or compatibility support. This is something you could
 probably achieve pretty quickly and would allow you meet your
 immediate goals.

 We are already there, AFAIK, on the 'py3' branch:  the blocker is just
 getting out a release.

All I ask is a stable releasable master.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-29 Thread Jim Fulton
On Mon, Apr 29, 2013 at 1:15 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/29/2013 12:44 PM, Jim Fulton wrote:
 Yes.  There are lots of features I'd like to add to ZODB.  I tend to
 work on them when I have time (infrequently) or where we have a
 driving need at ZC.  Long ZODB release cycles provide a lot of stop
 energy.

 We are already developing this way (the 'py3' branch has not been
 merged).  However, if you do a lot of Python2-only feature work and merge
 to master, you will likely push back to horizon for merging that branch:
  we will have to port any work done to it.

I'd be happy to say that anything pushed to master has to pass tests on
Python 3. I have no interest in delaying Python 3 work.


 Using the 4.0 label to signal big changes ahead, evaluate carefully
 before upgrading was the primary reason I had been pushing to get the
 Py3k stuff landed

Apparently. :)  But, IIRC, you never discussed this with me.
When I announced 4.0, the big change was splitting off ZEO,
persistent, and later BTrees.  In fact, as you may remember,
I suggested splitting BTrees off in 5, because I didn't want to
to delay 4.

 (the low-risk thing would have been more naturally
 labeled 3.11).

Except I explicitly said that 4.0 was supposed to be
a low risk release. That's why 3.11 was just a meta-release
to aid people in the transition to 4.

When I saw all your activity on porting to Python 3, I
stepped back to give you room.  But now, several months
have gone by and we're more or less where we were in
November wrt 4.0.

I greatly appreciate and support you guys have done on Python 3 porting.

I don't mean to criticize the work you've done.  If anyone deserves
criticism, it's me for not staying on top of this.

We need to get to a point where we can release frequently, with confidence.
That doesn't mean we will; it depends on people's time to contribute.
But we need to be able and we need to plan our activities so we can
release frequently.

Whether 4.0 supports Python 3 or not, let's quickly get to the point where
tests are run and pass on both Python 2 and 3.  Once we get to that point,
we won't accept pull requests that break Python 3 (or 2, of course).
But let's get to the point soon where we can make Python 2 releases
with confidence.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-28 Thread Jim Fulton
 storages
   (replace_py2_cpickle_). See:
   https://github.com/zopefoundation/zodbpickle/tree/py2_explicit_bytes

 - - ``zodbpickle`` should provide a pickler/unpickler for use by
   Py3k clients who operate against unconverted storages
   (replace_py3k_pickle_). See:
   https://github.com/zopefoundation/zodbpickle

 - - ``zodbpickle`` might need to provide a wrapper storage supporting
   straddle_no_convert_.


 Comments?

Thanks for taking the time to work all of this out.

It sounds rather complex. :)

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-28 Thread Jim Fulton
On Wed, Apr 17, 2013 at 2:54 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/16/2013 05:13 PM, Stephan Richter wrote:
 On Tuesday, April 16, 2013 04:38:06 PM Tres Seaver wrote:
 Comments?

 (I don't now why Stephan's e-mail didn't make it to the list).


 The big omission that I noticed while reading the text carefully is a
 note saying that you will never be able to use stock Py3k pickle,
 because it does not support noload(). Thus ``zodbpickle`` is needed
 for any Py3k code. (I think this is a correction to you your last
 bullet in _replace_py2_cPickle.)

 Hmm, I think you are correct.

 That reminds me, originally we forked pickle.py from Python 3.3.
 During PyCon I think you decided to start by using cPickle from Python
 2.7 instead. If you are starting from Py2.7 cPickle, then supporting
 Protocol 3 is not easy.

 Already done (as you note in your follow-up).

 Given your writeup, I think you are implicitly saying to start from
 Py3.3 pickle and add the special support for Python 2 binary via the
 special new type. That sounds good to me.

 I would actually prefer to fork the Python 3.2 version:  the one from 3.3
 pulls in a bunch of grotty internal-only usage.

I'm confused.  I don't understand why we need a Python 3 pickler
change to support the new Python 2 binary type.  I thought we were
going to pickle
Python 2 binary objects using the standard Python 3 protocol 3 code?

 BTW, what are your motivations for all the different strategies?

 I wanted to document them all, because some of the strategies suit
 different cases better than others.

 _ignore_compat is obvious. If you can easily create the ZODB from
 other data sources, then you can do a one-time switch. In fact, at
 CipherHealth we have this case, since the ZODB only contains config
 (which is loaded from text files) and session data.

 Yup.  Even for large CMS systems, I would still make dump-to-filesystem,
 then reload a requirement.  Others disagree, of course (and may have
 legitimate reasons).  Leo Rochael Almeida has clients with databases too
 big to convert, for instance (the downtime required to do the conversion
 would be prohibitive, I believe).

 But which strategy would be useful for a large Plone site for example?
 I think we should focus on that and provide one good way to do it.

 Plone has historically preferred in-place migration to dump-reload.  Maybe
 jumping the Py3k curb is enough reason for them to reconsider.

I'm hoping to be able to provide some help with in-place conversion in the
near future.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: Python2 - Py3k database compatibility

2013-04-28 Thread Jim Fulton
On Fri, Apr 26, 2013 at 8:02 PM, Stephan Richter
stephan.rich...@gmail.com wrote:
 On Friday, April 26, 2013 05:34:15 PM Tres Seaver wrote:
 I would like to merge this branch to master early next week and make a
 release, so that we can evaluate merging the 'py3' branch of ZODB.

 Thoughts?  Note that I have not yet addressed the portions of my proposal
 which deal with analyzing / converting existing databases, or with the
 possibly-needed wrapper storage (for on-the-fly conversion).

 Let's do it. This way people can test ZODB 4 on their existing Py2 code bases
 and we at CH can test our uibuilder/demo on Python 3. This will give us at
 least some confidence that we are going in the right direction.

 We might consider not even tackling DB conversions for ZODB 4.0 and delay that
 to ZODB 4.1 or leave it even up to an add-on package. This way people can
 experiment with different approaches and we do not have to nail the conversion
 problem with ZODB 4.0.

I've lost track of where we are with ZODB 4.

Can ZODB 4 be used now without zodbpickle?

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Towards ZODB on Python 3

2013-03-10 Thread Jim Fulton
=True) [7]
   [7] this is the status quo of the 'py3' branch in the ZODB repo
   [8] OTOH we could implement special support for REDUCE of
   codecs.decode() in our noload -- I almost got that working before
   Jim suggested a different approach, which is [6].

 At least there's some nice symmetry: no matter if you pickle your stuff
 on Python 2 or Python 3, you get to deal with bytes becoming unicode
 when you unpickle.  These kinds of guessing games are inevitable when
 you're migrating pickles from Python 2 to Python 3, but do we want to
 make them mandatory for day-to-day operation?

 Perhaps we ought to drop our original goal (3) and require an explicit
 one-time possibly-lossy conversion process for goal (2), then use pickle
 protocol 3 on Python 3 and have short pickles, perfect roundtripping of
 bytestrings?


 Then there's ZEO, which uses pickles for both payloads _and_ for
 marshalling in its RPC layer.  That's also fun, but I think we can at
 least declare that ZEO server and client must be on the same Python
 version, perhaps by bumping the protocol version.


 So, this is where things stand right now.  Plus a few relatively minor
 matters like adding missing noload() tests to zodbpickle and making
 zodbpickle work on Python 3.2 [9]

   [9] https://mail.zope.org/pipermail/checkins/2013-March/065813.html

 Other than that, the ZODB py3 branch works on Python 3.3 [10].  As long as
 you're prepared to deal with bytestrings magically transforming into
 unicodes.

   [10] Stephan reported running an actual small demo application with it.


 Where do we go from here?

Is this an issue for anything but names (object attributes and global
names)?

I don't think there's a native strings issue.  There *does* seem to
be an name issue.  In Python 2 and Python 3, (non-buggy) unicode aware
applications use bytes and unicode the same way, unicode for text,
bytes for data.

AFAICT, Python 3 has (admirably) changed the way names are implemented
to use unicode, rather than ASCII.

Am I missing something?

This is a somewhat thorny, but still fairly restricted problem.  I
would hazard to guess that 99.923% of persistent classes pickle their
state using their instance dictionaries.  99.9968% for regular Python
classes.  We know when we're pickling and unpickling instances and we
can apply transformations necessary for the target platforms.

I think the fix is pretty straightforward.

In the default __setstate__ provided by Persistent, and when loading
non-persistent instances:

- On Python 2, ASCII encode unicode attribute names.

- On Python 3, ASCII decode byte attribute names.

The same transformation is necessary when looking up global names.

This will cover the vast majority of cases where the default
__setstate__ is used.  In rare cases where a custom setstate is used,
or when Python 3 non-ASCII attribute names are used, then databases
may not be sharable accross Python versions.

There is also likely to be breakage in dictionaries or BTrees where
applications are sloppy about mixing Unicode and byte keys.  I don't
think we should try to compensate for this. These applications need to
be fixed.  One could write a database analysis script to detect this
kind of breakage (looking for mixed string and unicide keys).

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Towards ZODB on Python 3

2013-03-10 Thread Jim Fulton
On Sun, Mar 10, 2013 at 11:25 AM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 03/10/2013 09:19 AM, Jim Fulton wrote:
...
 I think the fix is pretty straightforward.

 In the default __setstate__ provided by Persistent, and when loading
 non-persistent instances:

 - On Python 2, ASCII encode unicode attribute names.

 - On Python 3, ASCII decode byte attribute names.

 The same transformation is necessary when looking up global names.

 Hmm, if zodbpickle has to handle the issue for non-persistent instances
 and global names, wouldn't it be simpler to make it handle persistent
 instances too?

No.  It can't know when a key is going to be used for a
persistent attribute name.

  It can examine the stack inside 'load_dict' to figure out
 that the context is an instance, right?

Ugh.  What stack?

It would be much simpler to handle this in __setstate__ (or the equivalent).
This isn't exactly a lot of code.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Towards ZODB on Python 3

2013-03-10 Thread Jim Fulton
On Sun, Mar 10, 2013 at 12:13 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 03/10/2013 11:55 AM, Jim Fulton wrote:
 On Sun, Mar 10, 2013 at 11:25 AM, Tres Seaver tsea...@palladion.com
 wrote:
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1

 On 03/10/2013 09:19 AM, Jim Fulton wrote:
 ...
 I think the fix is pretty straightforward.

 In the default __setstate__ provided by Persistent, and when
 loading non-persistent instances:

 - On Python 2, ASCII encode unicode attribute names.

 - On Python 3, ASCII decode byte attribute names.

 The same transformation is necessary when looking up global
 names.

 Hmm, if zodbpickle has to handle the issue for non-persistent
 instances and global names, wouldn't it be simpler to make it handle
 persistent instances too?

 No.  It can't know when a key is going to be used for a persistent
 attribute name.

 It can examine the stack inside 'load_dict' to figure out that the
 context is an instance, right?

 Ugh.  What stack?

 The one where the unpickler keeps its work-in-progress?

  static int
  load_none(UnpicklerObject *self)
  {
  PDATA_APPEND(self-stack, Py_None, -1);
  return 0;
  }

  static int
  load_dict(UnpicklerObject *self)
  {
  PyObject *dict, *key, *value;
  Py_ssize_t i, j, k;

  if ((i = marker(self))  0)
  return -1;
  j = Py_SIZE(self-stack);

  if ((dict = PyDict_New()) == NULL)
  return -1;

  for (k = i + 1; k  j; k += 2) {
  key = self-stack-data[k - 1];
  value = self-stack-data[k];
  if (PyDict_SetItem(dict, key, value)  0) {
  Py_DECREF(dict);
  return -1;
  }
  }
  Pdata_clear(self-stack, i);
  PDATA_PUSH(self-stack, dict, -1);
  return 0;
  }

That won't work for persistent objects.

Persistent state is set by the deserializer, not by the unpickler.

The deserializer calls the unpickler to load the state.  It then calls
__setstate__ on the persistent object to set the state.  The
serializer doesn't know how to interpret the state, only __setstate__
does.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] transaction: synchronizer newTransaction() behavior

2013-03-09 Thread Jim Fulton
On Fri, Mar 8, 2013 at 9:54 PM, Siddhartha Kasivajhula
countvajh...@gmail.com wrote:
 Hi there,
 I've been discussing this issue with Laurence Rowe on the pylons-dev mailing
 list, and he suggested bringing it up here.

 I'm writing a MongoDB data manager for the python transaction package:
 https://github.com/countvajhula/mongomorphism
 I noticed that for a synchronizer, the beforeCompletion() and
 afterCompletion() methods are always called once the synch has been
 registered, but the newTransaction() method is only called when an explicit
 call to transaction.begin() is made. Since it's possible for transactions to
 be started without this explicit call, I was wondering if there was a good
 reason why these two cases (explicitly vs implicitly begun transactions)
 would be treated differently.

Nope. This is a bug.

 That is, should the following two cases not be
 equivalent, and therefore should the newTransaction() method be called in
 both cases:

 (1)
 t = transaction.get()
 t.join(my_dm)
 ..some changes to the data..
 transaction.commit()

 and:

 (2)
 transaction.begin()
 t = transaction.get()
 t.join(my_dm)
 ..some changes to the data..
 transaction.commit()

Only if (1) was was preceded by an ``abort``. The definition of
``get`` is to get the current transaction, creating one, if necessary.

Really, ``begin` and ``abort`` are equivalent.  It might be better if
there wasn't a ``begin`` method, as it's missleading. One should be an
alias for the other. I'd be for deprecating ``begin``. The call to
``_new_transaction`` should be moved to the point in ``get`` where a
new transaction is created, and ``begin`` should be made an alias for
``abort``.


 In my mongo dm implementation, I am using the synchronizer to do some
 initialization before each transaction gets underway, and am currently
 requiring explicit calls to transaction.begin() at the start of each
 transaction. Unfortunately, it appears that other third party libraries
 using the transaction library may not be calling begin() explicitly, and in
 particular my data manager doesn't work when used with pyramid_tm.

 Another thing I noticed was that a synchronizer cannot be registered like
 so:
 transaction.manager.registerSynch(MySynch())
 .. and can only be registered like this:
 synch = MySynch()
 transaction.manager.registerSynch(synch)

 ... which I'm told is due to MySynch() being stored in a WeakSet which
 means it gets garbage collected. Currently this means that I'm retaining a
 reference to the synch as a global that I never use. Just seems a bit
 contrived so thought I'd mention that as well, in case there's anything that
 can be done about that.

This is to prevent memory leaks.  Normally, the synchronizier is
associated with a database. For example, the synchronizers are
database connection methods.  A stand-alone synchronizer seems weird
to me.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Cache warm up time

2013-03-09 Thread Jim Fulton
On Sat, Mar 9, 2013 at 5:50 AM, Vincent Pelletier plr.vinc...@gmail.com wrote:
 Le Friday 08 March 2013 18:50:09, Laurence Rowe a écrit :
 It would be great if there was a way to advise ZODB in advance that
 certain objects would be required so it could fetch multiple object
 states in a single request to the storage server.

 +1

 I can see this used to process a large tree, objects being be processed as
 they are loaded (loadds being pipelined).

 Pseudo-code interface suggestion:

 class IPipelinedStorage:
   def loadMany(oid_list, callback, tid=None, before_tid=None):
   callback being along the lines of:
 def callback(oid, data_record, tid, next_tid):
   if stop_condition:
 raise ... (StopIteration ? just anything ?)
   return more_oids_to_queue_for_loading
   tid and before_tid (mutualy exclusive) specify the snapshot to use, to
   implement equivalent of loadSerial and loadBefore.

 class IPipelinedConnection:
   def walk(ob, callback):
   callback being along the lines of:
 def callback(just_loaded_object, referee_list):
   # do womething on just_loaded_object
   return filtered_referee_list
   referee_list would expose at least referee's class (name ?), and hold their
   oid for Connection.walk internal use (only ?).
   Or maybe just ghosts, but callback would have to take care of not
   unghostifying them - it would void the purpose of pipelining loads.

 Above ZODB (persistent containers with internal persistent objects, like
 BTree):
   Implement an iterator over subobjects ignoring intermediate internal
   structure (think BTree.*Bucket classes).

 Specific iteration order could probably be specified to be able to implement
 iterkeys and such in BTree for example, but storage may have to implement load
 reordering when they happen in parallel (like NEO, and as could probably be
 implemented for zeoraid and relStorage configured with multiple mirrored
 databases), limiting latency/processing parallelism and possibly leading to
 memory footprint explosion.
 So I think it should be possible to also request no special loading order to
 get lowest latency backend can provide and somewhat constant memory footprint.

 Any thought/comment ?

I think this is more complicated than necessary.

I think a simple method on a storage that gives a hint that a set of
object ids will be loaded is enough.  A network storage could then
issue a pipelined request for those oids. The application can then
proceed as usual.  I think I've proposed such an API before, but am
too lazy to look it up. Something like:

load_hint(*oids)

I'd like to see this functionality, but I don't have time to do it soon.

I must say that I think this API is more likely to be abused
than used effectively.  Prefetching catalog indexes is a sort of
anti-pattern than only makes sense for small catalogs.  It
would likely make more sense to have a dedicated catalog
server that returned oids and possibly object records in
response to queries (or whimper, use solr ).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Cache warm up time

2013-03-09 Thread Jim Fulton
On Sat, Mar 9, 2013 at 9:02 AM, Jim Fulton j...@zope.com wrote:
...
 I think a simple method on a storage that gives a hint that a set of
 object ids will be loaded is enough.  A network storage could then
 issue a pipelined request for those oids. The application can then
 proceed as usual.  I think I've proposed such an API before, but am
 too lazy to look it up. Something like:

 load_hint(*oids)

 I'd like to see this functionality, but I don't have time to do it soon.

 I must say that I think this API is more likely to be abused
 than used effectively.  Prefetching catalog indexes is a sort of
 anti-pattern than only makes sense for small catalogs.  It
 would likely make more sense to have a dedicated catalog
 server that returned oids and possibly object records in
 response to queries (or whimper, use solr ).

I forgot to mention an even simpler way to reduce the number of
round-trips for cataloging data structures is to increase the bucket
and internal node sizes.  You can do this now by patching the
BTree header files. We've done this in the past to reduce the
likelihood of database conflicts when buckets split.  I suspect the
default sizes are too low.

I'd really like BTrees to be subclassible and for sizes to
be read from instance data, falling back to class settings.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodb conversion questions

2013-02-07 Thread Jim Fulton
On Thu, Feb 7, 2013 at 10:48 AM, Jürgen Herrmann
juergen.herrm...@xlhost.de wrote:
 Am 06.02.2013 15:05, schrieb Jürgen Herrmann:

 Hi there!

 I hav a relstorage with mysql backend that grew out of bounds
 and we're looking into different backend solutions now. Possibly
 also going back to FileStorage and using zeo...

 Anyway we'll have to convert the databases at some point. As these
 are live DBs we cannot shut them down for longer than the
 ususal maintenance interval during the night, so for maybe 2-3h.

 a full conversion process will never complete in this time so
 we're looking for a process that can split the conversion into
 two phases:

 1. copy transactions from backup of the source db to the destination
db. this can take a long time, we don't care. note the last
timestamp/transaction_id converted.
 2. shut down the source db
 3. copy transactions from the source db to the destination db, starting
at the last converted transaction_id. this should be fast, as only
a few transactions need to be converted, say  1% .


 if i would reimplement copyTransactionsFrom() to accept a start
 transaction_id/timestamp, would this result in dest being an exact
 copy of source?

 source = open_my_source_storage()
 dest = open_my_destination_storage()
 dest.copyTransactionsFrom(source)
 last_txn_id = source.lastTransaction()
 source.close()
 dest.close()

 source = open_my_source_storage()
 # add some transactions
 source.close()

 source = open_my_source_storage()
 dest = open_my_destination_storage()
 dest.copyTransactionsFrom(source, last_txn_id=last_txn_id)
 source.close()
 dest.close()


 I will reply to myself here :) This actually works, tested with a
 modified version of FileStorage for now. I modified the signature
 of copyTransactionsFrom to look like this:

 def copyTransactionsFrom(self, source, verbose=0, not_before_tid=None):

``start`` would be better to be consistent with the iterator API.

 not_before_tid is a packed tid or None, None meaning copy all
 (the default, so no existing API usage would break).

 Is there public interest in modifying this API permamently?

+.1

This API is a bit of an attractive nuisance.  I'd rather people
learn how to use iterators in their own scripts, as they are very
useful and powerful.  This API just hides that.

The second part, replaying old transactions is a bit more subtle,
but it's still worth it for people to be aware of it.

If I were doing this today, I'd make this documentation
rather than API. But then, documentation ... whimper.

 Anybody want to look at the actual code changes?

Sure, if they have tests.  Unfortunately, we can only accept
pull requests from zope contributors. Are you one?
Wanna be one? :)

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] #zodb on freenode

2013-02-01 Thread Jim Fulton
I hang there, fwiw :)

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-28 Thread Jim Fulton
On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote:
 On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote:
 I plan to do ZEO shortly.

 Well, that didn't go well.  svn git fetch spent several days and didn't
 finish.  It seemed to be really thrashing trying to follow tags.

 I'm probably going to have to convert just the trunk.

Done.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-19 Thread Jim Fulton
On Fri, Jan 18, 2013 at 2:38 PM, Claudiu Saftoiu csaft...@gmail.com wrote:

 I wonder if disk latency is the problem?. As a test you could put the
 index.fs file into a tmpfs and see if that improves things, or cat
 index.fs  /dev/null to try and force it into the fs cache.


 Hmm, it would seem not... the cat happens instantly:

 (env)tsa@sp2772c:~/sports$ time cat Data_IndexDB.fs  /dev/null

 real0m0.065s
 user0m0.000s
 sys 0m0.064s

 The database isn't even very big:

 rw-r--r-- 1 tsa tsa 233M Jan 18 14:34 Data_IndexDB.fs

 Which makes me wonder why it takes so long to load it into memory it's
 just a bit frustrating that the server has 7gb of RAM and it's proving to be
 so difficult to get ZODB to keep ~300 megs of it up in there. Or, indeed, if
 linux already has the whole .fs file in a memory cache, where are these
 delays coming from? There's something I don't quite understand about this
 whole situation...

Some high-level comments:

- ZODB doesn't simply load your database into memory.

  It loads objects when you try to access their state.

  If you're using ZEO (or relstorage, or neo), each load requires a
  round-trip to the server.  That's typically a millisecond or two,
  depending on your network setup.  (Your database is small, so disk
  access shouldn't be an issue as it is, presumably in your disk
  cache.

- You say it often takes you a couple of minutes to handle requests.
  This is obviously very long.  It sounds like there's an issue
  with the way you're using the catalog.  It's not that hard get this
  wrong.  I suggest either hiring someone with experience in this
  area to help you or consider using another tool, like solr.

  (You could put more details of your application here, but I doubt
  people will be willing to put in the time to really analyze it and
  tell you how to fix it.  I know I can't.)

- solr is so fast it almost makes me want to cry.  At ZC, we're
  increasingly using solr instead of the catalog.  As the original
  author of the catalog, this makes me sad, but we just don't have the
  time to put in the effort to equal solr/lucene.

- A common mistake when using ZODB is to use it like a relational
  database, puting most data in catalog-like data structures and
  querying to get most of your data.  The strength of a OODB is that
  you don't have to query to get data from a well-designed object
  model.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-19 Thread Jim Fulton
On Thu, Jan 17, 2013 at 12:31 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
...
 One potential thing is this: after a zeopack the index database .fs file is
 about 400 megabytes, so I figure a cache of 3000 megabytes should more than
 cover it. Before a zeopack, though - I do one every 3 hours - the file grows
 to 7.6 gigabytes.

In scanning over this thread while writing my last message, I noticed
this.

This is a ridiculous amount of churn. There is likely something
seriously out of whack with your application.  Every application is
different, but we typically see *weekly* packs reduce database size by
at most 50%.

 Shouldn't the relevant objects - the entire set of latest
 versions of the objects - be the ones in the cache, thus it doesn't matter
 that the .fs file is 7.6gb as the actual used bits of it are only 400mb or
 so?

Every object update invalidates cached versions of the obejct in all
caches except the writer's.  (Even the writer's cached value is
invalidated of conflict-resolution was performed.)

 Another question is, does zeopacking destroy the cache?

No, but lots of writing does.

 If so then that
 would make sense. I'll have to preload upon every zeopack. If it's not that,
 then I'm not sure what it could be.

I think you have some basic application design problem(s).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-19 Thread Jim Fulton
On Sat, Jan 19, 2013 at 10:06 AM, Leonardo Santagada
santag...@gmail.com wrote:

 On Sat, Jan 19, 2013 at 12:51 PM, Jim Fulton j...@zope.com wrote:

 - solr is so fast it almost makes me want to cry.  At ZC, we're
   increasingly using solr instead of the catalog.  As the original
   author of the catalog, this makes me sad, but we just don't have the
   time to put in the effort to equal solr/lucene.


 We are using it on some projects also... But deploying java is as
 complicated as python to deploy so it increases 2x the deployment work
 needed for a project.

Yup, integration is a downside, which is why we still use
catalogs or related indexing structures too.

 Do you think this could be a good idea, to have an
 integrated solution of ZODB for object persistence and integrated indexing
 using Whoosh?

I have no idea. I'm not familiar with whoosh.

 Probably what would be needed is a project to wrap them
 together, that does indexing during the commit and exposes an indexing api,
 like marking fields in persistent objects for indexing and having methods
 for searching the index.

shrug

I know people have come up with tools to index ZODB objects
with lucene in the past.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-18 Thread Jim Fulton
On Fri, Jan 18, 2013 at 11:55 AM, Claudiu Saftoiu csaft...@gmail.com wrote:



 If you want to load the btree item into cache, you need to do

   item._p_activate()


 That's not going to work, since `item` is a tuple. I don't want to load
 the item itself into the cache, I just want the btree to be in the cache.


 Er, to be clearer: my goal is for the preload to load everything into the
 cache that the query mechanism might use.

 It seems the bucket approach only takes ~10 seconds on the 350k-sized index
 trees vs. ~60-90 seconds. This seems to indicate that less things end up
 being pre-loaded...

I guess I was too subtle before.

Preloading is a waste of time.  Just use a persistent ZEO cache
of adequate size and be done with it.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] API question

2013-01-15 Thread Jim Fulton
On Mon, Jan 14, 2013 at 1:32 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 While working on preparation for a Py3k port, I've stumbled across a
 fundamental issue with how ZODB structures its API.  Do we intend that
 client code do the following::

   from ZDOB import DB, FileStorage
   db = DB(FileStorage('/path/to/Data.fs'))

As Marius points out, this doesn't work.


 or use the module as a facade ::

   import ZODB
   db = ZODB.DB(ZODB.FileStorage.FileStorage('/path/to/Data.fs'))

This doesn't work either. You haven't imported FileStorage.

WRT ZODB.DB, ZODB.DB is an age-old convenience. It's unfortunate that
ZODB.DB (the class) shadows the module, ZODB.DB, just like the class
ZODB.FileStorage.FileStorage shadows the modules
ZODB.FileStorage.FileStorage.FileStorage. (Of course, it's also
unfortunate that there's a ZODB.FileStorage.FileStorage.FileStorage
module. :)

If we had a do-over, we'd use ZODB.db.DB and
ZODB.filestorage.FileStorage, and ZODB.DB would be a convenience for
ZODB.db.DB.


 I would actually prefer that clients explicitly import the intermediate
 modules::

   from ZDOB import DB, FileStorage
   db = DB.DB(FileStorage.FileStorage('/path/to/Data.fs'))

So you don't mind shadowing FileStorage.FileStorage.FileStorage. ;)

 or even better::

   from ZDOB.DB import DB
   # This one can even be ambiguous now

FTR, I don't like this style.  Somewhat a matter of taste.


   from ZODB.FileStorage import FileStorage
   db = DB(FileStorage('/path/to/Data.fs'))

 The driver for the question is getting the tests to pass under both
 'nosetests' and 'setup.py test', where the order of module imports etc.
 can make the ambiguous cases problematic.  It would be a good time to do
 whatever BBB stuff we need to (I would guess figuring out how to emit
 deprecation warnings for whichever variants) before releasing 4.0.0.

I'm pretty happy with the Zope test runner and I don't think using
nosetests is a good reason to cause backward-incompatibility. The zope
test runner works just fine with Python 3. Why do you feel compelled
to introduce nose?

I'm sort of in favor of moving to nose to follow the crowd, although
otherwise, nose is far too implicit for my taste. It doesn't hande
doctest well at all.

Having said that, if I was going to do something like this, I'd
rename the modules, ZODB.DB and ZODB.FileStorage to ZODB.db and
ZODB.filestorage and add module aliases for backward compatibility. I
don't know if that would be enough to satisfy nose.

I'm not up for doing any of this for 4.0.  I'm not alergic to a 5.0 in
the not too distant future.  I'm guessing that a switch to nose would
also make you rewrite all of the doctests as unittests. As the
prrimary maintainer of ZODB, I'm -0.8 on that.

Back to APIs, I think 90% of users don't import the APIs but set up
ZODB via ZConfig (or probably should, if they don't).  For Python use,
I think the ZODB.DB class short-cut us useful.  Over the last few
years, ZODB has grown some additional shortcuts that I think are also
useful. Among them:

ZODB.DB(filename) - DB with a file storage
ZODB.DB(None) - DB with a mapping storage
ZODB.connection(filename) - connection to DB with file storage
ZODB.connection(None) - connection to DB with mapping storage

More importantly:

ZEO.client us a shortcut for ZEO.ClientStorage.ClientStorage
ZEO.DB(addr or port) - DB with a ZEO client
ZEO.connection(addr or port) - connection to DB with a ZEO client

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] API question

2013-01-15 Thread Jim Fulton
On Mon, Jan 14, 2013 at 7:20 PM, Tres Seaver tsea...@palladion.com wrote:
...
 I'm tempted to rename the 'DB.py' module 'db.py', and jam in a BBB entry
 in sys.modules for 'ZODB.DB';  likewise, I am tempted to rename the
 'FileStorage.py' package 'filestorage', its same-named module
 '_filestorage.py', and jam in BBB entries for the old names.

+.9 if done without backward-incompatiblke breakage. This would be a
4.1 thing.  +1 if you used zodb.filestorage.filestorage rather than
zodb.filestorage._filestorage.

 Those renames would make the preferred API:
from ZODB import DB # convenience alias for the class
from ZODB import db # the moodule
from ZODB.db import DB # my preferred speling
from ZDOB.filestorage imoprt FileStorage # conv. alias for class
from ZODB import filestorage # the package
from ZODB.filestorage import FileStorage # my preferred speling

This is the same as one earlier.  I suspect you meant:

from ZODB.filestorage._filestorage import FileStorage

but couldn't type the underware.

I don't think the packagification of the FileStorage module was a win,
but it's too hard to fix it now.

Some day, I'd like to work on a filestorage2, but fear I won't ever
find the time. :(

from ZODB.filestorage import _filestorage # if needed

We shouldn't design an API where we expected people to grab underware.

Aside from not liking from imports and the _filestorage nit, +1

 For extra bonus fun, we could rename 'ZODB' to 'zodb' :)

In that case, we might switch to a namespace package, oodb, which I've
already reserved:

  http://pypi.python.org/pypi/oodb

But I doubt we're up for this much disruption.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
On Tue, Jan 15, 2013 at 2:08 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 On Tue, Jan 15, 2013 at 2:07 PM, Leonardo Santagada santag...@gmail.com
 wrote:




 On Tue, Jan 15, 2013 at 3:10 PM, Jim Fulton j...@zope.com wrote:

 On Tue, Jan 15, 2013 at 12:00 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  Hello all,
 
  I'm looking to speed up my server and it seems memcached would be a
  good
  way to do it - at least for the `Catalog` (I've already put the catalog
  in a
  separate
  zodb with a separate zeoserver with persistent client caching enabled
  and it
  still doesn't run as nice as I like...)
 
  I've googled around a bit and found nothing definitive, though...
  what's the
  best way to combine zodb/zeo + memcached as of now?

 My opinion is that a distributed memcached isn't
 a big enough win, but this likely depends on your  use cases.

 We (ZC) took a different approach.  If there is a reasonable way
 to classify your corpus by URL (or other request parameter),
 then check out zc.resumelb.  This fit our use cases well.


 Maybe I don't understand zodb correctly but if the catalog is small enough
 to fit in memory wouldn't it be much faster to just cache the whole catalog
 on the clients? Then at least for catalog searches it is all mostly as fast
 as running through python objects. Memcache will put an extra
 serialize/deserialize step into it (plus network io, plus context switches).


 That would be fine, actually. Is there a way to explicitly tell ZODB/ZEO to
 load an entire object and keep it in the cache? I also want it to remain in
 the cache on connection restart, but I think I've already accomplished that
 with persistent client-side caching.

You can't cause a specific object (or collection of objects) to stay
ion the cache, but if you're working set is small enough to fit in
the memory or client cache, you can get the same effect.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
So, first, a concise partial answer to a previous question:

ZODB provides an in-memory object cache.  This is non-persistent.
If you restart, it is lost.  There is a cache per connection and the
cache size is limited by both object count and total object size (as
estimated by database record size).

ZEO also provides a disk-based cache of database records read
from the server.  This is normally much larger than the in-memory cache.
It can be configured to be persistent.  If you're using blobs, then there
is a separate blob cache.

On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 You can't cause a specific object (or collection of objects) to stay
 ion the cache, but if you're working set is small enough to fit in
 the memory or client cache, you can get the same effect.


 That makes sense. So, is there any way to give ZODB a Persistent and tell it
 load everything about the object now for this transaction so  that the
 cache mechanism then gets triggered, or do I have to do a custom search
 through every aspect of the object, touching all Persistents it touches,
 etc, in order to get everything loaded? Essentially, when  the server
 restarts, I'd like to pre-load all these objects (my cache is indeed big
 enough), so that if a few hours later someone makes a request that uses it,
 the objects will already be cached instead of starting to be cached right
 then.

ZODB doesn't provide any pre-warming facility.  This would be
application dependent.

You're probably better off using a persistent ZEO cache
and letting the cache fill with objects you actually use.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
On Tue, Jan 15, 2013 at 2:45 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 On Tue, Jan 15, 2013 at 2:40 PM, Jim Fulton j...@zope.com wrote:

 So, first, a concise partial answer to a previous question:

 ZODB provides an in-memory object cache.  This is non-persistent.
 If you restart, it is lost.  There is a cache per connection and the
 cache size is limited by both object count and total object size (as
 estimated by database record size).

 ZEO also provides a disk-based cache of database records read
 from the server.  This is normally much larger than the in-memory cache.
 It can be configured to be persistent.  If you're using blobs, then there
 is a separate blob cache.

 On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  You can't cause a specific object (or collection of objects) to stay
  ion the cache, but if you're working set is small enough to fit in
  the memory or client cache, you can get the same effect.
 
 
  That makes sense. So, is there any way to give ZODB a Persistent and
  tell it
  load everything about the object now for this transaction so  that the
  cache mechanism then gets triggered, or do I have to do a custom search
  through every aspect of the object, touching all Persistents it touches,
  etc, in order to get everything loaded? Essentially, when  the server
  restarts, I'd like to pre-load all these objects (my cache is indeed big
  enough), so that if a few hours later someone makes a request that uses
  it,
  the objects will already be cached instead of starting to be cached
  right
  then.

 ZODB doesn't provide any pre-warming facility.  This would be
 application dependent.

 You're probably better off using a persistent ZEO cache
 and letting the cache fill with objects you actually use.


 Okay, that makes sense. Would that be a server-side cache, or a client-side
 cache?

There are no server-side caches (other than the OS disk cache).

 I believe I've already succeeded in getting a client-side persistent
 disk-based cache to work (my zodb_indexdb_uri is
 zeo://%(here)s/zeo_indexdb.sock?cache_size=2000MBconnection_cache_size=50connection_pool_size=5var=zeocacheclient=index),

This configuration syntax isn't part of ZODB.  I'm not familiar with
the options there.

 but this doesn't seem to be what you're referring to as that is exactly the
 same size as the in-memory cache.

I doubt it, but who knows?

 Could you provide some pointers as to how
 to get a persistent disk-based cache on the ZEO server, if that is what you
 meant?

ZODB is configured via ZConfig.  The parameters are defined here:

  https://github.com/zopefoundation/ZODB/blob/master/src/ZODB/component.xml

Not too readable, but at least precise. :/

Look at the parameters for zodb and zeoclient.

Here's an example:

zodb main
  cache-size 10
  pool-size 7

  zeoclient
blob-cache-size 1GB
blob-dir /home/zope/foo-classifieds/blob-cache
cache-size 2GB
server das-head1.foo.zope.net:11100
server das-head2.foo.zope.net:11100
  /zeoclient
/zodb

If you want to use this syntax with paste, see:

  http://pypi.python.org/pypi/zc.zodbwsgi

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zeopack error in zrpc.connection

2013-01-07 Thread Jim Fulton
On Mon, Jan 7, 2013 at 1:04 PM, Claudiu Saftoiu csaft...@gmail.com wrote:

 How do I go about fixing this? Let me know if I can provide any other
 information that would be helpful.


 I took the advice in this thread:
 https://mail.zope.org/pipermail/zodb-dev/2012-January/014526.html

 The exception that comes up, from the zeo server log, is:

 2013-01-07T13:01:49 ERROR ZEO.zrpc (14891) Error raised in delayed method
 Traceback (most recent call last):
   File /home/tsa/env/lib/python2.6/site-packages/ZEO/StorageServer.py,
 line 1377, in run
 result = self._method(*self._args)
   File /home/tsa/env/lib/python2.6/site-packages/ZEO/StorageServer.py,
 line 343, in _pack_impl
 self.storage.pack(time, referencesf)
   File /home/tsa/env/lib/python2.6/site-packages/ZODB/blob.py, line 796,
 in pack
 result = unproxied.pack(packtime, referencesf)
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/FileStorage.py,
 line 1078, in pack
 pack_result = self.packer(self, referencesf, stop, gc)
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/FileStorage.py,
 line 1034, in packer
 opos = p.pack()
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line
 397, in pack
 self.gc.findReachable()
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line
 190, in findReachable
 self.findReachableAtPacktime([z64])
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line
 275, in findReachableAtPacktime
 for oid in self.findrefs(pos):
   File
 /home/tsa/env/lib/python2.6/site-packages/ZODB/FileStorage/fspack.py, line
 328, in findrefs
 return self.referencesf(self._file.read(dh.plen))
   File /home/tsa/env/lib/python2.6/site-packages/ZODB/serialize.py, line
 630, in referencesf
 u.noload()
 TypeError: 'NoneType' object does not support item assignment


 I'm afraid this doesn't seem to help me figure out what's wrong...

I suspect your database is corrupted.  You'd probably want to look at
the record in question to be sure.

You could disable garbage collection, but if you have a damaged
record, you might want to use the previous version of the record
(if it exists) to recover it.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-05 Thread Jim Fulton
On Fri, Jan 4, 2013 at 1:19 PM, Jim Fulton j...@zope.com wrote:
 I'll do and BTrees, persistent and transaction in about a week.

...

 This weekend, I'll update svn for these packages and ZODB to indicate
 that development is taking place in github.

So ZODB, persistent, BTrees and transaction are no in github,
so noted in SVN and SVN projects now read only.

Jim


-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] repoze.catalog.query very slow

2013-01-04 Thread Jim Fulton
Y'all might have better luck with this on zope-dev.

Jim

On Thu, Jan 3, 2013 at 5:25 PM, Jeff Shell j...@bottlerocket.net wrote:
 Dear gods, I hope you get an answer to this question, as I've noticed the 
 same thing with very large indexes (using zc.catalog). I believe that at the 
 root layers repoze.catalog is built around the same concepts and structures.

 When I tried to trace down the problems with a profiler, they all revolved 
 around loading the relevant portions of the indexes into memory. It had 
 nothing to do with the final results of the query; had nothing to do with 
 waking up the 'result' objects; all of the slowness seemed to be in loading 
 the indexes themselves into memory. In one case, we were only using one index 
 (a SetIndex), with about 15 document ids.

 This is from my own profiling. All I know is that this is very slow, and then 
 very fast, and the object cache and the relevant indexes ability to keep all 
 of their little BTrees or Buckets or Sets or whatever in that object cache 
 seem to have a tremendous impact on query and response time - far more than 
 is taken up by then waking up the content objects in your result set.

 When the indexes aren't in memory, in my case I found the slowness to be in 
 BTrees's 'multiunion' function; the real slowness was in calling ZODB's 
 setstate (which is loading into memory). This is just BTree (catalog index) 
 data being loaded at this point:


 Profiling a fresh site (no object cache / memory population yet)
 
 winterhome-firstload.profile% callees multiunion
 Function  called...
   ncalls  tottime  cumtime
 {BTrees._IFBTree.multiunion}  -   659800.132   57.891  
 Connection.py:848(setstate)

 winterhome-firstload.profile% callers multiunion
 Function  was called by...
   ncalls  tottime  cumtime
 {BTrees._IFBTree.multiunion}  -  270.348   58.239  
 index.py:203(apply)

 (yep, 58 seconds; very slow ZEO network load in a demostorage setup where ZEO 
 cannot update its client cache, which makes these setstate problems very 
 exaggerated). 'multiunion' is called 27 times, but one of those times takes 
 58 seconds).


 Profiling the same page again with everything all loaded
 
 winterhome-secondload.profile% callees multiunion
 Function  called...
   ncalls  tottime  cumtime
 {BTrees._IFBTree.multiunion}  -

 winterhome-secondload.profile% callers multiunion
 Function  was called by...
   ncalls  tottime  cumtime
 {BTrees._IFBTree.multiunion}  -  270.1930.193  
 index.py:203(apply)

 (this time, multiunion doesn't require any ZODB loads, and its 27 calls 
 internal time and cumulative time are relatively speedy)

 If there's a good strategy for getting and keeping these things in memory, 
 I'd love to know it; but when the catalog indexes are competing with all of 
 the content objects that make up a site, it's hard to know what to do or even 
 how to configure the object cache counts well without running into serious 
 memory problems.

 On Jan 3, 2013, at 2:50 PM, Claudiu Saftoiu csaft...@gmail.com wrote:

 Hello all,

 Am I doing something wrong with my queries, or is repoze.catalog.query very 
 slow?

 I have a `Catalog` with ~320,000 objects and 17 `CatalogFieldIndex`es. All 
 the objects are indexed and up to date. This is the query I ran (field names 
 renamed):

 And(InRange('float_field', 0.01, 0.04),
 InRange('datetime_field', seven_days_ago, today),
 Eq('str1', str1),
 Eq('str2', str2),
 Eq('str3', str3),
 Eq('str4', str4))

 It returned 15 results so it's not a large result set by any means. The 
 strings are like labels - there are 20 things any one of the string fields 
 can be.

 This query took a few minutes to run the first time. Re-running it again in 
 the same session took 1 second each time. When I restarted the session it 
 took only 30 seconds, and again 1 second each subsequent time.

 What makes it run so slow? Is it that the catalog isn't fully in memory? If 
 so, is there any way I can guarantee the catalog will be in memory given 
 that my entire database doesn't fit in memory all at once?

 Thanks,
 - Claudiu
 ___
 For more information about ZODB, see http://zodb.org/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 https://mail.zope.org/mailman/listinfo/zodb-dev

 Thanks,
 Jeff Shell
 j...@bottlerocket.net



 ___
 For more information about ZODB, see http://zodb.org/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 https://mail.zope.org/mailman/listinfo/zodb-dev



-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky

Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-04 Thread Jim Fulton
On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote:
 On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote:
 I plan to do ZEO shortly.

 Well, that didn't go well.  svn git fetch spent several days and didn't
 finish.  It seemed to be really thrashing trying to follow tags.

 I'm probably going to have to convert just the trunk.

 I'm guessing that this is related to the project split.

 I'll do and BTrees, persistent and transaction in about a week.

 Or so. :)  I suspect that these will have the same problem and that I'll
 have to convert just the trunk.

persistent went smoothly enough.  persistent is now in github.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-04 Thread Jim Fulton
On Fri, Jan 4, 2013 at 1:05 PM, Jim Fulton j...@zope.com wrote:
 On Wed, Jan 2, 2013 at 9:37 AM, Jim Fulton j...@zope.com wrote:
 On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote:
 I plan to do ZEO shortly.

 Well, that didn't go well.  svn git fetch spent several days and didn't
 finish.  It seemed to be really thrashing trying to follow tags.

 I'm probably going to have to convert just the trunk.

 I'm guessing that this is related to the project split.

 I'll do and BTrees, persistent and transaction in about a week.

 Or so. :)  I suspect that these will have the same problem and that I'll
 have to convert just the trunk.

 persistent went smoothly enough.  persistent is now in github.

BTrees and transaction converted smoothly too.  Go figure.

This weekend, I'll update svn for these packages and ZODB to indicate
that development is taking place in github.

Jim


-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-02 Thread Jim Fulton
On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote:
 I plan to do ZEO shortly.

Well, that didn't go well.  svn git fetch spent several days and didn't
finish.  It seemed to be really thrashing trying to follow tags.

I'm probably going to have to convert just the trunk.

I'm guessing that this is related to the project split.

 I'll do and BTrees, persistent and transaction in about a week.

Or so. :)  I suspect that these will have the same problem and that I'll
have to convert just the trunk.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2013-01-02 Thread Jim Fulton
On Wed, Jan 2, 2013 at 1:04 PM, Leonardo Santagada santag...@gmail.com wrote:


 On Wed, Jan 2, 2013 at 12:37 PM, Jim Fulton j...@zope.com wrote:

 On Fri, Dec 28, 2012 at 4:44 PM, Jim Fulton j...@zope.com wrote:
  I plan to do ZEO shortly.

 Well, that didn't go well.  svn git fetch spent several days and didn't
 finish.  It seemed to be really thrashing trying to follow tags.

 I'm probably going to have to convert just the trunk.

 I'm guessing that this is related to the project split.


 Pypy had a very hard problem with that trying to convert from svn to
 mercurial. They end up using a home made tool for that, maybe they still
 have it. After converting to mercurial it is easy to convert to git.

Thanks.  I'll keep that in mind.  Do you know where the tool can be found?

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Blobstorage shrinks only after the second pack operation - bug or feature?

2012-12-30 Thread Jim Fulton
On Sun, Dec 30, 2012 at 3:50 AM, Andreas Jung li...@zopyx.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I noticed a strange behavior with packing a storage having lots of data
 in a blob storage (Plone 4.2, Zope 2.13).

 I had a large Plone site (5 GB of data in blobstorage) in a dedicated
 storage. I removed the Plone Site object and packed the storage through
 the Zope 2 database management screen. The size of the Data.fs went down
 however the blobstorage size remained the same. Packing a second removed
 the obsolete data from the blob storage.

 So why do I have to pack two times in order to get a minimized blob
 storage?

Without knowing more details, it's impossible to know.

What did you have pack-keep-old set to?

 Bug or feature?

I doubt either.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Blobstorage shrinks only after the second pack operation - bug or feature?

2012-12-30 Thread Jim Fulton
On Sun, Dec 30, 2012 at 12:22 PM, Andreas Jung li...@zopyx.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1



 Jim Fulton wrote:
 On Sun, Dec 30, 2012 at 3:50 AM, Andreas Jung li...@zopyx.com
 wrote:
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1

 I noticed a strange behavior with packing a storage having lots of
 data in a blob storage (Plone 4.2, Zope 2.13).

 I had a large Plone site (5 GB of data in blobstorage) in a
 dedicated storage. I removed the Plone Site object and packed the
 storage through the Zope 2 database management screen. The size of
 the Data.fs went down however the blobstorage size remained the
 same. Packing a second removed the obsolete data from the blob
 storage.

 So why do I have to pack two times in order to get a minimized
 blob storage?

 Without knowing more details, it's impossible to know.

 What did you have pack-keep-old set to?

 I used the default of the Zope 2 UI which is 0 (days).

You didn't answer my question.  I didn't ask how many days you packed to.
I asked if you set pack-keep-old. I'm guessing from your response that you
didn't.

If you don't set pack-keep-old to false, then old blobs are kept around, just
like the file-storage file, except that, rather than making copies, hard links
are created.

The second time you packed, the old links would be removed, freeing up
the space taken by the old blobs.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZODB projects to github

2012-12-28 Thread Jim Fulton
On Wed, Dec 26, 2012 at 1:42 PM, Jim Fulton j...@zope.com wrote:
 I'd like to move ZODB-related projects to github nowish.
 (I'm having problems converting ZODB though :()

OK, I think I worked out the problems I was having with svn2git
(It's broken. I wrote a simple Python script that did the same thing,
except for the brokeness :)

The new repo is at:

https://github.com/zopefoundation/ZODB

Please don't check into the svn project any more.

It would be great if people would kick the tires on this
to make sure I didn't miss anything in the conversion.

In a few days, I'll mark the svn project as moved to github.

Eventually, I'll set the tests up to run with travis.  It would
be just fine if someone beat me to it. :)

Also, generally, I'd like to update trunk via pull requests,
rather than direct checkins to trunk (exceptions being things
like travis setup or checkins associated with releasing.)

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Moving ZODB projects to github

2012-12-28 Thread Jim Fulton
On Fri, Dec 28, 2012 at 3:44 PM, Mikko Ohtamaa
mi...@opensourcehacker.com wrote:
 Hi,

 OK, I think I worked out the problems I was having with svn2git

 (It's broken. I wrote a simple Python script that did the same thing,
 except for the brokeness :)


 I also run problems with svn2git. In the end I just ended up migrating trunk
 + limited history :

 Is your Python script available somewhere, just for the poor souls of the
 future who will come after us?

It's a work in progress. I plan to publish it more widely when I've used it more
and maybe/probably incorporated the github/oauth goodness you pointed me to.

But here's a snapshot. Let me know if it works for you (or doesn't).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm


svn2git.py
Description: Binary data
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Moving ZEO, BTrees, persistent and transaction to github

2012-12-28 Thread Jim Fulton
I plan to do ZEO shortly.

I'll do and BTrees, persistent and transaction in about a week.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Moving ZODB projects to github

2012-12-26 Thread Jim Fulton
I'd like to move ZODB-related projects to github nowish.
(I'm having problems converting ZODB though :()

If you're currently working on something, let me know so
I don't leave your work behind.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Please clean up your unneeded dev branches in the ZODB svn project

2012-12-26 Thread Jim Fulton
Some of the branches (especially some tseaver_...), are giving svn2git fits.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Please clean up your unneeded dev branches in the ZODB svn project

2012-12-26 Thread Jim Fulton
On Wed, Dec 26, 2012 at 7:25 PM, Jim Fulton j...@zope.com wrote:
 Some of the branches (especially some tseaver_...), are giving svn2git fits.

G. The branches that bothered svn2git were already deleted. Whimper.

Sorry.

Jim


-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] shared cache when no write?

2012-12-13 Thread Jim Fulton
On Wed, Dec 12, 2012 at 6:31 PM, Dylan Jay d...@pretaweb.com wrote:
 Hi,


 I've been working with zope for over 12 years and something that
  keeps coming up is sacling IO bound operations in Zope. The typical
  example is where you build an app that calls external apis. While
  this is happening a zope thread isn't doing any other processing
  and because there is a 1 thread 1 zodb cache limit.  You can run
  into scalability problems as you can only have as many threads your
  RAM / average cache size. The end result is low throughput while
  still having low CPU. I've consulted on some $$$ sites where others
  have made this mistake. It's an easy mistake to make as SQL/PHP
  systems don't tend to have this limitation so new developers to
  zope often don't to think of it.

I was listening to a talk by a Java guy on Friday where he warned that
a common newbie mistake was to have too large a database connection
pool, causing lots of RAM usage.  I expect though that ZODB caches,
consisting of live Python objects exacerbate this effect.


  The possible workarounds aren't
  pretty. You can segregate your api calling requests to zeo clients
  with large numbers of threads with small caches using some fancy
  load balancing rules. You can rework that part of your application
  to not use zope, perhaps using edge side includes to make it seem p
  art of the same app.

 Feel free to shoot down the following makes no sense.  What if two
 or more threads could share a zodb cache up until the point at which
 one wants to write. This is the point at which you can't share a
 cache in a consistent manner in my understanding. At that point the
 transaction could be blocked until other readonly transactions had
 finished and continue by itself? or perhaps the write transaction
 could be aborted and restarted with a special flag to ensure it was
 processed with the cache to itself. As long as requests which
 involve external access are readonly with regard to zope then this
 would improve throughput. This might seem an edge case but consider
 where you want to integrate an external app into a zope or Plone
 app. Often the external api is doing the writing not the zope
 part. For example clicking a button on a plone site to make plone
 send a tweet. It might also improve throughput on zope requests
 which involve zodb cache misses as they are also IO bound.

A simpler approach might be to manage connections better at the
application level so you don't need so many of them.  If you're goinng
to spend a lot of time blocked waiting on some external service, why
not close the database connection and reopen it when you need
it? Then you could have a lot more threads than database connections.

It's possible that ZODB could help at the savepoint level.  For
example, maybe you could somehow allow savepoints to be used accross
tranasctions and connections.  This would be a lot saner that tring to
share a cache accross threads.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] shared cache when no write?

2012-12-13 Thread Jim Fulton
On Thu, Dec 13, 2012 at 4:18 PM, Dylan Jay d...@pretaweb.com wrote:
...
 I'd never considered that the cache was attached to the db connection rather
 than the thread. I just reread
 http://docs.zope.org/zope2/zope2book/MaintainingZope.html and it says
 exactly that.
 So what your saying is I'd tune db connections down to memory size on an
 instance dedicated to io bound and then increase the threads. Whenever a
 thread requests a db connection and there isn't one available it will block.
 So I just optimize my app the release the db connection when not needed.
 In fact I could tune all my copes this way since a zone with 10 threads and
 2 connections is going to end up queuing requests the same as 2 threads and
 10 connections?

Something like that. It's a little more complicated than that because
Zope 2 is managing connections for you, it would be easy to run afoul
of that.  This is a case where something that usually makes your life
easier, makes it harder. :)

What I'd do is use a separate database other than the one Zope 2 is
using.  Then you can manage connections yourself without conflicting
with the publisher is doing.  Then, when you want to use the database,
you just open the database, being careful to close it when you're
going to block.  The downside being that you'll have separate
transactions.

 This should be easier to achieve and changes the application less than the
 erp5 background task solution mentioned.

It would probably be a good idea to lean more bout how erp does this.
The erp approach sounds like a variation on what I suggested.

 I can see from the previous post, as there is no checkout semantics
 in zodb,

I don't know what checkout semantics means.

 you are free to write anytime so there is no sane way to block at the point
 someone wants to write to an object, so it wouldn't work.

ZODB provides a very simple concurrency model by giving each
connection (and in common practice, each thread) it's own view of the
database. If you break that, then you're injecting concurrency issues
into the app or in some pretty magical layer.

 You perhaps could have a single read only db connection which is
 shared?

But even if the database data was only read, objects have other state
that may be mutated.  You'd have to inspect every class to make sure
it's thread safe. That's too scary for me.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO invalidation message transaction-inbound or outbound

2012-12-03 Thread Jim Fulton
On Mon, Dec 3, 2012 at 2:35 PM, Shane Hathaway sh...@hathawaymix.org wrote:
 I've seen ZEO clients become stale due to network
 instability.  The clients caught up the moment they changed something. This
 was years ago, though.

ZEO clients now send keep-alive messages to the storage server to
prevent this from happening.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Need Windows Volunteers -- Losing interest in windows

2012-12-02 Thread Jim Fulton
For a while now, I've tried to make sure ZODB runs on windows.

I'll keep doing so as long as it is fairly easy. I'm not a windows
developer and life is short.

I have a Windows XP VM with the free Microsoft compiler on it.
It's getting rather long in the tooth and doesn't seem to work
with Python 3. shrug.  I'll keep using it as long as it works,
but ...

We need people who care about Windows to step up.
It's necessary, but not sufficient to run tests on windows.
We need people willing to debug and fix windows issues.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] ZODB 3.11.0a1 released -- the breakup!

2012-12-01 Thread Jim Fulton
I just released ZODB 3.11.0a1.  ZODB3 is now a meta package that
requires persistent, ZODB, BTrees and ZEO, all at versions =4dev.

I expect this release to be boring.  If I don't hear of any problems,
I plan to move these releases to beta in a few days and to final a few
days after that.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO invalidation message transaction-inbound or outbound

2012-11-30 Thread Jim Fulton
On Fri, Nov 30, 2012 at 1:37 AM, Andreas Jung li...@zopyx.com wrote:
 a customer made the observation that that ZEO clients
 became inconsistent after some time (large CMF-based application
 running on Zope 2.12 afaik). Customer made some investigation and
 noticed that the ZEO invalidations have been queued (in some cases for
 hours).

I can't imagine invalidations being queued for many seconds, much less
hours without some serious network pathology.

How did they come to this conclusion?

..so the overall state of the ZEO clients became inconsistent.
 Aren't ZEO invalidation messages supposed to be transmitted within
 the current transaction (and outbound as it seems to happen here).

No. Invalidations (and all other data) are queued for transmission
over the network.

 Isn't a ZEO cluster to be completely consistent at any time?

Each client has a consistent view of the database, but not necessarily
an up-to-date one.

Differene clients may have views of the database as it was at (usually
slightly) different points time, depending on which data they've
recieved.

It's not practical for all clients to have precisely the same
view of the database as each other, although they should differ by
seconds, or less.

The only way I can see clients having views of the database far out of
sync with the server is if the clients are disconncted.  A ZEO client
can continue functioning normally from a server while disconnected as
long as it doesn't write anything and has all the data it needs in
its cache.  It has a consistent view of the database, but not an
up-to-date one.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Call for volunteers: help w finishing Python BTrees

2012-11-09 Thread Jim Fulton
On Thu, Nov 8, 2012 at 10:59 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 08/21/2012 06:50 PM, Tres Seaver wrote:
 On 10/04/2011 01:32 PM, Jim Fulton wrote:
 On Tue, Oct 4, 2011 at 11:36 AM, David Glick
 davidgl...@groundwire.org wrote:
 On 10/4/11 8:33 AM, Jim Fulton wrote:

 Someone recently told me I should be more agressive about asking
 for help.

 If someone is looking for an opportunity to help, finishing the
  Python version of BTrees would help a lot.  I think I got this
  started pretty well, but ran out of time.  This is needed for
 running ZODB on PyPy and jython, both of which I'd like to see.

 svn+ssh://svn.zope.org/repos/main/ZODB/branches/jim-python-btrees




 Jim

 P.S. Much thanks to Tres for his work on the Python version of
 persistence.

 What tasks remain to be done? (I assume running the tests will
 give a starting point, but perhaps there are other todo items you
 know of?)

 Really, just getting the tests to pass.  I think there are a lot of
  legacy, but still supporte features that need to be fixed.  (This
 is a really old package.)

 In a fresh checkout of the branch, I see what looks like an infinite
 loop in the tests:  I left it running for an hour just now, and it
 hung inside the '_set_operation' helper function inside the
 'test_difference' testcase for 'PureOO' testcase.

 Just a quick update:  my 'pure_python' branch now passes all tests on
 Python 2.6, 2.7, and PyPy (no C extension1)  I plan to do a lot of
 cleanup during the PyConCA sprints next week before merging the branch to
 the trunk.

Awesome. Thanks.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RFC: ZODB 4.0 (without persistent)

2012-11-07 Thread Jim Fulton
On Sat, Oct 20, 2012 at 3:37 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 10/20/2012 01:47 PM, Jim Fulton wrote:

 I

 had the impression that Tres was proposing more. shrug


 I released BTrees 4.0.0, and created a ZODB branch for the (trivial)
 shift to depending on it:

   http://svn.zope.org/ZODB/branches/tseaver-btrees_as_egg/

 That branch passes all tests, and should be ready for merging.

Merged and released.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] ZODB 4.0.0a1 released

2012-11-07 Thread Jim Fulton
Not to be confused with ZODB3! :)

This is the first ZODB 4 release.

I'll be making a ZEO 4.0.0a1 release soon and then a ZODB3 3.11.0 release
that simply requires ZEO 4. Although I wonder if that should be a
ZODB 3 4.0.0a1 release.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles

2012-10-24 Thread Jim Fulton
On Wed, Oct 24, 2012 at 12:33 AM, Tim Godfrey t...@obsidian.com.au wrote:
 Hi Jim

 Do you have any idea as to why people recommend against many storages under
 a single Zeo?

Seriously?

Go back and read the thread.

 Also can increasing the invalidation-queue-size help this if
 there is memory to spare on the machine?

Invalidation queue size has nothing to do with this.

Jim



 Tim

 On 23 October 2012 00:53, Jim Fulton j...@zope.com wrote:

 On Sun, Oct 21, 2012 at 8:48 PM, Tim Godfrey t...@obsidian.com.au wrote:
  Hi Jon
 
  Thanks for your response. Is that something that has been done in a
  later
  version of Zeoserver than mine (ZODB3-3.10.3)?

 No. Every time I've tried to switch to poll in a asyncore, I've had
 problems.

 
  It this your recommended action for the issue I'm having or are there
  still
  some configuration changes I can make?

 Many people have recommended not hosting many storages in a single
 process.

 Jim

 --
 Jim Fulton
 http://www.linkedin.com/in/jimfulton
 Jerky is better than bacon! http://zo.pe/Kqm




 --
 Tim Godfrey
 Obsidian Consulting Group

 P: +61 3 9355 7844
 F: +61 3 9350 4097
 E: t...@obsidian.com.au
 W: http://www.obsidian.com.au/



-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles

2012-10-22 Thread Jim Fulton
On Sun, Oct 21, 2012 at 8:48 PM, Tim Godfrey t...@obsidian.com.au wrote:
 Hi Jon

 Thanks for your response. Is that something that has been done in a later
 version of Zeoserver than mine (ZODB3-3.10.3)?

No. Every time I've tried to switch to poll in a asyncore, I've had problems.


 It this your recommended action for the issue I'm having or are there still
 some configuration changes I can make?

Many people have recommended not hosting many storages in a single process.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZeoServer, multiple storages and open file handles

2012-10-18 Thread Jim Fulton
On Thu, Oct 18, 2012 at 3:09 PM, Leonardo Rochael Almeida
leoroch...@gmail.com wrote:

Thanks for pitching in with an answer!

 Having a single ZEO server handling more than one or two storages is
 not usually a good idea. ZEO does not handle each storage in a
 separate thread, so you're underusing multiple CPUs if you have them.

Nit pick: ZEO handles each client connection and storage in a separate thread.
(So 30 storages and 16 clients means 480 threads :)

It is Python's GIL that prevents effective use of multiple processors.

ZEO goes out if it's way to let I/O C code run concurrently (since I/O isn't
subject to the GIL) and I've seen ZEO storage servers use up to 200% CPU
on 4-core boxes (2 cores worth, IOW).

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


  1   2   3   4   5   6   7   8   >