Re: [ZODB-Dev] Re: ZODB Benchmarks

2008-02-01 Thread Roché Compaan
I have completed my first round of benchmarks on the ZODB and welcome
any criticism and advise. I summarised our earlier discussion and
additional findings in this blog entry:
http://www.upfrontsystems.co.za/Members/roche/where-im-calling-from/zodb-benchmarks

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Dieter Maurer
Andreas Jung wrote at 2008-2-1 12:13 +0100:
>
>
>--On 1. Februar 2008 03:03:53 -0800 Tarek Ziadé <[EMAIL PROTECTED]> 
>wrote:
>>
>> Since BTrees are written in C, I couldn't add my own conflict manager to
>> try to merge buckets. (and this is
>> way over my head)
>>
>
>But you can inherit from the BTree classes and hook your 
>_p_resolveConflict() handler into the Python class - or?

I very much doubt that this is a possible approach:

  A BTree is a complex object, an object that creates new objects
  (partially other BTrees and partially Buckets) when it grows.

  Giving the application used "BTree" class an "_p_resolvedConflict"
  will do little -- because the created subobjects (Buckets mainly)
  will not know about it.

  Note especially, that the only effective conflict resolution
  is at the bucket level. As you can see, there is currently
  no way to tell a "BTree" which "Bucket" class it should use
  for its buckets -- this renders your advice ineffective.



-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage now in Subversion

2008-02-01 Thread Dieter Maurer
Hallo Shane,

Shane Hathaway wrote at 2008-1-31 13:45 -0700:
> ...
>No, RelStorage doesn't work like that either.  RelStorage opens a second
>database connection when it needs to store data.  The store connection
>will commit at the right time, regardless of the polling strategy.  The
>load connection is already left open between connections; I'm only
>talking about allowing the load connection to keep an idle transaction.
> I see nothing wrong with that, other than being a little surprising.

That looks very troubesome.

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.

>>  and you read older and older data
>> which must increase serializability problems
>
>I'm not sure what you're concerned about here.  If a storage instance
>hasn't polled in a while, it should poll before loading anything.

Even if it has polled not too far in the past, it should
repoll when the storage is joined to a Zope request processing
(in "Connection._setDB"):
If it does not, then it may start work with an already outdated
state -- which can have adverse effects when the request bases modifications
on this outdated state.
If everything works fine, than a "ConflictError" results later
during the commit.

This implies, the read connection must start a new transaction
at least after a "ConflictError" has occured. Otherwise, the
"ConflictError" cannot go away.

> 
>> (Postgres might
>> not garantee serializability even when the so called isolation
>> level is chosen; in this case, you may not see the problems
>> directly but nevertheless they are there).
>
>If that is true then RelStorage on PostgreSQL is already a failed
>proposition.  If PostgreSQL ever breaks consistency by exposing later
>updates to a load connection, even in the serializable isolation mode,
>ZODB will lose consistency.  However, I think that fear is unfounded.
>If PostgreSQL were a less stable database then I would be more concerned.

I do not expect that Postgres will expose later updates to the load
connection.

What I fear is described by the following szenario:

   You start a transaction on your load connection "L".
   "L" will see the world as it has been at the start of this transaction.

   Another transaction "M" modifies object "o".

   "L" reads "o", "o" is modified and committed.
   As "L" has used "o"'s state before "M"'s modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.

If something causes a commit failure, then the probability of such
failures increases with the outdatedness of "L"'s reads.

> ...
>RelStorage only uses the serializable isolation level for loading, not
>for storing.  A big commit lock prevents database-level conflicts while
>storing.  RelStorage performs ZODB-level conflict resolution, but only
>while the commit lock is held, so I don't yet see any opportunity for
>consistency to be broken.  (Now I imagine you'll complain the commit
>lock prevents scaling, but it uses the same design as ZEO, and that
>seems to scale fine.)

Side note:

  We currently face problems with ZEO's commit lock: we have 24 clients
  that produce about 10 transactions per seconds. We observe
  occational commit contentions in the duration of a few minutes.

  We already have found several things that contribute to this problem --
  slow operations on clients while the commit lock is held on ZEO:
  Python garbage collections, invalidation processing, stupid
  application code.
  But there are still some mysteries and we do not yet have
  a good solution.

> 
I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits ("2PC") but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  "[EMAIL PROTECTED]" about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, "RelStorage"
  seems only safe for single storage use.


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Jim Fulton


On Feb 1, 2008, at 9:49 AM, Tarek Ziadé wrote:




Jim Fulton wrote:


...

Since BTrees are written in C, I couldn't add my own conflict
manager to try
to merge buckets. (and this is
way over my head)

That doesn't really matter, because conflict-resolution can only
operate on one object at a time.



Is the class I have shown to Andreas is the way to go for conflict
resolution
(beside the fact that it shouldn't occur with a better design) ?


I doubt it.  I don't want to work that hard.





Jim Fulton wrote:


..
A similar and common mistake is
to allocate keys sequentially.  A better solution is to allocate keys
randomly (or sequentially within threads with random starting  
points).




Is it possible to have some kind of thread-safe next_id() function ?
like what some database systems provides



There are numerous examples of this. It isn't provided by ZODB because  
this is an application issue.  Look at the way ids are generated in  
the Zope 3 intid utility and in the Zope 2 catalog.


Jim

--
Jim Fulton
Zope Corporation


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Tarek Ziadé


Jim Fulton wrote:
> 
> ...
>> Since BTrees are written in C, I couldn't add my own conflict  
>> manager to try
>> to merge buckets. (and this is
>> way over my head)
> That doesn't really matter, because conflict-resolution can only  
> operate on one object at a time.
> 

Is the class I have shown to Andreas is the way to go for conflict
resolution
(beside the fact that it shouldn't occur with a better design) ?


Jim Fulton wrote:
> 
> ..
> A similar and common mistake is  
> to allocate keys sequentially.  A better solution is to allocate keys  
> randomly (or sequentially within threads with random starting points).
> 

Is it possible to have some kind of thread-safe next_id() function ?
like what some database systems provides

++
Tarek
-- 
View this message in context: 
http://www.nabble.com/How-to-avoid-ConflictErrors-in-BTrees---tp15224628p15227536.html
Sent from the Zope - ZODB-Dev mailing list archive at Nabble.com.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Jim Fulton


On Feb 1, 2008, at 6:03 AM, Tarek Ziadé wrote:
...
Since BTrees are written in C, I couldn't add my own conflict  
manager to try

to merge buckets. (and this is
way over my head)


That doesn't really matter, because conflict-resolution can only  
operate on one object at a time.



Is there a way to avoid these conflicts in BTree ?



Your best bet is to look at your application to see if you can avoid  
the hot spot in the first place. For example, your test script creates  
threads that allocate overlapping ids, guaranteeing conflicts. This is  
probably a bug in your test script.  A similar and common mistake is  
to allocate keys sequentially.  A better solution is to allocate keys  
randomly (or sequentially within threads with random starting points).


I have plans to redo conflict resolution to:

- do resolution on the client, where the software is,
- make it more flexible,
- allow conflict-resolution on multiple objects

but I don't know when that will happen.  In any case, there are often  
better application-specific approaches to avoid the conflicts in the  
first place.



Jim

--
Jim Fulton
Zope Corporation


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Tarek Ziadé

I did, but not the right way I guess -> now it works.

Here's my prototype to avoid conflict errors (done quickly, I guess ot can
be done better
to merge the indexes):


class NoConflictBtree(IOBTree):
def _p_resolveConflict(self, oldState, savedState, newState):
if oldState is None and savedState is None:
return newState
else:
def _getBucket(el):
if el is None:
return tuple()
return el[0][0][0]

def _mergeBuckets(*buckets):
res = []
for bucket in buckets:
for el in bucket:
if el not in res:
res.append(el)
return tuple(res)

res = _mergeBuckets(*[_getBucket(el) for el in (oldState,
savedState,
  newState)])
return (((tuple(res),),),)

Is this the right way to do it ?



Andreas Jung-5 wrote:
> 
> 
> 
> --On 1. Februar 2008 03:03:53 -0800 Tarek Ziadé <[EMAIL PROTECTED]> 
> wrote:
>>
>> Since BTrees are written in C, I couldn't add my own conflict manager to
>> try to merge buckets. (and this is
>> way over my head)
>>
> 
> But you can inherit from the BTree classes and hook your 
> _p_resolveConflict() handler into the Python class - or?
> 
> Andreas
>  
> ___
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
> 
> ZODB-Dev mailing list  -  ZODB-Dev@zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-avoid-ConflictErrors-in-BTrees---tp15224628p15225209.html
Sent from the Zope - ZODB-Dev mailing list archive at Nabble.com.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Andreas Jung



--On 1. Februar 2008 03:03:53 -0800 Tarek Ziadé <[EMAIL PROTECTED]> 
wrote:


Since BTrees are written in C, I couldn't add my own conflict manager to
try to merge buckets. (and this is
way over my head)



But you can inherit from the BTree classes and hook your 
_p_resolveConflict() handler into the Python class - or?


Andreas

pgpOj2DvIdzXl.pgp
Description: PGP signature
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] How to avoid ConflictErrors in BTrees ?

2008-02-01 Thread Tarek Ziadé

Hello,

I have a simple use case in a Zope site:

We had to manage massive object creation, so we set up Btrees. 
Users are creating a lot of objects in a single Btree. They are never
working on the same object, just adding new objects. This leads to conflict
errors when the pace of creation is high. It happens for instance in a
website where 200 users work all day long on the website, and are creating
objects in the btree.

I have created a Python script to try to reproduce this problem with ZODB
3.7. Basically, it tries to reproduce the adding of objects by users, with
sleeps to reproduce a realistic transaction when these objects are added
through a Zope app.

It's here: http://paste.plone.org/19308

The stress values I have set are generating Conflict errors on my MacBook.

I have then tried the very same script, using RelStorage, because I thaught
it would be better,
but I have the same non resolved conflicts.

Since BTrees are written in C, I couldn't add my own conflict manager to try
to merge buckets. (and this is
way over my head)

Is there a way to avoid these conflicts in BTree ?

++
Tarek

-- 
View this message in context: 
http://www.nabble.com/How-to-avoid-ConflictErrors-in-BTrees---tp15224628p15224628.html
Sent from the Zope - ZODB-Dev mailing list archive at Nabble.com.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] The write skew issue

2008-02-01 Thread Christian Theune

Hi,

Dieter Maurer schrieb:

Christian Theune wrote at 2008-1-30 21:21 +0100:

...
That would mean that the write skew phenomenon that you found would be 
valid behaviour, wouldn't it?


No.


Am I missing something?


Yes. No matter how you order the two transactions in my example,
the result will be different from what the ZODB produces.


Yes, you're right. My intuitive understanding was wrong, I reproduced 
both schedules and they both produce the wrong result indeed.


Christian

--
gocept gmbh & co. kg - forsterstrasse 29 - 06112 halle (saale) - germany
www.gocept.com - [EMAIL PROTECTED] - phone +49 345 122 9889 7 -
fax +49 345 122 9889 1 - zope and plone consulting and development
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev