Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-29 Thread Marius Gedminas
On Tue, Aug 28, 2012 at 06:31:05PM +0200, Vincent Pelletier wrote:
 On Tue, 28 Aug 2012 16:31:20 +0200,
 Martijn Pieters m...@zopatista.com wrote :
  Anything else different? Did you make any performance comparisons
  between RelStorage and NEO?
 
 I believe the main difference compared to all other ZODB Storage
 implementation is the finer-grained locking scheme: in all storage
 implementations I know, there is a database-level lock during the
 entire second phase of 2PC, whereas in NEO transactions are serialised
 only when they alter a common set of objects.

This could be a compelling point.  I've seen deadlocks in an app that
tried to use both ZEO and PostgreSQL via the Storm ORM.  (The thread
holding the ZEO commit lock was blocked waiting for the PostgreSQL
commit to finish, while the PostgreSQL server was waiting for some other
transaction to either commit or abort -- and that other transaction
couldn't proceed because it was waiting for the ZEO lock.)

Marius Gedminas
-- 
People who think, Oh this is a one-off, need to be offed, or perhaps politely
removed from the project.
-- George Neville-Neil


signature.asc
Description: Digital signature
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-29 Thread Jim Fulton
On Wed, Aug 29, 2012 at 2:29 AM, Marius Gedminas mar...@gedmin.as wrote:
 On Tue, Aug 28, 2012 at 06:31:05PM +0200, Vincent Pelletier wrote:
 On Tue, 28 Aug 2012 16:31:20 +0200,
 Martijn Pieters m...@zopatista.com wrote :
  Anything else different? Did you make any performance comparisons
  between RelStorage and NEO?

 I believe the main difference compared to all other ZODB Storage
 implementation is the finer-grained locking scheme: in all storage
 implementations I know, there is a database-level lock during the
 entire second phase of 2PC, whereas in NEO transactions are serialised
 only when they alter a common set of objects.

 This could be a compelling point.  I've seen deadlocks in an app that
 tried to use both ZEO and PostgreSQL via the Storm ORM.  (The thread
 holding the ZEO commit lock was blocked waiting for the PostgreSQL
 commit to finish, while the PostgreSQL server was waiting for some other
 transaction to either commit or abort -- and that other transaction
 couldn't proceed because it was waiting for the ZEO lock.)

This sounds like an application/transaction configuration problem.
To avoid this sort of deadlock, you need to always commit in a
a consistent order.  You also need to configure ZEO (or NEO)
to time-out transactions that take too long to finish the second phase.

I don't think NEO's locking strategy mitigates the deadlock problem
much, if at all.

The strategy should provide greater transaction throughput and
reduce latency.  It's a strategy I'd like to implement for ZEO at some
point.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-29 Thread Jim Fulton
On Tue, Aug 28, 2012 at 12:31 PM, Vincent Pelletier vinc...@nexedi.com wrote:
...
 I forgot in the original mail to mention that NEO does all conflict
 resolutions on client side rather than server side. The same happens in
 relStorage, but this is different from ZEO.

That's good.  I'd like to move ZEO in this direction.  I'd also
like to stop hanging conflict-resolution on classes and have
some kind of registry, so that people can set CR policies
independent of class implementation.

I didn't realize that relstorage did client side CR, but thinking
about it, it has to work that way, since there's no relstorage
server.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-28 Thread Vincent Pelletier
On Mon, 27 Aug 2012 14:37:37 +0200,
Vincent Pelletier vinc...@nexedi.com wrote :
 Under the hood, it relies on simple features of SQL databases

To make things maybe a bit clearer, from the feedback I get:
You can forget about SQL presence. NEO usage of SQL is as a relational
as a handful of python dicts is. Except there is no way to load only
part of a pickled dict, or do range searches (ZODB's BTrees are much
better in this regard), or writable to disk atomically without having to
implement this level of atomicity ourselves.

Ideally, NEO would use something like libhail, or maybe even simpler
like kyotocabinet (except that we need composed keys, and kyotocabinet
b-trees have AFAIK no such notion).
SQL as a data definition language was simply too convenient during
development (need a new column ? easy, even if you have a 40GB table),
and it stuck - and we have yet to find a significant drawback to
implement a new storage backend.

As a side effect, SQL allows gathering some statistics over the data
contained in a database very efficiently. Number of current objects,
number of revisions per object, number of transactions, when
transactions occured in base history, average object size, largest
object, you name it.

-- 
Vincent Pelletier
ERP5 - open source ERP/CRM for flexible enterprises
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-28 Thread Martijn Pieters
On Mon, Aug 27, 2012 at 2:37 PM, Vincent Pelletier vinc...@nexedi.com wrote:
 NEO aims at being a replacement for use-cases where ZEO is used, but
 with better scalability (by allowing data of a single database to be
 distributed over several machines, and by removing database-level
 locking), with failure resilience (by mirroring database content among
 machines). Under the hood, it relies on simple features of SQL
 databases (safe on-disk data structure, efficient memory usage,
 efficient indexes).

How does NEO compare to RelStorage? NEO appears to implement the
storage roughly in the same way; store pickles in tables in a SQL
database.

Some differences that I can see from reading your email:

* NEO takes care of replication itself; RelStorage pushes that
responsibility to the database used.
* NEO supports MySQL and sqlite, RelStorage MySQL, PostgreSQL and Oracle.
* RelStorage can act as a BlobStorage, NEO can not.

Anything else different? Did you make any performance comparisons
between RelStorage and NEO?

-- 
Martijn Pieters
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-28 Thread Vincent Pelletier
On Tue, 28 Aug 2012 16:31:20 +0200,
Martijn Pieters m...@zopatista.com wrote :
 Anything else different? Did you make any performance comparisons
 between RelStorage and NEO?

I believe the main difference compared to all other ZODB Storage
implementation is the finer-grained locking scheme: in all storage
implementations I know, there is a database-level lock during the
entire second phase of 2PC, whereas in NEO transactions are serialised
only when they alter a common set of objects.
This removes an argument in favour of splitting databases (ie, using
mountpoints): evading the tpc_vote..tpc_finish database-level locking.

Also, NEO distributes objects over several servers (aka, some or all
servers might not contain the whole database), for load balancing/
parallelism purposes. This is not possible if one relies on relational
database replication alone.

I forgot in the original mail to mention that NEO does all conflict
resolutions on client side rather than server side. The same happens in
relStorage, but this is different from ZEO. Packing on client side
makes it easier to get the setup right: with ZEO you will get more
conflicts than normal if it cannot load some class which implements
conflict resolution, and this might go unnoticed until someone worries
about a performance drop or so. With client-side resolution, if you
don't see Broken Objects, conflict resolution for those classes works.

Some comments on some points you mentioned:
 * NEO supports MySQL and sqlite, RelStorage MySQL, PostgreSQL and
 Oracle.

It should be rather easy to adapt to more back-ends.
We (Nexedi) are not interested in proprietary software, so we will
probably not implement Oracle support ourselves. For PostgreSQL, it's
just that we do not have a setup at hand and the experience to
implement a client properly. I expect that it would not take more than a
week to get PostgreSQL implemented by someone used to it and knowing
python, but new to NEO.

Just to demonstrate that NEO really does not rely on fancy features of
SQL servers, you may dig in older revisions in NEO's git repository. You
can find a btree.py[1] test storage, which is based on ZODB.BTree
class. It was just a toy, without persistence support (I initially
intended to provide it, but never finished it) and hence limited by
the available amount of RAM. But it was otherwise a fully functional NEO
storage backend. I think it took me a week-end to put it together,
while discovering ZODB.Btree API and adapting NEO's storage backend
API along the way (this was the first non-MySQL backend ever
implemented, so API was a bit too ad-hoc at that time).

sqlite was chosen as a way to get rid of the need to setup a
stand-alone SQL server in addition to NEO storage process. We are not
sure yet of how well our database schema holds when there are several
(10+) GB of data in each storage node.

 * RelStorage can act as a BlobStorage, NEO can not.

I would like to stress that this has nothing to do with design, rather
it's just not implemented. We do not wish to rely on filesystem-level
sharing, so we consider something along the lines of providing a
FUSE-based to share blob storage, which then can abstract the blobs
being distributed over several servers. This is just the general idea,
we don't have much experience with blob handling ourselves (which is
why we preferred to leave it asides rather than providing an
unrealistic - and hence unusable - implementation).

[1]http://git.erp5.org/gitweb/neoppod.git/blob/75d83690bd4a34cfe5ed83c949e4a32c7dec7c82:/neo/storage/database/btree.py

Regards,
-- 
Vincent Pelletier
ERP5 - open source ERP/CRM for flexible enterprises
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-27 Thread Lennart Regebro
On Mon, Aug 27, 2012 at 2:37 PM, Vincent Pelletier vinc...@nexedi.com wrote:
 Hi,

 We've just tagged the 1.0 NEO release.

 NEO aims at being a replacement for use-cases where ZEO is used, but
 with better scalability (by allowing data of a single database to be
 distributed over several machines, and by removing database-level
 locking), with failure resilience (by mirroring database content among
 machines). Under the hood, it relies on simple features of SQL
 databases (safe on-disk data structure, efficient memory usage,
 efficient indexes).

That sounds pretty cool!
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )