Re: [ZODB-Dev] mvcc related error?

2007-03-15 Thread Chris Withers

Dieter wrote:



Unfortunately, neither of these means anything to me ;-)


That is because you did not look at the code :-)


Much as I wish I had time to read and learn the whole zodb code base, I 
don't. It wasn't clear what that code did and what those assertions 
really meant...


Jim wrote:

I'm glad you brought that up.  I'd like to set up a project in Launchpad.


https://bugs.launchpad.net/zodb/+bug/92507

cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] mvcc related error?

2007-03-15 Thread Jeremy Hylton

On 3/15/07, Chris Withers [EMAIL PROTECTED] wrote:

Dieter wrote:

 Unfortunately, neither of these means anything to me ;-)

 That is because you did not look at the code :-)

Much as I wish I had time to read and learn the whole zodb code base, I
don't. It wasn't clear what that code did and what those assertions
really meant...


The code in question has some docstrings that explain the basic idea.
You certainly don't need to read the whole codebase.

_setstate_noncurrent(obj) attempts to load the state of obj that was
current before the transaction started (technically, before
_txn_time).  loadBefore() returns a 3-tuple including the transaction
ids the delimit the lifetime of this particular revision of the
object.  It was written by transaction start and was current until
transaction end committed.  If end is None, it implies that the
revision returned by loadBefore() is the current revision.  There is
an assert here, because the _setstate_noncurrent() is only called if
the object is in the invalidated set, which implies that there is a
non-current revision to read.

If I had to guess, I'd say it was a bug in loadBefore().  It looks
like the only ways for loadBefore() to return None for end are
- The very first record for the object has a transaction id less than
the tid argument.  If so, end_tid is never set.  Not sure this is
compatible with the object being in the invalidated set.
- Something is happening with versions.  Are you using versions?  It
seems likely that there are bugs here.
- There's a bug in the code that reads the data record from the
storage where it reads None for a transaction id.  That seems very
unlikely.

Perhaps the reasoning about invalidated sets and transaction ids is
wrong in the presence of versions.  MVCC should not work with
versions, but I don't see code that will abort the loadBefore() call
if the Connection has a version.  You aren't using versions, are you?

Jeremy

Jeremy



Jim wrote:
 I'm glad you brought that up.  I'd like to set up a project in Launchpad.

https://bugs.launchpad.net/zodb/+bug/92507

cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
- http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Are Data.fs and Data.fs.index the same regardless of platform?

2007-03-15 Thread Ray Liere

We are running Zope 3.2 (and therefore ZODB 3.6), and are using
FileStorage for the ZODB storage mechanism.

We have several Zope installations on different platforms -- different
hardware word sizes, different cpu's, different (Linux) operating
 systems, etc.

We routinely copy the Data.fs and Data.fs.index files between systems
(using scp) -- so as to for example snag a copy of the live ZODB for
testing purposes. And the ZODB copied to a different platform has
always worked perfectly -- i.e., we are able to access objects in it
(via Zope), their values are correct, etc.

My question: is this because the Data.fs internal format (modulo
whatever scp does when it transfers files between different platforms)
is always the same, no matter on which platform the ZODB was created?
OR ... have we just been extremely lucky?

I looked through lots of web pages and articles on the ZODB (and on
Zope), and did of course learn that objects in the ZODB have been
pickled (all object types?) and pickling gives platform independence.
But it occurs to me that there is probably glue in the Data.fs holding
the pickles together, so to speak, and that the glue itself may not
ALWAYS be platform independent in the above sense. I also found
discussions on Zope and ZODB being machine-independent, and there
being products using them that are also machine-independent, but I
think the generally understood meaning of that term is not exactly
what I am asking.

Thanks for any answers or pointers you can provide.

Ray Liere
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Community opinion about search+filter

2007-03-15 Thread Martijn Faassen

Hello,

Adam Groszer wrote:

I'd like to ask your opinion, your experiences about searching and
filtering in quite large object DBs.
We need to add search and filter functions to our current app, where
the user might be able to create quite _sophisticated_ filter criterias.
(The app is a pure Z3 app, subject is document management)

Currently we're looking at something based on catalog/indexes.
As I checked the most comfortable solution would be based on
hurry.query.
Some questions arose:
- Is it necessary/worth adding indexes on all attributes?
- How does the index perform on modification and retrieval?

The biggest problem is that this will be our first try, so we're
missing experiences and are a bit puzzled about the right solution.
Certain is that moving to RDB is not an option.


I think one of the main limitations of the current catalog (and 
hurry.query) is efficient support for sorting and batching the query 
results. The Zope 3 catalog returns all matching results, which can then 
be sorted and batched. This will stop being scalable for large 
collections. A relational database is able to do this internally, and is 
potentially able to use optimizations there.


It would be very nice if someone could look into expanding hurry.query 
and/or the catalog to support these cases. It would be interesting to 
look at what Dieter Maurer has done with AdvancedQuery in Zope 2 in this 
regard as well.


Regards,

Martijn


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Community opinion about search+filter

2007-03-15 Thread Ross Patterson
Martijn Faassen [EMAIL PROTECTED] writes:

 I think one of the main limitations of the current catalog (and
 hurry.query) is efficient support for sorting and batching the query
 results. The Zope 3 catalog returns all matching results, which can
 then be sorted and batched. This will stop being scalable for large
 collections. A relational database is able to do this internally, and
 is potentially able to use optimizations there.

 It would be very nice if someone could look into expanding hurry.query
 and/or the catalog to support these cases. It would be interesting to
 look at what Dieter Maurer has done with AdvancedQuery in Zope 2 in
 this regard as well.

I recently became obsessed with this problem and sketched out an
architecture for presorted indexes.  I thought I'd take this
opportunity to get some review of what I came to.

From my draft initial README:

 Presort provides intids which assure the corresponding documents will
 be presorted in any BTrees objects where the intid is used as a key.

 Presorted intids exist alongside normal intids.  Intids are
 distributed over the range of integers available on a platform so as
 to avoid moving presorted intids whenever possible, but eventually a
 given presorted intid may need to be placed in between two other
 consecutive presorted intids.  When this happens, one or more
 presorted intids will have to be moved.  Normal intids are unchanging
 as normal.

 Presort also provides a facility for updating objects who store
 presorted intids when a presorted intid is moved, such as in indexes.
 It also provides for indexes that map a given query to the appropriate
 presorted intid result set and for catalogs that use the appropriate
 presorted intid utility to lookup the real objects for the results.

Would this be a viable approach?  Would it be generally useful?

Also, the problem of distributing intids so that they are moved as
seldom as possible is a bit of a challenge.  I'm sure there's some
algorithm for such problems that I'm just unaware of.  Does anyone
have any pointers for this?

Ross

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev