[ZODB-Dev] Re: AW: diploma thesis: ZODB Indexing

2007-09-11 Thread Martijn Faassen

Jim Fulton wrote:


On Sep 5, 2007, at 9:39 AM, Christian Theune wrote:

[snip]
I also have the feeling that our goal for ad-hoc querying would be 
incompatible with your envisioned framework for defining

collections and indexes.


I guess I have no idea what you are talking about.

I assumed you meant something along the lines of what people expect
of relational databases.  In the relational world, people define
tables and indexes in order to be able to do indexed ad-hoc queries.
Maybe you are talking about something else.


It is interesting to compare with XML databases. Some XML databases like 
MonetDB or eXist offer XPath queries into the database without anyone 
having to pre-define indexes. Basically these databases tend to index 
the entire tree structure. I'd suggest reading the eXist papers that are 
about. I'd also take a look at MonetDB, as at its core it's a general 
database system which has a RDB and XML db built on top.


http://monetdb.cwi.nl/

I'm not sure how many of these ideas can be translated to work for 
Python structures - XPath is rather specific to XML, after all. But if 
anyone wants to talk about my ideas on all this, let me know. :)


Regards,

Martijn



___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: AW: diploma thesis: ZODB Indexing

2007-09-05 Thread Laurence Rowe

Christian Theune wrote:
snip /

We imagine we need two kinds of components to make this work:

1. A query processor that could look like:

class IQueryProcessor(Interface):

def query(...):
Returns a list of matching objects. The parameters are
   specific to the query processor in use.


Alternatively, as the signature of the only method isn't specified
anyway, we could make each query processor define its own interface
instead.

2. An object collection that serves two purposes:

a) maintain indexes

b) provide a low-level query API that is rich enough to let different
query processors e.g. for SQL, xpath, ... work against them.

This is the one that needs most work to get the separation of concerns
right. One split we came up with are the responsibilities to define:

- which objects to index
- how to store the indexes
- how to derive the structural relations between objects

Those could be separated into individual components and make the object
collection a component that joins those together.

On the definition of indexes: we're not sure whether a generic set of
indexes will be sufficient (e.g. the three indexes from XISS - class
index, attribute index, structural index) or do those need to be
exchanged? 


For our ad-hoc querying we certainly don't want to have to set up
specialised indexes to make things work, but maybe optional indexes
could be used when possible -- just like RDBMS.



Make sure you take a look at SQLAlchemy's implementation of this, 
sqlalchemy.orm.query.


RDBMS do not get fast querying for free... They just revert to a 
complete record scan when they do not have an index - analogous to the 
find tab in the ZMI. As anyone who has ever queried such a database can 
attest, it ain't quick. (RDBMSs tend to create implicit indexes on 
primary and foreign keys also.)


Laurence

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: AW: diploma thesis: ZODB Indexing

2007-09-05 Thread Christian Theune
Hi,

Am Mittwoch, den 05.09.2007, 21:47 +0100 schrieb Laurence Rowe:
 Make sure you take a look at SQLAlchemy's implementation of this, 
 sqlalchemy.orm.query.

Thanks for the tip.

 RDBMS do not get fast querying for free... They just revert to a 
 complete record scan when they do not have an index - analogous to the 
 find tab in the ZMI. As anyone who has ever queried such a database can 
 attest, it ain't quick. (RDBMSs tend to create implicit indexes on 
 primary and foreign keys also.)

Well. They do have some support on the storage side because of the
strong typed rectangular shape. E.g. we discovered that postgres seems
to never do index lookups in tables with less than about 1000 rows --
for us it always did table scans even when indexes existed.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev