Hi,
On Mon, 2008-08-18 at 20:17 +0300, Markus wrote:
I'm new here, so hi! ;-)
I'm looking to create a database of persons and events, later to
search persons by names, events by dates and locations (participants
of events are already in an attribute of the event and instances of
Person, which inherits from Persistent)
At first I made a PersistentList of all the events and a
PersistentMapping of all the people by an id, but later found out,
that searching through a list with a for-loop is very slow (there are
about 200 000 people and 100 000 events). And so as I've looked around
here a bit (the docs and
the wikis are mostly outdated or empty -- there's also talk about the
bad documentation in this mailinglist) I've found, that I should be
using OOBTree for making the indexes.
Yes, the documentation situation is less than desirable for
beginners. :/
So what I'm asking is, is it reasonable to create the db like this:
persons in root['persons'],
which is a OOBTree, mapping names to Person-objects and events in
root['events'], which also an OOBTree, mapping dates to Event-objects?
And if I want to map locations to events, I should do it at the same
time, when creating the events, so I don't have to loop through all of
them again?
Here's what I do:
Create a physical structure that models your data in a 'natural' way.
This can e.g. be:
- A root object representing the application, in case you may want to
hold multiple instances of your application within a single database.
- BTrees for storing large lists of objects, like you do. But mainly
with a single lookup direction, e.g. for you the name-to-person
mapping.
Some times, those lists just work with arbitrary IDs for the objects,
much like primary keys in tables.
Alternatively, if you have a VFS-like structure, you might want to use
the folder/item metaphor for the main structure of your database.
- Add an indexing/searching framework for orthogonal queries. This is
called `cataloging` in the Zope/ZODB universe. Some (more or less)
standalone solutions are found in the proximity of `zope.catalog`.
Use those to create tabular views on your data (independent of the
physical structure) that are queryable by indexed arguments. Those are
fast.
If I have a OOBTree-mapping of dates to events, what should the values
of it be? PersistentLists? I've read something about Buckets or Sets,
but I'm not sure what they are good
for, Bucket seems to behave like the equivalent BTree (OO, or IO or OI
or IF or ), but Set seems to be a set... Is that true?
I'd go with a flat structure. See my note on 'arbitrary' IDs above.
What's the difference between a PersistentMapping and a OOBTree or
OOBucket? Only the back-end, because on the front they all seem like
dictionarys? Should I be using OOBTrees and OOBuckets for what I'm
doing, because strings and dates are Os and not Is or Fs or...
A PM is a persistent dictionary that loads all of its data at once.
A bucket is an internal node of a BTree.
A BTree is a (key-)sorted(!) data structure that provides a key/value
interface like dictionaries do. Due to that, the lookup of items in a
BTree is fast and also memory efficient, as only individual buckets of
the BTree need to be activated for a lookup (optimally only O(logn)
buckets).
Christian
--
Christian Theune · [EMAIL PROTECTED]
gocept gmbh co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1
Zope and Plone consulting and development
signature.asc
Description: This is a digitally signed message part
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list - ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev