Re: [elephant-devel] Representational Question

Ian Eslick Thu, 06 Mar 2008 07:11:16 -0800

I agree with Robert. The best way to start is to use lisp as a querylanguage and essential do a search/match over the object graph.

The rub comes when you start looking at performance. A linear scan ofalot of class instances can become prohibitive, even for small setsizes, especially since you are randomly accessing the elements ofrelated class instances. This is in large part what class indexing isfor, to reduce those scans to a reasonable range of instances andavoid too many random accesses of disk btree pages.

One of the big things that SQL engines exploit are the way relationsare stored on disk. A relational query using foreign key referencesfrom one table (class) to another allows SQL engines to filter rows oftable A using a list of rows of table B via join tables.


(select a
 :where (= (fkey a) (oid b))
 :where (and (< 2/5/2004 b.date) (< b.date 2/5/2006))).

A simple lisp implementation is forced to be O(N) where N is thenumber of instances of A - we simply scan instances of A and test(date (fkey a)). With an index on (fkey a) and (date b) we can makethis into O( n + m log N + m log m ) I think. We do a range scan overdate values from instances of B -- O(m log m) -- and sort them. Thenusing the ordered index of B oids -- the index for (fkey a) -- we cansimply skip along finding all instances of A that have one of thematching instances of B. This is m doctors * log n accesses.

This is necessarily rough since the BTree branching factor + keyscanning has an impact as does access patterns influcing memory opsvs. # of disk seeks, memory, simple caching and streaming cache effects.

Even for small set sizes (ignoring constant factors), the complexapproach is helpful.


A = 1000
B = 100
Selecting 10 A's using 3 B's

simple cost = 1000
complex cost = 19

A = 1000000 (person->doctor)
B = 200000 (doctor->degree date)
Selecting 10000 A's using 100 B's

simple cost = 1000000
complex cost = 10000 + 300 + 200 = 10600

My #'s may be off here, but you get the general idea. I'm hoping thatthe stuff we're writing now makes it reasonably easy for users toperform the optimizations mentioned above manually. As we get this towork, we may be able to get a query interpreter to do the above selectquery for us. There are further optimizations, such as deciding whatlinear scans, index scans and join ordering will best optimize a query(i.e. a query planner) - but one step at a time!


Ian



Ian

On Mar 6, 2008, at 9:13 AM, Robert L. Read wrote:

On Wed, 2008-03-05 at 21:50 -0800, Paul Legato wrote:

I think the thing is not that we need professional consulting, but we
are trying to become the professional consultants. :) Could you more
knowledgable ODBMS coders recommend some good books or papers on the
topic of ODBMS care and feeding that we should read to get a better
handle on all this?

Cheers,
Paul


Thanks for the input, Paul.

I'm afraid I can't recommend a book on ODBMS, but for me a starting
point is this:

http://www.ibm.com/developerworks/library/wa-objprev/

I have personally always used Elephant in "prevalence" style where(Ian

doesn't do this, for example, because his database is too big.)

I personally think this mad would be clarified if someone (like me or
this community) would write a short guide to "doing SQL like things"

with LISP analogs. This would have the advantage of showing howsimple

it is to do the SQL operations in LISP.

To me, the fact that Elepahnt allows you to use LISP as your query
language rather than creating yet another mismatch between the
programming language and the query language is a major advantage.

I will make a 30-second beginning for such a document:

*) We have a way to map over every instance of a class. A simplefilter

performs the equivalent of a "select" in SQL, but with much wider
computational power.

*) The "aggregate" operations in SQL (SUM, MAX, MIN, AVG) can beeasily

expressed in terms of the the "reduce" operation in LISP.

*)  Any sort of "graph" problem (topological sort, find-all-children,
find-all-descendants, etc.) is easier to program in LISP than in SQL.



_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel


_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

Re: [elephant-devel] Representational Question

Reply via email to