Re: [elephant-devel] virtual subtree?

Ian Eslick Tue, 17 Apr 2007 09:21:13 -0700

Comments are inline below...

On Apr 17, 2007, at 11:11 AM, Joe Corneli wrote:

Thanks again for the detailed response... here are some follow-up
questions.  Not critical, but a few answers might satisfy my
curiousity...

   The easiest thing I can think of is to create a derived index on
   the classes of elements in your btree and then use map-index to map
   over the instances.

   ELE-TESTS> (setf my-things (make-indexed-btree))
   ELE-TESTS> (add-index my-things :index-name 'thing-type
                         :key-form '(lambda (index k v)
                                     (values t (type-of v))))
   ELE-TESTS> (map-index (lambda (sk v pk) (print v))

(get-index my-things 'thing-type) :value'symbol)


OK, this will enable me to grab all of the triples, which is good.
But if I understand correctly, it is not associated with *subsequent*
efficient search through the results, so I should use idioms like:


Yup, thus my second example...

   ELE-TESTS> (add-index my-things :index-name 'triples-first
                          :key-form '(lambda (index k v)

(if (subtypep (type-of v)'triple)(values t (triple-firstv))

                                            (values nil nil)))
                          :populate t)

[Note that I have fiddled with the `lambda' form you supplied, the
original

(lambda (index k v)
  (if (subtypep (type-of v)
                'triple) t)
  (values t (triple-first v))
  (values nil nil))

did not make sense to me -- a typo?]


A typo!

You say:

   This will create an index 'triples-first which only indexes triples
   and does so by the value of the first element.  Thus you can easily
   retrieve all triples with the first element eq to 5.

This sounds good -- but -- I would also like to be able to look up
triples by the middle, end, and combinations (like, "match beginning
and end").  No problem from the coding point of view, but I want to
check that creating all of these indices isn't nuts from the storage
point of view.  Such additional indices don't take up too much space,
right?  (I certainly don't have any other better ideas in mind!  but
thought I should clarify that there may be as many as 7 indices for
all the different look-up combinations.)

Each index has one entry for every triple you store. The entrycontains the 'value' returned by the lambda expression (as small as 5-bytes for fixnums) and size of the primary key (5-bytes if using afixnum id) of the triples in the main btree. Add to this this theoverhead for the BTree data structures. Since it's all on disk thestorage really isn't that obscene.

Combinations are usually dealt by joins (elements where head = 1 ANDtail = 2 AND type = 10).

You could do this manually, but it means deserializing each setindividually and then doing an intersection between them. If yoursets are likely to be small, you could just do the intersection inlisp. If your sets are likely to be very large, then you need to bemore clever.

I'm working on a query system (for class instances only) that wouldautomatically exploit the three indexes and do cheap joins usingsorted instance oids. i.e.


(constrain (t thing)
   where (eq (type t) 'canDo)
         (eq (target t) 'bark))

To get all instances of class triple that represent sources thatcanDo bark. (X canDo 'bark) If the indexes are defined only overtriples, than no class-of or type-of constraint is needed. The userwill still need to keep track of some of these constraints as not alloptimizations can be automatically determined.

Computationally, this would involve getting all the oids from thetype index, all oids from the bark index, doing a sorted merge thenusing the oids to pull out the instances during a mapping operation.i.e.


(map-query-result
  (lambda (triple) (print (source triple)))
  (constraint (t thing)
    ...))

The syntax will evolve, of course, but it will be something likethis. The query engine will go into the 1.0 or 1.1 release. I'lltry to write it so someone could write a more general version to dothis for their own btrees rather than relying on the automated classsupport. The query language uses slot/accessor semantics andreflection so a more general solution would require a different andprobably more verbose syntax.

   This does create a parallel 'index' which is its own btree.  There
   isn't a good way to do this using only a single btree as we don't

expose the sort order constraints or allow the user to createpartial

   keys so they can use cursors to iterate over a subset of the btree.
   Now if you indexed your things by typename, then you could do this
   with a single tree, although you'd have to go over indexes

Actually this would _only_ work for non-numeric types and is perhapstoo dependent on the current serializer. I think I'd rather notrestrict the data store implementation that much given that we're dueto have 4 separate ones in the future.

Hm, this is an interesting suggestion, but I think it would eliminate
the "elegance" (perhaps questionable) of being able to easily and
cleanly reach any Thing by its unique ID number (my current strategy).
_______________________________________________
elephant-devel site list
[email protected]
http://common-lisp.net/mailman/listinfo/elephant-devel


_______________________________________________
elephant-devel site list
[email protected]
http://common-lisp.net/mailman/listinfo/elephant-devel

Re: [elephant-devel] virtual subtree?

Reply via email to