Comments are inline below...

On Apr 17, 2007, at 11:11 AM, Joe Corneli wrote:

Thanks again for the detailed response... here are some follow-up
questions.  Not critical, but a few answers might satisfy my
curiousity...

   The easiest thing I can think of is to create a derived index on
   the classes of elements in your btree and then use map-index to map
   over the instances.

   ELE-TESTS> (setf my-things (make-indexed-btree))
   ELE-TESTS> (add-index my-things :index-name 'thing-type
                         :key-form '(lambda (index k v)
                                     (values t (type-of v))))
   ELE-TESTS> (map-index (lambda (sk v pk) (print v))
(get-index my-things 'thing-type) :value 'symbol)

OK, this will enable me to grab all of the triples, which is good.
But if I understand correctly, it is not associated with *subsequent*
efficient search through the results, so I should use idioms like:

Yup, thus my second example...

   ELE-TESTS> (add-index my-things :index-name 'triples-first
                          :key-form '(lambda (index k v)
(if (subtypep (type-of v) 'triple) (values t (triple-first v))
                                            (values nil nil)))
                          :populate t)

[Note that I have fiddled with the `lambda' form you supplied, the
original

(lambda (index k v)
  (if (subtypep (type-of v)
                'triple) t)
  (values t (triple-first v))
  (values nil nil))

did not make sense to me -- a typo?]

A typo!

You say:

   This will create an index 'triples-first which only indexes triples
   and does so by the value of the first element.  Thus you can easily
   retrieve all triples with the first element eq to 5.

This sounds good -- but -- I would also like to be able to look up
triples by the middle, end, and combinations (like, "match beginning
and end").  No problem from the coding point of view, but I want to
check that creating all of these indices isn't nuts from the storage
point of view.  Such additional indices don't take up too much space,
right?  (I certainly don't have any other better ideas in mind!  but
thought I should clarify that there may be as many as 7 indices for
all the different look-up combinations.)

Each index has one entry for every triple you store. The entry contains the 'value' returned by the lambda expression (as small as 5- bytes for fixnums) and size of the primary key (5-bytes if using a fixnum id) of the triples in the main btree. Add to this this the overhead for the BTree data structures. Since it's all on disk the storage really isn't that obscene.

Combinations are usually dealt by joins (elements where head = 1 AND tail = 2 AND type = 10).

You could do this manually, but it means deserializing each set individually and then doing an intersection between them. If your sets are likely to be small, you could just do the intersection in lisp. If your sets are likely to be very large, then you need to be more clever.

I'm working on a query system (for class instances only) that would automatically exploit the three indexes and do cheap joins using sorted instance oids. i.e.

(constrain (t thing)
   where (eq (type t) 'canDo)
         (eq (target t) 'bark))

To get all instances of class triple that represent sources that canDo bark. (X canDo 'bark) If the indexes are defined only over triples, than no class-of or type-of constraint is needed. The user will still need to keep track of some of these constraints as not all optimizations can be automatically determined.

Computationally, this would involve getting all the oids from the type index, all oids from the bark index, doing a sorted merge then using the oids to pull out the instances during a mapping operation. i.e.

(map-query-result
  (lambda (triple) (print (source triple)))
  (constraint (t thing)
    ...))

The syntax will evolve, of course, but it will be something like this. The query engine will go into the 1.0 or 1.1 release. I'll try to write it so someone could write a more general version to do this for their own btrees rather than relying on the automated class support. The query language uses slot/accessor semantics and reflection so a more general solution would require a different and probably more verbose syntax.

   This does create a parallel 'index' which is its own btree.  There
   isn't a good way to do this using only a single btree as we don't
expose the sort order constraints or allow the user to create partial
   keys so they can use cursors to iterate over a subset of the btree.
   Now if you indexed your things by typename, then you could do this
   with a single tree, although you'd have to go over indexes

Actually this would _only_ work for non-numeric types and is perhaps too dependent on the current serializer. I think I'd rather not restrict the data store implementation that much given that we're due to have 4 separate ones in the future.

Hm, this is an interesting suggestion, but I think it would eliminate
the "elegance" (perhaps questionable) of being able to easily and
cleanly reach any Thing by its unique ID number (my current strategy).
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

Reply via email to