Hi Ian, I read the thread from a few days ago regarding OIDs. While I agree with your position, I think it's beneficial to be able to get a list of OIDs to be able to perform these kinds of operations. I think the premise from the other thread was that of "relying" on OIDs for relational purposes.
I also like the idea of where the query system may be going. But, independently of that, being able to specify something like :oid t in the get-instances-by-range function and then be able to "walk" through a set of oids, specifying the base class and the "walk" mechanism will instantiate those objects. Think of it as a primitive to the query system. Thanks, Daniel On Fri, June 1, 2007 2:08 am, Ian Eslick <[EMAIL PROTECTED]> said: > That's very insightful and quite correct! I was going to use this > mechanism as part of the query system I want to get working for 1.0. > This is pretty much the mechanism that relational databases use to do > join queries and Elephant will be no different. > > As a matter of discipline, I didn't want to encourage users to mess > around with oids because of the issues outlined in an e-mail thread a > few days ago. There will be an internal elephant api to access this > functionality for advanced users. That function will essentially be > a copy of map-inverted-index, but using cursor-get to get only the > oid instead of cursor-pget to get the oid and primary value. I'm > happy to be argued out of this position, but my experience is that > people will start to depend on this interface and then complain when > features are added later that leads to bugs in user code. > > The query system will perform these kinds of optimizations for you. > The interface will be something like: > > (map-query > (lambda (obj) > (render obj screen)) > (select-objects (geometry) > where > (between geometry.x 10 20) > (between geometry.y -5 15))) > > For a class named geometry with indexed slots x and y. If x and y > are indexed, then the query will be very efficient. If not, then it > will be a linear walk. I aim to add informative statements and > statistics to select-objects to help people optimize queries. This > is similar to the select clause in SQL, but will have features > oriented towards path queries (object references) and graph queries. > > The select-objects statement effectively returns a generator, which > will probably be a list of oids, either in memory or maintained on > disk for large query sets. > > I'm open to suggestions on syntax. I'm not happy with my current > sketches here. > > Ian > > PS - If you really want to get OIDs as the user, you can using > cursors over the inverted index to get the ranges yourself, then you > can do exactly the intersection operation you mentioned before. Of > course you need the classes of the oids so selected to recover the > objects. > > On May 31, 2007, at 11:43 PM, [EMAIL PROTECTED] wrote: > >> Please excuse if I don't make a direct reference to Elephant >> solving this in my comment below. However, I remembered reading >> something just like this in AllegroCache's Reference Manual, in >> which it said, and I quote: >> >> "In a database every object has a unique object identifier (oid). >> This value can be retrieved using db- object- oid. An oid is an >> integer. There is no way to determine the class of an object given >> its oid. >> >> Usually a program need not concern itself with oids. However in >> certain circumstances it may be convenient to work with oids. One >> such case is when combining the results of multiple indexes over >> the same class. You may want to ask for the set of objects whose X >> slot is greater than 10 and whose Y slot is greater than 20. It >> consumes fewer resources to ask for the oids of objects whose X >> slot is greater than 10 than to ask for the objects themselves. In >> the later case the objects retrieved have to be instantiated in the >> object cache and there's no point in doing that if you don't need >> all those objects. In this case you don't need all those objects >> since you only need those objects whose Y slot is also greater >> then 20. Thus the optimal way to do this query is to find the >> intersection of the oids corresponding to "X > 10" and those oids >> with "Y > 20" and then from that intersection find the objects >> corresponding to the oids." >> >> They later document the following function: >> >> "retrieve- from- index- range (class slot name >> initial value end value >> &key (db *allegrocache*) oid) >> >> returns all objects of the given class (a persistent class object >> or a symbol) whose slot slot name has a value in the range >> beginning with initial value up to but not including end value. If >> oid is true then the object id values are returned instead of the >> objects." >> >> It's interesting to note the "oid" key argument. I'm not sure if >> Elephant's get-instances-by-range supports something like that. >> But, a similar solution to the problem presented is: >> >> "We could find the oids of all the employees whose first name >> begins with “Jo” and whose last name begins with “F” >> using >> >> (intersection (retrieve-from-index-range >> ‘Employee ‘first-name “Jo” >> “Jp” :oid >> t) >> (retrieve-from-index-range >> ‘Employee ‘last-name “F” >> “G” :oid >> t))" >> >> given a persistent class Employee with indices on slots first-name >> and last-name. >> >> If Elephant supports something similar to the "oid" key argument, >> it would be very useful for these kind of queries, where you can >> get an initial result set and only create the instances of those >> objects if/when needed. >> >> Just my $0.02 :) >> >> - Daniel >> >> On Thu, May 31, 2007 11:07 pm, Ian Eslick <[EMAIL PROTECTED]> said: >> >>> Ignas, >>> >>> The easiest way to do this is to follow Robert's suggestion, and >>> declare slots x and y to be indexed (assuming your parameters are >>> slot values of a persistent objects) and then say: >>> >>> (remove-duplicates >>> (nconc (get-instances-by-range 'my-class 'x 10 20) >>> (get-instances-by-range 'my-class 'y -5 15))) >>> >>> Of course you might do a little better, performance-wise, if you >>> filter the y value as you traverse the x range (or vice-versa). >>> Since get-instances-by-range is just a wrapper around map-inverted- >>> index that collects visited objects, this could be up to twice as >>> fast, although in practice I'd expect the benefit to be marginal: >>> >>> (let ((matches nil)) >>> (map-inverted-index (lambda (obj) >>> (when (and (> y -5) (< y 15)) >>> (push obj matches))) >>> 'my-class >>> 'x >>> 10 >>> 20) >>> matches) >>> >>> This version avoids a second map-inverted-index operation (called by >>> get-instances-by-range) by only collecting objects if their y >>> coordinate is in the range. You can wrap all this in a function like >>> (get-objects-in-region x1 y1 x2 y2). Map over the x or y based on >>> which query is smaller, or which parameter is sparser. If you don't >>> know, then it doesn't really matter which you choose! >>> >>> If you get some performance comparisons of these two scenarios, >>> please post them to the list! >>> >>> Regards, >>> Ian >>> >>> PS - It would be interesting to see an R-tree (or one of the well- >>> known variants such as R* or Priority R-Trees) in Elephant. Instead >>> of using indexed classes, you would directly implement the R-tree >>> nodes as persistent instances and write the construction and >>> retrieval algorithms as if they were operating on in-memory objects. >>> Graphs are a very natural structure to implement in Elephant and the >>> only special provision I can think of might be maintaining a >>> persistent set of free nodes for the dynamic case. >>> >>> On May 31, 2007, at 10:17 PM, Robert L. Read wrote: >>> >>>> I'm assuming x and y are properites of a data object, which has >>>> some other component z which >>>> you with to retrieve, and you query is that you want to find all >>>> the records (x,y,z) such that >>>> (10 < x < 20) and ( -5 < y < 15). >>>> >>>> There is a spectrum of solutions to this problem. However, in the >>>> general case Elephant will >>>> not compute this with the best possible asymptotic complexity, >>>> although it may be better than >>>> a relational database at doing so. >>>> >>>> Here is the most naive solution: >>>> >>>> Examine every object in the database, and report those that meet >>>> that above condition. >>>> >>>> Although this may seem silly, don't knock it till you try it....it >>>> may be perfectly reasonable. >>>> >>>> A second solution is to make sure that you have specified :index on >>>> the x component >>>> (or, without loss of generality, the y component), and then using >>>> the "get-instance-by-range" >>>> feature to get all of the instances within the x range, and use a >>>> simple lisp function to discard >>>> those for which the y component does not match. >>>> >>>> This will be relatively fast if the x range excludes a lot of >>>> objects and the y range doesn't. >>>> >>>> It's not obvious to me that one can do better than that without >>>> some significant coding >>>> (For example, a "grid file" is a datastructure designed to answer >>>> such queries efficiently. >>>> An R-Tree is another geometric structure.) >>>> >>>> A relational database will not do better, in the worst case. A >>>> relational database will do >>>> a better job at statistical sampling---that is, a query optimizer >>>> should, in a huge database, >>>> be able to decide whether it should use the "x" index or the "y" >>>> index first based on the >>>> selectivity of those clauses in the boolean expression. However, >>>> fundamentally it can't >>>> do any better than to pick the best index, use that only examine >>>> some of the objects, and then >>>> look at the values of the other ones. >>>> >>>> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote: >>>>> Hi, i have an interesting usecase and i would like to ask whether >>>>> i can solve it by using Elephant. I would like to index objects >>>>> placed on a discreet 2d grid of an arbitarry size, which would >>>>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it >>>>> possible to perform such a query with elephant? I am not sure i >>>>> have looked in the right place, but i could not find such example >>>>> in the manual. Ignas Mikalajūnas >>>>> _______________________________________________ elephant-devel >>>>> site list elephant-devel@common-lisp.net http://common-lisp.net/ >>>>> mailman/listinfo/elephant-devel >>>> _______________________________________________ >>>> elephant-devel site list >>>> elephant-devel@common-lisp.net >>>> http://common-lisp.net/mailman/listinfo/elephant-devel >>> >>> _______________________________________________ >>> elephant-devel site list >>> elephant-devel@common-lisp.net >>> http://common-lisp.net/mailman/listinfo/elephant-devel >>> >> >> _______________________________________________ >> elephant-devel site list >> elephant-devel@common-lisp.net >> http://common-lisp.net/mailman/listinfo/elephant-devel > > _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel