I think this is the best policy that we can generate. I personally don't like it, as I've mentioned before. I don't think we should have an arbitrary order. If you don't have a specified order, then you don't really have a btree in meaningful sense. You can't do a range query over something that is arbitrary, and if you don't have a range query, why call it a btree? You should just call it a dictionary, (which happens to be implemented with btrees) equip it with an enumeration, and have something just as efficient and simpler than a btree, which furthermore decreases the chances of someone accidentally depending on the order, since you never provide a way to utilize it (although of course humans, being human, will accidentally depend on the ordering of objects returned by the enumeration, so the problem never goes away from a software engineering point of view.)
However, I don't have time to reorganize things to present a "dictionary" in addition to a "btree" abstraction; and even if someone volunteered to do this work, it would be worthy of a new release, rather than a discussion of the current release. And since Ian has done much more work than I have in the last year, I will defer to his judgment. Is there a reason that string sorting is case insensitive? This seems like asking for trouble --- that's something that people are likely to depend on and be surprised when they switch repositories and find it is different. On Mon, 2007-10-22 at 13:18 -0400, Ian Eslick wrote: > There have been a number of discussions about sorting issues and > guarantees among the various data store implementations. This is > difficult due to different backend serialized representations. This > is my understanding of the current guarantees. Let's commit to this > as part of locking down 0.9.1. > > Any ordered data structure, such as btree or indexed-btrees, has to > make some guarantees about the function that defines the ordering. > Currently, the BDB data store and Elephant core implements the > following policy (I believe this is also true for CL-SQL stores as > well): > > General policies: > > In a BTree with multiple types, each type is sorted as an independent > subsequence. Types are guaranteed to be separated, but the order of > the type sub-sequences is not guaranteed. > > In an Index with multiple values, there is no guarantee about the > ordering of duplicate values, the the index is treated as a sorted > sequence of unordered equivalence classes. Tests and users should > not depend on this. > > Specific types: > > Numeric types (fixnum, bignum, float, double, rational and complex): > Sorted properly over the continuous real numbers, smallest to biggest > > Characters: > Currently considered and sorted in with all the above numeric types > (this is probably wrong) > > Strings: > Sorted alphabetically, case-INSENSITIVE > > Symbols/Pathnames: > Sorted as strings above, but symbols and paths are sorted > independently as type classes. > > Persistent objects are sorted according to OID, but the OID ordering > may change at any time, so long as object equivalency is maintained > (for example, if we implement GC). Users should not depend on the > OID for anything other than uniqueness unless they are implemented a > new low-level persistent-collection data structure like (N:1 and N:N) > associations that is GC aware. Otherwise the order can only be > guaranteed within a given transaction. > > All other types have no user-visible ordering guarantees. They have > an arbitrary ordering within the data store, but there is no lisp > ordering that makes sense for hash tables, standard objects, etc. > (You can create derived indices of an indexed-btree that compute a > number, string, etc that can serve as an ordering for arbitrary > objects). > > Open issues: > - Characters as independent class, or as numeric codes? > - If/how to map over a type subset in an index or btree > > > Anything I missed? > > Ian > > > _______________________________________________ > elephant-devel site list > elephant-devel@common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel