Lisp, by the way, only allows comparisons between characters in terms
of their code. So sorting them as one of numerous numeric values, or
independently by code, is adding to the lisp spec, not violating it.
Ian
On Oct 22, 2007, at 1:18 PM, Ian Eslick wrote:
There have been a number of discussions about sorting issues and
guarantees among the various data store implementations. This is
difficult due to different backend serialized representations.
This is my understanding of the current guarantees. Let's commit
to this as part of locking down 0.9.1.
Any ordered data structure, such as btree or indexed-btrees, has to
make some guarantees about the function that defines the ordering.
Currently, the BDB data store and Elephant core implements the
following policy (I believe this is also true for CL-SQL stores as
well):
General policies:
In a BTree with multiple types, each type is sorted as an
independent subsequence. Types are guaranteed to be separated, but
the order of the type sub-sequences is not guaranteed.
In an Index with multiple values, there is no guarantee about the
ordering of duplicate values, the the index is treated as a sorted
sequence of unordered equivalence classes. Tests and users should
not depend on this.
Specific types:
Numeric types (fixnum, bignum, float, double, rational and complex):
Sorted properly over the continuous real numbers, smallest to biggest
Characters:
Currently considered and sorted in with all the above numeric types
(this is probably wrong)
Strings:
Sorted alphabetically, case-INSENSITIVE
Symbols/Pathnames:
Sorted as strings above, but symbols and paths are sorted
independently as type classes.
Persistent objects are sorted according to OID, but the OID
ordering may change at any time, so long as object equivalency is
maintained (for example, if we implement GC). Users should not
depend on the OID for anything other than uniqueness unless they
are implemented a new low-level persistent-collection data
structure like (N:1 and N:N) associations that is GC aware.
Otherwise the order can only be guaranteed within a given transaction.
All other types have no user-visible ordering guarantees. They
have an arbitrary ordering within the data store, but there is no
lisp ordering that makes sense for hash tables, standard objects,
etc. (You can create derived indices of an indexed-btree that
compute a number, string, etc that can serve as an ordering for
arbitrary objects).
Open issues:
- Characters as independent class, or as numeric codes?
- If/how to map over a type subset in an index or btree
Anything I missed?
Ian
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel