Ard Schrijvers wrote:
So, IMO, it would be better for the specification to allow for configuration
that indicates orderable or equality is possible for a property. If this is
not possible, I think we might need to alter the current jackrabbit
implementation to enable configuration for properties "how" to implement
equality and ordering. The reason here for is that if I have representable
data, with for example about 10 properties per document, of which one is
"body" (~10 kb), 1/3 of the index consists of *never* used UN_TOKENIZED (=
lucene single 99.9999% sure unique term) *body* property. This really is a
waste. If the JSR is reluctant regarding configurable equality, we could
store for larger values in lucene a term that is some checksum(), though, we
then have no 100% garantueed equality then, which is probably pretty
undesirable.

a checksum would work (with the restrictions you mentioned) for equality but not for the other operators.

My preference would be (easy to achieve because I already implemented it
locally) is to enable equality/ordering set to false in the upcoming 1.4
IndexingConfiguration [1]. Then, you can just configure the body property for
example to not be added to the index as UN_TOKENIZED.

WDOT?

I think that's a choice a jackrabbit user can make. by excluding some properties in the indexing configuration the repository becomes slightly less compliant, but if you never query for those properties you will not notice it.

regards
 marcel

[1] http://wiki.apache.org/jackrabbit/IndexingConfiguration

Reply via email to