Hello, I have some questions / doubts about the use of equality and ordering of nodes/properties in the current JSR 170 or 283. IIUC, you can configure that you have orderableChildNodes. I suppose this ordering is stored in the db or FS, depending on what you are using to persist your data.
Now, AFAICS, despite the fact that you did not set orderableChildNodes, you can still query nodes and order them by some node/property (by the UN_TOKENIZED lucene field). IIUC, also equality in an XPATH or SQL query is done by the lucene index. >From JSR-283 4.6.2 I do understand that according the last sentence "Support >of equality and order comparison of BINARY values is not required", support >for equality and order *is* required for not binary values. The current JR >implementation therefore 'indexes' (UN_TOKENIZED) the stringValue of *every* >property as one single lucene term in the index (See NodeIndexer >addStringValue). But, IMHO, who wants to order on the text body of document, >or do an equal with string comparison on the body of a text? Ordering and >equality is done on things like author and date, not on some document >contents. So, IMO, it would be better for the specification to allow for configuration that indicates orderable or equality is possible for a property. If this is not possible, I think we might need to alter the current jackrabbit implementation to enable configuration for properties "how" to implement equality and ordering. The reason here for is that if I have representable data, with for example about 10 properties per document, of which one is "body" (~10 kb), 1/3 of the index consists of *never* used UN_TOKENIZED (= lucene single 99.9999% sure unique term) *body* property. This really is a waste. If the JSR is reluctant regarding configurable equality, we could store for larger values in lucene a term that is some checksum(), though, we then have no 100% garantueed equality then, which is probably pretty undesirable. My preference would be (easy to achieve because I already implemented it locally) is to enable equality/ordering set to false in the upcoming 1.4 IndexingConfiguration [1]. Then, you can just configure the body property for example to not be added to the index as UN_TOKENIZED. WDOT? Regards Ard [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration -- Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +31 (0)20 5224466 ------------------------------------------------------------- [EMAIL PROTECTED] / [EMAIL PROTECTED] / http://www.hippo.nl --------------------------------------------------------------
