Minimize use of fields in lucene index
--------------------------------------

         Key: JCR-106
         URL: http://issues.apache.org/jira/browse/JCR-106
     Project: Jackrabbit
        Type: Improvement
  Components: query  
 Environment: svn revision: 161184
    Reporter: Marcel Reutegger
    Priority: Minor


Currently every property name creates a field in the lucene index, bloating the 
size of the index because of the norm files created for each field.

When values are indexed as is (not tokenized for fulltext indexing), then the 
property name may be part of the term text. That way lucene must only maintain 
one field for all property names. With this approach the search terms are 
always a combination of property name and literal value. e.g. instead of using 
TermQuery(new Term("prop", "foo")) the query must be TermQuery(new 
TermQuery("common-field", "prop:foo")). this works for general comparison / 
value comparison operators and also for the like function. the contains 
function uses the fulltext index which uses a different field anyway.

Using the property name as part of the indexed term text, requires a custom 
SortComparator which is aware of the property name.

This change will not be backward compatible with earlier indexes created by 
jackrabbit.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira

Reply via email to