[jira] [Issue Comment Deleted] (OAK-2005) Use separate Lucene index for performing property related queries

Amit Jain (JIRA) Wed, 15 Oct 2014 23:23:57 -0700

     [ 
https://issues.apache.org/jira/browse/OAK-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Amit Jain updated OAK-2005:
---------------------------
    Comment: was deleted

(was: The indexing time with the lucene property index is faster in comparison 
to ordered index.
The test run was {{AggregateNodeSearcher}}
|*Indexing (Random data)*|*Time Seconds (2.6 m nodes)*|*Time Seconds (1.3 m 
nodes)*|
|OrderedIndex|3540|1497|
|Lucene|954|456|

On the downside query time is quite slow with lucene property index
|*Query*|*Time Seconds (2.6 m nodes)*|*Time Seconds (1.3 m nodes)*|
|OrderedIndex|0.5|NA|
|Lucene|8|6.5|

Will investigate further on why querying is slower.)

> Use separate Lucene index for performing property related queries 
> ------------------------------------------------------------------
>
>                 Key: OAK-2005
>                 URL: https://issues.apache.org/jira/browse/OAK-2005
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: oak-lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.2
>
>         Attachments: OAK-2005-1.patch, OAK-2005-2.patch, 
> OAK-2005-fixes.patch, OAK-2005-sort.patch
>
>
> Oak Lucene has some support for working with multiple Lucene directories. 
> Currently Oak uses a single Lucene directory to store the full text index. It 
> would be worthwhile to investigate if we can use a separate Lucene index to 
> store specific properties only and use it to perform property related queries.
> * A separate Lucene directory would be used to store explicitly configured 
> list of properties
> * The properties would be stored with there type
> ** JCR Long - long
> ** JCR Double - double
> ** JCR Date - long - The data value can be stored in long but with lesser 
> precision say upto second or even minutes
> * The values would stored "as is" i.e. without tokenization
> Possible benefits of such an index would be (ofcourse need be validated!)
> * Compact storage - Less memory would be used to store the index
> * Native support for Order By
> * Improved performance for like query - Specifically 'foo%', '%foo'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (OAK-2005) Use separate Lucene index for performing property related queries

Reply via email to