[jira] [Commented] (OAK-1702) Create a benchmark for Full text search

Chetan Mehrotra (JIRA) Wed, 09 Apr 2014 04:30:45 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964041#comment-13964041
 ]


Chetan Mehrotra commented on OAK-1702:
--------------------------------------

The problem appears to be somewhere else. As I modified the code to
use shared IndexSearcher and native FSDirectory and still the
performance improvement was marginal.

The problem is occuring because the
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex#query [1]
currently does a eager initialization of cursor while the testcase
only fetches the first result. Compared to this the JR2 version does a
lazy evaluation. If put a break in loop (exit after first result) the
results are much better

{noformat}
Oak-Tar(break.shared searcher,fs)  1       2       2       3       3  170   
23204
Oak-Tar(break)                     1       5       5       5       6   90   
10593
Jackrabbit                         1       4       4       5       6  231   
11385
{noformat}

Now I am not sure if this a problem with the usecase taken. Or the
Lucene Index cursor management should be improved as in many case the
results would be multiple but the client code only makes use of
initial few results.

Further discussion on mail thread [2]

[1] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java#L381-L409
[2] http://markmail.org/message/laj3anrm46q4kcnu

> Create a benchmark for Full text search
> ---------------------------------------
>
>                 Key: OAK-1702
>                 URL: https://issues.apache.org/jira/browse/OAK-1702
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: bench
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.1
>
>         Attachments: OAK-1702-shared-indexer.patch, OAK-1702.patch
>
>
> To compare the performance of Full text search between Jackrabbit 2 and Oak a 
> benchmark should be added.
> To start with the benchmark would do following
> * Would be based on WikipediaImport benchmark. So it would import the 
> wikipedia dump and perform full text query on that
> * Should be able to run on both JR2 and Oak. Need to account for maven setup 
> to handle different Lucene version as JR2 uses 3.6.0 and Oak use 4.x
> Later we can add concurrent version



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (OAK-1702) Create a benchmark for Full text search

Reply via email to