[ 
https://issues.apache.org/jira/browse/OAK-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-1702:
---------------------------------

    Attachment: OAK-1702-shared-indexer-2.patch

Updated [patch|^OAK-1702-shared-indexer-2.patch] which relies on NodeState. 
With this following performance no are observed

* The test now fetches first 100 rows
* Multi threaded runs are executed

Legend
* Shared searcher - Benchmark run with just the attached patch applied. It 
includes support for shared searcher and batch querying. Results are loaded in 
batch of 100
* disable compression - Compression disabled via custom OakCodec
* mlt off - All text content is not stored as part of index
* local dir - In this the Lucene index is first copied to local file system and 
then a FSDirectory is opened on it. This feature is optional can be enabled via 
configuration

I still need to add testcase for logic in SearcherManager. However would like a 
review of the approach taken.

Key observations
* Default Oak-Tar work fine with usage of OakDirectory in querying
* When FileDataStore (FDS) is used then using a native FSDirectory performs 
better. 
* If MLT is disabled then we need not go for disabling the compression. I think 
key issue was that we were storing all the content which slows down reading 
path value as noted by Alex in previous comment
* SearcherManager - Current approach relies on a time delaye between subsequent 
calls to check for changes in directory. If a change is detected (via 
comparison of Lucene segment versions) then a new searcher is opened

{noformat}
//With read limit set to 100
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Jackrabbit                         1       4       4       5       6      71   
12224
Jackrabbit                         5       0       0       1       1     161  
331510
Jackrabbit                        10       0       0       1      10     254  
174780

//shared searcher
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       6       6       7       7      42    
8728
Oak-Tar                            5       1       1       2       7      68   
90120
Total read 1593592

//shared searcher/disable compression
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       3       4       5       6      48   
12412
Oak-Tar                            5       0       1       2       5     172   
99472

//shared searcher/mlt off
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       3       3       4       4      15   
16616
Oak-Tar                            5       1       1       2       5      55  
106068
Total read 3498539

//shared searcher/mlt off/disable compression
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       2       3       3       5      22   
16287
Oak-Tar                            5       0       1       2       5      58  
109836
Total read 3827996

//shared searcher/mlt off/disable compression/local dir
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       1       2       2       3      61   
27018
Oak-Tar                            5       0       0       1       1      82  
304053
Total read 7142948
Oak-Tar-FDS                        1       1       2       2       4      90   
24198
Oak-Tar-FDS                        5       0       0       1       2     133  
229287
Total read 13134162

//shared searcher/local dir
# FullTextSearchTest               C     min     10%     50%     90%     max    
   N
Oak-Tar                            1       5       6       6      11      76    
7656
Oak-Tar                            5       0       0       1       2     231  
226866
Total read 1822340
Oak-Tar-FDS                        1       5       5       6       6      51   
10163
Oak-Tar-FDS                        5       0       0       1       2     128  
228108
Total read 3948533
{noformat}

[~tmueller], [~alexparvulescu], [~jukkaz], [~teofili] Can you review the patch

> Create a benchmark for Full text search
> ---------------------------------------
>
>                 Key: OAK-1702
>                 URL: https://issues.apache.org/jira/browse/OAK-1702
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: bench
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.1
>
>         Attachments: OAK-1702-hack.patch, OAK-1702-lazy-cursor.patch, 
> OAK-1702-shared-indexer-2.patch, OAK-1702-shared-indexer.patch, 
> OAK-1702.oakcodec.patch, OAK-1702.patch
>
>
> To compare the performance of Full text search between Jackrabbit 2 and Oak a 
> benchmark should be added.
> To start with the benchmark would do following
> * Would be based on WikipediaImport benchmark. So it would import the 
> wikipedia dump and perform full text query on that
> * Should be able to run on both JR2 and Oak. Need to account for maven setup 
> to handle different Lucene version as JR2 uses 3.6.0 and Oak use 4.x
> Later we can add concurrent version



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to