[ 
https://issues.apache.org/jira/browse/LUCENE-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yakov Sirotkin updated LUCENE-7639:
-----------------------------------
    Description: 
If query term starts with asterisks FST checks all words in the dictionary so 
request processing speed falls down. This problem can be solved with Suffix 
Array approach. Luckily, Suffix Array can be constructed after Lucene start 
from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we 
can use it only when special flag is set:

-Dsolr.suffixArray.enable=true

It is possible to  speed up Suffix Array initialization using several threads, 
so we can control number of threads with 

-Dsolr.suffixArray.initialization_treads_count=5

This system property can be omitted, the default value is 5.  

Attached patch is the suggested implementation for SuffixArray support, it 
works for all terms starting with asterisks with at least 3 consequent 
non-wildcard characters. This patch do not change search results and  affects 
only performance issues.

*Update*
suffix-arra-2.patch is improved version of patch, system properties for it are 
following::

{{lucene.suffixArray.enable}} - {{true}}, if you want to enable Suffix Array 
support. Default value - {{false}}.
{{lucene.suffixArray.initializationThreadsCount}} - number of threads for 
Suffix Array initialization, if you set {{0}} - no additional threads used. 
Default value - {{5}}.

  was:
If query term starts with asterisks FST checks all words in the dictionary so 
request processing speed falls down. This problem can be solved with Suffix 
Array approach. Luckily, Suffix Array can be constructed after Lucene start 
from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we 
can use it only when special flag is set:

-Dsolr.suffixArray.enable=true

It is possible to  speed up Suffix Array initialization using several threads, 
so we can control number of threads with 

-Dsolr.suffixArray.initialization_treads_count=5

This system property can be omitted, the default value is 5.  

Attached patch is the suggested implementation for SuffixArray support, it 
works for all terms starting with asterisks with at least 3 consequent 
non-wildcard characters. This patch do not change search results and  affects 
only performance issues.


> Use Suffix Arrays for fast search with leading asterisks
> --------------------------------------------------------
>
>                 Key: LUCENE-7639
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7639
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Yakov Sirotkin
>         Attachments: suffix-array-2.patch, suffix-array.patch
>
>
> If query term starts with asterisks FST checks all words in the dictionary so 
> request processing speed falls down. This problem can be solved with Suffix 
> Array approach. Luckily, Suffix Array can be constructed after Lucene start 
> from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we 
> can use it only when special flag is set:
> -Dsolr.suffixArray.enable=true
> It is possible to  speed up Suffix Array initialization using several 
> threads, so we can control number of threads with 
> -Dsolr.suffixArray.initialization_treads_count=5
> This system property can be omitted, the default value is 5.  
> Attached patch is the suggested implementation for SuffixArray support, it 
> works for all terms starting with asterisks with at least 3 consequent 
> non-wildcard characters. This patch do not change search results and  affects 
> only performance issues.
> *Update*
> suffix-arra-2.patch is improved version of patch, system properties for it 
> are following::
> {{lucene.suffixArray.enable}} - {{true}}, if you want to enable Suffix Array 
> support. Default value - {{false}}.
> {{lucene.suffixArray.initializationThreadsCount}} - number of threads for 
> Suffix Array initialization, if you set {{0}} - no additional threads used. 
> Default value - {{5}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to