[ 
https://issues.apache.org/jira/browse/LUCENE-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley closed LUCENE-8800.
--------------------------------

> FieldsReader#terms poor performance on a index with many field names sharing 
> common prefix
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8800
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8800
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.0
>            Reporter: Huy Le
>            Priority: Major
>         Attachments: Screen Shot 2019-05-15 at 5.08.26 pm.png
>
>
> We have experienced poor performance on an index with many fields, their 
> names share common prefix. Sampling stack using jprofiler showed a hotspot on 
> methodĀ FieldsReader#terms.
> !Screen Shot 2019-05-15 at 5.08.26 pm.png!
> Looking at source code I have seen thatĀ TreeMap is used to map between field 
> name to  FieldsProducer which means a lookup incurs O(logN) comparisons. 
> {code:java}
> private static class FieldsReader extends FieldsProducer {
>     ...    
>     private final Map<String,FieldsProducer> fields = new TreeMap<>();
>     ...
>     @Override
>     public Terms terms(String field) throws IOException {
>       FieldsProducer fieldsProducer = fields.get(field);
>       return fieldsProducer == null ? null : fieldsProducer.terms(field);
>     }
> {code}
> The problem becomes much worse when field names are long and share common 
> prefix because each comparison has to iterate over an entire string.
> In our case, the index has around 6000 fields in form of customfield_*.  I 
> wonder if we can change the TreeMap to HashMap or LinkedHashMap in case we 
> want to preserve the sorted order to improve the situation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to