[ https://issues.apache.org/jira/browse/LUCENE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389623#comment-15389623 ]
Martijn van Groningen commented on LUCENE-7391: ----------------------------------------------- +1 to count the number of fields with `numTerms > 0` and filter out fields with `numTerms <= 0` > MemoryIndexReader.fields() performance regression > ------------------------------------------------- > > Key: LUCENE-7391 > URL: https://issues.apache.org/jira/browse/LUCENE-7391 > Project: Lucene - Core > Issue Type: Bug > Reporter: Steve Mason > Attachments: LUCENE-7391.patch > > > While upgrading our codebase from Lucene 4 to Lucene 6 we found a significant > performance regression - a 5x slowdown > On profiling the code, the method MemoryIndexReader.fields() shows up as one > of the hottest methods > Looking at the method, it just creates a copy of the inner {{fields}} Map > before passing it to {{MemoryFields}}. It does this so that it can filter out > fields with {{numTokens <= 0}}. > The simplest "fix" would be to just remove the copying of the map completely, > and pass {{fields}} directly to {{MemoryFields}}. It's simple and removes > any slowdown caused by this method. It does potentially change behaviour > though, but none of the unit tests seem to test that behaviour so I wonder > whether it's necessary (I looked at the original ticket LUCENE-7091 that > introduced this code, I can't find much in way of an explanation). I'm going > to attach a patch to this effect anyway and we can take things from there -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org