[ 
https://issues.apache.org/jira/browse/LUCENE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613789#comment-14613789
 ] 

Markus Heiden commented on LUCENE-6365:
---------------------------------------

@Michael: The removal of @lucene.experimental was a mistake of mine during 
merging.Thanks for your rework and your patience.

@Uwe: I measured the cpu runtime in sampling mode, so (almost) no additional 
overhead should occur. I did the reuse because there is not just one allocation 
of the array, but many. During runtime the array will be resized over and over 
again, because the initial size was rather small (4 entries). I changed that to 
16 so the resizing occurs less frequent. My test case was the build of 
dictionary of 100000s of words, so even small things accumulate.

A better solution to that problem would be, if automatons know the length of 
their longest word. In that case that above mentioned array could initially be 
sized right. But I don't know, if that length is always known during 
construction of automatons.

> Optimized iteration of finite strings
> -------------------------------------
>
>                 Key: LUCENE-6365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6365
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>    Affects Versions: 5.0
>            Reporter: Markus Heiden
>            Priority: Minor
>              Labels: patch, performance
>         Attachments: FiniteStrings_noreuse.patch, FiniteStrings_reuse.patch, 
> LUCENE-6365.patch
>
>
> Replaced Operations.getFiniteStrings() by an optimized FiniteStringIterator.
> Benefits:
> Avoid huge hash set of finite strings.
> Avoid massive object/array creation during processing.
> "Downside":
> Iteration order changed, so when iterating with a limit, the result may 
> differ slightly. Old: emit current node, if accept / recurse. New: recurse / 
> emit current node, if accept.
> The old method Operations.getFiniteStrings() still exists, because it eases 
> the tests. It is now implemented by use of the new FiniteStringIterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to