[ 
https://issues.apache.org/jira/browse/LUCENE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534633#comment-14534633
 ] 

Michael McCandless commented on LUCENE-6365:
--------------------------------------------

Thanks [~markus_heiden], new patch looks great.

Can we remove the limit to FiniteStringsIterator.init?  Seems like this ("abort 
iteration after N items") should be the caller's job?

Can we just pass the automaton to FSI's ctor?  I don't think we need a reuse 
API here...

bq. I am not sure if the implementation change of CompletionTokenStream is OK, 
because I set the position attribute at the end of the iteration instead of at 
the start of the iteration. The tests run fine, but someone should review that.

It is weird that CompletionTokenStream hijacks PositionIncrementAttribute like 
that, and I can't see anywhere that reads from that (and indeed tests pass if I 
comment it out).  Maybe [~areek] knows?  I think we should just remove it?

> Optimized iteration of finite strings
> -------------------------------------
>
>                 Key: LUCENE-6365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6365
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>    Affects Versions: 5.0
>            Reporter: Markus Heiden
>            Priority: Minor
>              Labels: patch, performance
>         Attachments: FiniteStringsIterator.patch, FiniteStringsIterator2.patch
>
>
> Replaced Operations.getFiniteStrings() by an optimized FiniteStringIterator.
> Benefits:
> Avoid huge hash set of finite strings.
> Avoid massive object/array creation during processing.
> "Downside":
> Iteration order changed, so when iterating with a limit, the result may 
> differ slightly. Old: emit current node, if accept / recurse. New: recurse / 
> emit current node, if accept.
> The old method Operations.getFiniteStrings() still exists, because it eases 
> the tests. It is now implemented by use of the new FiniteStringIterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to