[ 
https://issues.apache.org/jira/browse/LUCENE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490888#comment-14490888
 ] 

Markus Heiden edited comment on LUCENE-6365 at 4/11/15 10:07 AM:
-----------------------------------------------------------------

Are you talking about this?
{code:java}
for (IntsRef finiteString; (finiteString = iterator.next()) != null;)
{code}
For me it is the standard iteration pattern for non-lookahead iterations, like 
e.g. iterating over an input stream (see e.g. FileCopyUtils of Spring 
framework).

Does this one look better for you?
{code:java}
for (IntsRef finiteString = iterator.next(); finiteString != null; finiteString 
= iterator.next())
{code}
I like my version better, because it is shorter and the iterator.next() is not 
doubled, but I will you use it, if you like it better.

A simple while loop looks even more bloated to me. It unnecessarily widens the 
scope of finiteString and splits things which belong together, which both is 
error prone for coding:
{code:java}
IntsRef finiteString = iterator.next();
while (finiteString != null) {
   // do something

   finiteString = iterator.next();
}
{code}

Something different:
I marked Operations.getFiniteStrings() as deprecated in my patch, because it 
should be replaced by the new iterator. But I consider to remove the 
deprecated, because this method is easier to use for single iterations of small 
finite strings sets and makes some tests cases simpler. What do you think?

Again something different:
What about the initial stack size in the new iterator (which needs to be at 
least as big as the max. length of the iterated finite strings)? May I raise it 
from 4 to e.g. 16? In my opinion this would be needed for roughly 90% of all 
cases.


was (Author: markus_heiden):
Are you talking about this?
{code:java}
for (IntsRef finiteString; (finiteString = iterator.next()) != null;)
{code}
For me it is the standard iteration pattern for non-lookahead iterations, like 
e.g. iterating over an input stream (see e.g. FileCopyUtils of Spring 
framework).

Does this one look better for you?
{code:java}
for (IntsRef finiteString = iterator.next(); finiteString != null; finiteString 
= iterator.next())
{code}
I like my version better, because it is shorter and the iterator.next() is not 
doubled, but I will you use it, if you like it better.

A simple while loop looks even more bloated to me. It unnecessarily widens the 
scope of finiteString and splits things which belong together, which both is 
error prone for coding:
{code:java}
IntsRef finiteString = iterator.next();
while (finiteString != null) {
   // do something

   finiteString = iterator.next();
}
{code}

Something different:
I marked Operations.getFiniteStrings() as deprecated in my patch, because it 
should be replaced by the new iterator. But I consider to remove the 
deprecated, because this method is easier to use for single iterations of small 
finite strings sets and makes some tests cases simpler. What do you think?

> Optimized iteration of finite strings
> -------------------------------------
>
>                 Key: LUCENE-6365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6365
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>    Affects Versions: 5.0
>            Reporter: Markus Heiden
>            Priority: Minor
>              Labels: patch, performance
>         Attachments: FiniteStringsIterator.patch
>
>
> Replaced Operations.getFiniteStrings() by an optimized FiniteStringIterator.
> Benefits:
> Avoid huge hash set of finite strings.
> Avoid massive object/array creation during processing.
> "Downside":
> Iteration order changed, so when iterating with a limit, the result may 
> differ slightly. Old: emit current node, if accept / recurse. New: recurse / 
> emit current node, if accept.
> The old method Operations.getFiniteStrings() still exists, because it eases 
> the tests. It is now implemented by use of the new FiniteStringIterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to