[jira] Commented: (LUCENE-2529) always apply position increment gap between values

Michael McCandless (JIRA) Sat, 10 Jul 2010 07:58:47 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887041#action_12887041
 ]


Michael McCandless commented on LUCENE-2529:
--------------------------------------------

Well, if an app has a multi-valued field today where the first N (> 1) values 
analyze to 0 tokens, then this change will alter the positions of subsequent 
tokens.

Still, I agree, it seems unlikely that an app is relying on this... so I think 
we can just break it (and advertise that we did so, in CHANGES under back 
compat breaks).

Note that offsets also have logic that avoids adding the offset gap if there 
were no tokens; but it's slightly different since it will not add the gap if 
the current value in the multi-valued field had no tokens (whereas the logic 
for the position gap is only if we've seen net 0 tokens so far).  Seems like we 
should also fix offset to always add the gap?

Wanna cons up a patch...?

> always apply position increment gap between values
> --------------------------------------------------
>
>                 Key: LUCENE-2529
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2529
>             Project: Lucene - Java
>          Issue Type: Improvement
>         Environment: (I don't know which version to say this affects since 
> it's some quasi trunk release and the new versioning scheme confuses me.)
>            Reporter: David Smiley
>             Fix For: 3.1, 4.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I'm doing some fancy stuff with span queries that is very sensitive to term 
> positions.  I discovered that the position increment gap on indexing is only 
> applied between values when there are existing terms indexed for the 
> document.  I suspect this logic wasn't deliberate, it's just how its always 
> been for no particular reason.  I think it should always apply the gap 
> between fields.  Reference DocInverterPerField.java line 82:
> if (fieldState.length > 0)
>           fieldState.position += 
> docState.analyzer.getPositionIncrementGap(fieldInfo.name);
> This is checking fieldState.length.  I think the condition should simply be:  
> if (i > 0).
> I don't think this change will affect anyone at all but it will certainly 
> help me.  Presently, I can either change this line in Lucene, or I can put in 
> a hack so that the first value for the document is some dummy value which is 
> wasteful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2529) always apply position increment gap between values

Reply via email to