On Mon, Dec 12, 2011 at 5:18 AM, Max <nas...@gmail.com> wrote:

> The end offset remains 11 even after folding and transforming "æ" to
> "ae", which seems wrong to me.

End offsets refer to the *original text* so this is correct.

What is wrong, is EdgeNGramsFilter. See how it turns that 11 to a 12?

>
> I also stumbled upon https://issues.apache.org/jira/browse/LUCENE-1500
> which seems like a similiar issue.
>
> Is there a workaround for that problem or is the field configuration wrong?

For now, don't use EdgeNGrams.

-- 
lucidimagination.com

Reply via email to