Hi Nathan, That's a good idea for synonymy!
I think that independent offsets would be a good addition to core, too (if it is not already possible). This would - for example - also allow for compound tokenization (like https://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html). So in case you have the word "Donaudampfschiff", you could index "schiff" as well as "Donaudampfschiff" - and if you like, you could give "schiff" the complete offset of "Donaudampfschiff" (as "Donaudampfschiff" is just a special type of "schiff"). This wouldn't be feasible with expanded queries, as there are unlimited types of "schiff" possible. Best, Nils Am 04.07.2013 22:41, schrieb Nathan Kurz: > Hi Nils -- > > I don't think this is directly supported, but it seems like a good addition. > > Another approach might be to expand to the synonyms in the query > rather than in the index. That is, expand a search for > [examplification] to [example OR examplification], which should > already highlight correctly. > > You'd be trading a less efficient query for a small index. > > --nate > > On Thu, Jul 4, 2013 at 6:07 AM, Nils Diewald <*@b**n.de> wrote: >> Hello, >> I'm working with Lucene as well as with Lucy and I'm wondering if there >> is a possibility to store multiple terms with independent offset >> informations in Lucy, like this is possible with Lucene. >> >> Example: >> The string "This is an example" should be indexed with the >> offset-information: >> * this,0-4 >> * is,5-7 >> * an,8-10 >> * example,11-18 >> * examplification, 11-18 >> so in case the user searches for "examplification" the highlighter >> highlights the synonym "example". >> >> I'm glad about any hints in the right direction. Thank you all for this >> awesome tool! >> Best, Nils
