By doing synonyms at index time, you cause "apfelsin" to be added to documents that contain only "orang", so of course documents that previously only contained "orang" will now match for "apfelsin" or any term query that matches "apfelsin", such as a wildcard. At query time, Lucene cannot tell whether your original document contained "apfelsin" or if "apfelsin" was added when the document was indexed due to an index-time synonym.

Solution: Either disable index time synonyms, or have a parallel field (via copyField) that does not have the index-time synonyms.

But... perhaps you should clarify what you really intend to happen with these pseudo-synonyms.

-- Jack Krupansky

-----Original Message----- From: Johannes Rodenwald
Sent: Wednesday, February 13, 2013 10:25 AM
To: solr-user@lucene.apache.org
Subject: Index-time synonyms and trailing wildcard issue

Hi,

I use Solr 3.6.0 with a synonym filter as the last filter at index time, using a list of stemmed terms. When i do a wildcard search that matches a part of an entry on the synonym list, the synonyms found are used by solr to generate the search results. I am trying to disable that behaviour, but with no success.

Example:

Stemmed synonyms:
apfelsin, orang

Search term:
apfel*

Matches:
Apfelkuchen, Apfelsaft, Apfelsine... (good, i want these matches)
Orange (bad, i dont want this match)

My questions are:
- Why does the synonym filter react on a wildcard query? For it is not a multiterm-aware component (see http://lucene.apache.org/solr/api-3_6_1/org/apache/solr/analysis/MultiTermAwareComponent.html) - How can i disable this behaviour, so that "Orange" is no longer returned by the query for "apfel*"?

Regards,

Johannes

Reply via email to