Hi Uri,
Yes, I think that would make sense (word vs. synonym token types). Custom
boosting/weighting of original token vs. synonym token(s) also makes sense. Is
this something you can provide a patch for?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Uri Boness [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, June 11, 2008 8:56:02 PM
Subject: synonym token types and ranking
Hi,
I've noticed that currently the SynonymFilter replaces the original
token with the configured tokens list (which includes the original
matched token) and each one of these tokens is of type word. Wouldn't
it make more sense to only mark the original token as type word and
the the other tokens as synonym types? In addition, once payloads are
integrated with Solr, it would be nice if it would be possible to
configure a payload for synonyms. One of the requirements we're
currently facing in our project is that matches on synonyms should weigh
less than exact matches.
cheers,
Uri