synonym token types and ranking

2008-06-11 Thread Uri Boness

Hi,

I've noticed that currently the SynonymFilter replaces the original 
token with the configured tokens list (which includes the original 
matched token) and each one of these tokens is of type word. Wouldn't 
it make more sense to only mark the original token as type word and 
the the other tokens as synonym types? In addition, once payloads are 
integrated with Solr, it would be nice if it would be possible to 
configure a payload for synonyms. One of the requirements we're 
currently facing in our project is that matches on synonyms should weigh 
less than exact matches.


cheers,
Uri


Re: synonym token types and ranking

2008-06-11 Thread Otis Gospodnetic
Hi Uri,

Yes, I think that would make sense (word vs. synonym token types).  Custom 
boosting/weighting of original token vs. synonym token(s) also makes sense.  Is 
this something you can provide a patch for?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Uri Boness [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Wednesday, June 11, 2008 8:56:02 PM
 Subject: synonym token types and ranking
 
 Hi,
 
 I've noticed that currently the SynonymFilter replaces the original 
 token with the configured tokens list (which includes the original 
 matched token) and each one of these tokens is of type word. Wouldn't 
 it make more sense to only mark the original token as type word and 
 the the other tokens as synonym types? In addition, once payloads are 
 integrated with Solr, it would be nice if it would be possible to 
 configure a payload for synonyms. One of the requirements we're 
 currently facing in our project is that matches on synonyms should weigh 
 less than exact matches.
 
 cheers,
 Uri