: 
: NOTE: I definitely don't want to discourage you from tackling this
: issue, but I think its fair to mention there is a workaround, and
: thats if you can preprocess your queries yourself (maybe you dont
: allow all the lucene syntax to your users or something like that), you
: can escape the whitespace yourself such as rain\ coat, and I think
: your synonyms will work as expected.

Alternatively: use a QueryParser that doesn't know/care about any special 
markup and just analyzes the entire input against a single (configured) 
field and generates the appropriate query -- Solr's "FieldQParser" works 
this way for example.

You have to pick a tradeoff between "i want to support query operators 
like ':', '+', '-', and ' ' that let me build up BooleanQuery objects and 
query specific fields" vs "i want the entire query string analyzed as one 
chunk"

: > really tripping them up. A prime example is that a search for "dress shoes"
: > returns a list of dresses and random shoes (not necessarily dress shoes). I
: > wish that I was able to synonym compound words to single tokens (e.g. "dress
: > shoes => dress_shoes"), but with this whitespace tokenization issue, it's
: > impossible.

this is one of the main use cases of the DismaxQParser (and now 
EDismaxQParser as well) with the "pf" param in solr ... you can have it 
query for both "dress" and/or "shoes" in som set of fields (qf) but also 
for the entire phrase "dress shoes" in a distinct set of fields (pf) which 
get a higher score.

http://wiki.apache.org/solr/DisMax
http://wiki.apache.org/solr/DisMaxQParserPlugin
http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/



-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to