Re: Find results with or without whitespace

2011-08-31 Thread roySolr
Frankie, Have you fixes this issue? I'm interested in your solution,,

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Find-results-with-or-without-whitespace-tp3117144p3298298.html
Sent from the Solr - User mailing list archive at Nabble.com.


Find results with or without whitespace

2011-06-28 Thread Frankie
I'm looking for a way to index/search on terms that may or may not contain
spaces.
An example will explain better :
- Loooking for healthcare, I want to find both healthcare and health
care.
- Loooking for health care, I want to find both health care and
healthcare.

My other constraints are
- I will index rather long strings (extracted from Office documents)
- I want to avoid synonym lists (as they may be incomplete)
- I want to avoid specific logic (i.e. query rewriting with as many OR as
search terms combination requires)
- I don't want to rely on uppercase/lowercase tokenizer (as users are...
creative)

I already tried many tokenizer/filter combination without success.
I did not find any answer to this problem.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Find-results-with-or-without-whitespace-tp3117144p3117144.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Find results with or without whitespace

2011-06-28 Thread roySolr
I had the same problem:

http://lucene.472066.n3.nabble.com/Results-with-and-without-whitespace-soccer-club-and-soccerclub-td2934742.html#a2964942



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Find-results-with-or-without-whitespace-tp3117144p3117386.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Find results with or without whitespace

2011-06-28 Thread Frankie
Thank you for your answer.

I agree, I can manage predictable values through synonyms.

However most data in this index are company and product names, leading
sometimes to rather strange syntax (mix of upper/lower case, misplaced dash
or spaces). One purpose to using solr was to help in finding potential
duplicates before data insertion.

On another hand I could write a custom tokenizer/filter and a custom query
builder that would test many combinations. I have the feeling however it is
an inefficient approach.
That is...
Indexing : chelsea soccer club =
chelsea,soccer,club,chelseasoccer,soccerclub,chelseasoccerclub
Searching : chelsea soccerclub = chelsea and soccerclub or
chelseasoccerclub
While search expressions are generally short, indexation will be a
nightmare...


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Find-results-with-or-without-whitespace-tp3117144p3117581.html
Sent from the Solr - User mailing list archive at Nabble.com.