The worddelimiter filter is set to
generatewordparts=1,generatenumberparts=1,catenatewords=1,catenatenumbers=1
both at index and querytime.

Now i have this synonym mapping k-1 => k1 visa

Here is the parsedquery_ToString
<str name="parsedquery_toString">
+(text:"k (1 k) 1 visa"^0.8 | name:"k (1 k) 1 visa"^2.0)~0.01 (text:"k (1 k)
1 visa"~25^0.8 | name:"k (1 k) 1 visa"~25^2.0)~0.01
</str>

Why is solr grouping this way?k (1 k) 1 visa (i mean the 1k within
brackets?)
Also now after k-1 gets split by worddelimiter, does catenatewords=1 make k1
to be a single token?

As far as with the matching, 
(text:"k (1 k) 1 visa"^0.8
documents that have k1 visa exact phrase would rank higher, docs with just
k1 might rank next 
and since i have ps set to 25, would it also match docs that have 'k' and
'1' within 25 words of one another? or k1 and visa within 25 words of one
another because k1 is a single token? I seem to get confused with how solr
matches documents in cases like this.





Yonik Seeley wrote:
> 
> On Jan 5, 2008 2:28 PM, anuvenk <[EMAIL PROTECTED]> wrote:
>> Thats what i'm thinking too. If i remove solr.worddelimiter filter from
>> both
>> index and query, the word h1-b will remain as is in the index correct, so
>> if
>> someone searches for h1b (without hyphens) would it still return the h1-b
>> doc.
> 
> for "h1-b" to match "h1b", it will take either a synonym or something
> like WordDelimiterFilter.
> You can configure WordDelimiterFilter to only catenate too... so h1-b
> would become h1b at both index and query time.  The downside is that
> it might catenate things you want.
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/solr-word-delimiter-tp14630435p14641602.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to