Hello,
I've seen references to this in the list, but not completely explained...my
apologies if this is FAQ (and for the length of the email).....

I am using dismax across a number of fields on an index with data about music
albums & songs - the fields are quite full of stop words. I am trying to boost
'exact' matches - ie, if you search for 'The Doors', those documents with 'The
Doors' should be first. I've created the following fieldType and I use it for 
fields artist_exact and title_exact:


        <fieldType name="lowerCaseString" class="solr.TextField"
                        sortMissingLast="true" omitNorms="true">
                        <analyzer>
                                <!-- KeywordTokenizer does no actual
tokenizing, so the entire input string is preserved as a single token
                                -->
                                <tokenizer
class="solr.KeywordTokenizerFactory" /> <!-- The LowerCase TokenFilter does
what you expect, which can be when you want your sorting to be case insensitive
                                -->
                                <filter class="solr.LowerCaseFilterFactory" />
                                <!-- The TrimFilter removes any leading or
trailing whitespace --> <filter class="solr.TrimFilterFactory" />

                        </analyzer>
                </fieldType>

I then give artist_exact and title_exact pretty high boosts ( title_exact^200.0
artist_exact^100.0 )

Now, when I search with ?q=the doors , all the terms in my q= aren't used
together to build the dismaxQuery , so I never get a match on the _exact fields:

(there are a few other fields involved...pretty self explanatory)

<str name="rawquerystring">the doors</str>
<str name="querystring">the doors</str>
___
<str name="parsedquery">
+((DisjunctionMaxQuery((title_ngram2:"th he"^0.1 | artist_ngram2:"th he"^0.1 |
title_ngram3:the^4.5 | artist_ngram3:the^3.5 | artist_exact:the^100.0 |
title_exact:the^200.0)~0.01) DisjunctionMaxQuery((genre:door^0.2 |
title_ngram2:"do oo or rs"^0.1 | artist_ngram2:"do oo or rs"^0.1 |
title_ngram3:"doo oor ors"^4.5 | title:door^6.0 | artist_ngram3:"doo oor
ors"^3.5 | artist:door^4.0 | artist_exact:doors^100.0 |
title_exact:doors^200.0)~0.01))~2) DisjunctionMaxQuery((title:door^2.0 |
artist:door^0.8)~0.01) FunctionQuery((ord(release_year))^0.5) </str>

<str name="parsedquery_toString"> +(((title_ngram2:"th he"^0.1 |
artist_ngram2:"th he"^0.1 | title_ngram3:the^4.5 | artist_ngram3:the^3.5 |
artist_exact:the^100.0 | title_exact:the^200.0)~0.01 (genre:door^0.2 |
title_ngram2:"do oo or rs"^0.1 | artist_ngram2:"do oo or rs"^0.1 |
title_ngram3:"doo oor ors"^4.5 | title:door^6.0 | artist_ngram3:"doo oor
ors"^3.5 | artist:door^4.0 | artist_exact:doors^100.0 |
title_exact:doors^200.0)~0.01)~2) (title:door^2.0 | artist:door^0.8)~0.01
(ord(release_year))^0.5


but, if I build my search as ?q="the doors" 

<str name="parsedquery">
+DisjunctionMaxQuery((genre:door^0.2 | title_ngram2:"th he e   d do oo or
rs"^0.1 | artist_ngram2:"th he e   d do oo or rs"^0.1 | title_ngram3:"the he  e
d  do doo oor ors"^4.5 | title:door^6.0 | artist_ngram3:"the he  e d  do doo
oor ors"^3.5 | artist:door^4.0 | artist_exact:the doors^100.0 | title_exact:the
doors^200.0)~0.01) DisjunctionMaxQuery((title:door^2.0 | artist:door^0.8)~0.01)
FunctionQuery((ord(release_year))^0.5) </str>

<str name="parsedquery_toString"> +(genre:door^0.2 | title_ngram2:"th he e   d
do oo or rs"^0.1 | artist_ngram2:"th he e   d do oo or rs"^0.1 |
title_ngram3:"the he  e d  do doo oor ors"^4.5 | title:door^6.0 |
artist_ngram3:"the he  e d  do doo oor ors"^3.5 | artist:door^4.0 |
artist_exact:the doors^100.0 | title_exact:the doors^200.0)~0.01
(title:door^2.0 | artist:door^0.8)~0.01 (ord(release_year))^0.5

I've tried with other queries that don't include stopwords (smashing pumpkins,
for example), and in all cases, if I don't use " ", only the LAST word is used
with my _exact fields ( tried with 1, 2 and 3 words, always the same against my
_exact fields..)

What is the reason for this behaviour? 

my full dismax config is :

<str name="mm">2<-1 5<-2 6<90%</str>
<str name="spellcheck">true</str>
<str name="spellcheck.extendedResults">true</str>
<str name="tie">0.01</str>
<str name="qf">
title_exact^200.0 artist_exact^100.0 title^6.0 title_ngram3^4.5 artist^4.0
artist_ngram3^3.5 title_ngram2^0.1 artist_ngram2^0.1 genre^0.2 </str>
<str name="q.alt">*:*</str>
<str name="spellcheck.collate">true</str>
<str name="defType">dismax</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="rows">10</str>
<str name="pf">title^2.0 artist^0.8</str>
<str name="echoParams">all</str>
<str name="fl">*,score</str>
<str name="bf">ord(release_year)^0.5</str>
<str name="spellcheck.count">1</str>
<str name="ps">100</str>
</lst>

TIA!
B
_________________________
{Beto|Norberto|Numard} Meijome

"Never offend people with style when you can offend them with substance."
  Sam Brown

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.

Reply via email to