Hi Zac,

Field Analysis tool (analysis.jsp) does not perform actual query parsing.

One thing to be aware of when Using Keyword Tokenizer at query time is: Query 
string (chicken stock) is pre-tokenized according to white spaces, before it 
reaches keyword tokenizer.

If you use quotes ("chicken stock"), query parser does no pre-tokenizes, though.

--- On Fri, 2/10/12, Zac Smith <z...@trinkit.com> wrote:

> From: Zac Smith <z...@trinkit.com>
> Subject: RE: Keyword Tokenizer Phrase Issue
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Date: Friday, February 10, 2012, 10:35 AM
> I have done some further analysis on
> this and I am now even more confused. When I use the Field
> Analysis tool with the text 'chicken stock' it highlights
> that text as a match.
> The dismax query looks ok to me:
> +(DisjunctionMaxQuery((ingredient_synonyms:chicken^0.6)~0.01)
> DisjunctionMaxQuery((ingredient_synonyms:stock^0.6)~0.01))
> DisjunctionMaxQuery((ingredient_synonyms:chicken
> stock^0.6)~0.01)
> 
> Then I have done an explainOther and it shows a failure to
> meet condition. However there does seem to be some kind of
> match registered:
> 0.0 = (NON-MATCH) Failure to meet condition(s) of
> required/prohibited clause(s)
>   0.0 = no match on required clause
> (ingredient_synonyms:chicken^0.6
> ingredient_synonyms:stock^0.6)
>   0.0650662 = (MATCH)
> weight(ingredient_synonyms:chicken stock^0.6 in 0), product
> of:
>     0.21204369 =
> queryWeight(ingredient_synonyms:chicken stock^0.6), product
> of:
>       0.6 = boost
>       0.30685282 = idf(docFreq=1, maxDocs=1)
>       1.1517122 = queryNorm
>     0.30685282 = (MATCH)
> fieldWeight(ingredient_synonyms:chicken stock in 0), product
> of:
>       1.0 =
> tf(termFreq(ingredient_synonyms:chicken stock)=1)
>       0.30685282 = idf(docFreq=1, maxDocs=1)
>       1.0 =
> fieldNorm(field=ingredient_synonyms, doc=0)
> 
> Any ideas?
> 
> My dismax handler is setup like this:
>   <requestHandler name="dismax"
> class="solr.SearchHandler" >
>     <lst name="defaults">
>      <str
> name="defType">dismax</str>
>      <str
> name="echoParams">explicit</str>
>      <float
> name="tie">0.01</float>
>      <str
> name="qf">ingredient_synonyms^0.6</str>
>      <str
> name="pf">ingredient_synonyms^0.6</str>
> </requestHandler>
> 
> Zac
> 
> From: Zac Smith
> Sent: Thursday, February 09, 2012 12:52 PM
> To: solr-user@lucene.apache.org
> Subject: Keyword Tokenizer Phrase Issue
> 
> Hi,
> 
> I have a simple field type that uses the
> KeywordTokenizerFactory. I would like to use this so that
> values in this field are only matched with the full text of
> the field.
> e.g. If I indexed the text 'chicken stock', searches on this
> field would only match when searching for 'chicken stock'.
> If searching for just 'chicken' or just 'stock' there should
> not match.
> 
> This mostly works, except if there is more than one word in
> the text I only get a match when searching with quotes.
> e.g.
> "chicken stock" (matches)
> chicken stock (doesn't match)
> 
> Is there any way I can set this up so that I don't have to
> provide quotes? I am using dismax and if I put quotes in it
> will mess up the search for the rest of my fields. I had an
> idea that I could issue a separate search using the regular
> query parser, but couldn't work out how to do this:
> I thought I could do something like this:
> qt=dismax&q=fish OR _query_:ingredient:"chicken stock"
> 
> I am using solr 3.5.0. My field type is:
> <fieldType name="keyword_test" class="solr.TextField"
> positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
>                
> <analyzer type="index">
>                
>                
> <tokenizer class="solr.KeywordTokenizerFactory" />
>                
> </analyzer>
>                
> <analyzer type="query">
>                
>                
> <tokenizer class="solr.KeywordTokenizerFactory" />
>                
> </analyzer>
> </fieldType>
> 
> Thanks
> Zac
>

Reply via email to