> So then i change type="string" to type="shingleString" along with
> > [snip]
> >       <analyzer type="query">
> >         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >         <filter class="solr.ShingleFilterFactory" outputUnigrams="true" 
> > outputUnigramIfNoNgram="true" maxShingleSize="99" />
> >       </analyzer>

Debugging ShingleFilter I see that without quotes the shingles
StringBuffer array consists of just the current token.

When the query does have quotes the shingles array fills up with the
expected shingles.
And the Query (infact a MultiPhraseQuery)
  returned from SolrQueryParser.getFieldQuery()
  looks like

list_entry_shingle:"(abcd abcd efgh abcd efgh ijkl) (efgh efgh ijkl) ijkl"

I'm struggling to make sense of this.
How can the shingles be matched if they aren't quoted?
Why put the parenthesis () when the query has default operator OR?

I would be expecting a Query instead like:
abcd "abcd efgh" "abcd efgh ijkl" efgh "efgh ijkl" ijkl

(This with the ShingleFilter disabled does indeed work perfectly).

Am i barking up the wrong tree?
Is there a way to get the shingles phrased?

Otis, you mentioned this briefly on your reply on the dev list:
> Make sure you turn them into phrase queries

did you mean here something more than just quoting the original query?

~mck

-- 
"Claiming Java is easier than C++ is like saying that K2 is shorter than
Everest." Larry O'Brien 
| semb.wever.org | sesat.no | sesam.no |

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to