On 09/09/2008 at 4:38 PM, Mck wrote:
> 
> > Looks to me like MultiPhraseQuery is getting in the way.  Shingles
> > that begin at the same word are given the same position by
> > ShingleFilter, and Solr's FieldQParserPlugin creates a
> > MultiPhraseQuery when it encounters tokens in a query with the same
> > position.  I think what you want is to convert queries into shingle
> > disjunctions (*any* matching shingle results in a hit),  right?
> 
> Yes you're right Steve. thank you.
> 
> One way, i see now, to get the behaviour i want is to set the unigrams'
> positionIncrement to zero instead of one.
> 
> For example in ShingleFilter.fillOutputBuffer(..) replacing the two
> ocurrances of
> > .setPositionIncrement(1); 
> with 
> > .setPositionIncrement(0);
> 
> Then i end up with a MultiPhraseQuery with
>         termArrays[0] = { list_entry_shingles:abcd
>                           list_entry_shingles:abcd efgh
>                           list_entry_shingles:abcd efgh ijkl
>                           list_entry_shingles:efgh
>                           list_entry_shingles:efgh ijkl
>                           list_entry_shingles:ijkl }
> 
> and it works perfectly :-)

It works because you've set all of the shingles to be at the same position - 
probably better to change the one instance of .setPositionIncrement(0) to 
.setPositionIncrement(1) - that way, MultiPhraseQuery will not be invoked, and 
the standard disjunction thing should happen.

> [W]ould a patch to ShingleFilter that offers an option
> "unigramPositionIncrement" (that defaults to 1) likely be
> accepted into trunk?

The issue is not directly related to whether a unigram is involved, but rather 
whether or not tokens that begin at the same word are given the same position.  
The option thus should be named something like "coterminalPositionIncrement".  
This seems like a reasonable addition, and a patch likely would be accepted, if 
it included unit tests.

Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to