Just glancing over this.  I believe one of the recent shingle contributions 
over in Lucene contrib/ indeed has the option to add those begin/end marker 
characters, so if this will solve your exact matching needs, that's the thing 
to look at.  You'll have to write (and contribute?) a bit of glue to use it in 
Solr.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Mck <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, September 8, 2008 4:43:50 AM
> Subject: Re: Replacing FAST functionality at sesam.no
> 
> > I'm not very familiar with shingles but it seems to be that you should
> > have ShingleFilter at index time and make the query as a phrase query?
> 
> Then the entry "abcd efgh ijkl" would be indexed as 
> (abcd "abcd efgh" "abcd efgh ijkl" efgh "efgh ijkl" ijkl)
> 
> and a subsequent query "abcd" would return this entry.
> If this is so then this is not exact matching and not what we are
> looking for.
> 
> The filter behaviour we are looking for is like:
>    (i've included ^$ to denote the exact matching)
> 
> Original Query   --> Filtered Query
> abcd            -->  ^abcd$
> "abcd efgh"      --> (^abcd$ ^"abcd efgh"$ ^efgh$)
> "abcd efgh ijkl" --> (^abcd$ ^"abcd efgh"$ ^"abcd efgh ijkl"$ ^efgh$ ^"efgh 
> ijkl"$ ^ijkl$)
> 
> 
> ~mck
> 
> -- 
> "All stable processes we shall predict. All unstable processes we shall
> control." John von Neumann 
> | semb.wever.org | sesat.no | sesam.no |

Reply via email to