Aha, I have no idea if there actually is a better way of achieving that, auto-completion with Solr is always tricky and I personally have not been happy with any of the designs I've seen suggested for it. But I'm also not entirely sure your design will actually work, but neither am I sure it won't!

I am thinking maybe for that auto-complete use, you will actually need your field to be NOT tokenized, so you won't want to use the WhiteSpace tokenizer after all (I think!) -- unless maybe there's another filter you can put at the end of the chain that will take all the tokens and join them back together, seperated by a single space, as a single token. But I do think you'll need the whole multi-word string to be a single token in order to use terms.prefix how you want.

If you can't make ShingleFilter do it though, I don't think there is any built in analyzers that will do the transformation you want. You could write your own in Java, perhaps based on ShingleFilter -- or it might be easier to have your own software make the transformations you want and then simply send the pre-transformed strings to Solr when indexing. Then you could simply send them to a 'string' type field that won't tokenize.

On 1/20/2011 4:40 PM, Martin Jansen wrote:
On 20.01.11 22:19, Jonathan Rochkind wrote:
On 1/20/2011 4:03 PM, Martin Jansen wrote:
I'm looking for an<analyzer>   configuration for Solr 1.4 that
accomplishes the following:

Given the input "abc xyz foo" I would like to add at least the following
token combinations to the index:

     abc
     abc xyz
     abc xyz foo
     abc foo
     xyz
     xyz foo
     foo

Why do you want to do this, what is it meant to accomplish?  There might be a 
better way to accomplish what it is you are trying to do; I can't think of 
anything (which doesn't mean it doesn't exist) that what you're actually trying 
to do would be required in order to do.  What sorts of queries do you intend to 
serve with this setup?
I'm in the process of setting up an index for term suggestion. In my use
case people should get the suggestion "abc foo" for the search query
"abc fo" and under the assumption that "abc xyz foo" has been submitted
to the index.

My current plan is to use TermsComponent with the terms.prefix=
parameter for this, because it seems to be pretty efficient and I get
things like correct sorting for free.

I assume there is a better way for achieving this then?

- Martin

Reply via email to