I've been thinking about this too, and haven't come up with any GREAT way. But there are several possible ways, that will do different things, good or bad, depending on the nature of your data and exactly what you want to do. So here are some ideas I've been thinking about, but not a ready made solution for you.
One thing first, the statement about "copy field to copy all dismax terms into one big field." doesn't exactly make sense. Copyfield is something that happens at index time, whereas dismax is only something that is used at query time. Since it's only used at query time, just because you are using dismax for your main search, doesn't mean you have to use dismax for your autocomplete query. The autocomplete query, that returns the things you're going to display in your auto-complete list, can be set up however you want. (we are talking about an auto-complete list, not a "Google Instant" style autocomplete, right? The latter would introduce even more issues). So, do you want the autocomplete to only match on the _entire query_ as entered, or do you want an autocomplete for each word? For instance, if I enter "dog walking", should the autocomplete be autocompleting "dog walking" as a whole, or should it be autocompleting "walking" by the time I've typed in "dog walking"? It's easier to set up to autocomplete on the whole phrase. Next, though, you probably want autocomplete to complete on partial words, not just complete words. "Dog wal" should autocomplete to "dog walking". That introduces an extra kink too. But let's assume we want that. So one idea. At index time, populate a field that will be used exclusively for auto-completing. Make this field actually _non-tokenizing_, probably a Text type but with the KeywordTokenizer (ie, the non-tokenizing tokenizer, heh). So if you're indexing "dog walking", then the token in the field is actually "dog walking", not ["dog","walking"]. Next, normalize it by removing punctuation (because we probably don't want to consider punctuation for auto-completing), and maybe normalizing whitespace by collapsing any adjacent whitespace to a single space, and removing whitespace at beginning and end. So " dog walking " will index as "dog walking". (This matters more at query time then index time, but less confusing to do the same normalization at both points). That can be done with a charpatternfilter. But now we've also got to n-gram expand it. So if the term being indexed is "dog walking", we actually want to store ALL these terms in the index: "d" "do" "dog" "dog " "dog w" "dog wa" etc Ie, n-grams, but only expanded out from the front. I believe you can use the EdgeNGramFilterFactory for this (at index time only, this one you don't want in your query-time analyzers). Although I haven't actually tried the EdgeNGramFilterFactory with a non-tokenized field, I think it should work. This will expand the size of your index, hopefully not to a problematic degree. Now, to actually do the auto-complete. At query time, take the whole thing the user has entered, and issue a query, with whatever fq's you want too, but use the "field" type query parser (NOT "dismax" or "lucene", because we don't want the query parser to pre-tokenize on whitespace, but not "raw" because we DO want to go through the query-time field analyzers), restricted to this autocomplete field you've created. One way to do this is: << q={!field f=my_autocomplete_field}the user's query >> (url-encoded, naturally). That's pretty much it, I think that should work, depending on the requirements of 'work'. Although I haven't tried it yet. Now, if you want the user's query to auto-complete match in the middle of your terms, things get a lot more complicated. Ie, if you want "walk" to auto-complete to "dog walking" too. This won't do that. Also, if you want some kind of stemming to happen in auto-complete, this won't do that either. And also, if you want to auto-complete not the entire phrase the user has typed in, but each white-space-seperated word as they type it, this won't do THAT either. Trying to get all those things to work becomes even more complicated -- especially with the requirement that you want to be able to apply the 'fq's from your current search context to the auto-complete. I haven't entirely thought through a possible way to do all that. But hopefully this gives you some clues to think about it. Jonathan ________________________________________ From: David Yang [dy...@nextjump.com] Sent: Friday, September 10, 2010 11:14 AM To: solr-user@lucene.apache.org Subject: Autocomplete with Filter Query Hi, Is there any way to provide autocomplete while filtering results? Suppose I had a bunch of people and each person has multiple occupations. When I select 'Assistant' in a filter box, it would be nice if autocomplete only provides assistant names, instead of all names. The other issue is that I use DisMax to do my search (name, title, phone number etc) - so it might be more complex to do autocomplete. I could have a copy field to copy all dismax terms into one big field. Cheers, David