> >
> > > We do auto-complete through prefix searches on shingles.
> > >
> >
> > Just to confirm, do you mean using EdgeNgram filter to produce letter
> > ngrams
> > of the tokens in the chosen field?
> >
> >
>
>> No, I'm talking about prefix search on tokens produced by a ShingleFilter.
>>
>
> I did not know about the Prefix query parser in Solr. Thanks a lot for
> pointing out the same.
>
> I find relatively little online material about the Solr/Lucene prefix query
> parser. Kindly point me to any useful resource that I might be missing.
>
>
 I looked into the Solr/Lucene classes and found the required information.
Am summarizing the same for the benefit of those that might refer to this
thread in the future.

 The change I had to make was very simple - make a call to getPrefixQuery
instead of getWildcardQuery in my custom-modified Solr dismax query parser
class. However, this will make a fairly significant difference in terms of
efficiency. The key difference between the lucene WildcardQuery and
PrefixQuery lies in their respective term enumerators, specifically in the
term comparators. The termCompare method for PrefixQuery is more
light-weight than that of WildcardQuery and is essentially an optimization
given that a prefix query is nothing but a specialized case of Wildcard
query. Also, this is why the lucene query parser automatically creates a
PrefixQuery for query terms of the form 'foo*' instead of a WildcardQuery.

A big thank you to Shalin for providing valuable guidance and insight.

And one final request for Comment to Shalin on this topic - I am guessing
you ensured there were no duplicate terms in the field(s) used for
autocompletion. For our first version, I am thinking of eliminating the
duplicates outside of the results handler that gives suggestions since
duplicate suggestions originate only from different document IDs in our
system and we do want the list of document IDs matched. Is there a
better/different way of doing the same?

Regards,

Prasanna.

Reply via email to