You might find this Perl implementation a helpful reference. https://metacpan.org/pod/LucyX::Suggester
On Wed, May 3, 2017 at 3:06 PM, Serkan Mulayim <[email protected]> wrote: > Thank you very much Marvin, > > When I type hell, I would like to get tokens starting with hell, e.g. > {"hell","hello","helix"}. I do not want to get documents which contain hell > token in the title. So it seems like it should be working on the tokens. > > What I need is basically to be able to iterate over all tokens which are > lexicographically ordered. Also I would need to sort them based on their > frequencies when returning the results. I guess Lexicon class, > https://lucy.apache.org/docs/c/Lucy/Index/Lexicon.html, is designed for > this. Can you please confirm? I hope the returned results in the > lucy_Lex_seek contains the frequency of the terms as well. > > Thanks again, > Serkan > > > > > > On Tue, May 2, 2017 at 4:22 PM, Marvin Humphrey <[email protected]> > wrote: > > > On Mon, May 1, 2017 at 3:55 PM, Serkan Mulayim <[email protected]> > > wrote: > > > > > I am using the C library. I would like to get the suggester or > > autocomplete > > > functionality in my library. It needs to return {"hello", "hell", > > "hellx"} > > > when your query is "hell". I feel like I need to be able to read all > the > > > tokens in the whole index, and return the results based on it. I looked > > at > > > the indexReader for this, but I could not find any useful information. > Do > > > you think this is possible? > > > > Autosuggestion functionality will need tuning, just like search results. > > In > > fact, autosuggestion is really a specialized form of search application. > > It > > could be implemented with a separate index or separate fields. > > > > Say that we only wanted to offer suggestions derived from the `title` > > field. > > Split each title into an array of words. Then for each word, index > > starting > > at some letter, say the third. For the title `hello world`, you'd get > the > > following tokens: > > > > hello -> hel hell hello > > world -> wor worl world > > > > Then at search time, perform a search query with every keystroke. > > > > h -> (no result) > > he -> (no result) > > hel -> "hello world" > > > > Once you've got basic functionality running, experiment with minimum > token > > length, adding Soundex/Metaphone, performing character normalization, > etc. > > > > Marvin Humphrey > > > -- Peter Karman . https://peknet.com/ <http://peknet.com/> . https://keybase.io/peterkarman
