I just realize that I can set an index directory when constructing the Suggester, for example

Directory indexDir = FSDirectory.open(indexDirPath);

AnalyzingInfixSuggester suggester =new AnalyzingInfixSuggester(indexDir, 
analyzer, analyzer,3,true);

and that I build the index using an ItemIterator when it does not exist yet, for example

if (!indexDirPath.toFile().isDirectory() || indexDirPath.toFile().list().length == 0) {
      List<Item> entities = new ArrayList<Item>();

    entities.add(new Item("traffic 
accident","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),3) );
    entities.add(new 
Item("event","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),2));
    entities.add(new 
Item("person","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),4));
    entities.add(new Item("coverage 
check","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
    entities.add(new 
Item("coverage","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
    entities.add(new Item("contract 
search","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
    entities.add(new Item("claims management 
system","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));

    suggester.build(new ItemIterator(entities.iterator()));
)

I was a little confused, because all the implementation examples I found were using an in-memory directory.

My bad, everything good now, thank you :-)

Michael



Am 18.11.21 um 09:47 schrieb Michael Wechner:
Hi

I recently started to use the Autosuggest/Autocomplete package as suggested by Robert

https://www.mail-archive.com/java-user@lucene.apache.org/msg51403.html

which works very fine, thanks again for your help :-)

But it is not clear to me what are the best practices building a suggester using an InputIterator

https://lucene.apache.org/core/8_10_1/suggest/org/apache/lucene/search/suggest/Lookup.html#build-org.apache.lucene.search.suggest.InputIterator-

regarding

- scalability
- thousands of terms
- thousands of contexts (including personalized contexts)
- updating during runtime (singleton / thread safe)

So far I do something as follows

entities.add(new Item("traffic accident","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),3) ); entities.add(new Item("event","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),2)); entities.add(new Item("person","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),4)); entities.add(new Item("coverage check","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1)); entities.add(new Item("coverage","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1)); entities.add(new Item("contract search","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1)); entities.add(new Item("claims management system","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));

suggester.build(new ItemIterator(entities.iterator()));

whereas the terms associated with the context "public" are intended for all contexts and the terms associated with the context "a84581a3-302f-4b73-80d9-0e60da5238f9" are only for a private domain context, in this example an insurance company.

Let's assume we have thousands of private domain contexts and the terms keep changing continuously, because people upload new documents with new terms into these contexts.

Will the current implementation of building the suggester using InputIterator scale for such a situation?

I assumed/expected actually that the suggester is implemented like an IndexReader/DirectoryReader for searching, which means for each context I could have a separate "SuggesterDirectory", which can be updated during runtime and scales easily.

Or do I misunderstand the current concept of how to build a suggester?

Thanks

Michael

Reply via email to