I just realize that I can set an index directory when constructing the
Suggester, for example
Directory indexDir = FSDirectory.open(indexDirPath);
AnalyzingInfixSuggester suggester =new AnalyzingInfixSuggester(indexDir,
analyzer, analyzer,3,true);
and that I build the index using an ItemIterator when it does not exist
yet, for example
if (!indexDirPath.toFile().isDirectory() ||
indexDirPath.toFile().list().length == 0) {
List<Item> entities = new ArrayList<Item>();
entities.add(new Item("traffic
accident","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),3) );
entities.add(new
Item("event","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),2));
entities.add(new
Item("person","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),4));
entities.add(new Item("coverage
check","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new
Item("coverage","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new Item("contract
search","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new Item("claims management
system","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
suggester.build(new ItemIterator(entities.iterator()));
)
I was a little confused, because all the implementation examples I found
were using an in-memory directory.
My bad, everything good now, thank you :-)
Michael
Am 18.11.21 um 09:47 schrieb Michael Wechner:
Hi
I recently started to use the Autosuggest/Autocomplete package as
suggested by Robert
https://www.mail-archive.com/java-user@lucene.apache.org/msg51403.html
which works very fine, thanks again for your help :-)
But it is not clear to me what are the best practices building a
suggester using an InputIterator
https://lucene.apache.org/core/8_10_1/suggest/org/apache/lucene/search/suggest/Lookup.html#build-org.apache.lucene.search.suggest.InputIterator-
regarding
- scalability
- thousands of terms
- thousands of contexts (including personalized contexts)
- updating during runtime (singleton / thread safe)
So far I do something as follows
entities.add(new Item("traffic
accident","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),3)
);
entities.add(new
Item("event","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),2));
entities.add(new
Item("person","",asList("public","a84581a3-302f-4b73-80d9-0e60da5238f9"),4));
entities.add(new Item("coverage
check","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new
Item("coverage","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new Item("contract
search","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
entities.add(new Item("claims management
system","",asList("a84581a3-302f-4b73-80d9-0e60da5238f9"),1));
suggester.build(new ItemIterator(entities.iterator()));
whereas the terms associated with the context "public" are intended
for all contexts and the terms associated with the context
"a84581a3-302f-4b73-80d9-0e60da5238f9" are only for a private domain
context, in this example an insurance company.
Let's assume we have thousands of private domain contexts and the
terms keep changing continuously, because people upload new documents
with new terms into these contexts.
Will the current implementation of building the suggester using
InputIterator scale for such a situation?
I assumed/expected actually that the suggester is implemented like an
IndexReader/DirectoryReader for searching, which means for each
context I could have a separate "SuggesterDirectory", which can be
updated during runtime and scales easily.
Or do I misunderstand the current concept of how to build a suggester?
Thanks
Michael