Looking at the code briefly, I think WFSTCompletionLookup uses on heap
store for the fst. You'd have to load it with off heap fst store instead:

https://github.com/apache/lucene/blob/1b9d98d6ec079e950bdd37137082f81400d3bc2e/lucene/core/src/java/org/apache/lucene/util/fst/OffHeapFSTStore.java

but I don't think there is an API in WFSTCompletionLookup that would allow
you to do that.

D.

On Fri, Dec 23, 2022 at 5:00 PM marcos rebelo <ole...@gmail.com> wrote:

> Hey all!
>
> I'm loading multiple WFST with ~1.1 Gb and the JVM memory increases
> proportionally. Looks like the file is stored in memory, meaning not using
> Memory Mapped Files at all.
>
> Example code:
>
> In the following code we setup the Lucene to use /tmp/deleteme2 for the
> memory mapped file and we load the file from /tmp/deleteme/file.wfst via an
> InputStream.
>
> After file load I list the files on /tmp/deleteme2 and nothing is found,
> but I'm able to query the WFST file.
>
>   @Test
>   @SneakyThrows
>   void WFSTLoad() throws IOException {
>     Path wfstPath = Paths.get("/tmp/deleteme2");
>     Path wfstFilePath = Paths.get("/tmp/deleteme/file.wfst");
>
>     var directory = new MMapDirectory(wfstPath);
>
>     WFSTCompletionLookup wfst =
>             new WFSTCompletionLookup(directory, "temp");
>
>     try (var is = new FileInputStream(wfstFilePath.toFile())) {
>       wfst.load(is);
>       System.out.println("FILE LOADED");
>     }
>
>     Files.list(wfstPath).forEach(System.out::println);
>     System.out.println("FILES LISTED");
>
>     assertThat(wfst.get("qwert123qwert")).isEqualTo(123);
>   }
>
> What am I doing wrong?
>
> Thanks for the support
>
> Best Regards
> Marcos Rebelo
>
> --
>
> *Marcos Bruno Gomes Rebelo Engineering Manager / Data Scientist / Software
> Engineer*
> Linkedin: https://www.linkedin.com/in/oleber/
> *Adding value to your data. Specialized in Search and Recommendation
> Systems*
> Technologies: Elastic, Spark, Scala, Jupiter Notebook, Python, ...
>

Reply via email to