I have the same impression, even if I'm using the MMapDirectory. The data
is on heap.

For my use case, it's a huge waste of memory :( 90% of my data could be
correctly organised and kept in disk.

Thanks for the support

Best regards
Marcos Rebelo

On Tue, 27 Dec 2022, 09:11 Dawid Weiss, <dawid.we...@gmail.com> wrote:

> Looking at the code briefly, I think WFSTCompletionLookup uses on heap
> store for the fst. You'd have to load it with off heap fst store instead:
>
>
> https://github.com/apache/lucene/blob/1b9d98d6ec079e950bdd37137082f81400d3bc2e/lucene/core/src/java/org/apache/lucene/util/fst/OffHeapFSTStore.java
>
> but I don't think there is an API in WFSTCompletionLookup that would allow
> you to do that.
>
> D.
>
> On Fri, Dec 23, 2022 at 5:00 PM marcos rebelo <ole...@gmail.com> wrote:
>
> > Hey all!
> >
> > I'm loading multiple WFST with ~1.1 Gb and the JVM memory increases
> > proportionally. Looks like the file is stored in memory, meaning not
> using
> > Memory Mapped Files at all.
> >
> > Example code:
> >
> > In the following code we setup the Lucene to use /tmp/deleteme2 for the
> > memory mapped file and we load the file from /tmp/deleteme/file.wfst via
> an
> > InputStream.
> >
> > After file load I list the files on /tmp/deleteme2 and nothing is found,
> > but I'm able to query the WFST file.
> >
> >   @Test
> >   @SneakyThrows
> >   void WFSTLoad() throws IOException {
> >     Path wfstPath = Paths.get("/tmp/deleteme2");
> >     Path wfstFilePath = Paths.get("/tmp/deleteme/file.wfst");
> >
> >     var directory = new MMapDirectory(wfstPath);
> >
> >     WFSTCompletionLookup wfst =
> >             new WFSTCompletionLookup(directory, "temp");
> >
> >     try (var is = new FileInputStream(wfstFilePath.toFile())) {
> >       wfst.load(is);
> >       System.out.println("FILE LOADED");
> >     }
> >
> >     Files.list(wfstPath).forEach(System.out::println);
> >     System.out.println("FILES LISTED");
> >
> >     assertThat(wfst.get("qwert123qwert")).isEqualTo(123);
> >   }
> >
> > What am I doing wrong?
> >
> > Thanks for the support
> >
> > Best Regards
> > Marcos Rebelo
> >
> > --
> >
> > *Marcos Bruno Gomes Rebelo Engineering Manager / Data Scientist /
> Software
> > Engineer*
> > Linkedin: https://www.linkedin.com/in/oleber/
> > *Adding value to your data. Specialized in Search and Recommendation
> > Systems*
> > Technologies: Elastic, Spark, Scala, Jupiter Notebook, Python, ...
> >
>

Reply via email to