[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074637#comment-17074637 ]
Dawid Weiss commented on LUCENE-9286: ------------------------------------- Hi Bruno. Thanks for investigating and I'm sorry I'm of so little help here -- I haven't been able to dig in the fst code in a while. I can devise a workaround that dodges the copyOf problem in our code so this isn't much of an urgent issue but the general remark holds that arcs have changed from being lightweight to potentially heavy. I think my ideal state of fst would be a builder followed by immutable data structures you could cheaply throw around (node, arc pointer, iterator with cloneable state). This would be much easier to reason about and work with than what it currently is -- the current traversal API is kind of awkward. :) It is a long-term goal and one that isn't going to be easy (kudos for being able to understand and refactor that code!). > FST construction explodes memory in BitTable > -------------------------------------------- > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 8.5 > Reporter: Dawid Weiss > Assignee: Bruno Roustant > Priority: Major > Attachments: screen-[1].png > > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org