[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064832#comment-17064832
 ] 

Dawid Weiss commented on LUCENE-9286:
-------------------------------------

I placed a repro snippet here, including the automaton:
https://github.com/dweiss/lucene9286

Here is an overview of construction/ scan for varying oversizing factors (0-1 
in 0.2 steps).
{code}
[Task]                          [Time]    [%]  [+Tâ‚€]
Reading FST                      333ms   1.5%    0ms
FST construction (of=0.0)     3s 127ms  14.4%  353ms
 @ FST RAM: 56,055,936 bytes                        
 @ Oversizing factor: 0.00                          
TermEnum scan (of=0.0)           345ms   1.6%     3s
FST construction (of=0.2)     1s 997ms   9.2%     3s
 @ FST RAM: 56,055,936 bytes                        
 @ Oversizing factor: 0.20                          
TermEnum scan (of=0.2)           296ms   1.4%     5s
FST construction (of=0.4)     1s 914ms   8.8%     6s
 @ FST RAM: 56,055,936 bytes                        
 @ Oversizing factor: 0.40                          
TermEnum scan (of=0.4)           284ms   1.3%     8s
FST construction (of=0.6)     1s 908ms   8.8%     8s
 @ FST RAM: 56,055,936 bytes                        
 @ Oversizing factor: 0.60                          
TermEnum scan (of=0.6)           269ms   1.2%    10s
FST construction (of=0.8)      2s 52ms   9.4%    10s
 @ FST RAM: 56,055,056 bytes                        
 @ Oversizing factor: 0.80                          
TermEnum scan (of=0.8)           273ms   1.3%    12s
FST construction (of=1.0)           5s  24.4%    12s
 @ FST RAM: 54,945,816 bytes                        
 @ Oversizing factor: 1.00                          
TermEnum scan (of=1.0)        3s 670ms  16.8%    18s
{code}

> FST construction explodes memory in BitTable
> --------------------------------------------
>
>                 Key: LUCENE-9286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9286
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 8.5
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Major
>         Attachments: screen-[1].png
>
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to