[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:
--------------------------------
    Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:        1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:    648855
{noformat}




  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb


> FuzzyQuery has severe memory usage in 8.5
> -----------------------------------------
>
>                 Key: SOLR-14428
>                 URL: https://issues.apache.org/jira/browse/SOLR-14428
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.5, 8.5.1
>            Reporter: Colvin Cowie
>            Priority: Major
>         Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:        1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:    648855
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to