[ 
https://issues.apache.org/jira/browse/SOLR-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Pakhomov updated SOLR-13315:
---------------------------------
    Description: 
Here is possible leak of SolrIndexSearcher. Which prevents unused searchers to 
be reclaimed by gc.
This problem was found after analyzing heap dump which was created before Full 
GC.

1) Where unused ref to SolrIndexSearcher is stored.

Log4j2Watcher implements LogWatcher<LogEvent>
 and has CircularList<LogEvent> history inherited from LogWatcher<E>

In history we can store Log4jLogEvent which can hold ref to ParameterizedMessage
 and ParameterizedMessage stores refs to all arguments of event log. (here we 
can store objects which are no longer in use directly or indirectly)

2) How SolrIndexSearcher can be indirectly reached through this log buffer.

If during FunctionScoreQuery execution ExitingReaderException("The request took 
too long to iterate over terms. Timeout: " ..) will be thrown this query will 
be logged with warn level and it ref will be store in Log4j2Watcher.
 (Here can be any exception which will log this query to Log4j2Watcher)

In general it should be ok but in this case FunctionScoreQuery indirectly 
stores ref to SolrIndexSearcher.

As the result we have refs to already closed searchers which are no longer in 
use.
 Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
reclaimed by gc.

3) How SolrIndexSearcher can be accessed through FunctionScoreQuery

There is a FunctionScoreQuery which can hold ref to 
MultiplicativeBoostValuesSource
 which can hold ref to WrappedDoubleValuesSource
 and the last one can hold ref to SolrIndexSearcher.

public final class FunctionScoreQuery extends Query

{ ... private final DoubleValuesSource source; ... }

private static class MultiplicativeBoostValuesSource extends DoubleValuesSource

{ private final DoubleValuesSource boost; ... }

private static class WrappedDoubleValuesSource extends DoubleValuesSource

{ private final ValueSource in; private IndexSearcher searcher; ... }

Actually any DoubleValuesSource implementation which stores ref to 
IndexSearcher 
 on method call

public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
IOException;

can couse a problem if it will be logged via Log4j2Watcher.

4) How to temporary solve this problem
 It is possible to disable Log4j2Watcher in solr.xml

5) How to fix this issue in more reliable way ?
 I think that it is very dangerous to buffer refs to log messages arguments.
 And may be Log4j2Watcher should be reworked to avoid buffering refs but 
LoggingHandler depends on Log4j2Watcher.

But may be there are better ways to solve this issue.

Path to gc root is attached.
 !path_to_gc_root_from_heap_dump.png|width=811,height=387!

  was:
Here is possible leak of SolrIndexSearcher.

1) Where unused ref to SolrIndexSearcher is stored.

Log4j2Watcher implements LogWatcher<LogEvent>
 and has CircularList<LogEvent> history inherited from LogWatcher<E>

In history we can store Log4jLogEvent which can hold ref to ParameterizedMessage
 and ParameterizedMessage stores refs to all arguments of event log. (here we 
can store objects which are no longer in use directly or indirectly)

2) How SolrIndexSearcher can be indirectly reached through this log buffer.

If during FunctionScoreQuery execution ExitingReaderException("The request took 
too long to iterate over terms. Timeout: " ..) will be thrown this query will 
be logged with warn level and it ref will be store in Log4j2Watcher.
 (Here can be any exception which will log this query to Log4j2Watcher)

In general it should be ok but in this case FunctionScoreQuery indirectly 
stores ref to SolrIndexSearcher.

As the result we have refs to already closed searchers which are no longer in 
use.
 Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
reclaimed by gc.

3) How SolrIndexSearcher can be accessed through FunctionScoreQuery

There is a FunctionScoreQuery which can hold ref to 
MultiplicativeBoostValuesSource
 which can hold ref to WrappedDoubleValuesSource
 and the last one can hold ref to SolrIndexSearcher.

public final class FunctionScoreQuery extends Query

{ ... private final DoubleValuesSource source; ... }

private static class MultiplicativeBoostValuesSource extends DoubleValuesSource

{ private final DoubleValuesSource boost; ... }

private static class WrappedDoubleValuesSource extends DoubleValuesSource

{ private final ValueSource in; private IndexSearcher searcher; ... }

Actually any DoubleValuesSource implementation which stores ref to 
IndexSearcher 
 on method call

public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
IOException;

can couse a problem if it will be logged via Log4j2Watcher.

4) How to temporary solve this problem
 It is possible to disable Log4j2Watcher in solr.xml

5) How to fix this issue in more reliable way ?
 I think that it is very dangerous to buffer refs to log messages arguments.
 And may be Log4j2Watcher should be reworked to avoid buffering refs but 
LoggingHandler depends on Log4j2Watcher.

But may be there are better ways to solve this issue.

Path to gc root is attached.
!path_to_gc_root_from_heap_dump.png|width=811,height=387!


> Possible SolrIndexSearcher leak through LogWatcher and FunctionScoreQuery
> -------------------------------------------------------------------------
>
>                 Key: SOLR-13315
>                 URL: https://issues.apache.org/jira/browse/SOLR-13315
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.5
>            Reporter: Yury Pakhomov
>            Priority: Major
>         Attachments: path_to_gc_root_from_heap_dump.png
>
>
> Here is possible leak of SolrIndexSearcher. Which prevents unused searchers 
> to be reclaimed by gc.
> This problem was found after analyzing heap dump which was created before 
> Full GC.
> 1) Where unused ref to SolrIndexSearcher is stored.
> Log4j2Watcher implements LogWatcher<LogEvent>
>  and has CircularList<LogEvent> history inherited from LogWatcher<E>
> In history we can store Log4jLogEvent which can hold ref to 
> ParameterizedMessage
>  and ParameterizedMessage stores refs to all arguments of event log. (here we 
> can store objects which are no longer in use directly or indirectly)
> 2) How SolrIndexSearcher can be indirectly reached through this log buffer.
> If during FunctionScoreQuery execution ExitingReaderException("The request 
> took too long to iterate over terms. Timeout: " ..) will be thrown this query 
> will be logged with warn level and it ref will be store in Log4j2Watcher.
>  (Here can be any exception which will log this query to Log4j2Watcher)
> In general it should be ok but in this case FunctionScoreQuery indirectly 
> stores ref to SolrIndexSearcher.
> As the result we have refs to already closed searchers which are no longer in 
> use.
>  Searcher has refs to caches (Docs, Filers, results ...) and they can not be 
> reclaimed by gc.
> 3) How SolrIndexSearcher can be accessed through FunctionScoreQuery
> There is a FunctionScoreQuery which can hold ref to 
> MultiplicativeBoostValuesSource
>  which can hold ref to WrappedDoubleValuesSource
>  and the last one can hold ref to SolrIndexSearcher.
> public final class FunctionScoreQuery extends Query
> { ... private final DoubleValuesSource source; ... }
> private static class MultiplicativeBoostValuesSource extends 
> DoubleValuesSource
> { private final DoubleValuesSource boost; ... }
> private static class WrappedDoubleValuesSource extends DoubleValuesSource
> { private final ValueSource in; private IndexSearcher searcher; ... }
> Actually any DoubleValuesSource implementation which stores ref to 
> IndexSearcher 
>  on method call
> public abstract DoubleValuesSource rewrite(IndexSearcher reader) throws 
> IOException;
> can couse a problem if it will be logged via Log4j2Watcher.
> 4) How to temporary solve this problem
>  It is possible to disable Log4j2Watcher in solr.xml
> 5) How to fix this issue in more reliable way ?
>  I think that it is very dangerous to buffer refs to log messages arguments.
>  And may be Log4j2Watcher should be reworked to avoid buffering refs but 
> LoggingHandler depends on Log4j2Watcher.
> But may be there are better ways to solve this issue.
> Path to gc root is attached.
>  !path_to_gc_root_from_heap_dump.png|width=811,height=387!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to