[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792519#comment-16792519 ] Alan Woodward commented on SOLR-12743: -- This may be related to LUCENE-8726 / SOLR-13315 > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761841#comment-16761841 ] Markus Jelsma commented on SOLR-12743: -- Thanks [~dsmiley]! > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761321#comment-16761321 ] David Smiley commented on SOLR-12743: - Markus: configure Java to dump the heap on OOM and then after analyze it (e.g. with [Eclipse Memory Analyzer Open Source Project | The Eclipse Foundationhttps://www.eclipse.org/mat/|https://www.google.com/url?sa=t=j==s=web=1=rja=8=2ahUKEwim16Ga5aXgAhVoslQKHSmTBDUQFjAAegQIAhAC=https%3A%2F%2Fwww.eclipse.org%2Fmat%2F=AOvVaw0s7EPTFp8o7pVf0rZnJZXe] ). It takes time; give yourself at least a day if you are new to MAT. It's a powerful tool; I've used it to find an issue in Solr. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760968#comment-16760968 ] Markus Jelsma commented on SOLR-12743: -- Bad news, after having two nodes on 7.6 with LFUCache running fine for just over 24 hours, both went nuts (41M Term instances, 2M PhraseQuery instances, etc) and ran OOM, just about the same time, while just a few hundred documents were being indexed. It doesn't appear to be caused by LFUCache, we had two other 7.2.1 nodes also on LFUCache, they are still running fine. So it seems that besides this issue, we might have an even worse problem, one that i cannot reproduce locally nor consistently on production, yesterday it happened immediately after start up, now after 24 hours. Reindexing the same Nutch segment when things went bad doesn't trigger a new OOM. The heap eating went fast, the nodes died within minutes, just as yesterday. There is nothing in the logs. Is this something anyone else has had? Thanks, Markus > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16759790#comment-16759790 ] Markus Jelsma commented on SOLR-12743: -- Hello all, Because i can only reproduce it on production, i only have a limited number of tries per day, it takes over an hour to test a minor change and more when i need to revert. Here are some new notes: * it doesn't "appear" to be caused by the metrics part, i took out everything inside initializeMetrics(), but the leak persisted; * i swapped FastLRU for LFU cache, otherwise same settings, the node ran OOM within minutes even before the commit got issued; * no idea what happened, but because Solr can run OOM for no clear reason, restarted and tried again, this time the otherwise leaking reference is collected as it should! So i finally see a "stable" 7.6 with LFUCache instead of FastLRUCache. To be clear, FastLRU does work without leaking, but only with a zero autoWarmCount. I have no idea what is going on with the warming, the warming code is almost identical and i can't see how a SolrIndexSearcher instance would leak with FastLRU, but not with LFU. The CacheRegenerator is not leaking the reference, nor the calling code in SolrCore seems to be the problem. I'll keep this single node on 7.6 for now and keep an eye on it. Thanks! > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758193#comment-16758193 ] Markus Jelsma commented on SOLR-12743: -- Hello [~mgibney], # There are no blocked threads, nothing peculiar and nothing relating to autowarming, caches whatsoever. There are as many searcherExecutor threads as there are cores on the system, so it 'appears' it is not leaking threads but just object instances; # the system is in general not under heavy load at all; # this specific collection, the one having this problem, does not have autoCommit configured. This collection receives manual commits only, once every 15-20 minutes or so; # there never are, overlapping commits on this system, maxWarmingSearchers was set to 1 already many years ago. The instance is leaked during the first commit after start up; # precisely, the instance count increments at each commit, a forced GC does't clean it up. A second commit 15-20 minutes later it increments again, up until the nodes dies horribly. Thanks! Markus > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757467#comment-16757467 ] Michael Gibney commented on SOLR-12743: --- Ah, ok; so I guess looking for "overlapping onDeckSearcher" in logs is not productive. [~markus17], thanks for the extra information! A few more questions/thoughts: # Does a thread dump provide any useful information? e.g., if an autowarm (or other) thread is blocked somewhere? # When the problem manifests, is the service running under load heavy enough that inserts/cleanup _could_ potentially monopolize a lock? # What are your {{autoCommit}} (and {{autoSoftCommit}}, {{commitWithin}}, etc.) settings? Are you also running manual commits? # Looking only at the code in {{SolrCore}}, it looks like the only way to get "PERFORMANCE WARNING: Overlapping onDeckSearchers" errors in your log is to have {{maxWarmingSearchers}} set to > 1. You could try setting this to "2" ... it's unlikely to hurt (in fact, unlikely to make a difference, per [~dsmiley]) – but there's a remote chance it could provide useful feedback. # I see you earlier noted that it's normal that two {{SolrIndexSearcher}}s should coexist immediately after a commit; so just to clarify, when you say it "immediately" leaks a {{SolrIndexSearcher}} instance, you mean it's hanging around longer than it should ... > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757294#comment-16757294 ] David Smiley commented on SOLR-12743: - My understanding of "Overlapping onDeckSearcher" is that it became impossible ever since Solr 6.something in which commits block other commits instead of overlapping. Although that's configurable but it's good by default. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757133#comment-16757133 ] Markus Jelsma commented on SOLR-12743: -- Hello [~mgibney], I tried to pin down the problem once more, and can confirm a few things: * disabling the FilterCache solves the SolrIndexSearcher leaks on commit; * enabling the FilterCache with a zero autoWarmCount also solves the leak; * the FilterCache with a modest autoWarmCount (128), opposed to a high count (>2k) we had before, immediately leaks a SolrIndexSearcher instance on commit. Unfortunately, the patch doesn't prevent the leaking. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756284#comment-16756284 ] Michael Gibney commented on SOLR-12743: --- The patch just attached is a shot in the dark (I can't directly reproduce this problem). But I think it's probably a good patch either way, because: I _was_ able to induce some weird behavior, artificially simulating a high-turnover cache environment (lots of inserts) and simultaneously trying to execute the {{get[Oldest|Latest]AccessedItems()}} method on {{ConcurrentLRUCache}}. This would be akin to what happens to an old cache that remains under heavy load (lots of inserts) while a new cache/searcher is being warmed (queries for autowarm are retrieved via the {{get[Oldest|Latest]AccessedItems()}} methods. The heart of the issue I observed is that {{get[Oldest|Latest]AccessedItems()}} use {{ReentrantLock.lock()}}, but {{markAndSweep()}} (for cleaning overflow entries) uses {{ReentrantLock.tryLock()}}. The latter is evidently much faster, and by design does not respect the {{fairness=true}} setting on {{markAndSweepLock}}. So I was able to create a situation where, with heavy enough turnover, {{markAndSweep()}} was called regularly enough that it monopolized the lock, starving {{get[Oldest|Latest]AccessedItems()}}. FWIW, I noticed that the official solr docker image moved from using openjdk 8 to openjdk 11 in the version interval that seems to have triggered this issue. I realize that this might fall short as an explanation for this issue, because the line of reasoning I'm following here would suggest that autowarming should block (not complete), which should \(?\) trigger "Overlapping onDeckSearcher" warnings. Also, it seems unlikely (though certainly not impossible) to consistently sustain a level of load sufficient to permanently monopolize the lock. Re: autowarming ... earlier comments are ambiguous wrt autowarm counts. _If_ the underlying issue is lock contention, then the _exact_ autowarm count should not matter, but I would expect that _disabling_ autowarm (setting to 0) would in fact be an effective workaround. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > Attachments: SOLR-12743.patch > > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755052#comment-16755052 ] Markus Jelsma commented on SOLR-12743: -- Hello [~bjoernhaeuser], thanks for confirming. I, sadly, confirm the problem persists with Solr 7.6.0. We still can not reproduce it locally, not even if we take the index from production. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754660#comment-16754660 ] Björn Häuser commented on SOLR-12743: - [~markus17] sorry for coming back to you this late, but we can confirm, no leaks anymore. Sorry for the caused inconvenience. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661981#comment-16661981 ] Markus Jelsma commented on SOLR-12743: -- Hello, [~bjoernhaeuser], just pinging again. Could you please verify if your nodes are no longer leaking SolrIndexSearcher instances on commit? And can you confirm that it was your problem to begin with? I am still not sure if you are seeing the same problem as i have. Thanks, Markus > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636754#comment-16636754 ] Markus Jelsma commented on SOLR-12743: -- Using VisualVM's memory sampler tool, filtered on org.apache.solr.search.SolrIndexSearcher, you should see the number of instances increment by one for each commit. If you have multiple cores of the same collection on the same instance, the number of instances should grow accordingly. Take great care, on Solrs where this bug is not present, the number of instances of SolrIndexSearcher will increment too! But shortly after the commit, the number will reduce again due to searcher warming and GC delay. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634498#comment-16634498 ] Björn Häuser commented on SOLR-12743: - [~markus17] I am not sure. How could I see this? > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627213#comment-16627213 ] Markus Jelsma commented on SOLR-12743: -- Hello [~bjoernhaeuser], Are you absolutely sure you are no longer leaking a SolrIndexSearcher instance on commit after dialing back the auto-warm counts? Despite having flooded the filter cache with entries comparable to production and continuously indexing stuff, in local tests i am still not able to reproduce the problem and so not confirm this work-around. On the other hand, how could this solve the problem, we do not have overlapping warm up searchers, maxWarmingSearchers is set to 1 and we only do a commit roughly once every fifteen minutes when our crawler is ready to index a batch. Thanks, Markus > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625953#comment-16625953 ] Erick Erickson commented on SOLR-12743: --- Björn: Thanks for that info, I pretty much guarantee that people trying to reproduce this wouldn't have thought of using a high autowarm count right off the bat. {quote}There was nothing in the logs telling me that there are multiple open searchers at the same time. {quote} The only thing I'd expect is perhaps something like "PERFORMANCE WARNING: Overlapping onDeckSearcher" in the logs if your autowarm count was such that a second commit came in while autowarming was going on. I know some of the metrics were tricky to get to release all their resources, perhaps there's a path where they don't get shut down properly in this situation. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625398#comment-16625398 ] Björn Häuser commented on SOLR-12743: - Hello everyone, sorry for the late answer. We figured it was the high auto warm counts for our most active collection. We turned them down (as suggested in the mail thread) and everything was fine again. Though I could not find any evidence, that the autowarming itself was the problem. There was nothing in the logs telling me that there are multiple open searchers at the same time. We are running the official solr docker images from docker hub. Thank you Björn > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606883#comment-16606883 ] Markus Jelsma commented on SOLR-12743: -- Erick wrote: bq. It would be great if Markus and Björn could add some environment info on the JIRA, in particular the version of Java you're both using and the op system etc... Alright. Debian GNU/Linux 9.3 Stretch, running OpenJDK 64bit 1.8.0_181-8u181-b13-1~deb9u1-b13. > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605460#comment-16605460 ] Markus Jelsma commented on SOLR-12743: -- That is a most regrettable typo, i meant i can't/cannot reproduce it locally, even when i introduce continuous indexing and querying. That is the whole problem i have, perhaps Björn can. I'll ask him on the list! > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604714#comment-16604714 ] Joel Bernstein commented on SOLR-12743: --- [~markus.jel...@openindex.io], you mention in an email that you are able to reproduce this issue: "But, i can reproduce it locally when i introduce queries, and filter queries while indexing pieces of data and committing it." Can you add the steps you used to reproduce, queries run, filter queries run etc... > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604708#comment-16604708 ] Mark Miller commented on SOLR-12743: Is the metrics registry holding onto these SolrIndexSearchers? Perhaps the same thing as SOLR-11882? > Memory leak introduced in Solr 7.3.0 > > > Key: SOLR-12743 > URL: https://issues.apache.org/jira/browse/SOLR-12743 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.3, 7.3.1, 7.4 >Reporter: Tomás Fernández Löbbe >Priority: Critical > > Reported initially by [~markus17]([1], [2]), but other users have had the > same issue [3]. Some of the key parts: > {noformat} > Some facts: > * problem started after upgrading from 7.2.1 to 7.3.0; > * it occurs only in our main text search collection, all other collections > are unaffected; > * despite what i said earlier, it is so far unreproducible outside > production, even when mimicking production as good as we can; > * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; > * filterCache is enabled using FastLRUCache; > * filter queries are simple field:value using strings, and three filter query > for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last > week' and 'last month', but rarely used; > * reloading the core manually frees OldGen; > * custom URP's don't cause the problem, disabling them doesn't solve it; > * the collection uses custom extensions for QueryComponent and > QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a > whole bunch of TokenFilters, and several DocTransformers and due it being > only reproducible on production, i really cannot switch these back to > Solr/Lucene versions; > * useFilterForSortedQuery is/was not defined in schema so it was default > (true?), SOLR-11769 could be the culprit, i disabled it just now only for the > node running 7.4.0, rest of collection runs 7.2.1; > {noformat} > {noformat} > You were right, it was leaking exactly one SolrIndexSearcher instance on each > commit. > {noformat} > And from Björn Häuser ([3]): > {noformat} > Problem Suspect 1 > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > Biggest instances: > • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 > (1,35%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 > (1,27%) bytes. > • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 > (1,22%) bytes. > Problem Suspect 2 > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by > "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > {noformat} > More details in the email threads. > [1] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E] > [2] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E] > [3] > [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org