[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247325#comment-15247325 ] ASF subversion and git services commented on SOLR-8349: --- Commit ffbbfbbe107cf42de6299c6c94b032bb21fe716f in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ffbbfbb ] SOLR-8349: trying to address test failures > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Fix For: master, 6.1 > > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247323#comment-15247323 ] ASF subversion and git services commented on SOLR-8349: --- Commit 456d5c04c87744659a241f85d0fa04c683c81a2c in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=456d5c0 ] SOLR-8349: trying to address test failures > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Fix For: master, 6.1 > > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245450#comment-15245450 ] ASF subversion and git services commented on SOLR-8349: --- Commit 5b680de13a40021825b71d35c04a790dde2d7f6a in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b680de ] SOLR-8349: Allow sharing of large in memory data structures across cores > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Fix For: master, 6.1 > > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245447#comment-15245447 ] ASF subversion and git services commented on SOLR-8349: --- Commit 489acdb5092676965e1dda067d35b58aeee8cf7d in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=489acdb ] SOLR-8349: Allow sharing of large in memory data structures across cores > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245428#comment-15245428 ] ASF subversion and git services commented on SOLR-8349: --- Commit 9a1880aee821d4e6e96a8ff2fb15062b1e4c9eb1 in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9a1880a ] SOLR-8349: Allow sharing of large in memory data structures across cores > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244256#comment-15244256 ] Gus Heck commented on SOLR-8349: also deleted the misnamed patch since it seems to sort to the top. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242847#comment-15242847 ] Noble Paul commented on SOLR-8349: -- The patch is not up to date with the master. Can you please update it > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242841#comment-15242841 ] Gus Heck commented on SOLR-8349: Actually looking closer it seems to be happening when I try to add the blob, and not in the code that relies on the blob, so this might be a separate issue. Adding a 2 second sleep after startup of the cluster in the test does not alleviate it so this test seems to be creating a condition where the blob store is not able to receive a blob. Whether that is an artifact of what's going on ins SolrCloudTestCase or not, I don't know. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242833#comment-15242833 ] Gus Heck commented on SOLR-8349: re-applied patch to trunk (no conflicts) and ran tests, and hit this... I had seen something like this before but I hadn't noted the seed at the time and it hadn't happened in a while so I assumed it had been fixed by something I did. Looks a little like it might be the blob store not being ready (SOLR-8772). This seed seems to reproduce it reliably, though the test passes most of the time for me. (Ran it 10 times, saw this twice). {code} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=BlobRepositoryCloudTest -Dtests.seed=387A9881AC04D1FB -Dtests.slow=true -Dtests.locale=mk -Dtests.timezone=America/Lower_Princes -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] ERROR 0.00s | BlobRepositoryCloudTest (suite) <<< [junit4]> Throwable #1: java.net.SocketException: Connection reset [junit4]>at __randomizedtesting.SeedInfo.seed([387A9881AC04D1FB]:0) [junit4]>at java.net.SocketInputStream.read(SocketInputStream.java:209) [junit4]>at java.net.SocketInputStream.read(SocketInputStream.java:141) [junit4]>at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139) [junit4]>at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155) [junit4]>at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284) [junit4]>at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) [junit4]>at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) [junit4]>at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) [junit4]>at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) [junit4]>at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) [junit4]>at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) [junit4]>at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) [junit4]>at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) [junit4]>at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) [junit4]>at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) [junit4]>at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) [junit4]>at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) [junit4]>at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) [junit4]>at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) [junit4]>at org.apache.solr.core.BlobRepositoryCloudTest.postBlob(BlobRepositoryCloudTest.java:104) [junit4]>at org.apache.solr.core.BlobRepositoryCloudTest.setupCluster(BlobRepositoryCloudTest.java:59) [junit4]>at java.lang.Thread.run(Thread.java:745) [junit4] Completed [1/1 (1!)] in 8.94s, 0 tests, 1 error <<< FAILURES! {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242718#comment-15242718 ] Noble Paul commented on SOLR-8349: -- If you thing it is good enough, I can review and go ahead > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242710#comment-15242710 ] Gus Heck commented on SOLR-8349: Would be good to sort out what needs to be done for SOLR-8772 in time for 6.1. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175661#comment-15175661 ] Gus Heck commented on SOLR-8349: Right I got around that by creating .system first and uploading before I created a collection with a config that uses the blob store. I think some configurations that NEED the blob to function properly will want to fail if the blob store (and/or the blob) is not available. Both Using and not Using BlobStoreAware might be reasonable? Or maybe the interface should have a flag identifying if it is allowable to open the server without it? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Sub-task > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174902#comment-15174902 ] Noble Paul commented on SOLR-8349: -- Thanks. the tests are fine. But for one thing, The blob store is not guaranteed to be available at core load time (the {{.system}} collection ). So , your component should implement {{BlobStoreAware}} and only in the callback for that interface , the class should try to load resources > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172624#comment-15172624 ] Gus Heck commented on SOLR-8349: Ah it seems a brand new class for cloud unit testing has magically appeared in SOLR-8758 just today. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172386#comment-15172386 ] Gus Heck commented on SOLR-8349: Hmm, Ive been wondering how I might write a good test case for it. Are there some tests out there that fire up a cloud mode solr and interact with multiple cores that I can use as example? I can probably come up with mock object based tests that exercise some of the methods, but that rather misses the cross core sharing aspect of this and the usage by components. I expect SolrTestcaseJ4 will probably hit the else block in this code: {code} if (this.coreContainer.isZooKeeperAware()) { // stuff we want to test is here! } else { throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Blob loading is not supported in non-cloud mode"); } {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172260#comment-15172260 ] Noble Paul commented on SOLR-8349: -- The patch looks fine. If we have a testcase for this w could commit this right away > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171924#comment-15171924 ] Gus Heck commented on SOLR-8349: Oooops! sorry, it seems I uploaded the wrong file (and missed your response). Thx for following up! > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171699#comment-15171699 ] Noble Paul commented on SOLR-8349: -- [~gus_heck] can u post the updated patch? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169453#comment-15169453 ] Noble Paul commented on SOLR-8349: -- It's fine . But the patch doesn't have this extra method for decoder > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169187#comment-15169187 ] Gus Heck commented on SOLR-8349: Perhaps you were looking for a second hash map? We don't need one if we have more descriptive keys. I should have explained my latest patch. I was hasty. Implementations of the Decoder interface can now (optionally) give their decoders names. {code} public interface Decoder { /** * A name by which to distinguish this decoding. This only needs to be implemented if you want to support * decoding the same blob content with more than one decoder. * * @return The name of the decoding, defaults to empty string. */ default String getName() { return ""; } /** * A routine that knows how to convert the stream of bytes from the blob into a Java object. * * @param inputStream the bytes from a blob * @return A Java object of the specified type. */ T decode(InputStream inputStream); } {code} The internal hashmap that holds blob content objects (be they decoded or not) will append this name to the key for put and get operations in getIncrementRef and store the appropriate value in the key field of BlobContent {code} BlobContentRef getBlobIncRef(String key, Decoder decoder) { return getBlobIncRef(key.concat(decoder.getName()), () -> addBlob(key,decoder)); } in private BlobContentRef getBlobIncRef(String key, CallableblobCreator) ... aBlob = blobs.get(key); if (aBlob == null) { try { aBlob = blobCreator.call(); {code} Note that the unmodified key was supplied to the lambda invoking addBlob... {code} // for use cases sharing java objects private BlobContent addBlob(String key, Decoder decoder) { ByteBuffer b = fetchBlob(key); String keyPlusName = key + decoder.getName(); BlobContent aBlob = new BlobContent<>(keyPlusName, b, decoder); blobs.put(keyPlusName, aBlob); return aBlob; } {code} Thus the BlobContent object is holding the more descriptive key when we get to decrementBlobRefCount and do this: {code} if (ref.blob.references.isEmpty()) { blobs.remove(ref.blob.key); } {code} Also note that the differing method signatures distinguish when we don't have a decoder because we are interested in caching the raw bytes, not the decoded form and a different path is taken... In this case the key in the map matches the key in the blob store ensuring that we can't cache the raw bytes twice. {code} public BlobContentRef getBlobIncRef(String key) { return getBlobIncRef(key, () -> addBlob(key)); } // For use cases sharing raw bytes private BlobContent addBlob(String key) { ByteBuffer b = fetchBlob(key); BlobContent aBlob = new BlobContent<>(key, b); blobs.put(key, aBlob); return aBlob; } {code} Looking at it again today, It occurs to me that there is a small chance that if someone wants to share the raw bytes and also share a decoded form of the same blob and they fail to name their Decoder we would have a race condition. This should be avoided by defaulting to "!D!" (or some other "reserved" string) instead of empty string in the above Decoder interface (and document this in the javadoc). It is intentionally left up to the implementor to decide whether or not they provide names that allow/prevent collisions. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169057#comment-15169057 ] Noble Paul commented on SOLR-8349: -- really? I still think it doesn't take care of multiple decoders. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166522#comment-15166522 ] Noble Paul commented on SOLR-8349: -- One problem I see with the patch is, with decoding the object in two different ways . What if core1 has decoder1 and core2 has decoder2. Then the second call gets the output of first decoder. That's why I kept a map internally so that it is possible to deal with that usecase. It may be unusual to do so , but, for sake of correctness we have to do it > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163464#comment-15163464 ] Gus Heck commented on SOLR-8349: I had updated on the 22nd. Should have pulled before I created the patch, but usually a one day lag isn't a problem. I didn't know you were going to create another issue and check it in :). I'll recreate the patch. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349_4.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163001#comment-15163001 ] Noble Paul commented on SOLR-8349: -- Are you working on the latest trunk? I have already committed my other patch in SOLR-8719 > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349_4.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159601#comment-15159601 ] Gus Heck commented on SOLR-8349: Sure I could, but other folks who try to use this after us will likely stumble into that pitfall. I've played around with it and simplified BlobContent, moved the PluginBag specific stuff to PluginBag and provided some javadoc and a user friendly method on SolrCore which will ensure that a close hook is also created. It seems all my goals except #3 are satisfied and we've exceeded expectations for #2 making it possible to reliably update the content on the fly with no danger of memory leaks (yay). I definitely like this approach better. Attaching patch #4. The patch contains a class named org.apache.solr.handler.component.XXCustomComponent demonstrating usage (redacted and cleansed version of what I'm using for a client's server, adapted to this patch) This class obviously should not be included in the commit. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158135#comment-15158135 ] Noble Paul commented on SOLR-8349: -- bq.but I think there might be some holes WRT decoders decoding the same content more than once when multiple cores are loaded, You can easily solve it by synchronizing the {{decode()}} code at your end > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158113#comment-15158113 ] Gus Heck commented on SOLR-8349: Looked at the patch some today, Mostly it looks easy enough to adapt to my client's code, but I think there might be some holes WRT decoders decoding the same content more than once when multiple cores are loaded, and it seems we hold onto the ByteBuffer after the decoding, which doubles memory usage. Will comment more and provide suggestions (+ patch) tomorrow. Since decoding our file takes significant time and pegs the cpu's, I really don't want that repeating itself for all 40 cores :). Loading code in my decoder will look something like this: {code} ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors()); try (Stream lines = new BufferedReader(new InputStreamReader(inputStream, Charset.forName("UTF-8"))).lines()) { try { pool.submit(() -> lines.parallel().forEach(this::processSimpleCsvRow)).get(); } catch (InterruptedException | ExecutionException e) { throw new IOException(e); } } catch (IOException e) { throw new RuntimeException("Cannot load csv " , e); } finally { pool.shutdownNow(); } {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156521#comment-15156521 ] David Smiley commented on SOLR-8349: This is a nice suggestion [~noble.paul]; it looks remarkably close. +1 to enhancing it to be more generic and subsume this use-case. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156323#comment-15156323 ] Noble Paul commented on SOLR-8349: -- bq.So are you proposing changing JarRepository, or adding a similar class called BlobRepository? No, I would rename it to BlobRepository bq. Seems like fairly major surgery is required to make the JarRepository class fully generic. I shall put up a patch which can do this bq.I need to better understand the lazy="true" bit you mentioned, I understand the problem with {{startup=lazy}} we probably should make a new interface called {{BlobStoreAware}} which loads the component when the BlobStore is available. But let's not keep it separate > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156319#comment-15156319 ] Gus Heck commented on SOLR-8349: hmm looks like I didn't realize my ide had navigated me to another class getByteBuffer is on MemClassLoader... > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156315#comment-15156315 ] Gus Heck commented on SOLR-8349: I do see how what you suggest could be made to work if a decoder was supplied and the decoded version was what was cached. So are you proposing changing JarRepository, or adding a similar class called BlobRepository? JarRepository does seem very Jar focused. (returning JarContent, JarContentRef, etc..) It would for example no longer be possible to return a ByteBuffer type from getByteBuffer, since returning the bytes from which to decode a HashMap won't help. Seems like fairly major surgery is required to make the JarRepository class fully generic. I do like the versioned resource and config API bit though, that's nice. I need to better understand the lazy="true" bit you mentioned, because I very much don't ever want a user query to be the thing that causes the resource to load (a process that could take minutes). The resource should be available and ready to go when the system starts accepting queries. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156295#comment-15156295 ] Noble Paul commented on SOLR-8349: -- bq.That may come in handy if/when a refresh mechanism is desired. The refresh mechanism is built into the system . The name is a combination of name and version example: {{myCsvFile/1}} (where '1' is the first version) Every component can be updated using the API. If you add a new version of the blob, you should just simply update your component to use {{myCsvFile/2}}. The rest is automatically taken care of > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156293#comment-15156293 ] Noble Paul commented on SOLR-8349: -- OK , I got the point. The framework can be easily extended to making the codec pluggable. So the cache can just keep the decoded object in memory instead of ByteBuffer So the API would look like {code} MyCsvDecoder csvDecoder = null;//initiate your decoder that would convert your csv to MyCustomObject ObjectRef ref = BlobRepository.getObjectIncRef(name, csvDecoder ); {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156290#comment-15156290 ] Gus Heck commented on SOLR-8349: That sounds like a very cool way to store the source data, but this issue isn't about access to the blob/file itself. It's about access to the in memory java object created from the file/blob/whatever, so each core does not have to hold it's own identical copy in RAM. In our case we had a csv file that was translated into a HashMap (for fast access *during* query processing) that had >900MB of memory usage. When we fired up 40 cores (20 collections each with a main core and one replica) that used this component RAM usage was through the roof with no data in the index yet. A similar situation is faced by Analyzers that have dictionaries etc (see SOLR-3443). The current patch no longer supports a solution to that issue, as support for Lucene layer stuff has been deferred per discussion above. It is interesting to see that there is an implementation of the ref counting to use as a model, and your notes about hooks on the component will be useful too. That may come in handy if/when a refresh mechanism is desired. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156236#comment-15156236 ] Noble Paul commented on SOLR-8349: -- [~gus_heck] There is another easy existing solution for this problem How to do this 1) Store your large file in the blob store 2) Use {{blobRef = JarRepository.getJarIncRef(name)}} to get the content (we will change the method names for it to make sense for you) 3) Make your component register as a closehook 4) In the {{postClose()}}, do a {{blobref.decrementJarRefCount()}} The advantages of this solution are, 1) You get a version managed store for your large files without screwing up your ZK 2) It is already refcounted etc caveats are 1) It does not work for Standalone. We can extend it to do that 2) You will have to make your component {{startup=lazy}} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156203#comment-15156203 ] Gus Heck commented on SOLR-8349: *WRT #3/derministic behavior*: Here's the use case: # server is started, it loads a component that loads a file and creates resource A version 1 into memory # some time later the file is updated, and these updates need to be deployed # the new version 2 of the file is deployed to the server and the core is unloaded # the core is then loaded again and brought on line and made available to users. We now cannot predict which version of the resource is available to the users. If GC occured and the resource was collected between steps 3 and 4 the new resource will become available as the user would expect. If not, the old resource will show up on calls to getResource() until a GC occurs in which the JVM decides to clear the weak reference to it. If the component caches a (hard) reference to the resource, the new version of the resource will never get loaded. The previous system without weak references did not allow the old resource to ever be unloaded (and hence was deterministic). Now the behavior is a product of GC timing and the internal aspects of how the component was programmed. I would like to subsequently (in some later patch) make it possible to refresh the resource in a predictable manner without restarting the whole node. *WRT hard references*: I want people to have success not missteps and re-implementation using my feature :). For this reason I really like the weak references suggestion you made, but I want to manage it for them and not burden them with handling it properly. The submitted approach was meant to not bite the user who writes a component that never holds a reference to the resource. This would be a reasonable naive implementation for someone who knows nothing about the internals of solr and assumed they shouldn't hold the reference to ensure that the same resource was always seen everywhere. *WRT the abstraction*: it's there to get the loading code added to the deferredCallables list. SolrResourceLoader has no knowledge of the SolrCore until the core calls inform(core) on it. Unfortunately inform(resourceLoader) gets called before that. So any attempt to cast and do ((SolrResourceLoader)loader).getCore().getContainer() in the implementation of ResourceLoader#inform(loader) will throw an NPE. That's why the deferredCallables list exists. I chose to add the abstraction to enable the loader/core to manage hard references and allow the processing to become uniform with all loads being deferred. I wanted the folks attempting to use this to have a clear intuitive path to do so and the interfaces are meant to guide them into doing the right thing without needing to know all the details. It's worth noting that if the goal is a simple patch, the way to eliminate the MOST complexity from the patch is to have the component author manage references, and change: {code} resourceLoader.inform(resourceLoader); resourceLoader.inform(this); // last call before the latch is released. {code} to {code} resourceLoader.inform(this); resourceLoader.inform(resourceLoader); // last call before the latch is released. {code} In that case, casting and navigating to the container in inform(ResourceLoader) will work and we can loose the abstractions, the deferred callables and associated latch/synchronization, and the object reference code goes away too... but I definitely don't feel qualified to change the order in which components are made aware of things. I have no idea if any code out there would be relying on this order of inform() calls in some way. Lastly, Object key's are certainly possible, though this does reintroduce a vector for class loader memory leakages as previously discussed. I left this out because we were not supporting the lucene analyzers yet, and I wasn't yet adding "automatic" keys from configuration nodes. Automatic keys would be a nice feature to improve the feature and ensure implementors don't need to think so hard to use it. I'm amenable to try adding that now if you like, though the option to supply one's own key should remain. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150714#comment-15150714 ] Gus Heck commented on SOLR-8349: Actually in my particular case since all cores need it anyway I'm just loading using parallel streaming and enough threads to soak up all CPU on the box until processing finishes... That's probably not a good solution for the general use case though. Anywho, after punting Goal 3 guava works easy as pie, updating to merge with latest if any now. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149913#comment-15149913 ] David Smiley commented on SOLR-8349: bq. One of the reasons I had the goal of not blocking is the case that motivated this for a client of mine was one with 40 cores in a node all using this one resource, but I can accept that this might be an unusual case. I think you could work around that by having the Search Component lazily load the resource in a background thread, returning a Future. Then when the Search Component is invoked in a search, it grabs what's in the Future, blocking if necessary. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149420#comment-15149420 ] David Smiley commented on SOLR-8349: I see your point on CacheLoader needing to define its loading up-front; the caller of get() doesn't specify. I was confused with JDK's ConcurrentHashMap which I think in JDK 8 has a Function arg overloaded version. One way around this is for the Key to be be some new class hypothetically named say LoadingCacheKey that contains the real intended key (e.g. Node.java) plus a reference to a JDK 8 Function (or Supplier or something like that) that the CacheLoader will call. Then the CacheLoader would null out that function in the key. Admittedly that a bit's hacky, but nonetheless puts most of the tricky concurrency code into a dependent library where others maintain it; not us. That's my intent with this. I'll comment more later; gotta go right now. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149286#comment-15149286 ] Gus Heck commented on SOLR-8349: I did see .build(CacheLoader). :) My writeup was an explanation of why that (and other tactics) don't work (for the defined goals). What would you pass into build(CacheLoader)? How will it know how to load the data for a custom component that hasn't even been written yet. Furthermore, that pattern is meant to load on the first get(String) for a key, which (without further code to cause a warming-get before core init completes) violates Goal 5, and could block a query or update for the time it takes to load (~15-20 sec if I employ all 16 CPU with parallel streams in my case). If you do use build(CacheLoader) you will have to make your implementation of CacheLoader a wrapper object than can accept arbitrary loaders from yet to be written components, and then manage the acceptance and invocation of those individual loaders on it's own. One of the reasons I had the goal of not blocking is the case that motivated this for a client of mine was one with 40 cores in a node all using this one resource, but I can accept that this might be an unusual case. I'll submit a patch backing off on the "don't block loading other cores" goal soon. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148911#comment-15148911 ] David Smiley commented on SOLR-8349: bq. Change our goals, and use guava (allow cores loading the same resource to all block each other until loading is done) Seems totally fine to me provided this blocking is limited to the particular resource being loaded -- what Guava gives you via this: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html#build(com.google.common.cache.CacheLoader) It's a shame you spent so long on that looong write-up; as I read I wanted to say, _just stop please, stop already, just go see CacheBuilder.build(CacheLoader)_ :-) > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148797#comment-15148797 ] Gus Heck commented on SOLR-8349: Executive summary of the last overly long comment: # Guava's cache is indeed a cool cache, # It doesn't support arbitrary loaders in a way that is consistent with my design, # Either we can do one of these things (AFAICT): ## Use the working code I wrote (no guava cache) ## Change our goals, and use guava (allow cores loading the same resource to all block each other until loading is done) ## Use guava and wrap it in additional loader management code of similar complexity as my original code. # Weak/soft values require someone to hold the strong reference. New thought this morning: I could probably add methods and a list to the SolrCore object for the purpose of giving it a reference to the resource at load time, thus tying the life-cycle of the resource to the object we want it to live and die with. Then weak values would probably work fine. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147835#comment-15147835 ] Gus Heck commented on SOLR-8349: Guava sounded like a great idea. I love guava caches and have used them frequently in the past. When you mentioned it I thought to myself why didn't I think of that... eventually I remembered that I had, but decided not to go there because I didn't want to give control of memory issues to a library. In reflection that may have been a bit too draconian, so I gave guava caches a whirl. For the first time in my experience, I think I've found a use case they don't cover well. My design for this feature desires the following behavior: # get should return null if the resource is requested when no attempt has been made to load it. # load a resource only once, no provision for update or replacement is presently required, so first one in wins is just fine. # subsequent attempts to load the resource are a non-blocking no-op, allowing cores 2 through N that require the resource to continue to configure themselves while resource is being loaded (possibly starting the loading of resources for other components). # loading will be complete before the server completes startup and begins servicing requests. If a resource was supposed to be loaded and there was no error during loading, the component can rely on the existence of the resource at query (or update) time. # never allow a query or update to solr to initiate the loading for obvious latency reasons. I realized that the unit test I supplied doesn't fully test #3 above so I modified it like this: {code} +final String[] lastThread = new String[1]; // now add something to the map while it's under load try { Thread[] set1 = new Thread[100]; Then calls.incrementAndGet(); -}); + lastThread[0] = Thread.currentThread().getName(); +}, "thread"+i); // give the threads names like thread0 thread1 thread2 etc } for (Thread thread : set1) { thread.start(); +Thread.sleep(10); } while (calls.get() < 100) { Thread.sleep(100); } Then Thread.sleep(100); long count1b = counter[0]; long count2b = counter2[0]; + + // make sure other loaders didn't wait and thread0 completed last + assertEquals("thread0", lastThread[0]); {code} This modified test still passes just fine with my original code. But so far I haven't made it pass with guava. The naive first attempt was to use get(key,loader) and ignore the return value and use getIfPresent(key) to service get requests (handling case #1) but this was not good: # get(key,loader) ignoring the return value will block all cores that depend on this resource. The test fails with loader99 updating the array last. # wraping the get(key,loader) in a thread fails because now they all complete before loading completes and the test gets null when it checks the assert that that the value was set. (it would eventually become set, but this is just like completing server startup before loading is complete, and then failing queries, probably with an NPE). Also, failing server startup when loading fails becomes problematic since we have to get the exception back out of the thread. After those attempts I found myself getting into tracking what keys have already had loaders supplied and trying to coordinate blocking/pass through on my own, and of course the access to the collection recording what I've seen has to be synchronized... etc. This winds up negating most of the benefit of the guava cache. This is too bad because at first it looked like guava would replace a bunch of my code with one liners. My guess is that the guava folks didn't supply put(k,loader) because most of the time you aren't accepting arbitrary loaders in the first place. If the loader is known in advance, then LoadingCache works nicely. Unfortunately, for us, the loader is not known in advance. I don't see any reasonable way to write "one loader to rule them all" to allow us to use a LoadingCache either, this would just move all my tracking code into TheOneLoader.java :). Possibly a good reorganization code wise but we are still writing threaded code and not getting much simplification out of using guava caches that way. If we are going to write synchronized code, we should just make it do the right thing directly I think. Finally, weakValues() (or some equivalent) sounds good, but the problem is that the component supplies a loader and the container loads it at time A, and then at time B the component asks for the result of the load. Even if the component caches the results of that first request, the time between A and B leaves the object just loaded vulnerable to GC, since the only thing referencing it is the weak values cache. One could require the loader to
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142878#comment-15142878 ] David Smiley commented on SOLR-8349: First; I propose the container level cache be committed by itself (no Lucene abstraction or anything explicitly using it). It will enable us Solr hackers to at least explicitly use it in our components without resorting to using a static field on a top-level class-loaded class. A positive step forward. Now to clear something up; I definitely do not propose that Lucene have any sort of dependency on Solr. And I further propose that no change of any kind is needed to any of Lucene. I'll be more specific now that I see the relevant part of Solr. I propose that, Solr's FieldTypePluginLoader.readAnalyzer (called by IndexSchema) detect a flag attribute on a TokenizerFactory, TokenFilter, or CharFilterFactory that declares that it be global, and if so load it with a CoreContainer ResourceLoader (not a SolrCore one) into the shared cache (if it's not already there). To be specific. This doesn't affect the ResourceLoader abstraction, and components don't need to be written any differently. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142799#comment-15142799 ] Gus Heck commented on SOLR-8349: I am not sure I understand what you mean by load the analysis component into the container cache. The analysis component in lucene needs some way to reference the container and ask for the shared resource. My assumption was that it was not ok to add dependencies to solr dependencies into Lucene. Otherwise I could just cast to SolrResourceLoader in the analysis component in question. Did you look at SOLR-3443 where I use this patch to implement the feature for HunspellStemFilterFactory.java. do this: {code} if (loader instanceof ContainerResourceSharing) { resourceSharing = (ContainerResourceSharing) loader; {code} this then allows me to have a method like: {code} public Dictionary getDictionary() { return this.dictionary == null ? (Dictionary) resourceSharing.getContainerResource(resourceKey) : this.dictionary; } {code} If I can use a Solr Class, that could just as easily say {code} if (loader instanceof SolrResourceLoader) { resourceSharing = (SolrResourceLoader) loader; {code} One could also add methods to ResourceLoader itself, but then all forms of resource loader have to deal with implementing this method. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143872#comment-15143872 ] Gus Heck commented on SOLR-8349: Certainly we can punt enabling this for lucene level things to start with, and then tackle that in SOLR-3443 or in a separate Lucene ticket if appropriate. I'm totally fine with that. My motivating use case that got me started on this was a SearchComponent anyway. Now that I understand it, your idea is interesting. It generally sounds like it would allow a similar memory savings, though I had assumed promoting things to a higher class loader level would be undesirable. Here's some things that occurred to me as I pondered your suggestion... # My (potentially erroneous) assumption is that the whole reason for the class loader separation is to allow long term stability with loading/unloading cores and not retain references to classes and objects that aren't needed anymore. In the extreme, if cores were loaded in the same class loader as the container this issue would be less complex. # Putting the entire complexity of the custom Filter/Analyzer etc into the cache greatly enhances the chance of a class loader memory leak. Minimizing the complexity of what's cached puts the programmer in the best position to ensure core level classes don't become referenced by objects held in the container. Note that a follow on to this (or perhaps something required for it?) might be to reference count and unload unused keys. # That said, why would we do it one way for analysis classes and another for components? If your direction is selected for analysis classes perhaps we should do components that way too? # If we do that, what's left to be loaded in the core level loader? The non-increasing set of classes never previously loaded as global and whatever is not referenced by any component/analysis class I guess... # I'm a little concerned about how we will manage to automatically create appropriate keys in the cache. The same analysis class or component may be configured multiple times and so we need a key that hashes the important configuration parameters to distinguish identical instances from variants. Automatic determination of "important" seems dicey though we could simply be pessimistic and use every configuration parameter we can find, but then we need to know which fields are representative of configuration parameters (annotation? but then we're modifying lucene again, drat) or intercept this information as we read the configuration, before we create the instance of the class? Do we have classes that configure complex sub components and hold those as fields? In SOLR-3443 I do the following to generate a key for the resource cache: {code} md5.update(cs.encode(dictionaryFiles).array()); md5.update(cs.encode(affixFile).array()); md5.update(cs.encode(String.valueOf(ignoreCase)).array()); md5.update(cs.encode(String.valueOf(longestOnly)).array()); // ** SNIP ** // resourceKey = "org.apache.lucene.analysis.hunspell.dictionary." + configHash; {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144017#comment-15144017 ] David Smiley commented on SOLR-8349: 1. the classLoader separation is simply because there are libs and conf/ resources per-core, as cores might be for entirely separate purposes with different dependencies that might not even be compatible if it were all together. 2. RE memory-leak: I propose the weakValues() feature of Guava's cache. It's a nice cache :-) No reference counting needed; we're in a VM that GCs. 3. you're right; this isn't just about analysis components. But it would probably be incompatible with anything that implements CoreAware or SchemaAware as both those things are tied to a SolrCore. 4. RE "what's left to be loaded in the core level loader": I think this re-use feature should be explicitly opt-in. Most stuff will cotinue as-is -- core level loader. Docs for HunspellFactory on the ref guide could be augmented to suggest flagging global-reuse because we know it uses a lot of memory. 5. RE keys: I think just the W3C DOM Node.java instance is probably fine. We might want to double-check it's equals() method makes sense and doesn't refer to the parent, or if it does find a way to clone/detach it. Actually I think it does; so we'd call cloneNode(true) and use the result of that. I'm not concerned on the efficiency of the keys; this cache isn't going to be hit hard. Any way, lets continue this discussion if need be on a follow-on issue for actually _using_ the cache that this issue will create. Cool? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140881#comment-15140881 ] Gus Heck commented on SOLR-8349: Can this make it into 6.0? I'm slightly surprised that a feature that can save memory, especially for nodes with large numbers of cores hasn't generated more interest. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140931#comment-15140931 ] David Smiley commented on SOLR-8349: I take back my claim that I'll commit it if only it used Guava. I forgot about the Lucene layer part. I think you should file a Lucene issue on introducing that interface. It'll get input from more core Lucene devs. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140924#comment-15140924 ] David Smiley commented on SOLR-8349: Instead of a hand-rolled cache with separate per-key locks I suggest Guava's cache which already does this. See: https://code.google.com/p/guava-libraries/wiki/CachesExplained I'll commit your patch to 6 if you use that. bq. I'm slightly surprised that a feature that can save memory, especially for nodes with large numbers of cores hasn't generated more interest. I think internal plumbing issues like this capture less interest since it's something another Solr hacker can work around. I've solved what you're solving here before for a client but without hacking Solr. It wasn't elegant... but I recall I put a lib on the top level classpath (not per-core) and I used a static member to hold the cache. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141171#comment-15141171 ] Gus Heck commented on SOLR-8349: Thanks for the feedback Dave, I generally like guava caches, so this sounds like a good idea. I seem to recall that the Lucene part was motivated by the desire to have this fix SOLR-3443. I suspect it could be pulled out if needed. Should the Lucene ticket simply point to this or do I need to generate a separate patch for that that breaks things out? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142190#comment-15142190 ] David Smiley commented on SOLR-8349: I put a little more thought into this issue now. The cache at the CoreContainer makes sense but I'm not convinced that the Lucene layer needs a new abstraction. Instead, I think Solr could be enhanced to load some analysis components into this CoreContainer cache. The question is which ones? Not all... some components will be refer to resources that are local to a SolrCore. I'm not sure how easy it would be for Solr to detect that automatically; probably not easy. That leaves the possibility of a new attribute on the core to designate it as globally shared. What do you think? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071258#comment-15071258 ] Gus Heck commented on SOLR-8349: Has anyone had time to look at this? > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045423#comment-15045423 ] Gus Heck commented on SOLR-8349: I'm sure folks have been busy getting stuff into 5.4, but with the RC nearly produced, perhaps someone can review this now? It's intended for 6.x, but I'd like feedback/review so that I have something to chew on (unless of course it's already perfect :) ). I would definitely like someone who's comfortable with {{CoreContainer}} and class loading in Solr to take a peek. I put the definition for {{ContainerResourceSharing}} in Lucene package space. It needed to be there so that the hunspell analyzer for SOLR-3443 could leverage it without depending on Solr classes (or any other Lucene class that might load a large file that doesn't change from core to core). This patch adds no implementation in the Lucene world, only for {{SolrResourceLoader}}. Folks who are using Lucene in a container other than Solr can implement for their container or not as they desire. The intended pattern for usage within Solr (i.e. {{SearchComponent}} implementations) is to either navigate to {{CoreContainer}} from a core if {{CoreAware}} or cast the ResourceLoader if only {{ResourceLoaderAware}}. Either path leads to the same place as SolrResourceLoader merely delegates to it's CoreContainer anyway. At the Lucene level, code would want to test {{instanceof ContainerResourceSharing}} and do the right thing based on the result, retaining existing behavior to support use of Lucene outside of containers on in containers without support. Let me know if there are better ideas here... > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032007#comment-15032007 ] Gus Heck commented on SOLR-8349: Further background for this feature... a precursor version of this patch (which does not have the interface and thus can't fix SOLR-3443) is in use at a client where we had a ~900MB hash map for looking up lat/lon from custom query parameters. This map was needed for all our cores. This is anticipated to save >35GB of ram since 40+ cores will live on a machine. The original implementation of this lat/lon lookup feature for the client attempted to use a static field, but the independent class loaders (each core gets it's own class loader) loaded fresh copies of the class each with it's own static map. It's worth noting that the analyzers such as the hunspell one in SOLR-3443 are not loaded by the core's class loaders and the excess memory there is held in a member field per instance, so a static variable based solution would be possible there. I thought it was better to provide a uniform solution. Another possible follow on feature (or perhaps enhancement to this one) would be a means of reference counting the shared resources and removing them. In the present (initial) patch, a long running solr instance where lots of cores are added and removed would potentially have unused container resources hanging around (though they would become used again with no loading time if a core were re-installed that required them). I didn't go into the complexity of removal because I wasn't sure if it would be deemed necessary. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org