[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592412#comment-16592412 ] ASF subversion and git services commented on SOLR-12590: Commit 95cb7aa491f5659084852ec29f52cc90cd7ea35c in lucene-solr's branch refs/heads/jira/http2 from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=95cb7aa ] SOLR-12590: Improve Solr resource loader coverage in the ref guide > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Fix For: master (8.0), 7.5 > > Attachments: SOLR-12590.patch, SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590668#comment-16590668 ] ASF subversion and git services commented on SOLR-12590: Commit 95cb7aa491f5659084852ec29f52cc90cd7ea35c in lucene-solr's branch refs/heads/master from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=95cb7aa ] SOLR-12590: Improve Solr resource loader coverage in the ref guide > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch, SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590667#comment-16590667 ] ASF subversion and git services commented on SOLR-12590: Commit 523295666f4a7360f09a30cb006153f8b9c2f9bf in lucene-solr's branch refs/heads/branch_7x from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5232956 ] SOLR-12590: Improve Solr resource loader coverage in the ref guide > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch, SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590208#comment-16590208 ] Cassandra Targett commented on SOLR-12590: -- I took a look at the new patch - +1 overall. IMO it's ready to commit whenever you're ready. Thanks. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch, SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576803#comment-16576803 ] Steve Rowe commented on SOLR-12590: --- bq. So from this exploration I think the wrapper model concept introduced in SOLR-11250 is currently the only way to support large models (without changing ZooKeeper's max file size limit). Thanks for the analysis Christine, I'll leave the wrapper model docs intact then. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576762#comment-16576762 ] Christine Poerschke commented on SOLR-12590: bq. ... Do you have the bandwidth to test this assertion? ... Hmm, ok, so i've explored reaching the {{// delegate to the class loader (looking into $INSTANCE_DIR/lib jars)}} code path in [ZkSolrResourceLoader|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/core/src/java/org/apache/solr/cloud/ZkSolrResourceLoader.java#L122] for large learning-to-rank models, and, well, here's just some notes from that really: * We have a [ManagedModelStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java] using {{"/schema/model-store"}} as the REST endpoint. Conceptually, if the model store itself wasn't there (in ZooKeeper) then in principle looking elsewhere locally might be an option; having said that: ** if there is a (small) model store then perhaps one would wish to keep that and any alternative additional (large) model store should be separate. ** {{SolrResourceLoader}} has a {{managedResourceRegistry}} but it's not immediately obvious from a quick look if {{ZkSolrResourceLoader}} (or something else) has an equivalent which would look locally if it's not there in ZooKeeper. * Models use features and we have a [ManagedFeatureStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java] using {{"/schema/feature-store"}} as the REST endpoint. ** If there was a concept of a (small/regular) model store in ZooKeeper and an (additional/larger) model store locally, then similarly an additional large feature store locally might be logical. ** In such a hypothetical scenario, could models in the large model store use feature from the small feature store, and vice versa? What if both places have models with the same name? ** Current code detail: features are conceptually organised into [feature stores|https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html#feature-stores] akin to namespaces but in terms of implementation they are all persisted in the same place i.e. {{_schema_feature-store.json}} matching the {{"/schema/feature-store"}} upload REST endpoint. So from this exploration I think the wrapper model concept introduced in SOLR-11250 is currently the only way to support large models (without changing ZooKeeper's max file size limit). > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567104#comment-16567104 ] Cassandra Targett commented on SOLR-12590: -- Oh, also, FYI, the patch doesn't apply anymore after the changes committed with SOLR-11870. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564343#comment-16564343 ] Cassandra Targett commented on SOLR-12590: -- I finally got a chance (and remembered!) to review this. My first reaction when I saw the patch was that I wasn't sure about changing the name of the page...the new name is more descriptive of the topic discussed, but there was a reason why all its sibling pages (those under "Configuring solrconfig.xml") include "SolrConfig" in their names, which was to make it clear they all referred to settings and parameters in solrconfig.xml (long, long ago everything there was in a single page). I don't know that makes sense any more - frankly, I think I was just holding on to it as a historical artifact that probably means very little to anyone else anymore. So, no problems on the page name change. A couple other things about the content specifically: * I think we're missing a bit of intro into what we mean by resources here - custom query parser or other type of component jars? files needed by schema classes? LTR models? - as a paragraph before any of the headings start. Just to set expectations. * The first section, "Resources in ConfigSets on ZooKeeper", feels empty to me. Is it worth mentioning the blob store here (and pointing to it) even though it can only be used for jars, and also mentioning that some resources could be uploaded to ZK (and pointing to that doc in setting-up-an-external-zookeeper-ensemble.adoc)? Upon reading it seems like the first section is supposed to lead into the second, but people sometimes read these things in a more piecemeal way - the first section doesn't answer the question and they're using ZK, so presume there is no answer to the question. * Essentially it feels like we're setting up two ways of dealing with "large files" in SolrCloud mode (which was the impetus here): upload them to ZK, or put them on every node. We should state that explicitly, even if one approach is only linked to instead of described on the new page. The page is better, but I think we're missing a couple more ways we can tie all the options together. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562204#comment-16562204 ] Christine Poerschke commented on SOLR-12590: bq. ... I'd rather not remove docs unless we can verify the alternative. +1 to that. A test case specifically for this would seem the ideal/repeatable way to verify it. bq. ... Do you have the bandwidth to test this assertion? ... Happy to give it a go. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560124#comment-16560124 ] Steve Rowe commented on SOLR-12590: --- {quote} bq. Under SolrCloud, resources to be loaded are first looked up in ZooKeeper under the collection's configset znode. If the resource isn't found there, Solr will fall back to loading resources from the filesystem. A nicely concise and clear lead paragraph, I like it. Side question: was this fallback logic 'always' there (and I just didn't know about it, oops) or is it something introduced in a recent-ish version? {quote} I think it's always been there. The comment below (positioned after failing to load a resource from ZooKeeper), dates to 2010, when SolrCloud was committed to Subversion trunk (SOLR-1873): {code:java|title=ZkZolrResourceLoader.openResource() (branch_7x)} 121:try { 122: // delegate to the class loader (looking into $INSTANCE_DIR/lib jars) 123: is = classLoader.getResourceAsStream(resource.replace(File.separatorChar, '/')); {code} bq. So yes, if the wrapper model concept is not or no longer needed then let's not mention it in the documentation. Do you have the bandwidth to test this assertion? I'd rather not remove docs unless we can verify the alternative. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560103#comment-16560103 ] Christine Poerschke commented on SOLR-12590: bq. ... added info about how the resource loader works. ... the learning-to-rank large model discussion to point to local storage as an alternative, but I think it should be applicable; I haven't tried myself. ... is it possible that that indirection is not necessary? Good question. In your patch the "Resources in ConfigSets on ZooKeeper" lead paragraph is: bq. Under SolrCloud, resources to be loaded are first looked up in ZooKeeper under the collection's configset znode. If the resource isn't found there, Solr will fall back to loading resources from the filesystem. A nicely concise and clear lead paragraph, I like it. Side question: was this fallback logic 'always' there (and I just didn't know about it, oops) or is it something introduced in a recent-ish version? Either way, if the documentation provides one recommended way of doing things that should be clearest from a user's point of view. So yes, if the wrapper model concept is not or no longer needed then let's not mention it in the documentation. > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide
[ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555776#comment-16555776 ] Steve Rowe commented on SOLR-12590: --- test > Improve Solr resource loader coverage in the ref guide > -- > > Key: SOLR-12590 > URL: https://issues.apache.org/jira/browse/SOLR-12590 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Major > Attachments: SOLR-12590.patch > > > In SolrCloud, storing large resources (e.g. binary machine learned models) on > the local filesystem should be a viable alternative to increasing ZooKeeper's > max file size limit (1MB), but there are undocumented complications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org