[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938830#comment-16938830 ]
Jan Høydahl commented on SOLR-13661: ------------------------------------ I have just looked at some of the code and will not have time for a more thorough review until week after next. Here is a list of my main concerns so far: # My main concern is that too many decisions seem to be made with too few eyes, combined with a goal of merging very soon. # One example of "too few eyes" is that the "package" concept seems to be designed for ONE use case only, customer's internal custom packages, with arbitrary local naming of repos and packages. I think before such a feature goes mainstream, the design should also include converting some of our contrib modules to packages that we release as separate binaries in the mirrors, and enable an "apache" Repo as default. That requires some more thought behind stable name-spacing, so that e.g. “bin/solr install ltr” will mean the same for all customers. Perhaps that would mean some name spacing or name collision resolution, so if you have a custom local repo with a package also called "ltr", then you get an error which can be resolved by qualifying the package name like e.g. "apache:ltr" or "mylocalrepo:ltr". # We need a plan for how 3rd party plugin developers can publish their plugins on their own web site or on GitHub in a well defined way. The use of pf4j-update lib takes care of much of this, and this is also something that can be added incrementally, but the design needs to allow for this. My POC has a RepositoryFactory class that parses the repo URL (e.g. "bin/solr plugin repo add myrepo [https://host.com/repo/name]") and selects the GitHubUpdateRepository if it is a GitHub URL, the ApacheMirrorsUpdateRepository if it is an apache.org address, and the default site/FS repo else. Each of these handle the download process and signature verification in a different way. # Hot/Cold deploy. I don't like systems where you, as part of the install need to spin up a server. We already have this with setting urlScheme in ZK for HTTPS. But ideally it should be possible to do a Solr install including plugins before you need to spin up Solr. Elasticsearch uses such a static plugin installer (but also don't support hot install). Having a "staging" folder where you can drop package ZIP files (or JARs) where the node can self-install packages during first startup could be one way to handle this. # Robustness during upgrades is another concern. I don't see mentioned in the design doc what happens during a Solr upgrade. We should think through the scenario for both minor and major version upgrade for Solr, and then I mean rolling upgrade. Having ZK as only master for what version of a plugin should be used is probably not sufficient, as during a rolling Solr upgrade, you could have one node on 8.3 and another node on, say, 9.0. And you could have packageA:1.0 installed but Solr9 requires v2.0 due to removal of some APIs or what not. In the cold scenario (as in POC) you'd shut down a Solr node, upgrade Solr, then run "bin/solr plugin upgrade outdated" before starting that node again, and that would make sure it has the correct plugin version. Since you cannot upgrade Solr while it is running, perhaps we need to hook in some validation on node startup that it does not have any packages that won't work with that Solr version, and refuse to start. And some way to have two versions of a package installed at the same time, and then instead of using the latest, the Solr node will select the newest version that is compatible. Then when that node is upgraded it will select the new version of the plugin automatically based on Version.java. # Package system deserves its own Znode in Zookeeper instead of abusing clusterProps # I don't like the concept of an admin needing to "deploy" a package to a collection using a command. Rather, the collections should require a set of packages (optionally with min version) and fail to start if it is not available in the system. If the package is available in the system, the collection should gain access to the package(s) it required without running a deploy command. # Simplicity should be front seat. Don't force users to have to add {{package="my-pkg"}} wherever they today can say {{class="com.example.MyPlugin"}}. This is what we have ResourceLoader and class loaders for. If we cannot find {{com.example.MyPlugin}} in main class loader, then hunt through every package class loader until you have a match, if no match then throw ClassNotFound. (I never liked the {{runtimeLib=true}} equivalent in the old blob store.) # The package design says that a manifest is not required for a package and that any plain jar can function as a package just by registering it manually. That is ok as an alternative workflow, but most packages (and all official Apache and from official public 3rd party repos) should be required to have a manifest with package name, version, she-hash, signature and version-compatibility. # Some packages (contribs) have a ton of jars, thus we should support both PF4J's JAR and ZIP plugins. Other plugin formats can be added later (such as jar + list of maven coordinates for dependencies) # The plugin initialization commands seem complex and unnecessary for a first version. An alternative solution to the install/upgrade problem that I have been thinking of is to design a well-defined {{Package.java}} base-class, and if Solr finds that class in a package/plugin, then it will execute pre-defined methods on that class, such as {{upgrade(Version from, Version to)}} and {{uninstall()}} what will be called in different phases. PF4J defines a Plugin.java class ([https://github.com/pf4j/pf4j/blob/master/pf4j/src/main/java/org/pf4j/Plugin.java)] with start(), stop() and delete() methods for the different life cycles. We could of course use and extend that, or implement something similar for the plugin to be able to do anything it likes as part of installing, upgrading, uninstalling... That's why I think we should take out the "setup-command" stuff from v1. # The FSBlobstore will be very user-unfriendly to browse with sha256 as file names. Appending file names has been suggested. Other solutions could be an additional <sha256>.properties file alongside the blob file that contains the real file name, an possibly other metadata as well, such as what package it is part of. Puuh. This is what I got so far on the architecture and design level. This aspect is more important than code review at this stage, so I'll leave it there for now. Please do not take this feedback as criticism Noble/Ishan. None of these bullets are absolutes, and I do not think that I have the only possible solution. But I put a lot of careful thought into the POC which I feel is largely lacking here. I'm pretty sure we as a community will arrive at a better design by considering and discussing more use cases than the one use case/need of your employer. > A package management system for Solr > ------------------------------------ > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Noble Paul > Assignee: Ishan Chattopadhyaya > Priority: Major > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 20m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org