On 5/28/2014 10:01 PM, Alexandre Rafalovitch wrote: > I would like to (re-)initiate a discussion about Solr support for > plugin life-cycle (publish, discover, download, dependency > management). Triggered by a discussion on the Solr mailing list: > http://search-lucene.com/m/QTPaIv50e1&subj=Re+Contribute+QParserPlugin > > My main points: > 1) Plugins/Modules/packages seem to be a very core part of most of the > modern projects > 2) Solr is extremely modular on the implementation level. > 3) The community has been slowly building various plugins for Solr, > but without any way to announce/share them. > 4) ElasticSearch has plugins and it's been consistently pointed out as > a positive point (good on them)
Sounds like a good idea to me. > 5) Solr, frankly, is getting rather pudgy. Or possibly beyond mere > pudgy. This is becoming especially noticeable by comparison with > ElasticSearch but also with the increasing frequency of releases. I > mentioned this issue a couple of times in the past under different > angles (bundling Javadoc, compressing files, easy onboarding, etc). One of our PMC members has filed an issue regarding the extreme size of the Lucene/Solr artifacts that have to be lugged around to do a release. This issue is LUCENE-5589. That is a different (but related) topic to what you are talking about, which is the pieces that a user must download. To address the problem on the committer side, Lucene and Solr may need to be broken into multiple pieces from a *release* standpoint. This idea is not without its problems. Multiple sub-releases might be a way to address problems with a huge upload for a release. Think about how we separately manage the reference guide as its own release, and then imagine that we have the same thing for Lucene, Solr, Solr contrib, etc. The divisions might get even smaller. Separate release cycles would be possible, although the potential complications involved in that might mean that we keep everything together and make all the sub-releases part of a larger full release. Separate release managers would be a possible way to spread the bandwidth of both network and committer time. There's a possibility that the total artifact size for all sub-releases would actually go up, but each individual sub-release would be much smaller. To address the size problem on the user side, we need a "core" package that only includes what is needed to get a basic Solr started. I wouldn't mind at all if *none* of the contrib modules are included. Exactly how to organize the module downloads is something I'm not sure about, but I do think that DIH and SolrJ should be available in their own downloads separate from the rest, because those downloads would be pretty small. If Apache's distribution rules will allow it, I think that Javadocs should also be a separate download. > 6) Solr has so many features now that nobody will explore them all; > yet people still download them and - sometimes - get confused by all > the directories, jars and locations. Polish support alone has a > significant number of jars (not sure about file sizes). I am not even > talking about map-reduce+morphline in the recent release. > 7) Solr is already published as a set of Maven jars with dependencies > expressed between components. > 8) Apart from making initial downloads smaller, having proper module > system (publish, discover, download, install) provides incentives for > people to push the packages out, creates a stronger community > > Now, I know that some of the weight might be addressed in Solr 5 by > not bundling the war file as well as the libs. And some of the > ElasticSearch comparison is due to the different philosophical > approach (kitchen sink vs not even bundling Admin UI). And that any > individual person does not download Solr packages that often (I might > be an exception). But I still think we need the discussion. A plugin infrastructure that allows a user to start Solr and then click to install a plugin would be, quite frankly, beyond awesome. If it can be managed effectively, it would also be nice to have a third-party plugin repository, like Mozilla and Wordpress. > I would especially love to see a discussion of the lowest hanging > fruit. Even if we cannot decompose Solr itself right now, maybe we can > introduce additional package handling mechanism and then retrofit Solr > into that. Let's figure out what the lowest hanging fruit is and get the work started! I'd like to help too. On a tangent: Part of the "out of box" experience that's missing is installation scripts, which until we come up with something new for 5.0, would be for the example jetty. I have an init script that works very well, but only on Redhat-derived Linux distributions. It requires a lot of manual work to get started, and automating that would be a huge plus for users. My init script needs to be extended to cover Debian-derived distributions at the very minimum. Support for Solaris and other popular UNIX variants would be useful. As much as I don't like the platform for anything but client systems, if we can come up with a way to install on Windows as a service, we will truly open Solr up to the masses. Installation scripts and the supporting documentation are not a trivial undertaking, and I'm sure it will be more complicated than I'm imagining right now ... but a first draft would not be an immense project. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
