Shawn: * RE redundancies of stuff in /dist/, see https://issues.apache.org/jira/browse/SOLR-15916 * RE "contrib" vs "module" vs "package", see: https://issues.apache.org/jira/browse/SOLR-15917 * RE not shipping these extras with the Solr distribution, see: "slim distro" mention in the document "Solr first party packages" https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit?usp=sharing
It could very well be worth shipping two docker images in the meantime. Or maybe a zip of each module could be a separate artifact that is published? I'm not sure what freedoms we have to do this in the ASF. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Jan 12, 2022 at 8:21 PM Shawn Heisey <[email protected]> wrote: > On 1/12/2022 8:31 AM, Jan Høydahl wrote: > > I think there are lots of pieces of code in solr-core that can easily be > extracted the same way. > > Some perhaps even for 9.0.0, as it slims down the core and reduces > attack surface for most users as well. > > I think it would be really awesome if we had a core download that only > included basic functionality, and all the other fancy things that Solr > does now out of the box (as well as those that are contrib) could be > added after download via package scripting or just additional downloads. > > The size of solr-8.11.1.tgz is 207MiB, or 218076598 bytes. The .zip > version is slightly larger. 8.0.0 was 163MiB, 7.0.0 was 142MiBm, 6.0.0 > was 131MiB, and 1.4.1 was 53.7MiB. I think it's insane that the > download is so big ... and a lot of what makes it big are things that > the vast majority of our users will never use. > > Large reductions in the overall size of the main download would be > possible by putting hadoop, calcite, some of the really large lucene > analysis components, and the contrib stuff into packages. The > extraction contrib alone is 43.5MiB compressed in zip format. > > I would suggest moving zookeeper and its dependencies as well, but I > think we probably want SolrCloud to be part of base functionality. > > Some of the large jars are included for what are probably insignificant > usages, and I wonder if that functionality could be replaced by newer > native functions available in Java 8 and later. I am eyeballing things > like guava and the commons-* jars here, but I am sure there are other > things in this category. I'd like to eliminate as many dependencies as > we can. > > Extracting some things from the solr-core jar into other jars sounds > like a really awesome idea. > > I don't think the solr-core jar should be in the dist directory. It's > useless by itself, because it will still have a LOT of dependencies even > if we shrink it. And there are likely other things in the dist > directory that fall into that category. The test framework and its > dependencies are a good candidate for removal. > > By removing some of the low-hanging fruit that I am SURE isn't needed > for base binary functionality on the 8.11.1 download, I was able to end > up with a .zip file sized in at 60.4MiB, and I am sure at least a little > bit of further reduction is possible if we can fully map out > dependencies. I think we can leverage gradle to provide some dependency > info. > > Exactly how to organize the code repo to create divided artifacts is > something that we would need to think about. My initial idea is > changing "contrib" to "package" and then making some new directories > under package. > > Thanks, > Shawn > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
