Perhaps a bit of a wildcard question or thought ... would any split out top-level project necessarily be called "Apache Solr" or could the split out project be called "Apache <add-name-here>" with "Apache Solr" as its initial sub-project and over time there may be other sub-projects added? No particular name in mind, "Apache Search" might be too obvious, just wondering in principle.
Christine On 2020/05/04 09:10:35, Dawid Weiss <dawid.we...@gmail.com> wrote: > Dear Lucene and Solr developers! > > A few days ago, I initiated a discussion among PMC members about > potential pros and cons of splitting the project into separate Lucene > and Solr entities by promoting Solr to its own top-level Apache > project (TLP). Let me share with you the motivation for such an action > and some follow-up thoughts I heard from other PMC members so far. > > Please read this e-mail carefully. Both the PMC and I look forward to > hearing your opinion. This is a DISCUSS thread and it will be followed > next week by a VOTE thread. This is our shared project and we should > all shape its future responsibly. > > The big question is this: “Is this the right time to split Solr and > Lucene into two independent projects?”. > > Here are several technical considerations that drove me to ask the > question above (in no order of priorities): > > 1) Precommit/ test times. These are crazy high. If we split into two > projects we can pretty much cut all of Lucene testing out of Solr (and > likewise), making development a bit more fun again. > > 2) Build system itself and source release packaging. The current > combined codebase is a *beast* to maintain. Working with gradle on > both projects at once made me realise how little the two have in > common. The code layout, the dependencies, even the workflow of people > > working on these projects... The build (both ant and gradle) is full > of Solr and Lucene-specific exceptions and hooks that could be more > elegantly solved if moved to each project independently. > > 3) Packaging. There is no single source distribution package for > Solr+Lucene. They are already "independent" there. Why should Lucene > and Solr always be released at the same pace? Does it always make > sense? > > 4) Solr is essentially taking in Lucene and its dependencies as a > whole (so is Elasticsearch and many other projects). In my opinion > this makes Lucene eligible for refactoring and > > maintenance as a separate component. The learning curve for people > coming to each project separately is going to be gentler than trying > to dive into the combined codebase. > > 5) Mailing lists, build servers. Mailing lists for users are already > separated. I think this is yet another indication that Solr is > something more than a component within Lucene. It is perceived as an > independent entity and used as an independent product. I would really > like to have separate mailing lists for these two projects (this > includes build and test results) as it would make life easier: if your > focus is more on Lucene (or Solr), you would only need to track half > of the current traffic. > > > As I already mentioned, the discussion among PMC members highlighted > some initial concerns and reasons why the project should perhaps > remain glued together. These are outlined below with some of the > counter-arguments presented under each concern to avoid repetition of > the same content from the PMC mailing list (they’re copied from the > private discussion list). > > 1) Both projects may gradually split their ways after the separation > and even develop “against” each other like it used to be before the > merge. > > Whether this is a legitimate concern is hard to tell. If Solr goes TLP > then all existing Lucene committers will automatically become Solr > committers (unless they opt not to) so there will be both procedural > ways to prevent this from happening (vetoes) as well as common-sense > reasons to just cooperate. > > 2) Some people like parallel version numbering (concurrent Solr and > Lucene releases) as it gives instant clarity which Solr version uses > which version of Lucene. > > This can still be done on Solr side (it is Solr’s decision to adapt > any versioning scheme the project feels comfortable with). I > personally (DW) think this kind of versioning is actually more > confusing than helpful; Solr should have its own cadence of releases > driven by features, not sub-component changes. If the “backwards > compatibility” is a factor then a solution might be to sync on major > version releases only (e.g., this is how Elasticsearch is handling > this). > > 3) Solr tests are the first “battlefield” test zone for Lucene changes > - if it becomes TLP this part will be gone. > > Yes, true. But realistically Solr will have to adopt some kind of > snapshot-based dependency on Lucene anyway (whether as a git submodule > or a maven snapshot dependency). So if there are bugs in Lucene they > will still be detected by Solr tests (and fairly early). > > 4) Why split now if we merged in the first place? > > Some of you may wonder why split the project that was initially > *merged* from two independent codebases (around 10 years ago). In > short, there was a lot of code duplication and interaction between > Solr and Lucene back then, with patches flying back and forth. > Integration into a single codebase seemed like a great idea to clean > things up and make things easier. In many ways this is exactly what > did happen: we have cleaned up code dependencies and reusable > components (on Lucene side) consumed by not just Solr but also other > projects (downstream from Lucene). > > The situation we find ourselves now is different to what it was > before: recent and ongoing development for the most part falls within > Solr or Lucene exclusively. > > > This e-mail is for discussing the idea and presenting arguments/ > counter-arguments for or against the split. It will be followed by a > separate VOTE thread e-mail next Monday. If the vote passes then there > are many questions about how this process should be arranged and > orchestrated. There are past examples even within Lucene [1] that we > can learn from, and there are people who know how to do it - the > actual process is of lesser concern at the moment, what we mostly want > to do is to reach out to you, signal the idea and ask about your > opinion. Let us know what you think. > > [1] > https://lists.apache.org/thread.html/15bf2dc6d6ccd25459f8a43f0122751eedd3834caa31705f790844d7%401270142638%40%3Cuser.nutch.apache.org%3E > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org