Perhaps a bit of a wildcard question or thought ... would any split out 
top-level project necessarily be called "Apache Solr" or could the split out 
project be called "Apache <add-name-here>" with "Apache Solr" as its initial 
sub-project and over time there may be other sub-projects added? No particular 
name in mind, "Apache Search" might be too obvious, just wondering in principle.

Christine

On 2020/05/04 09:10:35, Dawid Weiss <dawid.we...@gmail.com> wrote: 
> Dear Lucene and Solr developers!
> 
> A few days ago, I initiated a discussion among PMC members about
> potential pros and cons of splitting the project into separate Lucene
> and Solr entities by promoting Solr to its own top-level Apache
> project (TLP). Let me share with you the motivation for such an action
> and some follow-up thoughts I heard from other PMC members so far.
> 
> Please read this e-mail carefully. Both the PMC and I look forward to
> hearing your opinion. This is a DISCUSS thread and it will be followed
> next week by a VOTE thread. This is our shared project and we should
> all shape its future responsibly.
> 
> The big question is this: “Is this the right time to split Solr and
> Lucene into two independent projects?”.
> 
> Here are several technical considerations that drove me to ask the
> question above (in no order of priorities):
> 
> 1) Precommit/ test times. These are crazy high. If we split into two
> projects we can pretty much cut all of Lucene testing out of Solr (and
> likewise), making development a bit more fun again.
> 
> 2) Build system itself and source release packaging. The current
> combined codebase is a *beast* to maintain. Working with gradle on
> both projects at once made me realise how little the two have in
> common. The code layout, the dependencies, even the workflow of people
> 
> working on these projects... The build (both ant and gradle) is full
> of Solr and Lucene-specific exceptions and hooks that could be more
> elegantly solved if moved to each project independently.
> 
> 3) Packaging. There is no single source distribution package for
> Solr+Lucene. They are already "independent" there. Why should Lucene
> and Solr always be released at the same pace? Does it always make
> sense?
> 
> 4) Solr is essentially taking in Lucene and its dependencies as a
> whole (so is Elasticsearch and many other projects). In my opinion
> this makes Lucene eligible for refactoring and
> 
> maintenance as a separate component. The learning curve for people
> coming to each project separately is going to be gentler than trying
> to dive into the combined codebase.
> 
> 5) Mailing lists, build servers. Mailing lists for users are already
> separated. I think this is yet another indication that Solr is
> something more than a component within Lucene. It is perceived as an
> independent entity and used as an independent product. I would really
> like to have separate mailing lists for these two projects (this
> includes build and test results) as it would make life easier: if your
> focus is more on Lucene (or Solr), you would only need to track half
> of the current traffic.
> 
> 
> As I already mentioned, the discussion among PMC members highlighted
> some initial concerns and reasons why the project should perhaps
> remain glued together. These are outlined below with some of the
> counter-arguments presented under each concern to avoid repetition of
> the same content from the PMC mailing list (they’re copied from the
> private discussion list).
> 
> 1) Both projects may gradually split their ways after the separation
> and even develop “against” each other like it used to be before the
> merge.
> 
> Whether this is a legitimate concern is hard to tell. If Solr goes TLP
> then all existing Lucene committers will automatically become Solr
> committers (unless they opt not to) so there will be both procedural
> ways to prevent this from happening (vetoes) as well as common-sense
> reasons to just cooperate.
> 
> 2) Some people like parallel version numbering (concurrent Solr and
> Lucene releases) as it gives instant clarity which Solr version uses
> which version of Lucene.
> 
> This can still be done on Solr side (it is Solr’s decision to adapt
> any versioning scheme the project feels comfortable with). I
> personally (DW) think this kind of versioning is actually more
> confusing than helpful; Solr should have its own cadence of releases
> driven by features, not sub-component changes. If the “backwards
> compatibility” is a factor then a solution might be to sync on major
> version releases only (e.g., this is how Elasticsearch is handling
> this).
> 
> 3) Solr tests are the first “battlefield” test zone for Lucene changes
> - if it becomes TLP this part will be gone.
> 
> Yes, true. But realistically Solr will have to adopt some kind of
> snapshot-based dependency on Lucene anyway (whether as a git submodule
> or a maven snapshot dependency). So if there are bugs in Lucene they
> will still be detected by Solr tests (and fairly early).
> 
> 4) Why split now if we merged in the first place?
> 
> Some of you may wonder why split the project that was initially
> *merged* from two independent codebases (around 10 years ago). In
> short, there was a lot of code duplication and interaction between
> Solr and Lucene back then, with patches flying back and forth.
> Integration into a single codebase seemed like a great idea to clean
> things up and make things easier. In many ways this is exactly what
> did happen: we have cleaned up code dependencies and reusable
> components (on Lucene side) consumed by not just Solr but also other
> projects (downstream from Lucene).
> 
> The situation we find ourselves now is different to what it was
> before: recent and ongoing development for the most part falls within
> Solr or Lucene exclusively.
> 
> 
> This e-mail is for discussing the idea and presenting arguments/
> counter-arguments for or against the split. It will be followed by a
> separate VOTE thread e-mail next Monday. If the vote passes then there
> are many questions about how this process should be arranged and
> orchestrated. There are past examples even within Lucene [1] that we
> can learn from, and there are people who know how to do it - the
> actual process is of lesser concern at the moment, what we mostly want
> to do is to reach out to you, signal the idea and ask about your
> opinion. Let us know what you think.
> 
> [1] 
> https://lists.apache.org/thread.html/15bf2dc6d6ccd25459f8a43f0122751eedd3834caa31705f790844d7%401270142638%40%3Cuser.nutch.apache.org%3E
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to