Thank you for the replies so far. I think that each contrib would necessarily have to have their own release schedule and release vote. I suspect that there might be frequent releases at first, and then these will smooth out into basically once per major release. I also think that contribs having releases could reduce the number of minor releases that we need to do, if a certain feature is well contained.
Compatibility breaks will happen, but I feel like we should try to avoid them. Sometimes they're inevitable though, and we'll need to clearly mark that version X of the contrib is only compatible with version X of Solr, and for newer versions of Solr you have to use Y. Maybe we'll be able to release contrib Y first, and have it bridge the Solr releases. I think we'll need to invest in CI tooling to catch these kinds of situations sooner. > * More build files; copying the rules/setup/standards of the Solr mothership and will become divergent over time no doubt. Or just KISS principle; no sharing; simple Maven projects. I wonder what the Gradle equivalent would be here. In maven-land, we can define a parent pom and attach a bunch of configuration and rules and plugins to it, and reuse across repositories and projects. Maybe the gradle build rules turn into an externally referenced project as well. I don't know what we'll need, but being able to apply all of our validation and precommit rules consistently to the contribs seems important. > Q: Could & should many contribs live in one repo (no more internal contribs), yet each still have its own release cycle? This could make sharing build infrastructure easier, and detecting Solr compatibility with them easier. Although it would mean sharing GitHub project area, thus sharing issues/PRs. I don't know. It would make source releases more complicated, which are what the ASF releases provide. I think it would make testing a contrib against multiple versions of Solr more difficult as well. > Q: Should we create a separate JIRA for these contribs... or ditch JIRA entirely for them, relying on GitHub alone? I'd start with same JIRA, with a separate component or label. I don't think GH issues would be good because it becomes harder to link between core and contrib issues in case of compat or tandme feature development. > Q: Would contribs be treated as first class citizens in the Solr Reference Guide (they are still in the ASF after all), or would they be banished like the DIH was? Probably a link in the reference guide to a list of contribs, and then each contrib has its own documentation. On Tue, Nov 17, 2020 at 10:00 AM Anshum Gupta <[email protected]> wrote: > Thanks for sending this email, Mike and thanks for the follow up, David. > > The idea of having multiple repos under the project seems like the > reasonable way to go for our project. This allows us to support more > features/tooling/etc. without having to link them to Solr or Lucene > releases. > > An important thing here is to understand that if it comes from under the > same umbrella, it should be treated with the same care and respect - at > least we should attempt to. > > Q: Is it "okay" to release new Solr versions that break any of these >> external contribs? Knowingly or unknowingly -- does it matter? > > I think it's really important to understand that breaking compat here > should be a well thought off thing, especially as that's the > differentiating factor for code that resides under the project vs. external > repos. It doesn't mean that compat breaks can't happen, it's just that > there would be more responsibility to providing a smooth upgrade path for > users in case of compat breaks. > > From my perspective, the code in the external repos here would be just > like the code in the core repo, just with a different release cadence. > > Q: Would contribs be treated as first class citizens in the Solr Reference >> Guide (they are still in the ASF after all), or would they be banished like >> the DIH was? > > The repos are supposed to grow, and with that, adding more to the current > ref guide would be just bad user experience. In addition, the different > release cadence would make it difficult to support documentation for the > code in these repos via the ref guide that would be released with the core. > We should certainly aim for the same quality of documentation, but not make > it to be a part of the ref guide. > > > > On Sat, Nov 14, 2020 at 8:54 PM David Smiley <[email protected]> wrote: > >> Thanks for shining a spotlight on this Mike. >> I have some questions to consider. I'll call these additional repos, >> "external contribs", or just contribs for short here; perhaps our internal >> contribs would migrate. >> >> Q: Would each contrib be released at its own cadence unrelated to Solr? >> I suppose so. >> Q: Would each contrib have it's own release vote? I suppose so, as it >> has its own artifact. I think the ASF requires this. >> Q: Is it "okay" to release new Solr versions that break any of these >> external contribs? Knowingly or unknowingly -- does it matter? >> Q: What technical work is needed to extricate an internal contrib to an >> external? >> * source control history. (note: i've done this git history in a single >> folder extraction before, with a popular Stackoverflow answer) >> * mandatory ASF files, e.g. license, notice >> * more files that we may want: CHANGES.txt >> * More build files; copying the rules/setup/standards of the Solr >> mothership and will become divergent over time no doubt. Or just KISS >> principle; no sharing; simple Maven projects. >> Q: Could & should many contribs live in one repo (no more internal >> contribs), yet each still have its own release cycle? This could make >> sharing build infrastructure easier, and detecting Solr compatibility with >> them easier. Although it would mean sharing GitHub project area, thus >> sharing issues/PRs. >> Q: Should we create a separate JIRA for these contribs... or ditch JIRA >> entirely for them, relying on GitHub alone? >> Q: Would contribs be treated as first class citizens in the Solr >> Reference Guide (they are still in the ASF after all), or would they be >> banished like the DIH was? >> >> ~ David Smiley >> Apache Lucene/Solr Search Developer >> http://www.linkedin.com/in/davidwsmiley >> >> >> On Thu, Nov 12, 2020 at 6:40 PM Mike Drob <[email protected]> wrote: >> >>> Solr Devs, >>> >>> We've slowly been moving into a multi-repository model, and I wanted to >>> bring some more attention to it and have a more focused discussion. We've >>> recently embarked upon the acceptance of solr-operator as a distinct >>> repo[1] under the care of the Lucene (soon to be Solr) PMC. I expect that >>> there will be more cases of this as we transition additional contribs out >>> of core, or as more plugins, packages, and integrations mature. Some will >>> make sense as externally maintained code bases, but I believe other >>> contributions may benefit our community more as part of the Apache >>> Foundation. >>> >>> I think there was a very insightful comment[2] made by GP regarding >>> adopting a similar model to Apache Commons governance, bringing attention >>> to it here because I fear it may have gotten lost deep in the thread. Based >>> on observations of Commons and a few other Apache projects with multi-repo >>> setups, there thankfully does not appear to be a limit on how many >>> repositories a PMC can maintain. The size and scope of each individual >>> repository can vary greatly. I see potential ideas for anything that could >>> be standalone and not tied to a release cycle (Admin UI, DIH, etc...), or >>> anything that bridges integrations between Solr and other systems (k8s, >>> HDFS, etc...). >>> >>> The risks that new repos face are similar to the risks they would have >>> encountered as contrib modules, but I don't think they should dissuade us. >>> Each project would need to start with a champion or sponsor and a >>> discussion on the mailing list. From there, we can vote to accept the code, >>> or just the idea if there is no code yet, as a community and create the >>> repo. As part of a natural lifecycle, if there's not enough momentum or >>> adoption over time, then we can update the README and docs and "retire" >>> certain projects. The exact mechanisms can be undetermined for now; maybe >>> it's a repo rename, maybe it's marking the repo read-only, maybe it's >>> something else. >>> >>> The Commons model is that everyone is a committer on everything. There >>> are other governance models, like Hadoop, with "area committers" who are >>> limited to the specific repositories they have contributed frequently to. >>> I'm not sure which model ultimately suits us better, but I think that >>> leveraging area committers would allow us to recognize and empower >>> contributors sooner and more frequently. Releases would still need to be >>> voted on and approved by the singular PMC. >>> >>> There's no real action items here, it's more of a discussion prompt. If >>> it looks like we have general consensus to this approach, then I'll start >>> putting together individual proposals for a few repos to exercise the >>> process and get more contributions going. I'll probably put the proposals >>> together even if there's no replies here, but I'd much rather have some >>> acknowledgement from the community that I'm headed in a sustainable >>> direction! >>> >>> Mike >>> >>> [1]: >>> https://lists.apache.org/thread.html/rb90f530155dc6edc6f1ccd5f056db1618142fdfcbd32d83f539d984b%40%3Cdev.lucene.apache.org%3E >>> [2]: >>> https://lists.apache.org/thread.html/r9965cb693369d927a942f805c134bfeb45c5e80f447ad0fe2f663fae%40%3Cdev.lucene.apache.org%3E >>> >>
