NightOwl888 commented on issue #793:
URL: https://github.com/apache/lucenenet/issues/793#issuecomment-1780717782

   First of all, the versioning scheme had been decided some time ago and is in 
fact [documented](https://lucenenet.apache.org/contributing/versioning.html) 
and made part of the build. At this point I don't see any reason to go back and 
revisit this scheme which was part of the work that was done during the first 
4.8.0 beta.
   
   > By releasing yourself from this constraint you would have the flexibility 
to release stable versions of the functionality that you have implemented 
without waiting for 100% feature parity with a given upstream Java version.
   
   > This way you may opt to never be 100% feature complete with Java Lucene 
4.8 (for example), because the community is more in need for some 7.x features 
that can then be prioritized over the long tail of rarely used 4.8 features 
(just as a made up example). By following your own version scheme you can 
instead document version X as "compatible with Lucene 4.8 minus features Y and 
Z".
   
   This assumes usability and API are the entire issue, but they are not.
   
   Lucene.NET is the most difficult application I have ever had the pleasure of 
debugging in my 25 years as a developer. When we go off the map like this, we 
literally throw away our best debugging tool, which is to run the *same 
version* of Lucene and Lucene.NET side by side to see where the execution paths 
diverge. I don't have an answer for how we could debug if we combine different 
versions of Lucene. Do you?
   
   Furthermore, the binary structure of the index does change from one version 
to the next, making them incompatible and making it literally impossible to 
bring many Lucene 9.x features back to Lucene.NET 4.x. We had this issue with 
back-porting the [analyzers-nori](https://github.com/apache/lucenenet/pull/645) 
package.
   
   We have 100% compatibility with creating an index in Lucene and opening it 
in Lucene.NET with the same version and plan to keep it that way going forward 
(and it worked once the other way around, but hasn't been tested in quite a 
while). The index isn't the only binary format that is also kept in sync 
between versions.
   
   There are other problems with disjointed versioning between Lucene and 
Lucene.NET. Case and point: Lucene.NET 3.0.3. There was no release of Lucene 
3.0.3. Despite trying to sleuth an answer I have no idea what commit Lucene.NET 
3.0.3 is a port of. I could *guess* that it is a port from 3.0.1 (which 
actually was released), but I can't be 100% sure. I didn't even know what 
commit in this repo corresponded to the 3.0.3 release until I found it on an 
obscure blog (they released 3.0.3 RC2 by renaming it, but didn't make a tag 
corresponding to the 3.0.3 release). Both of these issues are the primary 
reason we have never done a maintenance release of Lucene.NET 3.0.3. While we 
could incorporate the *actual* version number as part of the 
`InformationalVersion` and make it disjointed, it would be very confusing for 
users who see numbers that overlap Lucene releases that don't correspond to 
them or their binary formats. Strict version compatibility avoids getting into 
this situation again.
   
   For usability, there are also issues. Existing Lucene blog posts may not be 
useful if the API is different than the major version of Lucene the post is 
about.
   
   The bottom line is there is no maintenance plan for making a Frankenstein 
version of Lucene that incorporates features from different versions. The best 
way is to try to sync the entire project to a single Git commit. The story goes 
way beyond keeping the API in sync. It also means keeping the execution paths, 
binary formats, tests, and documentation in sync.
   
   While we could simply abandon 4.8.0 and start working on the latest version 
of Lucene now, we would be stuck in a situation where we have all of the same 
work to finish we do now **plus** an estimated 1800 hours of upgrading work. 
This upgrade estimate could be off if we run into any major gaps that mean more 
JDK features we need to find or build replacements for. Right now, we are in a 
situation where our remaining work still has an undefined scope because of gaps 
that we may not know about. The plan is to try to close all of the gaps so when 
we finally do start working on the upgrade we have a mostly well-defined scope 
of work instead of a fuzzy "research this and figure out what we need to do 
here" situation, where research is often most of the work (meaning to create an 
issue about it, we need to do most of the work first to define the scope of the 
issue).
   
   Also, seems like a total waste do to that. Most of the work that is 
remaining is on ICU4N. I have almost convinced myself that we may be able to 
release ICU4N as stable earlier by not strictly following the [ICU versioning 
scheme](https://unicode-org.github.io/icu/userguide/icu/design.html#other-icu-design-principles)
 but instead allowing each major release to have breaking API changes until we 
stabilize it (we are 13 versions behind so we have some wiggle room, but it 
does mean we will have to do a full upgrade every time we make a breaking API 
change). But we should probably still conditionally compile out the "draft" 
APIs and other APIs that are considered unstable in the NuGet package or at 
least make them invisible to the IDE. There are still other issues to deal 
with, such as the fact that NuGet doesn't actually deploy resource files for 
cultures it doesn't recognize. There are many decisions to make like that in 
ICU4N where there are gaps between Java and .NET. Unfortunatel
 y, nobody here seems willing to talk about the *actual* work that remains. 
Most want to move on to the next version of Lucene and pretend that we don't 
need to do this work for the upgrade, anyway.
   
   We could alternatively move on to 4.8.0 release while keeping the 
`Lucene.Net.ICU` and components that depend on it unstable, but unfortunately 
that means either splitting up the `lucene-cli` component or releasing it as 
stable with unstable dependencies. I would argue we need to focus 100% on the 
remaining things that could break the API before we do such a thing (such as 
automated query parser generation), which could still be time-consuming. It 
also means we won't have a completely stable 4.8.0 release, the first fully 
stable release might be something like 4.8.0.17. Or else we would need to setup 
our build to make separate stable and unstable release packages to comply with 
the Apache [release 
procedure](https://lucenenet.apache.org/contributing/make-release.html). And we 
still wouldn't technically be able to start working on upgrading until we have 
a stable ICU4N, anyway. I don't see how this improves the situation, it only 
adds more work to do to make it stable and makes the
  versioning history more difficult to understand.
   
   It really sucks for us to have to reject what would ordinarily be good ideas 
from the community, but unfortunately, most of these ideas never take 
everything into consideration when providing such advice, only the "normal 
stuff" that most projects deal with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to