NightOwl888 commented on issue #793: URL: https://github.com/apache/lucenenet/issues/793#issuecomment-2291774218
> > Furthermore, the binary structure of the index does change from one version to the next, making them incompatible and making it literally impossible to bring many Lucene 9.x features back to Lucene.NET 4.x. We had this issue with back-porting the [analyzers-nori](https://github.com/apache/lucenenet/pull/645) package. > > We have 100% compatibility with creating an index in Lucene and opening it in Lucene.NET with the same version and plan to keep it that way going forward (and it worked once the other way around, but hasn't been tested in quite a while). The index isn't the only binary format that is also kept in sync between versions. > > @NightOwl888 I am a Lucene Java programmer myself and am happy to help in any efforts to maintain two-way compatibility between Lucene and Lucene.NET. @superkelvint - Sorry for the late reply. I didn't see your comment back in April. Thanks for offering to help with compatibility. One way you might be able to help us is to add support (even if it is unofficial) to the latest version of Lucene to read 4.8.0 codecs. The [backwards-codecs package](https://github.com/apache/lucene/tree/releases/lucene-solr/8.8.1/lucene/backward-codecs/src/java/org/apache/lucene/codecs) only goes back to Lucene 5.x. Our plan is that once Lucene.NET 4.8.0 is stable to jump ahead to the current version, so it would be beneficial if Lucene.NET users could upgrade the software first and upgrade their index at some later point. It would save us some time if we didn't have to grab the 4.x codecs from the last version that supported them and try to splice them into the backwards-codecs package, as offhand I don't really know what is involved. Another way you could help us (since you linked to the issue) is to provide some guidance on the [analysis-nori](https://github.com/apache/lucenenet/pull/645) module ([latest work here](https://github.com/NightOwl888/lucenenet/tree/feature/analysis-nori-2)). We got most of it working, but there are 3 test failures that were difficult to find an answer for. The tests are [`TestRandomHugeStringsMockGraphAfter`](https://github.com/NightOwl888/lucenenet/blob/feature/analysis-nori-2/src/Lucene.Net.Tests.Analysis.Nori/TestKoreanTokenizer.cs#L447), [TestUserDict](https://github.com/NightOwl888/lucenenet/blob/feature/analysis-nori-2/src/Lucene.Net.Tests.Analysis.Nori/TestKoreanTokenizerFactory.cs#L95), and [TestLookup](https://github.com/NightOwl888/lucenenet/blob/feature/analysis-nori-2/src/Lucene.Net.Tests.Analysis.Nori/Dict/UserDictionaryTest.cs#L29). The biggest issue is that it is ported from Lucene 8.2.0 and the FST implementation has completely changed. I tried recreating the UserD ictionary with our ported code, but the UserDict test still doesn't pass. I also tried porting over the earliest version, but FST had changed before then. Now, since the kuromoji module is almost identical and it runs on 4.8.0, I suspect there is a solution. I have already [asked the Lucene team](https://lists.apache.org/thread/b5nt4hwbkxo5s75z32kp1ocg87q2qoq8), but their advice was just to wait until we upgrade. However, if we have someone who is willing to help us find a solution, maybe we can make this available sooner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org