NightOwl888 commented on code in PR #801: URL: https://github.com/apache/lucenenet/pull/801#discussion_r1160794004
########## src/Lucene.Net.Analysis.Common/Tartarus/Snowball/SnowballProgram.cs: ########## @@ -63,7 +63,9 @@ protected SnowballProgram() /// <summary> /// Set the current string. /// </summary> - public virtual void SetCurrent(string value) + // LUCENENET specific - S1699 - marked non-virtual because calling + // virtual members from the constructor is not a safe operation in .NET + public void SetCurrent(string value) Review Comment: This one has a C# implementation that we can get clues from: https://github.com/snowballstem/snowball/blob/master/csharp/Snowball/Stemmer.cs#L144-L191. Looks like they just made a private implementation that is called by both the constructor and the virtual Current property. Let's just follow their lead. Ditto for the other overload. Of course, our code is a port from Lucene, which is a port of the Java version of this, so there are several differences that have crept in. ########## src/Lucene.Net.Analysis.Common/Analysis/Util/OpenStringBuilder.cs: ########## @@ -56,7 +56,9 @@ public virtual int Length set => m_len = value; } - public virtual void Set(char[] arr, int end) + // LUCENENET specific - S1699 - marked non-virtual because calling + // virtual members from the constructor is not a safe operation in .NET + public void Set(char[] arr, int end) Review Comment: I doubt anyone will need to extend this class, but see my comment on the SnowballProgram for an idea on how to deal with this - I think having a shared private implementation here is fine. I recently modified it so we can pass strings into the J2N parsers using bounds instead of allocating substrings. Side note: Perhaps we could find a way to replace usages of this with [ValueStringBuilder](https://github.com/dotnet/runtime/blob/v7.0.4/src/libraries/Common/src/System/Text/ValueStringBuilder.cs), which is a ref struct and lives on the stack. The initial allocation can even be on the stack. It would need its own overloads to be passed around, though - ref structs cannot have interfaces. ########## src/Lucene.Net.Facet/Taxonomy/Directory/DirectoryTaxonomyReader.cs: ########## @@ -241,15 +241,19 @@ protected override TaxonomyReader DoOpenIfChanged() /// <summary> /// Open the <see cref="DirectoryReader"/> from this <see cref="Directory"/>. /// </summary> - protected virtual DirectoryReader OpenIndexReader(Directory directory) + // LUCENENET specific - S1699 - marked non-virtual because calling + // virtual members from the constructor is not a safe operation in .NET + protected DirectoryReader OpenIndexReader(Directory directory) Review Comment: This method is complementary to the [DirectoryTaxonomyWriter.OpenIndexWriter]() method, so we should aim for a common solution for them both. As you can see, that one has existing subclasses and those are broken when trying to pass any state in through the constructor. These were definitely meant for extension by end users. These virtual members exist so in Java they can quickly declare an anonymous class and override this implementation to return something else. This is the reason why factory method is preferred over abstract factory in Lucene. I have tried fixing `DirectoryTaxonomyWriter` with abstract factory and there were issues trying to get the state of the subclass separated from the factory. But there are a couple of things that might work I didn't try: 1. Moving all of the shared state from the subclass and into the factory and sharing the state as properties of that class. 2. Using `LazyInitializer` to wait until the first use to open the reader and some of the other state that isn't available in the constructor. That being said, I am hesitant to try fixing this class until the concurrency issues have been tracked down. There are 2 tests that have locking contention issues that don't exist in Java Lucene: 1. `TestTaxonomyCombined.TestTaxonomyReaderRefreshRaces()` 2. `TestDirectoryTaxonomyWriter.TestConcurrency()` You are welcome to take a stab at them if you like. @eladmarg gave me a tip that it could be the `LurchTable` that we are using as an LRU cache. There is an experimental implementation that doesn't use locking in [LRUCache](https://github.com/JKirk865/LRUCache) I would like to try. We unfortunately put too much stock in trying to make it conform to the `IDictionary<TKey, TValue>` interface, which basically makes using any existing c# implementation impossible. So, I think we should abandon that idea and create our own `ILruCache<TKey, TValue>` interface with a minimal set of members that Lucene.Net.Facet will need. We ought to also make it easy to plug in a custom LRU cache implementation, since Redis has a [distributed LRU implementation](http://cndoc.github.io/redis-doc-cn/cn/topics/lru-cache.html) that could get a lot of milage here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org