NightOwl888 commented on code in PR #801:
URL: https://github.com/apache/lucenenet/pull/801#discussion_r1160794004


##########
src/Lucene.Net.Analysis.Common/Tartarus/Snowball/SnowballProgram.cs:
##########
@@ -63,7 +63,9 @@ protected SnowballProgram()
         /// <summary>
         /// Set the current string.
         /// </summary>
-        public virtual void SetCurrent(string value)
+        // LUCENENET specific - S1699 - marked non-virtual because calling
+        // virtual members from the constructor is not a safe operation in .NET
+        public void SetCurrent(string value)

Review Comment:
   This one has a C# implementation that we can get clues from: 
https://github.com/snowballstem/snowball/blob/master/csharp/Snowball/Stemmer.cs#L144-L191.
 Looks like they just made a private implementation that is called by both the 
constructor and the virtual Current property. Let's just follow their lead. 
Ditto for the other overload.
   
   Of course, our code is a port from Lucene, which is a port of the Java 
version of this, so there are several differences that have crept in.



##########
src/Lucene.Net.Analysis.Common/Analysis/Util/OpenStringBuilder.cs:
##########
@@ -56,7 +56,9 @@ public virtual int Length
             set => m_len = value;
         }
 
-        public virtual void Set(char[] arr, int end)
+        // LUCENENET specific - S1699 - marked non-virtual because calling
+        // virtual members from the constructor is not a safe operation in .NET
+        public void Set(char[] arr, int end)

Review Comment:
   I doubt anyone will need to extend this class, but see my comment on the 
SnowballProgram for an idea on how to deal with this - I think having a shared 
private implementation here is fine. I recently modified it so we can pass 
strings into the J2N parsers using bounds instead of allocating substrings.
   
   Side note: Perhaps we could find a way to replace usages of this with 
[ValueStringBuilder](https://github.com/dotnet/runtime/blob/v7.0.4/src/libraries/Common/src/System/Text/ValueStringBuilder.cs),
 which is a ref struct and lives on the stack. The initial allocation can even 
be on the stack. It would need its own overloads to be passed around, though - 
ref structs cannot have interfaces.



##########
src/Lucene.Net.Facet/Taxonomy/Directory/DirectoryTaxonomyReader.cs:
##########
@@ -241,15 +241,19 @@ protected override TaxonomyReader DoOpenIfChanged()
         /// <summary>
         /// Open the <see cref="DirectoryReader"/> from this <see 
cref="Directory"/>. 
         /// </summary>
-        protected virtual DirectoryReader OpenIndexReader(Directory directory)
+        // LUCENENET specific - S1699 - marked non-virtual because calling
+        // virtual members from the constructor is not a safe operation in .NET
+        protected DirectoryReader OpenIndexReader(Directory directory)

Review Comment:
   This method is complementary to the 
[DirectoryTaxonomyWriter.OpenIndexWriter]() method, so we should aim for a 
common solution for them both. As you can see, that one has existing subclasses 
and those are broken when trying to pass any state in through the constructor. 
These were definitely meant for extension by end users.
   
   These virtual members exist so in Java they can quickly declare an anonymous 
class and override this implementation to return something else. This is the 
reason why factory method is preferred over abstract factory in Lucene.
   
   I have tried fixing `DirectoryTaxonomyWriter` with abstract factory and 
there were issues trying to get the state of the subclass separated from the 
factory. But there are a couple of things that might work I didn't try:
   
   1. Moving all of the shared state from the subclass and into the factory and 
sharing the state as properties of that class.
   2. Using `LazyInitializer` to wait until the first use to open the reader 
and some of the other state that isn't available in the constructor.
   
   That being said, I am hesitant to try fixing this class until the 
concurrency issues have been tracked down. There are 2 tests that have locking 
contention issues that don't exist in Java Lucene:
   
   1. `TestTaxonomyCombined.TestTaxonomyReaderRefreshRaces()`
   2. `TestDirectoryTaxonomyWriter.TestConcurrency()`
   
   You are welcome to take a stab at them if you like.
   
   @eladmarg gave me a tip that it could be the `LurchTable` that we are using 
as an LRU cache. There is an experimental implementation that doesn't use 
locking in [LRUCache](https://github.com/JKirk865/LRUCache) I would like to 
try. We unfortunately put too much stock in trying to make it conform to the 
`IDictionary<TKey, TValue>` interface, which basically makes using any existing 
c# implementation impossible. So, I think we should abandon that idea and 
create our own `ILruCache<TKey, TValue>` interface with a minimal set of 
members that Lucene.Net.Facet will need.
   
   We ought to also make it easy to plug in a custom LRU cache implementation, 
since Redis has a [distributed LRU 
implementation](http://cndoc.github.io/redis-doc-cn/cn/topics/lru-cache.html) 
that could get a lot of milage here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to