NightOwl888 commented on issue #737: URL: https://github.com/apache/lucenenet/issues/737#issuecomment-1299969105
> > And related to this, would it make sense to enable the Nullable feature across the project to give a informative public API for the developers using Lucene.NET? Here we could mark variables and parameters that can be null at some point. This should be possible to do step by step by adding the #nullable enable that I can see you've used in some files. Well, you've come to a similar conclusion that I have - nullable warnings, nullable reference types, guard clauses, and exception documentation comments are all intimately linked together. It is best if these are all dealt with together. Recently `OfflineSorter` was updated to include these in https://github.com/apache/lucenenet/pull/733/commits/c0ba34db214c7ca48c0914d2d80bc3b09abe12b9. The whole `Lucene.Net.Spatial` module was also done in https://github.com/apache/lucenenet/pull/619. But it is a huge task because it requires careful analysis. If we average 30 minutes per file, it is about ~1500 hours of work. Also, although this can technically be enabled one file at a time, it doesn't end up working out that way because anything a file is dependent upon that is newly `#nullable enable` is suddenly going to show warnings that would need to be addressed after the job was already "done" for a that file. It works out best to start at the base of the dependency tree and then work out to all of the submodules that depend on them. Of course, we also have some classes that are higher in priority than others. Certainly, all of the main APIs used in the demos are the most important ones to enable nullable reference types on. There are several APIs where `null` is allowed as a parameter, so this would make it easier to understand that it is a valid value just by looking at Intellisense. All of that being said, we can break this up into chunks to make it easier to manage, for sure. > > Which is similar to the version we have which should therefore mean that the same issue is still present in the Java version too? Therefore I'm wondering what's the best approach for fixing these errors or if they should be fixed at all. ## Asserts vs Guard Clauses It is common in Java to simply let exceptions happen and then deal with them later. This is why most of the guard clauses are missing that would stop these null reference warnings showing up in the scan. Guard clauses are a thing in Java, but they are much less common than in .NET. In Lucene, there are a ton of things that I would call "guard clauses" that were made into asserts ([example](https://github.com/apache/lucenenet/pull/733/commits/c0ba34db214c7ca48c0914d2d80bc3b09abe12b9#diff-42faf9ca5c301c1a1bae397b5a115db9f8d41d524359db26cd5e1a74637a54a0L580-L585)). Since asserts can be enabled or disabled in Java, this sort of makes sense - the performance of checking whether values are valid is mitigated in production when asserts are disabled. However, in conversations I have had with @rclabo, we both agree that it doesn't make much sense to expect .NET users to have to *enable* guard clauses. They should always be there. IMO, we should also add additional guard clauses that check for `null` even if they didn't exist in Lucene, which is a better approach to getting rid of these warnings than suppressing them. An `ArgumentNullException` or `InvalidOperationException` is an intentional design decision. A `NullReferenceException` is a bug. > **NOTE:** At least having an analyzer already pre-built that identifies these is one less custom analyzer we need to create to locate these issues, so that part of the job is done. One more thing to consider is what impact changing the exception type might have on code that uses the API. We've done a review of exceptions in https://github.com/apache/lucenenet/pull/476/files, and I am fairly confident there won't be much impact with changing from `AssertionError` to `ArgumentException` or `ArgumentNullException`. `ArgumentOutOfRangeException` may take some additional care because in Java there are [3 different exceptions that map to 2 in .NET](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net/Support/ExceptionHandling/ExceptionExtensions.cs#L220-L266). So, we will need pretty thorough testing on each section that is changed to ensure we aren't catching errors where we shouldn't be. The extension methods used in catch blocks take care of much of it, but they are not 100% thorough because the mapping isn't perfect, not every catch block has been converted, and sometimes we simply needed a design change to make it f unction. > **NOTE:** Some of the IO components are set up to always throw `ArgumentOutOfRangeException` instead of `ArgumentException` when checking `index` and `length` to ensure they are within the bounds of the array (as is typical in .NET) because in Java these would throw `IndexOutOfBoundsException`. ## Inheritance Another thing to consider is how heavily Lucene depends on inheritance and the fact that when you make a `#nullable` decision on the API of the base class, it is enforced for every class that inherits it. That can go against the grain of Lucene's design where a [base class](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSource.cs#L61) has both [subclasses that do not accept `null`](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSources/NormValueSource.cs#L52) and [subclasses that do accept `null`](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSources/TermFreqValueSource.cs#L45). I haven't looked into this too thoroughly - there might be some way around it that I am unaware of. > An option under consideration for this particular case is simply to make them all allow `null`, but have them instantiate a new dictionary when `null` is passed to ensure it doesn't throw. This is an API design change, but would make the API easier to work with. > > This really ought to be a `IDictionary<TKey, TValue>` rather than `IDictionary`, also, but that is another project. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org