[GitHub] [lucenenet] NightOwl888 commented on issue #737: Null pointers/nullable

GitBox Wed, 02 Nov 2022 02:59:43 -0700


NightOwl888 commented on issue #737:
URL: https://github.com/apache/lucenenet/issues/737#issuecomment-1299969105

> > And related to this, would it make sense to enable the Nullable feature
across the project to give a informative public API for the developers using
Lucene.NET? Here we could mark variables and parameters that can be null at
some point. This should be possible to do step by step by adding the #nullable
enable that I can see you've used in some files.

Well, you've come to a similar conclusion that I have - nullable warnings,
nullable reference types, guard clauses, and exception documentation comments
are all intimately linked together. It is best if these are all dealt with
together. Recently `OfflineSorter` was updated to include these in
https://github.com/apache/lucenenet/pull/733/commits/c0ba34db214c7ca48c0914d2d80bc3b09abe12b9.
The whole `Lucene.Net.Spatial` module was also done in
https://github.com/apache/lucenenet/pull/619.

But it is a huge task because it requires careful analysis. If we average 30
minutes per file, it is about ~1500 hours of work.

Also, although this can technically be enabled one file at a time, it
doesn't end up working out that way because anything a file is dependent upon
that is newly `#nullable enable` is suddenly going to show warnings that would
need to be addressed after the job was already "done" for a that file. It works
out best to start at the base of the dependency tree and then work out to all
of the submodules that depend on them.

Of course, we also have some classes that are higher in priority than
others. Certainly, all of the main APIs used in the demos are the most
important ones to enable nullable reference types on. There are several APIs
where `null` is allowed as a parameter, so this would make it easier to
understand that it is a valid value just by looking at Intellisense.

All of that being said, we can break this up into chunks to make it easier
to manage, for sure.

> > Which is similar to the version we have which should therefore mean that
the same issue is still present in the Java version too? Therefore I'm
wondering what's the best approach for fixing these errors or if they should be
fixed at all.

## Asserts vs Guard Clauses

It is common in Java to simply let exceptions happen and then deal with them
later. This is why most of the guard clauses are missing that would stop these
null reference warnings showing up in the scan. Guard clauses are a thing in
Java, but they are much less common than in .NET.

In Lucene, there are a ton of things that I would call "guard clauses" that
were made into asserts
([example](https://github.com/apache/lucenenet/pull/733/commits/c0ba34db214c7ca48c0914d2d80bc3b09abe12b9#diff-42faf9ca5c301c1a1bae397b5a115db9f8d41d524359db26cd5e1a74637a54a0L580-L585)).
Since asserts can be enabled or disabled in Java, this sort of makes sense -
the performance of checking whether values are valid is mitigated in production
when asserts are disabled. However, in conversations I have had with @rclabo,
we both agree that it doesn't make much sense to expect .NET users to have to
*enable* guard clauses. They should always be there.

IMO, we should also add additional guard clauses that check for `null` even
if they didn't exist in Lucene, which is a better approach to getting rid of
these warnings than suppressing them. An `ArgumentNullException` or
`InvalidOperationException` is an intentional design decision. A
`NullReferenceException` is a bug.

> **NOTE:** At least having an analyzer already pre-built that identifies
these is one less custom analyzer we need to create to locate these issues, so
that part of the job is done.

One more thing to consider is what impact changing the exception type might
have on code that uses the API. We've done a review of exceptions in
https://github.com/apache/lucenenet/pull/476/files, and I am fairly confident
there won't be much impact with changing from `AssertionError` to
`ArgumentException` or `ArgumentNullException`. `ArgumentOutOfRangeException`
may take some additional care because in Java there are [3 different exceptions
that map to 2 in
.NET](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net/Support/ExceptionHandling/ExceptionExtensions.cs#L220-L266).
So, we will need pretty thorough testing on each section that is changed to
ensure we aren't catching errors where we shouldn't be. The extension methods
used in catch blocks take care of much of it, but they are not 100% thorough
because the mapping isn't perfect, not every catch block has been converted,
and sometimes we simply needed a design change to make it f
unction.

> **NOTE:** Some of the IO components are set up to always throw
`ArgumentOutOfRangeException` instead of `ArgumentException` when checking
`index` and `length` to ensure they are within the bounds of the array (as is
typical in .NET) because in Java these would throw `IndexOutOfBoundsException`.

## Inheritance

Another thing to consider is how heavily Lucene depends on inheritance and
the fact that when you make a `#nullable` decision on the API of the base
class, it is enforced for every class that inherits it. That can go against the
grain of Lucene's design where a [base
class](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSource.cs#L61)
has both [subclasses that do not accept
`null`](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSources/NormValueSource.cs#L52)
and [subclasses that do accept
`null`](https://github.com/apache/lucenenet/blob/c076e40b14d4c20e6fdfee4e28d0b3332cf6d0ce/src/Lucene.Net.Queries/Function/ValueSources/TermFreqValueSource.cs#L45).
I haven't looked into this too thoroughly - there might be some way around it
that I am unaware of.

> An option under consideration for this particular case is simply to make
them all allow `null`, but have them instantiate a new dictionary when `null`
is passed to ensure it doesn't throw. This is an API design change, but would
make the API easier to work with.
>
> This really ought to be a `IDictionary<TKey, TValue>` rather than
`IDictionary`, also, but that is another project.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [lucenenet] NightOwl888 commented on issue #737: Null pointers/nullable

Reply via email to