Thoughts on null static analysis

Daniel Trebbien Fri, 11 May 2018 14:42:36 -0700

Hello all,

I have been thinking about a way to generalize the changes that I have
proposed in Pull Request 314
<https://github.com/apache/incubator-netbeans/pull/314> regarding NPE
static analysis and the NPECheck hint.


In the pull request, I propose adding a "nullability of fields" preference
that would allow the user to specify whether fields are definitely null,
definitely not-null, or possibly null.  This would allow NPECheck to be
informed that java.lang.Boolean#TRUE and #FALSE are definitely not null,
but this solution does not apply to, for example, "nullability of method
return values", which I allude to in one of the comments.

Instead of this approach, I started thinking:  wouldn't it be great if a
NetBeans module could assist NPECheck by providing its own null static
analysis using a custom set of checks and rules.  This would allow a
NetBeans module to inform NPECheck of the nullability of fields as well as
the nullability of method return values.

This idea would require some additions to the Java Hints SPI module.  In
particular, I was thinking that there could be a @NullAnalyzer annotation
which would be like the @Hint annotation in that it would mark points where
NPECheck could call into the module for null static analysis given
HintContext and a CompilationInfo.  The NPECheck.State enum would be moved
to SPI and renamed to something like NullAnalysisResultType.  Another
class, named something like NullAnalysisContext, would have two methods.
The first, getResult(), would take a Tree and return a NullAnalysisResult
(explained later).  This first method would be similar to getting a value
from the Map returned by NPECheck#computeExpressionsState().  The second
method, setResult(), would take a Tree and a NullAnalysisResult for the
tree.  This second method is what null analyzer plugins would use to inform
NPECheck about any null analysis results that the plugin makes.  The bulk
of NPECheck#VisitorImpl would then be made into a "DefaultNullAnalyzer".

I think that it would be good to clean up the NullAnalysisResultType enum
so that it has the following enum variants:

   - POSSIBLE_NULL (currently POSSIBLE_NULL_REPORT)
   - NULL (currently NULL and NULL_HYPOTHETICAL)
   - NOT_NULL (currently NOT_NULL, NOT_NULL_HYPOTHETICAL, and
   NOT_NULL_BE_NPE)

A null NullAnalysisResultType value would be the same as a null or
POSSIBLE_NULL NPECheck.State value.

I see that there are also currently INSTANCE_OF_FALSE and INSTANCE_OF_TRUE
State variants, but I am not sure what these are for.

NullAnalysisResult would be a new class that takes a NullAnalysisResultType
value and a second NullAnalysisResultType value for a "container of" result
type.  To explain this, consider the example of a variable of type
Object[].  An IdentifierTree referencing this variable can have its own
null analysis result type (e.g. NOT_NULL), and its "container of" result
type might be NOT_NULL if static analysis is able to determine that the
object references contained in the array are not null.

"Container of" result types would also apply to Streams, Iterables,
Collections, Iterators, and third-party library types that are similar to
collections.

An alternative to NullAnalysisResult is to add all of the combinations of
variants to NullAnalysisResultType; i.e.:

   - POSSIBLE_NULL
   - NULL
   - NOT_NULL
   - CONTAINER_OF_POSSIBLE_NULL
   - CONTAINER_OF_NULL
   - CONTAINER_OF_NOT_NULL
   - POSSIBLE_NULL_CONTAINER_OF_POSSIBLE_NULL
   - POSSIBLE_NULL_CONTAINER_OF_NULL
   - POSSIBLE_NULL_CONTAINER_OF_NOT_NULL
   -
   - NOT_NULL_CONTAINER_OF_POSSIBLE_NULL
   - NOT_NULL_CONTAINER_OF_NULL
   - NOT_NULL_CONTAINER_OF_NOT_NULL

NULL_CONTAINER_OF_* variants are not added because the container variable
is itself null, so it's sort of undefined what the null analysis result of
the contents would be.

There are also not CONTAINER, POSSIBLE_NULL_CONTAINER, NULL_CONTAINER, or
NOT_NULL_CONTAINER variants, as these can be mapped to null, POSSIBLE_NULL,
NULL, and NOT_NULL, respectively.

I think that I prefer the alternative for the reason that it would not be
necessary to create many objects of type NullAnalysisResult which would add
pressure on the GC.  I wanted to introduce
NullAnalysisResult/NullAnalysisResultType first so that you would know why
I am proposing so many similarly-named enum variants.

What are others' thoughts on this idea?

Daniel

Thoughts on null static analysis

Reply via email to