Re: Null checking in Beam

Jan Lukavský Tue, 16 Mar 2021 09:37:50 -0700

I believe it is not a problem of Idea. At least I didn't notice behaviorlike that with Guava, although Guava uses the framework as well. Maybethere is a way to tune which annotations should be generated? Or Guavahandles that somehow differently?


On 3/16/21 5:22 PM, Reuven Lax wrote:

I've also been annoyed at IntelliJ autogenenerating all theseannotations. I believe Kenn said that this was not the intention -maybe there's an IntelliJ setting that would stop this from happening?

On Tue, Mar 16, 2021 at 2:14 AM Jan Lukavský <[email protected]<mailto:[email protected]>> wrote:

I don't know the details of the checkerframework, but there seems
a contradiction between what code is (currently) generated and
what we therefore release and what actually the checkerframework
states [1]:

@UnknownKeyFor:

Used internally by the type system; should never be written by a
programmer.

If this annotation is generated for overwritten methods, then I'd
say, that it means we place a great burden to our users - either
not using autogenerated methods, or erase all the generated
annotations afterwards. Either way, that is not "polite" from Beam.

What we should judge is not only a formal purity of code, but what
stands on the other side is how usable APIs we provide. We should
not head for 100% pure code and sacrificing use comfort. Every
check that leaks to user code is at a price and we should not
ignore that. No free lunch.

From my point of view:

a) if a check does not modify the bytecode, it is fine and we can
use it - we are absolutely free to use any tooling we agree on, if
our users cannot be affected anyhow

b) if a tool needs to be leaked to user, it should be as small
leakage as possible

c) if a check significantly affects compile performance, it
should be possible to opt-out

I think that our current setup violates all these three points.

Moving the check to different CI is a possibility (a)), it would
then require opt-in flag to be able to run the check locally. It
would also stop the leakage (if we would release code without this
check).

If we want to keep some annotations for user's benefit (which
might be fine), it should be really limited to the bare minimum
(e.g. only @Nullable for method arguments and return values,
possibly more, I don't know if and how we can configure that).
Definitely not @UnknownKeyFor, that is simply internal to the
checker. We should then have opt-out flag for local development
before committing changes.

Jan

[1]

https://checkerframework.org/api/org/checkerframework/checker/nullness/qual/UnknownKeyFor.html

<https://checkerframework.org/api/org/checkerframework/checker/nullness/qual/UnknownKeyFor.html>

On 3/16/21 8:33 AM, Reuven Lax wrote:



    On Mon, Mar 15, 2021 at 11:42 PM Reuven Lax <[email protected]
    <mailto:[email protected]>> wrote:



        On Mon, Mar 15, 2021 at 9:12 PM Kenneth Knowles
        <[email protected] <mailto:[email protected]>> wrote:

            I will be blunt about my opinions about the general issue:

            - NullPointerExceptions (and similar) are a solved problem.
               * They have been since 2003 at the latest [1] (this is
            when the types were hacked into Java - the foundation
            dates back to the 70s or earlier)


        Huh - Fahndrich tried to hire me once to work on a project
        called Singularity. Small world. Also note that Sanjay
        Ghemawat is listed in the citations!


    Umm, I was confusing names. Fahndrich is actually a former
    coworker at Google :)

               * Checkerframework is a _pluggable_ system where that
            nullness type system is a "hello, world" level demo,
            since 2008 at the latest [2].
               * Our users should know this and judge us accordingly.

            - Checkerframework should be thought of and described as
            type checking, because it is. It is not heuristic nor
            approximate.
            - If your code was unclear about whether something could
            be null, it was probably unclear to a person reading it
            also, and very likely to have actual bugs.
            - APIs that accept a lot of nullable parameters are,
            generally speaking, bad APIs. They are hard to use
            correctly, less readable, and very likely to cause actual
            bugs. You are forcing your users to deal with accidental
            complexity you left behind.
              * Corollary to the above two points: Almost all the
            time, the changes to clearify nullness make your code
            better, more readable, easier for users or editors.
            - It is true that there is a learning curve to
            programming in this way.


        I agree with the above in a closed system. However as
        mentioned, some of the APIs we use suffer from this.

        In a previous life, I saw up close an effort to add such
        analysis to a large codebase. Two separate tools called
        Prefix and Prefast were used (the difference between the two
        is actually quite interesting, but not relevant here).
        However in order to make this analysis useful, there was a
        massive effort to properly annotate the entire codebase,
        including all libraries used. This isn't a perfect example -
        this was a C++ codebase which is much harder to analyze, and
        these tools identified far more than simply nullness errors
        (resource leaks, array indices, proper string null
        termination, exception behavior, etc.). However the closer we
        can get to properly annotating the transitive closure of our
        dependencies, the better this framework will work.

            - There are certainly common patterns in Java that do not
            work very well, and suppression is sometimes the best option.
               * Example: JUnit's @Setup and @Test conventions do not
            work very well and it is not worth the effort to make
            them work. This is actually because if it were "normal
            code" it would be bad code. There are complex
            inter-method dependencies enforced only by convention.
            This matters: sometimes a JUnit test class is called from
            another class, used as "normal code". This does go wrong
            in practice. Plain old JUnit test cases frequently go
            wrong as well.

            And here is my opinion when it comes to Beam:

            - "Community over code" is not an excuse for negligent
            practices that cause easily avoidable risk to our users.
            I will be very disappointed if we choose that.
            - I think having tooling that helps newcomers write
            better code by default is better for the community too.
            Just like having automatic formatting is better. Less to
            haggle about in review, etc.
            - A simple search reveals about 170 bugs that we know of [4].
            - So far in almost every module I have fixed I discovered
            actual new null errors. Many examples at [5].
            - It is extremely easy to suppress the type checking.
            Almost all of our classes have it suppressed already (I
            did this work, to allow existing errors while protecting
            new code).
            - Including the annotations in the shipped jars is an
            important feature. Without this, users cannot write
            null-safe code themselves.
               * Reuven highlighted this: when methods are not
            annotated, we have to use/implement workarounds. Actually
            Guava does use checkerframework annotations [6] and the
            problem is that it requires its *input* to already be
            non-null so actually you cannot even use it to convert
            nullable values to non-nullable values.
               * Beam has its own [7] that is annotated, actually for
            yet another reason: when Guava's checkNotNull fails, it
            throws NPE when it should throw IllegalArgumentException.
            Guava's checkNotNull should not be used for input validation!
            - It is unfortunate that IntelliJ inserts a bunch of
            annotations in user code. I wonder if there is something
            we can do about that. At the Java level, if they are not
            on the classpath they should be ignored and not affect
            users. Coincidentally, the JDK has had
            NullPointerExceptions in this area :-)  [8].

            I understand the pain of longer compile times slowing
            people down. That is actually a problem to be solved
            which does not require lowering our standards of quality.
            How about we try moving it to a separate CI job and see
            how it goes?

            In my experience stories like Reuven's are much more
            frustrating in a separate CI job because you find out
            quite late that your code has flaws. Like when spotless
            fails, but much more work to fix, and would have been
            prevented long ago if it were integrated into the compile.


        I agree with this. I prefer to be able to detect on my
        computer that there are failures, and not have to wait for
        submission. The complaint was that some people are
        experiencing trouble on their local machine however, so it
        seems reasonable to add an opt-out flag (though I would
        prefer opt out to opt in).


            Kenn

            [1]
            
https://web.eecs.umich.edu/~bchandra/courses/papers/Fahndrich_NonNull.pdf
            
<https://web.eecs.umich.edu/~bchandra/courses/papers/Fahndrich_NonNull.pdf>
            [2]
            
https://homes.cs.washington.edu/~mernst/pubs/pluggable-checkers-issta2008.pdf
            
<https://homes.cs.washington.edu/~mernst/pubs/pluggable-checkers-issta2008.pdf>
            [3]
            
https://github.com/google/guava/blob/fe3fda0ca54076a2268d060725e9a6e26f867a5e/pom.xml#L275
            
<https://github.com/google/guava/blob/fe3fda0ca54076a2268d060725e9a6e26f867a5e/pom.xml#L275>
            [4]
            
https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20(summary%20~%20%22NPE%22%20OR%20summary%20~%20%22NullPointerException%22%20OR%20description%20~%20%22NPE%22%20OR%20description%20~%20%22NullPointerException%22%20OR%20comment%20~%20%22NPE%22%20OR%20comment%20~%20%22NullPointerException%22)
            
<https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20(summary%20~%20%22NPE%22%20OR%20summary%20~%20%22NullPointerException%22%20OR%20description%20~%20%22NPE%22%20OR%20description%20~%20%22NullPointerException%22%20OR%20comment%20~%20%22NPE%22%20OR%20comment%20~%20%22NullPointerException%22)>
            [5] https://github.com/apache/beam/pull/12284
            <https://github.com/apache/beam/pull/12284> and
            https://github.com/apache/beam/pull/12162
            <https://github.com/apache/beam/pull/12162> and
            [6]
            
https://github.com/google/guava/blob/fe3fda0ca54076a2268d060725e9a6e26f867a5e/guava/src/com/google/common/base/Preconditions.java#L878
            
<https://github.com/google/guava/blob/fe3fda0ca54076a2268d060725e9a6e26f867a5e/guava/src/com/google/common/base/Preconditions.java#L878>
            [7]
            
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/Preconditions.java
            
<https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/Preconditions.java>
            [8]
            https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8152174
            <https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8152174>


            On Mon, Mar 15, 2021 at 2:12 PM Reuven Lax
            <[email protected] <mailto:[email protected]>> wrote:

                I have some deeper concerns with the null checks. The
                fact that many libraries we use (including guava)
                don't always annotate their methods forces a lot of
                workarounds. As a very simple example, the return
                value from Preconditions.checkNotNull clearly can
                never be null, yet the nullability checks don't know
                this. This and other similar cases require constantly
                adding extra unnecessary null checks in the code just
                to make the checker happy. There have been other
                cases where I haven't been able to figure out a way
                to make the checker happy (often these seem to
                involve using lambdas), and after 10-15 minutes of
                investigation have given up and disabled the check.

                Now you might say that it's worth the extra pain and
                ugliness of writing "useless" code to ensure that we
                have null-safe code. However I think this ignores a
                sociological aspect of software development. I have a
                higher tolerance than many for this sort of pain, and
                I'm willing to spend some time figuring out how to
                rewrite my code such that it makes the checker happy
                (even though often it forced me to write much more
                awkward code). However even I have often found myself
                giving up and just disabling the check. Many others
                will have less tolerance than me, and will simply
                disable the checks. At that point we'll have a
                codebase littered with @SuppressWarnings("nullness"),
                which doesn't really get us where we want to be. I've
                seen similar struggles in other codebases, and
                generally having a static checker with too many false
                positives often ends up being worse than having no
                checker.

                On Mon, Mar 15, 2021 at 10:37 AM Ismaël Mejía
                <[email protected] <mailto:[email protected]>> wrote:

                    +1

                    Even if I like the strictness for Null checking,
                    I also think that
                    this is adding too much extra time for builds
                    (that I noticed locally
                    when enabled) and also I agree with Jan that the
                    annotations are
                    really an undesired side effect. For reference
                    when you try to auto
                    complete some method signatures on IntelliJ on
                    downstream projects
                    with C-A-v it generates some extra Checkers
                    annotations like @NonNull
                    and others even if the user isn't using them
                    which is not desirable.



                    On Mon, Mar 15, 2021 at 6:04 PM Kyle Weaver
                    <[email protected]
                    <mailto:[email protected]>> wrote:
                    >>
                    >> Big +1 for moving this to separate CI job. I
                    really don't like what annotations are currently
                    added to the code we ship. Tools like Idea add
                    these annotations to code they generate when
                    overriding classes and that's very annoying.
                    Users should not be exposed to internal tools
                    like nullability checking.
                    >
                    >
                    > I was only planning on moving this to a
                    separate CI job. The job would still be release
                    blocking, so the same annotations would still be
                    required.
                    >
                    > I'm not sure which annotations you are
                    concerned about. There are two annotations
                    involved with nullness checking,
                    @SuppressWarnings and @Nullable.
                    @SuppressWarnings has retention policy SOURCE, so
                    it shouldn't be exposed to users at all.
                    @Nullable is not just for internal tooling, it
                    also provides useful information about our APIs
                    to users. The user should not have to guess
                    whether a method argument etc. can be null or
                    not, and for better or worse, these annotations
                    are the standard way of expressing that in Java.

Re: Null checking in Beam

Reply via email to