I've been interested in having the D compiler take advantage of the flow analysis in the optimizer to do some more checking. Coverity and clang get a lot of positive press about doing this, but any details of exactly *what* they do have been either carefully hidden (in Coverity's case) or undocumented (clang's page on this is blank). All I can find is marketing hype and a lot of vague handwaving.

Here is what I've been able to glean from much time spent with google on what they detect and my knowledge of how data flow analysis works:

1. dereference of NULL pointers (all reaching definitions of a pointer are NULL)

2. possible dereference of NULL pointers (some reaching definitions of a pointer are NULL)

3. use of uninitialized variables (no reaching definition)

4. dead assignments (assignment of a value to a variable that is never subsequently used)

5. dead code (code that can never be executed)

6. array overflows

7. proper pairing of allocate/deallocate function calls

8. improper use of signed integers (who knows what this actually is)


Frankly, this is not an impressive list. These issues are discoverable using standard data flow analysis, and in fact are part of Digital Mars' optimizer. Here is the current state of it for dmd:

1. Optimizer discovers it, but ignores the information. Due to the recent thread on it, I added a report for it for D (still ignored for C). The downside is I can no longer use *cast(char*)0=0 to drop me into the debugger, but I can live with that as assert(0) will do the same thing.

2. Optimizer collects the info, but ignores this, because people are annoyed by false positives.

3. Optimizer detects and reports it. Irrelevant for D, though, because variables are always initialized. The =void case is rare enough to be irrelevant.

4. Dead assignments are automatically detected and removed. I'm not convinced this should be reported, as it can legitimately happen when generating source code. Generating false positives annoy the heck out of users.

5. Dead code is detected and silently removed by optimizer. dmd front end will complain about dead code.

6. Arrays are solidly covered by a runtime check. There is code in the optimizer to detect many cases of overflows at compile time, but the code is currently disabled because the runtime check covers 100% of the cases.

7. Not done because it requires the user to specify what the paired functions are. Given this info, it is rather simple to graft onto existing data flow analysis.

8. D2 has acquired some decent checking for this.


There's a lot of hoopla about these static checkers, but I'm not impressed by them based on what I can find out about them. What do you know about what these checkers do that is not on this list? Any other kinds of checking that would be great to implement?

D's dead code checking has been an encouraging success, and I think people will like the null dereference checks. More along these lines will be interesting.

Reply via email to