Le 19/02/2012 21:19, bearophile a écrit :
A belated comment on the GoingNative 2012 talk "Defending C++ fom Murphy's Million 
Monkeys" by Chandler Carruth:
http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Clang-Defending-C-from-Murphy-s-Million-Monkeys

The slides:
http://ecn.channel9.msdn.com/events/GoingNative12/GN12Clang.pdf

This talk shows some ways used by Clang to help the programmer spot or fix some 
mistakes in C++ code (according to Chandler, all bugs shown in the talk are 
real bugs found in production software). I think some of those ideas are 
interesting or useful for D too.

-----------------------------

Slides page 31 - 32, 14.48 in the video:

I have adapted the Clang example to this D code:


class BaseType {}
static int basetype;
class DerivedType : Basetype {}
void main() {}


The latest DMD gives this error:
test.d(3): Error: undefined identifier Basetype, did you mean variable basetype?


The idea here is to restrict the list of names to search for similar ones only among the 
class names, because here "basetype" is an int so it can't be what the 
programmer meant. There are few other groups of names for other situations.

This also reduces the work of the routines that look for similar names, because 
they need to work on a shorter list of possibilities.

-----------------------------

Slides page 47 - 48, 21.24 in the video:

C++ code:

static const long long DiskCacheSize = 8<<  30; // 8 Gigs

Clang gives:

% clang++ -std=c++11 -fsyntax-only overflow.cpp
overflow.cpp:1:42: warning: signed shift result (0x200000000) requires 35
bits to represent, but 'int' only has 32 bits [-Wshift-overflow]
static const long long DiskCacheSize = 8<<  30; // 8 Gigs
                                        ~ ^  ~~

In D/DMD this compiles with no errors or warnings:

static const long diskCacheSize = 8<<  30;
static assert(diskCacheSize == 0);
void main() {}


This kind of overflow errors are important and common enough, so I think D too 
has to spot them, as Clang.


I have done a little test, and I've seen that with the latest Clang this C++ 
code gives no warnings:

const static unsigned int x = -1;

-----------------------------

Slides 57 - 58, 28.11 in the video:

C++ code:

void test(bool b, double x, double y) {
   if (b || x<  y&&  x>  0) {
     // ...
   }
}


Clang gives:


% clang++ -std=c++11 -fsyntax-only -Wparentheses parentheses3.cpp
parentheses3.cpp:2:18: warning: '&&' within '||' [-Wlogical-op-
parentheses]
   if (b || x<  y&&  x>  0) {
         ~~ ~~~~~~^~~~~~~~
parentheses3.cpp:2:18: note: place parentheses around the '&&' expression
to silence this warning
   if (b || x<  y&&  x>  0) {
                  ^
            (             )
1 warning generated.


DMD compiles that code with no warnings. I think a similar warning is useful in 
D/DMD too.

-----------------------------

Slides 59 - 60, 29.17 in the video:

int test(bool b, int x, int y) {
   return 42 + b ? x : y;
}


Clang gives:


% clang++ -std=c++11 -fsyntax-only -Wparentheses parentheses4.cpp
parentheses4.cpp:2:17: warning: operator '?:' has lower precedence than
'+'; '+' will be evaluated first [-Wparentheses]
   return 42 + b ? x : y;
          ~~~~~~ ^
parentheses4.cpp:2:17: note: place parentheses around the '+' expression
to silence this warning
   return 42 + b ? x : y;
                 ^
          (     )
parentheses4.cpp:2:17: note: place parentheses around the '?:' expression
to evaluate it first
   return 42 + b ? x : y;
                 ^
               (        )
1 warning generated.


DMD compiles that code with no warnings.
(Maybe here Don thinks that accepting int+bool in the language rules is bad in 
the first place.)

-----------------------------

A little comment on slides 63 - 64: in D when I switch on an enumeration, I 
always use a final switch. I think use cases for nonfinal switches on enums are 
rare in D code.

-----------------------------

Slides 65 - 66, 31.57 in the video:


constexpr int arr_size = 42;
constexpr int N = 44;
void f(int);
int test() {
   int arr[arr_size];
   // ...
   f(arr[N]);
   // ...
   if (N<  arr_size) return arr[N];
   return 0;
}


Clang gives:

% clang++ -std=c++11 -fsyntax-only deadcode.cpp
deadcode.cpp:7:5: warning: array index 44 is past the end of the array
(which contains 42 elements) [-Warray-bounds]
   f(arr[N]);
     ^   ~
deadcode.cpp:5:3: note: array 'arr' declared here
   int arr[arr_size];
   ^
1 warning generated.


Similar D code:

enum int arr_size = 42;
enum int N = 44;
void foo(int) {}
int test() {
     int[arr_size] arr;
     foo(arr[N]);
     if (N<  arr_size)
         return arr[N];
     return 0;
}
void main() {}


DMD is able to catch the array overflow bugs, but it shows three errors:

test.d(6): Error: array index 44 is out of bounds arr[0 .. 42]
test.d(6): Error: array index 44 is out of bounds arr[0 .. 42]
test.d(8): Error: array index 44 is out of bounds arr[0 .. 42]


Clang doesn't give an error at the "return arr[N];" line because it's inside the 
"if(N<arr_size)" branch. This is nice (despite this code looks silly, it's real code 
reduced from a meaningful example, the enums come from conditional compilation, etc).

As a related example, in D I have written code similar to:


void foo(ulong x) {
     if (x<  5_000)
         int y = x;
}
void main() {}


Here DMD asks for a cast(int) or to!int:
test.d(3): Error: cannot implicitly convert expression (x) of type ulong to int

But a bit smarter compiler/language is able to tell there's no need for casts 
or conversions there.

-----------------------------

The lock annotations in slides 70 - 74 are nice.

-----------------------------

Regarding slides 76 - 79, is it useful to catch simple bugs like this 
statically?


void main() {
     auto a = new int[10];
     a[10] = 2;
}

-----------------------------

Slides 80 - 81, 41.46 in the video:

Clang is able to catch this bug at runtime in C++ code:


#include<iostream>
int test(int n) {
   return 42<<  n;
}
int main() {
   std::cerr<<  "Bogus: "<<  test(42)
             <<  "\n";
}



% clang++ -g -std=c++0x undef.cpp -o undef -fcatch-undefined-behavior
% gdb -x =(echo -e "r\nbt") --args ./undef
...
Program received signal SIGILL, Illegal instruction.
0x00000000004007fe in test (n=42) at undef.cpp:3
3         return 42<<  n;
#0  0x00000000004007fe in test (n=42) at undef.cpp:3
#1  0x000000000040084b in main () at undef.cpp:6


Here Chandler says that this runtime instrumentation is light and fast. This is 
good compiler sugar for future D compilers (maybe for LDC2).

-----------------------------

Taken one at a time those Clang features are small, but all at the same time 
are impressive. Hopefully Clang will poke GCC developers to improve the 
diagnostic features of their compiler.

I think some of those ideas are good for D too, so I'd like to open few D 
enhancement requests based on this post, but I have to think more on them. 
Suggestions and comments are welcome.

Bye,
bearophile

+100

Especially for parenthesis. « If you think you can remember precendence rules beyond +-*/, you're only 99% right. Also, nobody else can remember them, so your code is now 1% buggy, and 100% unmaintainable. Use brackets. » : http://home.comcast.net/~tom_forsyth/blog.wiki.html#OffendOMatic

Reply via email to