Hi Jim,

I'm looking for a test case that shows that the analyzer clearly understands 
the semantics of that statement.  A good way to do that is two provide two test 
cases, one that generates an error and another that doesn't, with the one that 
doesn't utilizing some bit about the semantics of the statement.

Here would be a good example:

void test1() {
  int array[2] = { 1, 2 };
  int j = 0;
  for (auto i : array) { j += i; }
  
  int *p = 0;
  *p = 0xDEADBEEF;  // expected-warning {{null}}
}

void test2() {
  int array[2] = { 1, 2 };
  int j = 0;
  for (auto i : array) { j += i; }
  
  if (j == 3)
    return;

  int *p = 0;
  *p = 0xDEADBEEF;  // no-warning
}


Notice that the two tests are practically identical.  The two test cases are 
important because they serve cross-check each other.  The first test shows that 
the analyzer is basically working, even in the presence of the special for 
loop, but it doesn't rely on any specific semantics of that statement.

Note that I also made the array small, to avoid any issues with exhausting the 
number of loop traversals.  In this case, we should explore every path.

The second test shows that we properly reason about the for loop in 
path-sensitive manner, which causes the warning to disappear.

If we omitted the first test, the second test would not validate our logic.  We 
could instead be missing a warning because we silently dropped the analysis of 
the function on the floor.

The test file to modify would be test/Analysis/misc-ps-cxx0x.cpp.

After applying this patch, I noticed that these tests do not pass, so the next 
step is to understand why.

It's instructive to look how this statement is modeled in the CFG.  If I just 
put test2 into a file t.cpp, I can look at the CFG:

$ clang -std=c++0x --analyze -Xclang -analyzer-checker=debug.DumpCFG t.cpp

 [ B8 (ENTRY) ]
    Predecessors (0):
    Successors (1): B7

 [ B1 ]
      1: 0
      2: [B1.1]
      3: int *p = 0;
      4: 3735928559U
      5: [B1.4]
      6: p
      7: [B1.6]
      8: *[B1.7]
      9: [B1.8] = [B1.5]
    Predecessors (1): B3
    Successors (1): B0

 [ B2 ]
      1: return;
    Predecessors (1): B3
    Successors (1): B0

 [ B3 ]
      1: j
      2: [B3.1]
      3: 3
      4: [B3.2] == [B3.3]
      T: if [B3.4]
    Predecessors (1): B4
    Successors (2): B2 B1

 [ B4 ]
      1: [B6.1]
      2: [B6.1]
      3: __end
      4: [B4.3]
      5: [B4.2] != [B4.4]
      T: for (int i : [B7.7]) {
    {
        [B6.9];
    }
}

    Predecessors (2): B5 B7
    Successors (2): B6 B3

 [ B5 ]
      1: [B6.1]
      2: ++[B6.1]
    Predecessors (1): B6
    Successors (1): B4

 [ B6 ]
      1: __begin
      2: [B6.1]
      3: *[B6.2]
      4: [B6.3]
      5: int i = *__begin;
      6: i
      7: [B6.6]
      8: j
      9: [B6.8] += [B6.7]
    Predecessors (1): B4
    Successors (1): B5

 [ B7 ]
      1: 2
      2: 1
      3: { [B7.2], [B7.1] }
      4: int array[2] = { 1, 2 };
      5: 0
      6: int j = 0;
      7: array
      8: auto int (&&__range)[2] = array;
      9: [B7.12]
     10: [B7.12]
     11: auto int *__begin = __range;
     12: __range
     13: [B7.12]
     14: 2L
     15: [B7.13] + [B7.14]
     16: auto int *__end = __range + 2L;
    Predecessors (1): B8
    Successors (1): B4

 [ B0 (EXIT) ]
    Predecessors (2): B1 B2
    Successors (0):


Take a look at block B7, which is the meat of the CFG representation for 
CXXForRangeStmt.  I've highlighted the problems:

 [ B7 ]
      1: 2
      2: 1
      3: { [B7.2], [B7.1] }
      4: int array[2] = { 1, 2 };
      5: 0
      6: int j = 0;
      7: array
      8: auto int (&&__range)[2] = array;
-->  9: [B7.12]
-->  10: [B7.12]
     11: auto int *__begin = __range;
     12: __range
     13: [B7.12]
     14: 2L
     15: [B7.13] + [B7.14]
     16: auto int *__end = __range + 2L;
    Predecessors (1): B8
    Successors (1): B4

It appears that the 12th expression in the basic block is referred to by the 
9th and 10th (before it gets evaluated).  That's completely bogus.  From the 
looks of it, it appears the CFG support for CXXForRangeStmt is wrong.

The problem is clearer when you view the analysis exploded graph:

$ ~/clang-xcode/bin/Debug/clang -std=c++0x --analyze t.cpp -Xclang 
-analyzer-viz-egraph-graphviz

I won't show it here, but basically the analysis graph caches out the path when 
evaluating block B7 by creating an artificial cycle.  Thus it is completely 
busted.

I suspected the culprit is here:

CFGBlock *CFGBuilder::VisitCXXForRangeStmt(CXXForRangeStmt *S) {
  ...

  // Add the initialization statements.
  Block = createBlock();
  addStmt(S->getBeginEndStmt());
  return addStmt(S->getRangeStmt());
}


An AST dump shows the problem:

  (CXXForRangeStmt 0x7fb8db83dad8 <line:4:3, col:34>
    (DeclStmt 0x7fb8db83d5a8 <col:17>
      0x7fb8db83d3a0 "auto int (&&__range)[2] =
        (DeclRefExpr 0x7fb8db83d260 <col:17> 'int [2]' lvalue Var 
0x7fb8db83d030 'array' 'int [2]')")
    (DeclStmt 0x7fb8db83d8e8 <col:15>
      0x7fb8db83d600 "auto int *__begin =
        (ImplicitCastExpr 0x7fb8db83d7c0 <col:15> 'int *' <ArrayToPointerDecay>
-->          (DeclRefExpr 0x7fb8db83d5c0 <col:15> 'int [2]' lvalue Var 
0x7fb8db83d3a0 '__range' 'int (&&)[2]'))"
      0x7fb8db83d670 "auto int *__end =
        (BinaryOperator 0x7fb8db83d818 <col:15, col:17> 'int *' '+'
          (ImplicitCastExpr 0x7fb8db83d800 <col:15> 'int *' 
<ArrayToPointerDecay>
-->            (DeclRefExpr 0x7fb8db83d5c0 <col:15> 'int [2]' lvalue Var 
0x7fb8db83d3a0 '__range' 'int (&&)[2]'))
          (IntegerLiteral 0x7fb8db83d7d8 <col:17> 'long' 2))")
    (BinaryOperator 0x7fb8db83d980 <col:15> '_Bool' '!='
      (ImplicitCastExpr 0x7fb8db83d950 <col:15> 'int *':'int *' <LValueToRValue>
        (DeclRefExpr 0x7fb8db83d900 <col:15> 'int *':'int *' lvalue Var 
0x7fb8db83d600 '__begin' 'int *':'int *'))
      (ImplicitCastExpr 0x7fb8db83d968 <col:15> 'int *':'int *' <LValueToRValue>
        (DeclRefExpr 0x7fb8db83d928 <col:15> 'int *':'int *' lvalue Var 
0x7fb8db83d670 '__end' 'int *':'int *')))
    (UnaryOperator 0x7fb8db83d9a8 <col:15> 'int *':'int *' lvalue prefix '++'
      (DeclRefExpr 0x7fb8db83d900 <col:15> 'int *':'int *' lvalue Var 
0x7fb8db83d600 '__begin' 'int *':'int *'))

The same DeclRefExpr for __range appears TWICE (you can tell from the pointer 
address of the DeclRefExpr) as subexpressions.  This completely breaks an 
invariant in the CFG that all expressions appear only once, as a single 
subexpression.  This causes the analyzer to cache out the analysis of the for 
loop when evaluating the prologue of the loop.

I don't know of any other AST node that does this, because this essentially 
makes the AST a DAG instead of a tree.  Arguably the AST should be changed, but 
this may be considered a reasonable optimization.  Alternatively, the CFG 
builder can be taught to do the right thing here.  In either case, one of these 
changes needs to be made in order for us to try and model the semantics of 
CXXForRangeStmt correctly.

I'll look into fixing this problem in the next few days, but feel free to 
tackle it if you are interested.

Cheers,
Ted

On Oct 4, 2011, at 4:46 PM, Jim Goodnow II wrote:

> Or do you just want an example of the ForRange and show that it doesn't cause 
> the analyzer to generate an error? Is there a file for that kind of thing 
> already or should it be a new file? For example:
> 
> int array[ 10 ];
> for ( auto i : array ) ;
> 
> Thanks!
> 
> - jim
> 
> On 10/4/2011 3:36 PM, Jim Goodnow II wrote:
>> Hi Ted,
>> 
>> Um, I'm not sure how I would go about writing a test case for a statement 
>> class that shouldn't be showing up as an expression in the CFG. It's just 
>> like "for", "break", "continue", etc. Is there an example test for those 
>> cases I could look at? Thanks!
>> 
>> - jim
>> 
>> On 10/4/2011 3:20 PM, Ted Kremenek wrote:
>>> Test case please.  :)
>>> 
>>> On Oct 4, 2011, at 1:32 PM, Jim Goodnow II wrote:
>>> 
>>>> This is just a program control statement and should be handled the same as 
>>>> any other.
>>>> 
>>>> - jim
>>>> 
>>>> Index: lib/StaticAnalyzer/Core/ExprEngine.cpp
>>>> ===================================================================
>>>> --- lib/StaticAnalyzer/Core/ExprEngine.cpp    (revision 141095)
>>>> +++ lib/StaticAnalyzer/Core/ExprEngine.cpp    (working copy)
>>>> @@ -453,7 +453,6 @@
>>>>     case Stmt::CXXBindTemporaryExprClass:
>>>>     case Stmt::CXXCatchStmtClass:
>>>>     case Stmt::CXXDependentScopeMemberExprClass:
>>>> -    case Stmt::CXXForRangeStmtClass:
>>>>     case Stmt::CXXPseudoDestructorExprClass:
>>>>     case Stmt::CXXTemporaryObjectExprClass:
>>>>     case Stmt::CXXThrowExprClass:
>>>> @@ -501,6 +500,7 @@
>>>>     case Stmt::CaseStmtClass:
>>>>     case Stmt::CompoundStmtClass:
>>>>     case Stmt::ContinueStmtClass:
>>>> +    case Stmt::CXXForRangeStmtClass:
>>>>     case Stmt::DefaultStmtClass:
>>>>     case Stmt::DoStmtClass:
>>>>     case Stmt::ForStmtClass:
>>>> 
>>>> <ForRange.patch>
>>> 
>>> 

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to