subject:"PR analyzer\/94362 Partial Fix"

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches

Agreed too. Generic "error on overflow" is not an answer, and ignoring overflow
is not an answer either because flagging faulty memory allocations is an
important feature.

Brian


Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Tuesday, March 2, 2021 6:09 PM, Jeff Law  wrote:

> On 3/2/21 6:40 PM, David Malcolm via Gcc-patches wrote:
>
> > > My devil's advocate position would be if the analyzer raises
> > > exception on
> > > any possible overflow, will that overwhelm the user with false
> > > positives?
> > > Presumably by "raise exception" you mean "issue a diagnostic and stop
> > > analyzing this path", right?
> >
> > I think the point is to detect where numerical overflow can lead to
> > e.g. a buffer overflow, rather than complain about numerical overflow
> > in its own right, like in the make_arr example I gave earlier.
>
> WRT overflow, IMHO, the most valuable case to detect overflows is when
> they feed an allocation via malloc/alloca.  If an attacker can arrange
> to overflow the size computation, then they can cause an
> under-allocation which in turn opens the ability to over-write the stack
> or heap data structures, which in turn are great attack vectors.
>
> And in case you think that's contrived, it isn't :-)
>
> http://phrack.org/issues/67/9.html
>
> > > I
> > > am not sure of the answer here, because a piece of me feels that
> > > overflow is not
> > > something that production code should be relying on in any serious
> > > application,
> > > and so should be non existent, but I am not sure if that is
> > > reflective of
> > > reality.
> > > My belief is that production code is full of overflows, but only some
> > > of them are security-sensitive. Consider e.g. hashing algorithms that
> > > sum some values and beningly assume overflow for wraparound as opposed
> > > to the "calculate the size of the buffer to be allocated" example
> > > (where the overflow is a classic security pitfall).
>
> Agreed.
>
> Jeff

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches

Wow! I wasn't expecting that to work. Obviously we know that  there is currently
no handler for binop_svalue in the constraints so I would have to watch it run 
with
state merging disabled to see how it is managing the unroll. The fact that 
merging
breaks it is indicative of what we are saying though, that the constraint and
merge model is currently insufficient.

Hash algorithms may provide a counterexample for legitimate use of overflow. 
Anyway,
I would prefer tracking the split too, but it is a big change. One way is state
split, like you say, but that is a pretty invasive change to the way the graph 
works.
The other way is to handle "or constraints" as I have said. This is an invasive
change to the constraint model, but arguably the concept of "or" cannot be 
ignored
forever.

Thinking about it, I guess currently the concept of "and" is handled by the
constraints (all constraints in a model exist as a big "and") and the concept of
"or" is handled by the graph. This could be acceptable but we cannot split the
graph arbitrarily, so there is currently no way to handle even a basic

if (i == 1 || i == 10)

what should be a very simple conditional. Handling a hash algorithm, like
you say, is good to keep in mind, because we don't want an explosion of
possibilities. If done right, the analyzer should understand for hashing that
anything + anything => anything, and we see no explosion of state.

By "raise an exception" I did mean issue an analyzer warning, yes. Perhaps the
simple answer is to just track the svalue as a possible overflow in the state
machine and report the warning for certain uses, like the alloc family, as you
said. Regardless, proper overflow handling renders my naive binop handler 
unusable,
because all it does is fold the condition and recur. There is basically no logic
to it, and once you reenter eval_condition it is not possible to know how you 
got
there.

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread Jeff Law via Gcc-patches




On 3/2/21 6:40 PM, David Malcolm via Gcc-patches wrote:
>
>> My devil's advocate position would be if the analyzer raises
>> exception on
>> any possible overflow, will that overwhelm the user with false
>> positives?
> Presumably by "raise exception" you mean "issue a diagnostic and stop
> analyzing this path", right?
>
> I think the point is to detect where numerical overflow can lead to
> e.g. a buffer overflow, rather than complain about numerical overflow
> in its own right, like in the make_arr example I gave earlier.
WRT overflow, IMHO, the most valuable case to detect overflows is when
they feed an allocation via malloc/alloca.  If an attacker can arrange
to overflow the size computation, then they can cause an
under-allocation which in turn opens the ability to over-write the stack
or heap data structures, which in turn are great attack vectors.

And in case you think that's contrived, it isn't :-)

http://phrack.org/issues/67/9.html

>>  I
>> am not sure of the answer here, because a piece of me feels that
>> overflow is not
>> something that production code should be relying on in any serious
>> application,
>> and so should be non existent, but I am not sure if that is
>> reflective of
>> reality.
> My belief is that production code is full of overflows, but only some
> of them are security-sensitive.  Consider e.g. hashing algorithms that
> sum some values and beningly assume overflow for wraparound as opposed
> to the "calculate the size of the buffer to be allocated" example
> (where the overflow is a classic security pitfall).
Agreed.


Jeff

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread David Malcolm via Gcc-patches

On Tue, 2021-03-02 at 23:14 +, brian.sobulefsky wrote:
> I have been kicking these sorts of ideas around ever since I came to
> understand that
> the second "UNKNOWN" in the for loop that originally started this was
> due to the state
> merge as we loop. For now, and I don't mean this disrespectfully
> because it is very
> hard to get right, the whole issue of merging has basically been
> punted, given some
> of the simple examples we found that will merge as an unknown svalue.
> As you think
> about this issue, "scope creep" becomes a concern quickly. It 
> quickly turns into
> a halting problem of sorts, you have to decide how much of you want
> the analyzer to
> be able to "understand" a program. For example, any human can follow
> this:
> 
> sum = 0;
> for (idx = 1; idx <= 10; idx++) sum += idx;
> __analyzer_eval (sum == 55);
> 
> but from an analyzer perspective it opens up all sorts of questions
> and
> becomes a bit of a PhD thesis as to where you draw the line. 

Challenge accepted!  FWIW with suitable options, the analyzer can
actually "figure this out":

$ cat ../../src/t.c
extern void __analyzer_eval (int);

void test (void)
{
 int sum = 0;
 for (int idx = 1; idx <= 10; idx++)
   sum += idx;
 __analyzer_eval (sum == 55);
}

$ ./xgcc -B. -S -fanalyzer ../../src/t.c \
-Wanalyzer-too-complex \
-fno-analyzer-state-merge \
--param analyzer-max-enodes-per-program-point=11
../../src/t.c: In function ‘test’:
../../src/t.c:8:2: warning: TRUE
8 |  __analyzer_eval (sum == 55);
  |  ^~~

i.e. with -fno-analyzer-state-merge to disable state merging, 
and increasing the enode limit so that the analyzer effectively fully
unrolls the loop when exploring the exploded_graph.

But obviously this isn't particularly useful, except as a demo of the
internals of the analyzer.

> The biggest concern
> with the analyzer seems to be vulnerabilities, so I doubt it is
> useful to get the
> analyzer to produce the correct answer for the above code, although
> it might be
> interesting to do so from an academic perspective.

Indeed - security vulnerabilities are my highest priority (making it
easier to avoid them as code is written/patched, and to find them in
existing code).

> The example you provided gives a concrete reason that overflow should
> not be a
> complete "punt" and I like it. In the interest of fighting scope
> creep and keeping
> things manageable, I would question whether you want to actually
> track the overflow /
> no overflow cases separately or just raise any possible overflow as
> an error immediately.
> I am not disputing your idea, I would prefer to track the overflow
> and get
> a correct result (in this case, an under allocation of memory). I
> guess I would want
> to know how much work you think that will be. You still know the
> codebase a lot better
> than I do.

Brainstorming somewhat, another idea I have for handling overflow (as
opposed to the || idea you mentioned) might be to bifurcate state at
each point where overflow could occur, splitting the path into "didn't
overflow" and "did overflow" outcomes, adding conditions to each
successor state accordingly.

But maybe that would lead to a combinatorial explosion of nodes (unless
it can be tamed by merging?)

(Unfortunately, we can currently only split states at CFG splits, not
at arbitrary statements; see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99260 )

> My devil's advocate position would be if the analyzer raises
> exception on
> any possible overflow, will that overwhelm the user with false
> positives?

Presumably by "raise exception" you mean "issue a diagnostic and stop
analyzing this path", right?

I think the point is to detect where numerical overflow can lead to
e.g. a buffer overflow, rather than complain about numerical overflow
in its own right, like in the make_arr example I gave earlier.

>  I
> am not sure of the answer here, because a piece of me feels that
> overflow is not
> something that production code should be relying on in any serious
> application,
> and so should be non existent, but I am not sure if that is
> reflective of
> reality.

My belief is that production code is full of overflows, but only some
of them are security-sensitive.  Consider e.g. hashing algorithms that
sum some values and beningly assume overflow for wraparound as opposed
to the "calculate the size of the buffer to be allocated" example
(where the overflow is a classic security pitfall).

> The simplest way to handle your example would be the following:
> 
> struct foo * make_arr (size_t n)
> {
>   if (MAX_INT / sizeof (struct foo) >= n)
>     return NULL;
>   //continue with what you wrote
> }

Ideally we'd emit a fix-it hint suggesting adding such a clause (I'm
kidding, but only partly).

> This should add a constraint that downstream of the initial guard, n
> is small
> enough to prevent overflow (I would have to check, but the current
> analyzer
> should be close to doing this if not

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches

I have been kicking these sorts of ideas around ever since I came to understand 
that
the second "UNKNOWN" in the for loop that originally started this was due to 
the state
merge as we loop. For now, and I don't mean this disrespectfully because it is 
very
hard to get right, the whole issue of merging has basically been punted, given 
some
of the simple examples we found that will merge as an unknown svalue. As you 
think
about this issue, "scope creep" becomes a concern quickly. It  quickly turns 
into
a halting problem of sorts, you have to decide how much of you want the 
analyzer to
be able to "understand" a program. For example, any human can follow this:

sum = 0;
for (idx = 1; idx <= 10; idx++) sum += idx;
__analyzer_eval (sum == 55);

but from an analyzer perspective it opens up all sorts of questions and
becomes a bit of a PhD thesis as to where you draw the line. The biggest concern
with the analyzer seems to be vulnerabilities, so I doubt it is useful to get 
the
analyzer to produce the correct answer for the above code, although it might be
interesting to do so from an academic perspective.

The example you provided gives a concrete reason that overflow should not be a
complete "punt" and I like it. In the interest of fighting scope creep and 
keeping
things manageable, I would question whether you want to actually track the 
overflow /
no overflow cases separately or just raise any possible overflow as an error 
immediately.
I am not disputing your idea, I would prefer to track the overflow and get
a correct result (in this case, an under allocation of memory). I guess I would 
want
to know how much work you think that will be. You still know the codebase a lot 
better
than I do.

My devil's advocate position would be if the analyzer raises exception on
any possible overflow, will that overwhelm the user with false positives? I
am not sure of the answer here, because a piece of me feels that overflow is not
something that production code should be relying on in any serious application,
and so should be non existent, but I am not sure if that is reflective of
reality.

The simplest way to handle your example would be the following:

struct foo * make_arr (size_t n)
{
  if (MAX_INT / sizeof (struct foo) >= n)
return NULL;
  //continue with what you wrote
}

This should add a constraint that downstream of the initial guard, n is small
enough to prevent overflow (I would have to check, but the current analyzer
should be close to doing this if not already correct). Therefore, all we would 
need
would be the check at the definition of sz as to whether it overflows, and that
check should come back negative in my example, unknown in yours. Assuming we 
agree
that the purpose of the analyzer is to prevent vulnerabilities, and not to 
provide
an academic exercise in seeing how close we get to solving the halting problem,
we could just treat any possible overflow as an error.

As this was fundamentally your project, I don't want to tell you how to do it, 
and
the nerd in me wants to build an analyzer that can answer all the silly puzzle 
code
I can think to feed it, but from a utilitarian perspective and given everyone's
limited time, I thought I would offer this as a path you can try.

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread David Malcolm via Gcc-patches

On Tue, 2021-03-02 at 07:09 +, brian.sobulefsky wrote:
> Hi,
> 
> It may not be worth altering at this point, but it seems like it
> would leave less
> bugs open if all the constraints go in as "closed" ranges and all
> evals are
> translated to closed intervals. So, if (idx > 0) and if (idx >= 1)
> are the same
> constraint. I know this won't be an option for eventual float
> support, but
> that is a different can of worms. For integers, you can fix it at
> those entry points
> and then all other subroutines can ignore the issue of open or closed
> ranges.
> 
> I fully understand the eye glaze and did not want to have to write it
> that
> way. I am thinking if there is a cleaner way to do it. Anyway, that
> is why I
> put a comment in each case to derive the result. This issue of "sides
> of the
> condition" and "inverted operator" as you call it in some places is a
> recurring
> theme. It is especially irritating when we lose commutativity, as we
> do with MINUS.
> 
> Adding logic in my subroutine for MULT or DIV is not hard, handling
> overflow
> is a bit more involved. At the very least, we would need to know what
> the max or min
> of a particular variable is, which might be in the type tree. We
> would also need to
> define how we want to handle the issue.
> 
> The problem is (and I have been thinking about this a lot in terms of
> constraint
> merging), there are currently no "or constraints," which would be
> helpful in merging too.
> So for overflow, when you have something like
> 
> if (idx > 0)
>  {
>   idx += num;
>   __analyzer_eval(idx > num);
>  }
> 
> you have gone from a single constraint (idx > 0), to an "or
> condition"
> (idx > num || idx < MIN_INT + num). The only solution now, other than
> ignoring
> overflow as a bug that is tolerated, is to just pass it off as
> unhandled (and
> therefore UNKNOWN). Perhaps you may want to add overflow as one of
> your analyzer
> warnings if it cannot be ruled out?
> 
> I did try to run a test with a simple || in the condition before just
> to see what
> would happen, and as you probably know it is not handled at all. I
> did not watch
> in gdb, but it is obvious from constraint-manager.cc that there is
> nothing to handle
> it. I think I actually did an __analyzer_eval() of the same ||
> condition verbatim
> that was in the if() conditional and still got an UNKNOWN.
> 
> It is a pretty intrusive change to add logic for that, which is why I
> have not
> done any work on it yet. Without the concept of "or" I don't see how
> we could
> handle overflow, but maybe you don't really want to handle it anyway,
> but
> rather just emit a warning if it might be considered poor practice to
> rely
> on something that is technically machine dependent anyway.

I'm not sure how we should handle this.

One approach would be to generalize the constraint-handling so that we
store a set of "facts" about svalues, where the facts are themselves
svalues that are known to be true.

Hence we could build an svalue for the TRUTH_OR_EXPR
  (idx > num || idx < MIN_INT + num)
and store that as a fact within the constraint manager.

But that would be a big rewrite.

Somehow the constraint manager needs to be able to evaluate queries
w.r.t known facts, and probably canonicalize sets of facts, and handle
mergers.

IIRC, the clang analyzer works by exposing an "add fact" interface
(where the facts can be compound symbolic expressions), but
implementing things internally with a choice of either a set of ranges,
or a Z3-backed solver.


If we're considering overflow, I would like to -fanalyzer to eventually
support bounds checking, and e.g. detecting attacks due to buffer size
overflows (CWE-131 due to CWE-190) e.g. in this code that probably
should have used calloc:

struct foo * make_arr (size_t n)
{
  size_t sz = sizeof (struct foo) * n;
  struct foo *p = (struct foo *)malloc (sz);
  if (!p)
 return;
  memset (p, 0, sz);
  return p;
}

void test (size_t n)
{
   struct foo *f = make_arr (n);
   if (!f)
 return;
   for (i = 0; i < n; i++)
 {
   //... do stuff with f[i]
 }
}

it would be good to detect the case when sz overflows and thus the
array is smaller than expected.  I think this could work be recording
that the allocated size of *p (== *f) is
   (size_t)(sizeof (struct foo) * n)

In the loop, the access to f[i] could bifurcate the egraph into:

  outcome A: (sizeof (struct foo) * i) < allocated_size
(carry on, and have this recorded so no further checking needed on
this path)

  outcome B: (sizeof (struct foo) * i) >= allocated_size
(complain about buffer overflow and stop this path)

Outcome B thus occurs when:
  (sizeof (struct foo) * i) >= (size_t)(sizeof (struct foo) * n)
  && (i < n)
so say sizeof (struct foo) == 16
  (16 * i) <= (size_t) (16 * n)
  && (i < n)
and we could then (somehow) show that this can happen e.g. for
  n > (SIZE_MAX / 16).
(and I've probably messed up at least some of the logic in the above)

Obviously this would be

Re: PR analyzer/94362 Partial Fix

2021-03-01 Thread brian.sobulefsky via Gcc-patches

Hi,

It may not be worth altering at this point, but it seems like it would leave 
less
bugs open if all the constraints go in as "closed" ranges and all evals are
translated to closed intervals. So, if (idx > 0) and if (idx >= 1) are the same
constraint. I know this won't be an option for eventual float support, but
that is a different can of worms. For integers, you can fix it at those entry 
points
and then all other subroutines can ignore the issue of open or closed ranges.

I fully understand the eye glaze and did not want to have to write it that
way. I am thinking if there is a cleaner way to do it. Anyway, that is why I
put a comment in each case to derive the result. This issue of "sides of the
condition" and "inverted operator" as you call it in some places is a recurring
theme. It is especially irritating when we lose commutativity, as we do with 
MINUS.

Adding logic in my subroutine for MULT or DIV is not hard, handling overflow
is a bit more involved. At the very least, we would need to know what the max 
or min
of a particular variable is, which might be in the type tree. We would also 
need to
define how we want to handle the issue.

The problem is (and I have been thinking about this a lot in terms of constraint
merging), there are currently no "or constraints," which would be helpful in 
merging too.
So for overflow, when you have something like

if (idx > 0)
 {
  idx += num;
  __analyzer_eval(idx > num);
 }

you have gone from a single constraint (idx > 0), to an "or condition"
(idx > num || idx < MIN_INT + num). The only solution now, other than ignoring
overflow as a bug that is tolerated, is to just pass it off as unhandled (and
therefore UNKNOWN). Perhaps you may want to add overflow as one of your analyzer
warnings if it cannot be ruled out?

I did try to run a test with a simple || in the condition before just to see 
what
would happen, and as you probably know it is not handled at all. I did not watch
in gdb, but it is obvious from constraint-manager.cc that there is nothing to 
handle
it. I think I actually did an __analyzer_eval() of the same || condition 
verbatim
that was in the if() conditional and still got an UNKNOWN.

It is a pretty intrusive change to add logic for that, which is why I have not
done any work on it yet. Without the concept of "or" I don't see how we could
handle overflow, but maybe you don't really want to handle it anyway, but
rather just emit a warning if it might be considered poor practice to rely
on something that is technically machine dependent anyway.

Re: PR analyzer/94362 Partial Fix

2021-03-01 Thread David Malcolm via Gcc-patches

On Sat, 2021-02-27 at 10:04 +, brian.sobulefsky wrote:
> Hi,
> 
> Please find a patch to fix part of the bug PR analyzer/94362.

Thanks.  Various comments inline below.

>  This bug is a
> false positive for a null dereference found when compiling openssl.
> The cause
> is the constraint_manager not knowing that i >= 0 within the for
> block:
> 
> for ( ; i-- > 0; )
> 
> The bug can be further reduced to the constraint manager not knowing
> that i >= 0
> within the if block:
> 
> if (i-- > 0)
> 
> which is not replicated for other operators, such as prefix
> decrement. The
> cause is that the constraint is applied to the initial_svalue of i,
> while it
> is a binop_svalue of i that enters the block (with op PLUS and arg1 -
> 1). The
> constraint_manager does not have any constraints for this svalue and
> has no
> handler. A handler has been added that essentially recurs on the
> remaining arg
> if the other arg and other side of the condition are both constants
> and the op
> is PLUS_EXPR or MINUS_EXPR.
> 
> This in essence fixed the problem, except an off by one error had
> been hiding
> in range::eval_condition. This error is hard to notice, because, for
> example,
> the code
> 
> if(idx > 0)
>   __analyzer_eval(idx >= 1);
> 
> will compile as (check -fdump-ipa-analyzer to see)
> 
> void test (int idx)
> {
>   _Bool _1;
>   int _2;
> 
>    :
>   if (idx_4(D) > 0)
>     goto ; [INV]
>   else
>     goto ; [INV]
> 
>    :
>   _1 = idx_4(D) > 0;
>   _2 = (int) _1;
>   __analyzer_eval (_2);
> 
>    :
>   return;
> 
> }
> 
> and will print "TRUE" to the screen, but as you can see, it is for
> the wrong
> reason, because we are not really testing the condition we wanted to
> test.
> 
> You can force the failure (i.e. "UNKNOWN") for yourself with the
> following:
> 
> void test(int idx)
> {
>   int check = 1;
>   if(idx > 0)
>     __analyzer_eval(idx >= check);
> }
> 
> which the compiler will not "fix" on us. 

Thank.  This looks like a good way to create DejaGnu tests that verify
the constraint_manager code, rather than accidentally testing the
optimizer.


> An examination of range::eval_condition
> should convince you that there is an off by one error. 

Yes, looking at the switch statement, the fact that LT_EXPR and LE_EXPR
share the same case suggests the boundaries aren't properly handled
(and the same for GT_EXPR and GE_EXPR)


> Incidentally, I might
> recommend doing away with "open intervals of integers" entirely.

What would the alternative be?

Note that in the range class a bound can have a NULL m_constant, in
which case that bound is a kind of null bound (the comments should
probably spell this out).

> When running the initial bug (the for loop), you will find that the
> analyzer
> prints "UNKNOWN" twice for the postfix operator, and "TRUE" "UNKNOWN"
> for other
> operators. This patch reduces postfix to the same state as the other
> operators.
> The second "UNKNOWN" seems to be due to a second "iterated" pass
> through the
> loop with a widening_svalue. A full fix of the bug will need a
> handler for the
> widening svalue, and much more critically, a correct merge of the
> constraints
> at the loop entrance. 

Sounds correct to me.

> That, unfortunately, looks to be a hard problem.

I think it's worth cleaning up this patch and getting this into trunk,
and leave the second part as a followup.

> This patch fixes a few xfails as noted in the commit message. These
> were tests
> that were evidently devised to test whether the analyzer would
> understand
> arithmetic being done on constrained values. Addition and subtraction
> is now
> working as expected, a handler for multiplication and division can be
> added.
> 
> As was noted in those test files, consideration at some point should
> be given to
> overflow.

Indeed.  I think the patch needs to take that into account when
updating bounds in eval_condition.

Various comments inline below throughout.


> commit d4052e8c273ca267f6dcf782084d60acfc50a609
> Author: Brian Sobulefsky 
> Date:   Sat Feb 27 00:36:40 2021 -0800
> 
> Changes to support eventual solution to bug PR analyzer/94362. This bug
> originated with a false positive null dereference during compilation of
> openssl. The bug is in effect caused by an error in constraint handling,
> specifically that within the for block:
> 
> for ( ; i-- > 0; )
>   {
>   }
> 
> the constraint_manager should know i >= 0 but does not. A reduced form of
> this bug was found where the constraint manager did not know within the if
> block:
> 
> if (i-- > 0)
>   {
>   }
> 
> that i >= 0. This latter error was only present for the postfix
> operators, and not for other forms, like --i > 0. It was due to the
> constraint being set for the initial_svalue associated with i, but a
> binop_svalue being what entered the if block for which no constraint
> rules existed.
> 
> By adding handling logic for a

PR analyzer/94362 Partial Fix

2021-02-27 Thread brian.sobulefsky via Gcc-patches

Hi,

Please find a patch to fix part of the bug PR analyzer/94362. This bug is a
false positive for a null dereference found when compiling openssl. The cause
is the constraint_manager not knowing that i >= 0 within the for block:

for ( ; i-- > 0; )

The bug can be further reduced to the constraint manager not knowing that i >= 0
within the if block:

if (i-- > 0)

which is not replicated for other operators, such as prefix decrement. The
cause is that the constraint is applied to the initial_svalue of i, while it
is a binop_svalue of i that enters the block (with op PLUS and arg1 -1). The
constraint_manager does not have any constraints for this svalue and has no
handler. A handler has been added that essentially recurs on the remaining arg
if the other arg and other side of the condition are both constants and the op
is PLUS_EXPR or MINUS_EXPR.

This in essence fixed the problem, except an off by one error had been hiding
in range::eval_condition. This error is hard to notice, because, for example,
the code

if(idx > 0)
  __analyzer_eval(idx >= 1);

will compile as (check -fdump-ipa-analyzer to see)

void test (int idx)
{
  _Bool _1;
  int _2;

   :
  if (idx_4(D) > 0)
goto ; [INV]
  else
goto ; [INV]

   :
  _1 = idx_4(D) > 0;
  _2 = (int) _1;
  __analyzer_eval (_2);

   :
  return;

}

and will print "TRUE" to the screen, but as you can see, it is for the wrong
reason, because we are not really testing the condition we wanted to test.

You can force the failure (i.e. "UNKNOWN") for yourself with the following:

void test(int idx)
{
  int check = 1;
  if(idx > 0)
__analyzer_eval(idx >= check);
}

which the compiler will not "fix" on us. An examination of range::eval_condition
should convince you that there is an off by one error. Incidentally, I might
recommend doing away with "open intervals of integers" entirely.

When running the initial bug (the for loop), you will find that the analyzer
prints "UNKNOWN" twice for the postfix operator, and "TRUE" "UNKNOWN" for other
operators. This patch reduces postfix to the same state as the other operators.
The second "UNKNOWN" seems to be due to a second "iterated" pass through the
loop with a widening_svalue. A full fix of the bug will need a handler for the
widening svalue, and much more critically, a correct merge of the constraints
at the loop entrance. That, unfortunately, looks to be a hard problem.

This patch fixes a few xfails as noted in the commit message. These were tests
that were evidently devised to test whether the analyzer would understand
arithmetic being done on constrained values. Addition and subtraction is now
working as expected, a handler for multiplication and division can be added.

As was noted in those test files, consideration at some point should be given to
overflow.


Thank you,
Brian

Sent with ProtonMail Secure Email.commit d4052e8c273ca267f6dcf782084d60acfc50a609
Author: Brian Sobulefsky 
Date:   Sat Feb 27 00:36:40 2021 -0800

Changes to support eventual solution to bug PR analyzer/94362. This bug
originated with a false positive null dereference during compilation of
openssl. The bug is in effect caused by an error in constraint handling,
specifically that within the for block:

for ( ; i-- > 0; )
  {
  }

the constraint_manager should know i >= 0 but does not. A reduced form of
this bug was found where the constraint manager did not know within the if
block:

if (i-- > 0)
  {
  }

that i >= 0. This latter error was only present for the postfix
operators, and not for other forms, like --i > 0. It was due to the
constraint being set for the initial_svalue associated with i, but a
binop_svalue being what entered the if block for which no constraint
rules existed.

By adding handling logic for a binop_svalue that adds or
subtracts a constant, this problem was solved. This logic was added to
a new method, constraint_manager::maybe_fold_condition, with the
intent of eventually adding more cases there (unary_svalue and
widening_svalue for example). Additionally, an off by one error was
found in range::eval_condition that needed to be corrected to get
the expected result. Correction of this error was done in that
subroutine, resulting in no more calls to below_lower_bound and
above_upper_bound. As such, these functions were commented out and may
be removed if not needed for anything else.

This change does not entirely fix the initial bug pr94362, but it
reduces the postfix operator to the same state as other operators. In the
case of the for loop, there appears to be an "initial pass" through the
loop, which the analyzer will now understand for postfix, and then an
"iterated pass" with a widening_svalue that the analyzer does not
understand for any condition found. This seems to be due to the merging
of constraints and is under investigation.

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

Re: PR analyzer/94362 Partial Fix

PR analyzer/94362 Partial Fix

9 matches

Site Navigation

Mail list logo

Footer information