[PATCH] D50250: [clang][ubsan] Implicit Conversion Sanitizer - integer sign change - clang part

Richard Smith - zygoloid via Phabricator via cfe-commits Tue, 23 Oct 2018 15:14:52 -0700

rsmith accepted this revision.
rsmith added a comment.
This revision is now accepted and ready to land.


Just some minor nits.



================
Comment at: docs/ReleaseNotes.rst:186
+
+  Just like other ``-fsanitize=integer`` checks, these issues are **not**
+  undefined behaviour. But they are not *always* intentional, and are somewhat
----------------
This seems inaccurate: `-fsanitize=signed-integer-overflow` is part of 
`-fsanitize=integer` and catches UB. Maybe "Like some other [...]"


================
Comment at: docs/ReleaseNotes.rst:190-191
+  but the ``-fsanitize=implicit-integer-sign-change`` check
+  (as is ``-fsanitize=implicit-integer-truncation`` check)
+  is enabled by ``-fsanitize=integer``.
 
----------------
This is a bit hard to read. Maybe reverse the order of these two lines.


================
Comment at: lib/CodeGen/CGExprScalar.cpp:1036
+    return;
+  // That's it. We can't rule out any more cases with the data we have.
+
----------------
lebedev.ri wrote:
> rsmith wrote:
> > rsmith wrote:
> > > lebedev.ri wrote:
> > > > rsmith wrote:
> > > > > I don't like the overlap between the implicit truncation check and 
> > > > > this check. I think you should aim for exactly one of those checks to 
> > > > > fire for any given integer conversion. There are the following cases:
> > > > > 
> > > > >  * Dst is smaller than Src: if the value changes at all (with sign 
> > > > > change or without), then the truncation check already catches it, and 
> > > > > catching it here does not seem useful
> > > > >  * Dst is the same size as Src or larger: sign change is the only 
> > > > > problem, and is only possible if exactly one of Src and Dst is signed
> > > > > 
> > > > > So I think you should bail out of this function if either Src and Dst 
> > > > > are both unsigned or both are signed, and also if Src is larger than 
> > > > > Dst (because we treat that case as a lossy truncation rather than as 
> > > > > a sign change).
> > > > > 
> > > > > And when you do emit a check here, the only thing you need to check 
> > > > > is if the signed value is negative (if so, you definitely changed the 
> > > > > sign, and if not, you definitely didn't -- except in the truncation 
> > > > > cases that the truncation sanitizer catches).
> > > > To be clear: we want to skip emitting in those cases if the other check 
> > > > (truncation) is enabled, right?
> > > > It does seem to make sense, (and i did thought about that a bit), but i 
> > > > need to think about it more..
> > > I think we want to skip emitting those checks always (regardless of 
> > > whether the other sanitizer is enabled). One way to think of it: this 
> > > sanitizer checks for non-truncating implicit integer conversions that 
> > > change the value of the result. The other sanitizer checks for truncating 
> > > implicit integer conversions that change the value of the result.
> > > 
> > > I don't see any point in allowing the user to ask to sanitize 
> > > sign-changing truncation but not other value-changing truncations. That 
> > > would lead to this:
> > > ```
> > > int a = 0x17fffffff; // no sanitizer warning
> > > int b = 0x180000000; // sanitizer warning
> > > int c = 0x1ffffffff; // sanitizer warning
> > > int d = 0x200000000; // no sanitizer warning
> > > ```
> > > ... which I think makes no sense.
> > Hmm, wait, the "truncation" sanitizer doesn't catch this:
> > 
> > `int a = 0x80000000u;`
> > 
> > ... does it? (Because it looks for cases where the value doesn't 
> > round-trip, not for cases where the value was changed by the truncation.)
> > 
> > 
> > I've thought a bit more about the user model and use cases for these 
> > sanitizers, and I think what we want is:
> > 
> >  * a sanitizer that checks for implicit conversions with data loss (the 
> > existing truncation sanitizer)
> >  * a sanitizer that checks for implicit conversions that change the value, 
> > where either the source or destination was signed (approximately what this 
> > sanitizer is doing)
> > 
> > The difference between that and what you have here is that I think the new 
> > sanitizer should catch all of these cases:
> > 
> > ```
> > int a = 0x17fffffff;
> > int b = 0x180000000;
> > int c = 0x1ffffffff;
> > int d = 0x200000000;
> > ```
> > 
> > ... because while the initializations of `a` and `d` don't change the sign 
> > of the result, that's only because they wrap around *past* a sign change.
> > 
> > So, I think what you have here is fine for the SrcBits <= DstBits case, but 
> > for the SrcBits > DstBits case, you should also check whether the value is 
> > the same as the original (that is, perform the truncation check).
> > 
> > In order to avoid duplicating work when both sanitizers are enabled, it'd 
> > make sense to combine the two sanitizer functions into a single function 
> > and reuse the checks.
> Yep, makes sense. I don't think i have followed the recommendations to the 
> letter,
> but i think the end result is not worse than suggested. Added tests shows how 
> it works now.
OK, so to be clear I'm following:

 * Any implicit conversion that truncates and changes the value triggers the 
truncation sanitizer (unsigned if both source and destination are unsigned, 
signed otherwise)
 * Any implicit conversion that results in a sign change triggers the sign 
change sanitizer
 * Any implicit conversion that triggers both sanitizers produces a single 
warning classified as ICCK_SignedIntegerTruncationOrSignChange (eg, the 
truncation changed the value, and the sign changed -- possibly multiple times 
-- when dropping bits)

That seems fine to me.


================
Comment at: lib/CodeGen/CGExprScalar.cpp:1071
+    // So which 'lt' predicate for the comparison do we have to use?
+    // NOTE: if it is unsigned, then the comparison is naturally always 
'false'.
+    llvm::ICmpInst::Predicate Pred =
----------------
Just emit `i1 false` directly in this case. `IRBuilder` generally only 
constant-folds values for you if all the operands are constant, and sanitizers 
are often used at -O0, so there's no guarantee that anyone else will clean this 
up.


================
Comment at: lib/CodeGen/CGExprScalar.cpp:1091-1103
+  //    Truth table:
+  //      /-----------------|-----------------|-----------------\
+  //      | src is negative | dst is negative | no sign change? |
+  //      |-----------------|-----------------|-----------------|
+  //      | false           | false           | true            |
+  //      |-----------------|-----------------|-----------------|
+  //      | false           | true            | false (!)       |
----------------
Do we really need this comment? This seems kinda obvious.


Repository:
  rC Clang

https://reviews.llvm.org/D50250



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D50250: [clang][ubsan] Implicit Conversion Sanitizer - integer sign change - clang part

Reply via email to