Re: [PATCH] Add sanopt support for UBSAN_PTR.

Jakub Jelinek Fri, 06 Oct 2017 01:41:13 -0700

On Thu, Oct 05, 2017 at 03:52:45PM +0200, Martin Liška wrote:
> > Do you really need to handle offset != NULL_TREE?
> > If the bit offset is representable in shwi, then it will just be
> > in bitpos and offset will be NULL.
> 
> For this:
> UBSAN_PTR (&MEM[(void *)&b + 9223372036854775807B], 1);
> 
> I have offset:
> <integer_cst 0x155553ec19f0 type <integer_type 0x155553d81000 sizetype> 
> constant 9223372036854775807>
> 
> which is a valid offset.


Sure, it is valid, but this is an optimization code.
Objects larger than 1/8 of 64-bit address space (since we have
always 64-bit HWI) are non-existent and will not appear for some years.
So I think it isn't a problem if we just don't optimize those.

> >> +        && (!is_global_var (base) || decl_binds_to_current_def_p (base)))
> >> +      {
> >> +        HOST_WIDE_INT bytepos = bitpos / BITS_PER_UNIT;
> >> +        offset_int total_offset = bytepos;
> >> +        if (offset != NULL_TREE)
> >> +          total_offset += wi::sext (wi::to_offset (offset), POINTER_SIZE);
> >> +        total_offset += wi::sext (wi::to_offset (cur_offset),
> >> +                                  POINTER_SIZE);
> > 
> > Why are you sign extending it each time?  Can't it be sign extended after
> > all the additions?
> 
> I had problem with this:
> UBSAN_PTR (&MEM[(void *)&b + -9223372036854775807B], 9223372036854775805);
> 
> when not doing sign extension, I was said that 
> fits_shwi_p(to_offset(-9223372036854775807) + to_offset 
> (9223372036854775805)) is false.

What I meant is that you can extend only the final result, and actually not
go through a HWI at all, you compute an offset_int, extend it and pass it
to other functions.

> > On the other side, is it safe to add the (bytepos + offset) part with
> > the cur_offset part unconditionally?
> > If the former is "positive" and cur_offset is "negative", they cancel each
> > other and we IMHO shouldn't try to optimize it away.  bytepos could be some
> > very large integer, outside of bounds of the var, and cur_offset some
> > negative constant, where both var + bytepos and (var + bytepos) + cur_offset
> > overflow.  Or bytepos could be negative, outside of the bounds of the
> > variable.  I think you need to check separately that bytepos is >= 0 and
> > <= DECL_SIZE_UNIT and that bytepos + cur_offset is within that range too.
> > 
> >> +        /* New total offset can fit in pointer type.  */
> >> +        if (wi::fits_shwi_p (total_offset))
> > 
> > Why do you need to convert a wide_int to signed HOST_WIDE_INT and back?
> > Or do you want to check for some overflows above?
> 
> Do I understand it correctly that:
> 
>   p = b - 9223372036854775807LL; /* pointer overflow check is needed */
>   p2 = p + 9223372036854775805LL;
> 
> should not be handled because: &b + -9223372036854775807B is outsize of 
> address of b?
> Am I right?

Let's talk with smaller numbers.  If we have:
1)
UBSAN_PTR (&MEM_REF[ptr + 128], 256)
both offsets have the same "sign", so the check is redundant if there is say
UBSAN_PTR (ptr, 128+256); (or larger offset).

2)
UBSAN_PTR (&MEM_REF[ptr + 128], -64)
bitpos is positive, cur_offset "negative", here optimizing away the check if
UBSAN_PTR (ptr, 128-64); is wrong, there could be address space boundary in
between ptr+64 and ptr+128, and perhaps the only code to detect it would be
this UBSAN_PTR.  But we can optimize it away if there is
UBSAN_PTR (ptr, 128);
check dominating it, then we know there is no address space boundary in
between ptr and ptr+128, therefore none between ptr+64 and ptr+128 either.

3)
UBSAN_PTR (&MEM_REF[ptr + 128], -256)
similarly to above, but cur_offset larger than bytepos.  Here we want to
look for
UBSAN_PTR (ptr, 128-256);

4-6) invert all the signs above to get the rest of the cases.

Actually, there are other cases, e.g. when bytepos + cur_offset is larger
than sizetype max or smaller than sizetype min.  In that case, IMHO we just
shouldn't optimize.  But the check wouldn't be whether the offset_int
fits into shwi, but rather whether the resulting offset is representable
in signed sizetype (whether sign-extending it from sizetype prec is equal to
the non-extended value).

In any case, cur_offset's offset_int value needs to be sign-extended from
sizetype precision first (for any work with it, we always want to treat it
as signed in these routines), and bitpos being HOST_WIDE_INT divided by
BITS_PER_UNIT and then converted into offset_int is also naturally
sign-extended.

> > 
> >> +          {
> >> +            tree toffset = build_int_cst (TREE_TYPE (cur_offset),
> >> +                                          total_offset.to_shwi ());
> >> +            tree ptr2 = build1 (ADDR_EXPR,
> >> +                                build_pointer_type (TREE_TYPE (base)),
> >> +                                base);
> > 
> > Why do you need to create these trees here when you actually don't use them
> > (or shouldn't need)
> 
> Is it really impossible to catch something by has_dominating_ubsan_ptr_check 
> (ctx, ptr2, total_offset)?

See above, yes, it is, but if you work with offset_int values, you don't
need to build any tree for it, and ptr2, while it has to be built, can be
built after the comparison whether it is a decl with guaranteed lack of
address space boundary inside of the object.

        Jakub

Re: [PATCH] Add sanopt support for UBSAN_PTR.

Reply via email to