Status: Accepted
Owner: [email protected]
CC: [email protected]
Labels: Type-Enhancement Priority-Medium

New issue 269 by [email protected]: comparison and difference on unrelated pointers
http://code.google.com/p/address-sanitizer/issues/detail?id=269

From Dominique Pellé:
====================================================================
First of all, thank you for the address sanitizer.
This one exciting new tool for c++ developers.

A long time ago, I was using another dynamic
analyzer called Insure++ which was good too
(but not free software) and which found things that
Valgrind could not find. I don't have access to it
anymore. Insure++ also works at compilation time,
like ASAN. I remember a kind of bugs which Insure++
found and that ASAN does not find: comparison and
differences on unrelated pointers. See some simple
examples of such bugs detected by Insure++ here:

http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expp.htm
http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expf.htm

The obvious question: could ASAN also detect such bugs?

I remember finding a few such bugs with this check. Here
is a patch in Vim which I fixed with such a bug...

ftp://ftp.vim.org/pub/vim/patches/7.0/7.0.144

I don't remember the problem exactly, but I think
Vim compared a pointer in the heap with a another
pointer that could sometimes be set to a constant
empty string "" (so in .text section, or .rodata?)
and so comparing pointers did not make any sense.


I thought afterwards: maybe that'd be more
relevant for UBSAN than for ASAN.
====================================================================

This is a good idea, although the performance worries me.
We could implement
  __sanitizer_pointer_ge(void *a, void *b);
  __sanitizer_pointer_gt(void *a, void *b);
  __sanitizer_pointer_le(void *a, void *b);
  __sanitizer_pointer_lt(void *a, void *b);
  __sanitizer_pointer_sub(void *a, void *b);
and instrument the IR instructions like this:

  tail call void @__sanitizer_pointer_lt(i8* %a, i8* %b)
  %cmp = icmp ult i8* %a, %b

We may want to add some extra information from the frontend (e.g. names of variables). There is also a risk that the operations on pointers will be transformed into operations on integers by the time IR reaches asan instrumentation. If we insert the callbacks in the frontend it may inhibit many optimizations downstream, so an alternative could be to assign a metadata to the appropriate instructions.

The implementation of these functions could be along these lines:

__sanitizer_pointer_ge(void *a, void *b) {
   uptr flag = common_flags()->detect_unrelated_pointers;
   if (!flag) return;
   if (flag) {   // cheap part
      // This part can be computed almost instantly with asan's allocator
bool in_small_heap1 = asan_addr_is_in_small_heap(a); // i.e. heap allocation of <128K bool in_small_heap2 = asan_addr_is_in_small_heap(b); // i.e. heap allocation of <128K
      if (in_small_heap1 != in_small_heap2) BARK()
      if (in_small_heap1 && in_small_heap2 &&
         asan_addresses_belong_to_different_allocations(a, b)) BARK()
      // neither 'a' nor 'b' are in the small heap
   }
   if (flag < 2) return;
   // can't check cheaply, do the more expensive check
   // The pointers are either in large heap, stack or globals
   // We can tell which class 'a' and 'b' is with reasonable overhead,
   // but not as cheap as for the small heap.
   // Then, if the pointers are in the same class, we'll need even more work
   // to tell if the objects are different.
}

The check sounds more like ubsan-ish at first, but I don't see how it can be implemented w/o controlling the allocator. We can implement the cheap part in msan and tsan in addition to asan because they share the allocator, but the expensive part will require the tool to know the boundaries of stack- and global objects,
which is only available in asan.

The problem is performance. Even if we hide this under a run-time flag like we do with stack-use-after-return detection, the overhead will be several instructions per pointer comparison when the checking is off.
When the checking is on, the overhead will depend on where the pointers are.
If at least one is in the small heap (<128K, most common case) the overhead may be reasonable.
If not, the overhead may be monstrous.

This is worth experimenting anyway.

--
You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups 
"address-sanitizer" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to