Status: Accepted
Owner: [email protected]
CC: [email protected]
Labels: Type-Enhancement Priority-Medium
New issue 269 by [email protected]: comparison and difference on
unrelated pointers
http://code.google.com/p/address-sanitizer/issues/detail?id=269
From Dominique Pellé:
====================================================================
First of all, thank you for the address sanitizer.
This one exciting new tool for c++ developers.
A long time ago, I was using another dynamic
analyzer called Insure++ which was good too
(but not free software) and which found things that
Valgrind could not find. I don't have access to it
anymore. Insure++ also works at compilation time,
like ASAN. I remember a kind of bugs which Insure++
found and that ASAN does not find: comparison and
differences on unrelated pointers. See some simple
examples of such bugs detected by Insure++ here:
http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expp.htm
http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expf.htm
The obvious question: could ASAN also detect such bugs?
I remember finding a few such bugs with this check. Here
is a patch in Vim which I fixed with such a bug...
ftp://ftp.vim.org/pub/vim/patches/7.0/7.0.144
I don't remember the problem exactly, but I think
Vim compared a pointer in the heap with a another
pointer that could sometimes be set to a constant
empty string "" (so in .text section, or .rodata?)
and so comparing pointers did not make any sense.
I thought afterwards: maybe that'd be more
relevant for UBSAN than for ASAN.
====================================================================
This is a good idea, although the performance worries me.
We could implement
__sanitizer_pointer_ge(void *a, void *b);
__sanitizer_pointer_gt(void *a, void *b);
__sanitizer_pointer_le(void *a, void *b);
__sanitizer_pointer_lt(void *a, void *b);
__sanitizer_pointer_sub(void *a, void *b);
and instrument the IR instructions like this:
tail call void @__sanitizer_pointer_lt(i8* %a, i8* %b)
%cmp = icmp ult i8* %a, %b
We may want to add some extra information from the frontend (e.g. names of
variables).
There is also a risk that the operations on pointers will be transformed
into operations on integers by the time IR reaches asan instrumentation.
If we insert the callbacks in the frontend it may inhibit many
optimizations downstream,
so an alternative could be to assign a metadata to the appropriate
instructions.
The implementation of these functions could be along these lines:
__sanitizer_pointer_ge(void *a, void *b) {
uptr flag = common_flags()->detect_unrelated_pointers;
if (!flag) return;
if (flag) { // cheap part
// This part can be computed almost instantly with asan's allocator
bool in_small_heap1 = asan_addr_is_in_small_heap(a); // i.e. heap
allocation of <128K
bool in_small_heap2 = asan_addr_is_in_small_heap(b); // i.e. heap
allocation of <128K
if (in_small_heap1 != in_small_heap2) BARK()
if (in_small_heap1 && in_small_heap2 &&
asan_addresses_belong_to_different_allocations(a, b)) BARK()
// neither 'a' nor 'b' are in the small heap
}
if (flag < 2) return;
// can't check cheaply, do the more expensive check
// The pointers are either in large heap, stack or globals
// We can tell which class 'a' and 'b' is with reasonable overhead,
// but not as cheap as for the small heap.
// Then, if the pointers are in the same class, we'll need even more work
// to tell if the objects are different.
}
The check sounds more like ubsan-ish at first, but I don't see how it can
be implemented w/o controlling the allocator.
We can implement the cheap part in msan and tsan in addition to asan
because they share the allocator,
but the expensive part will require the tool to know the boundaries of
stack- and global objects,
which is only available in asan.
The problem is performance. Even if we hide this under a run-time flag like
we do with stack-use-after-return detection,
the overhead will be several instructions per pointer comparison when the
checking is off.
When the checking is on, the overhead will depend on where the pointers are.
If at least one is in the small heap (<128K, most common case) the overhead
may be reasonable.
If not, the overhead may be monstrous.
This is worth experimenting anyway.
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
You received this message because you are subscribed to the Google Groups
"address-sanitizer" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.