Re: [PATCH] -fsanitize-recover=list

2014-12-18 Thread Alexey Samsonov
Hi Jakub,

On Tue, Nov 18, 2014 at 11:36 AM, Alexey Samsonov samso...@google.com wrote:

 On Mon, Nov 17, 2014 at 11:53 PM, Jakub Jelinek ja...@redhat.com wrote:
  On Mon, Nov 17, 2014 at 11:39:47PM -0800, Alexey Samsonov wrote:
   That is not what I think we've agreed on and what is implemented in GCC.
   -fsanitize-recover only enables recovery of the undefined plus
   undefined-like sanitizers, in particular it doesn't enable recover from
   kernel-address, because -fsanitize-recover should be a deprecated option
   and kernel-address never used it before.
 
  Hm, indeed, I messed it up. In the older thread we agree that plain
  -f(no-)sanitize-recover
  should be a deprecated synonym for the current meaning of these flags,
  which only specify recoverability for UBSan-related checks :-/
 
  Has GCC already shipped with existing implementation? I'm just curious
  if it's convenient to have flags that would enable/disable recovery for all
  sanitizers (analogous to -Werror flags which exist in a standalone form, 
  but
  may accept specific warnings, so that one can write
$(CC) foo.cc -Wno-error -Werror=sign-compare
 
  Well, no sanitizer is on by default, so you can just use the same
  list for -fno-sanitize-recover= or -fsanitize-recover= as for -fsanitize=
  if you want.

 Yeah, but it may look somewhat redundant. Also, re-using the exact
 same list may lead to diagnostic messages (if you write
 -fsanitize=unreachable,null -fsanitize-recover=unreachable,null).

Reviving this thread.

What do you think of the following idea:
1) we keep -fsanitize-recover and -fno-sanitize-recover as
deprecated synonyms
for -f(no-)?sanitize=ubsan-like checks
2) we introduce -fsanitize-recover=list and
-fno-sanitize-recover=list, where list may contain
specific sanitizers (address) or group of sanitizers (undefined).
3) we introduce group of sanitizers called all, which is, well, all
available sanitizers. It can *not* be
used in -fsanitize=all flag, but can be used to easily disable all
previously enabled sanitizers (-fno-sanitize=all),
or to enable/disable recovery for all enabled sanitizers
(-fsanitize-recover=all / -fno-sanitize-recover=all).

This would be a handy shortcut, and will save users from copying the
same list twice for
-fsanitize= and -fsanitize-recover= flags, and from potential
diagnostic problems I mention.



   So, in GCC -fsanitize-recover stands for
   -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
   and -fno-sanitize-recover stands for
   -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
  
* -fsanitize-recover=list: Enable recovery for all selected checks
or group of checks. It is forbidden to list unrecoverable sanitizers
here (e.g., -fsanitize-recover=address will produce an error).
  
   We only error on
   -fsanitize-recover=address
   -fsanitize-recover=thread
   -fsanitize-recover=leak
 
  Right, it's a good idea to error on sanitizer kinds we don't have
  recovery for. I will
  add the errors for TSan/MSan/LSan etc.
 
   but not say on
   -fsanitize-recover=unreachable
   which is part of undefined; unreachable isn't recoverable silently.
   Likewise -fsanitize-recover=return.  Otherwise one couldn't use
   -fsanitize-recover=undefined which is useful.
 
  Can't this be fixed by checking the actual values of -fsanitize-recover= 
  flag
  before expanding the sanitizer groups (i.e. turning undefined into
  null,unreachable,return,)?
 
  We should probably be able to distinguish between 
  -fsanitize-recover=undefined,
  and -fsanitize-recover=unreachable,another checks from undefined.
  In the first case we can enable recovery
  for all parts of undefined that support it. In the second, we can
  produce an error as user
  explicitly tried to enable recovery for sanitizer that doesn't support it.
 
  But in that case you would need to diagnose it already when parsing the
  option, or add code that would remember what bits have been set from other
  option sets.
  In the former case, you'd diagnose even
  -fsanitize-recover=unreachable -fno-sanitize-recover=undefined
  even when in that case unreachable in the end is not recoverable.
  We don't error out for
  -fsanitize-recover=address -fno-sanitize-recover=address
  because in the end address is not recovered.

 OK, that's a good question: should we diagnose -fsanitize-recover=address
 if it was later disabled by -fno-sanitize-recover=address? There is an 
 argument
 for doing this: -fsanitize-recover=address is unsupported, so this
 flag shouldn't
 have been provided in the first place. It makes much more sense to
 delete it rather
 than override it. It couldn't be passed down from some global
 project-wide configuration
 (like CFLAGS), because it wouldn't work anywhere.

 The only case where overriding might come in handy is re-using the same list 
 for
 -fsanitize= and -fsanitize-recover= flags that you mention:
 # SANITIZERS is a list of sanitizers we want to enable

Re: [PATCH] -fsanitize-recover=list

2014-11-18 Thread Alexey Samsonov
On Mon, Nov 17, 2014 at 11:53 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Nov 17, 2014 at 11:39:47PM -0800, Alexey Samsonov wrote:
  That is not what I think we've agreed on and what is implemented in GCC.
  -fsanitize-recover only enables recovery of the undefined plus
  undefined-like sanitizers, in particular it doesn't enable recover from
  kernel-address, because -fsanitize-recover should be a deprecated option
  and kernel-address never used it before.

 Hm, indeed, I messed it up. In the older thread we agree that plain
 -f(no-)sanitize-recover
 should be a deprecated synonym for the current meaning of these flags,
 which only specify recoverability for UBSan-related checks :-/

 Has GCC already shipped with existing implementation? I'm just curious
 if it's convenient to have flags that would enable/disable recovery for all
 sanitizers (analogous to -Werror flags which exist in a standalone form, but
 may accept specific warnings, so that one can write
   $(CC) foo.cc -Wno-error -Werror=sign-compare

 Well, no sanitizer is on by default, so you can just use the same
 list for -fno-sanitize-recover= or -fsanitize-recover= as for -fsanitize=
 if you want.

Yeah, but it may look somewhat redundant. Also, re-using the exact
same list may lead to diagnostic messages (if you write
-fsanitize=unreachable,null -fsanitize-recover=unreachable,null).

  So, in GCC -fsanitize-recover stands for
  -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
  and -fno-sanitize-recover stands for
  -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
 
   * -fsanitize-recover=list: Enable recovery for all selected checks
   or group of checks. It is forbidden to list unrecoverable sanitizers
   here (e.g., -fsanitize-recover=address will produce an error).
 
  We only error on
  -fsanitize-recover=address
  -fsanitize-recover=thread
  -fsanitize-recover=leak

 Right, it's a good idea to error on sanitizer kinds we don't have
 recovery for. I will
 add the errors for TSan/MSan/LSan etc.

  but not say on
  -fsanitize-recover=unreachable
  which is part of undefined; unreachable isn't recoverable silently.
  Likewise -fsanitize-recover=return.  Otherwise one couldn't use
  -fsanitize-recover=undefined which is useful.

 Can't this be fixed by checking the actual values of -fsanitize-recover= flag
 before expanding the sanitizer groups (i.e. turning undefined into
 null,unreachable,return,)?

 We should probably be able to distinguish between 
 -fsanitize-recover=undefined,
 and -fsanitize-recover=unreachable,another checks from undefined.
 In the first case we can enable recovery
 for all parts of undefined that support it. In the second, we can
 produce an error as user
 explicitly tried to enable recovery for sanitizer that doesn't support it.

 But in that case you would need to diagnose it already when parsing the
 option, or add code that would remember what bits have been set from other
 option sets.
 In the former case, you'd diagnose even
 -fsanitize-recover=unreachable -fno-sanitize-recover=undefined
 even when in that case unreachable in the end is not recoverable.
 We don't error out for
 -fsanitize-recover=address -fno-sanitize-recover=address
 because in the end address is not recovered.

OK, that's a good question: should we diagnose -fsanitize-recover=address
if it was later disabled by -fno-sanitize-recover=address? There is an argument
for doing this: -fsanitize-recover=address is unsupported, so this
flag shouldn't
have been provided in the first place. It makes much more sense to
delete it rather
than override it. It couldn't be passed down from some global
project-wide configuration
(like CFLAGS), because it wouldn't work anywhere.

The only case where overriding might come in handy is re-using the same list for
-fsanitize= and -fsanitize-recover= flags that you mention:
# SANITIZERS is a list of sanitizers we want to enable.
CFLAGS = ${CFLAGS} -fsanitize=${SANITIZERS} -fsanitize-recover=${SANITIZERS}
# Oops - we produce an error if ${SANITIZERS} contains address.

However, even if we decide to *not* diagnose unrecoverable sanitizer
kinds disabled later
in the command line, we can still implement warnings for unreachable properly.
You can scan -fsanitize-recover flags from right to left and keep
track of all sanitizers that
would be disabled. When you see -fsanitize=unreachable (with
literal unreachable as a value),
you diagnose the error only if unreachable wouldn't be disabled
later by some -fno-sanitize-recover flag.

FWIW, we implement this logic in Clang for regular -fsanitize= flags.
-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCH] -fsanitize-recover=list

2014-11-17 Thread Alexey Samsonov
  SANITIZE_NONNULL_ATTRIBUTE)
   ? BUILT_IN_UBSAN_HANDLE_NONNULL_ARG
   : BUILT_IN_UBSAN_HANDLE_NONNULL_ARG_ABORT;
   tree fn = builtin_decl_explicit (bcode);
 @@ -1396,7 +1398,7 @@ instrument_nonnull_return (gimple_stmt_i
  2, loc, NULL_TREE, NULL_TREE);
   data = build_fold_addr_expr_loc (loc[0], data);
   enum built_in_function bcode
 -   = flag_sanitize_recover
 +   = (flag_sanitize_recover  SANITIZE_RETURNS_NONNULL_ATTRIBUTE)
   ? BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN
   : BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_ABORT;
   tree fn = builtin_decl_explicit (bcode);
 --- gcc/c-family/c-ubsan.c.jj   2014-09-10 11:20:49.0 +0200
 +++ gcc/c-family/c-ubsan.c  2014-10-17 17:13:14.393241619 +0200
 @@ -104,7 +104,7 @@ ubsan_instrument_division (location_t lo
  NULL_TREE);
data = build_fold_addr_expr_loc (loc, data);
enum built_in_function bcode
 -   = flag_sanitize_recover
 +   = (flag_sanitize_recover  SANITIZE_DIVIDE)
   ? BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW
   : BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT;
tt = builtin_decl_explicit (bcode);
 @@ -199,7 +199,7 @@ ubsan_instrument_shift (location_t loc,
data = build_fold_addr_expr_loc (loc, data);

enum built_in_function bcode
 -   = flag_sanitize_recover
 +   = (flag_sanitize_recover  SANITIZE_SHIFT)
   ? BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS
   : BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT;
tt = builtin_decl_explicit (bcode);
 @@ -229,7 +229,7 @@ ubsan_instrument_vla (location_t loc, tr
  NULL_TREE);
data = build_fold_addr_expr_loc (loc, data);
enum built_in_function bcode
 -   = flag_sanitize_recover
 +   = (flag_sanitize_recover  SANITIZE_VLA)
   ? BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE
   : BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE_ABORT;
tt = builtin_decl_explicit (bcode);


 Jakub



-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCH] -fsanitize-recover=list

2014-11-17 Thread Alexey Samsonov
On Mon, Nov 17, 2014 at 10:47 PM, Jakub Jelinek ja...@redhat.com wrote:

 On Mon, Nov 17, 2014 at 06:40:00PM -0800, Alexey Samsonov wrote:
  I've just prepared a patch implementing -fsanitize-recover=list in
  Clang (http://reviews.llvm.org/D6302), writing here to make sure we're
  on
  the same page w.r.t. flag semantics:
 
  * -fsanitize-recover: Enable recovery for all checks enabled by
  -fsanitize= which support it.
  * -fno-sanitize-recover: Disable recovery for all checks.

 That is not what I think we've agreed on and what is implemented in GCC.
 -fsanitize-recover only enables recovery of the undefined plus
 undefined-like sanitizers, in particular it doesn't enable recover from
 kernel-address, because -fsanitize-recover should be a deprecated option
 and kernel-address never used it before.

Hm, indeed, I messed it up. In the older thread we agree that plain
-f(no-)sanitize-recover
should be a deprecated synonym for the current meaning of these flags,
which only specify recoverability for UBSan-related checks :-/

Has GCC already shipped with existing implementation? I'm just curious
if it's convenient to have flags that would enable/disable recovery for all
sanitizers (analogous to -Werror flags which exist in a standalone form, but
may accept specific warnings, so that one can write
  $(CC) foo.cc -Wno-error -Werror=sign-compare

 So, in GCC -fsanitize-recover stands for
 -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
 and -fno-sanitize-recover stands for
 -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow

  * -fsanitize-recover=list: Enable recovery for all selected checks
  or group of checks. It is forbidden to list unrecoverable sanitizers
  here (e.g., -fsanitize-recover=address will produce an error).

 We only error on
 -fsanitize-recover=address
 -fsanitize-recover=thread
 -fsanitize-recover=leak

Right, it's a good idea to error on sanitizer kinds we don't have
recovery for. I will
add the errors for TSan/MSan/LSan etc.

 but not say on
 -fsanitize-recover=unreachable
 which is part of undefined; unreachable isn't recoverable silently.
 Likewise -fsanitize-recover=return.  Otherwise one couldn't use
 -fsanitize-recover=undefined which is useful.

Can't this be fixed by checking the actual values of -fsanitize-recover= flag
before expanding the sanitizer groups (i.e. turning undefined into
null,unreachable,return,)?

We should probably be able to distinguish between -fsanitize-recover=undefined,
and -fsanitize-recover=unreachable,another checks from undefined.
In the first case we can enable recovery
for all parts of undefined that support it. In the second, we can
produce an error as user
explicitly tried to enable recovery for sanitizer that doesn't support it.

This is only a diagnostic quality issue, though.


  * -fno-sanitize-recover=list: Disable recovery for selected checks
  or group of checks.

 Jakub

-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-10-03 Thread Alexey Samsonov
On Wed, Oct 1, 2014 at 10:58 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Oct 01, 2014 at 04:21:29PM -0700, Alexey Samsonov wrote:
 Speaking of plain -f(no-)sanitize-recover - it would probably be
 better to change the semantics of this flag,
 so that -f(no-)?sanitize-recover means enable(disable) recovery for
 all sanitizers enabled at this point.
 That is, it would be pretty much like -Werror flag.

 For example,
 -fsanitize=undefined -fsanitize=address -fno-sanitize-recover
 would mean run UBSan and ASan and don't recover from errors.

 That would change behavior, e.g. for
 -fsanitize=undefined,address -fsanitize-recover
 would suddenly enable recovery from asan errors while previously
 they wouldn't be recovering.

 GCC has not shipped with the -fsanitize-recover flag yet (we have just
 vendor backport of it), so if you don't mind changing behavior for clang
 users, I can live with that.

Yes, I think so. -fsanitize-recover was not documented in Clang user manual.

  Would the default still be
 -fsanitize-recover=undefined,kernel-address -fno-sanitize-recover=address ?

Yes.

-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-10-01 Thread Alexey Samsonov
Speaking of plain -f(no-)sanitize-recover - it would probably be
better to change the semantics of this flag,
so that -f(no-)?sanitize-recover means enable(disable) recovery for
all sanitizers enabled at this point.
That is, it would be pretty much like -Werror flag.

For example,
-fsanitize=undefined -fsanitize=address -fno-sanitize-recover
would mean run UBSan and ASan and don't recover from errors.

On Tue, Sep 30, 2014 at 10:39 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Sep 30, 2014 at 10:36:34AM -0700, Alexey Samsonov wrote:
  Would we accept -fsanitize-recover=undefined 
  -fno-sanitize-recover=signed-integer-overflow
  as recovering everything but signed integer overflows, i.e. the decision
  whether to recover a particular call would check similar bitmask as
  is checked whether to sanitize something at all?

 Yes, the logic for creating a set of recoverable sanitizers should be
 the same as the logic for creating a set of enabled sanitizers.

 LGTM, will hack it up soon in GCC then.

 Jakub



-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Alexey Samsonov
On Tue, Sep 30, 2014 at 12:07 AM, Yury Gribov y.gri...@samsung.com wrote:
 On 09/30/2014 09:40 AM, Jakub Jelinek wrote:

 On Mon, Sep 29, 2014 at 05:24:02PM -0700, Konstantin Serebryany wrote:

 I don't think we ever going to support recovery for regular ASan
 (Kostya, correct me if I'm wrong).


 I hope so too.
 Another point is that with asan-instrumentation-with-call-threshold=0
 (instrumentation with callbacks)


 The normal (non-recovery) callbacks are __attribute__((noreturn)) for
 performance reasons, and you do need different callbacks and different
 generated code if you want to recover (after the callback you need jump
 back to a basic block after the conditional jump).
 So, in that case you would need -fsanitize-recover=address.

 I see no problem in enabling -fsanitize-recover by default for
 -fsanitize=undefined and


 This becomes more interesting when we use asan and ubsan together.


 That is fairly common case.


 I think we can summarize:
 * the current option -fsanitize-recover is misleading; it's really
 -fubsan-recover
 * we need a way to selectively enable/disable recovery for different
 sanitizers

 The most promininet solution seems to be
 * allow -fsanitize-recover=tgt1,tgt2 syntax
 * -fsanitize-recover wo options would still mean UBSan recovery

 The question is what to do with -fno-sanitize-recover then.

We can make -f(no-)?sanitize-recover= flags accept the same values as
-f(no-)?sanitize= flags. In this case,

-fsanitize-recover will be a deprecated alias of
-fsanitize-recover=undefined, and
-fno-sanitize-recover will be a deprecated alias of
-fno-sanitize-recover=undefined.
If a user provides -fsanitize-recover=address, we can instruct the
instrumentation pass to
use recoverable instrumentation.


 -Y




-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-30 Thread Alexey Samsonov
On Tue, Sep 30, 2014 at 10:33 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Sep 30, 2014 at 10:26:39AM -0700, Alexey Samsonov wrote:
  I think we can summarize:
  * the current option -fsanitize-recover is misleading; it's really
  -fubsan-recover
  * we need a way to selectively enable/disable recovery for different
  sanitizers
 
  The most promininet solution seems to be
  * allow -fsanitize-recover=tgt1,tgt2 syntax
  * -fsanitize-recover wo options would still mean UBSan recovery
 
  The question is what to do with -fno-sanitize-recover then.

 We can make -f(no-)?sanitize-recover= flags accept the same values as
 -f(no-)?sanitize= flags. In this case,

 -fsanitize-recover will be a deprecated alias of
 -fsanitize-recover=undefined, and
 -fno-sanitize-recover will be a deprecated alias of
 -fno-sanitize-recover=undefined.
 If a user provides -fsanitize-recover=address, we can instruct the
 instrumentation pass to
 use recoverable instrumentation.

 Would we accept -fsanitize-recover=undefined 
 -fno-sanitize-recover=signed-integer-overflow
 as recovering everything but signed integer overflows, i.e. the decision
 whether to recover a particular call would check similar bitmask as
 is checked whether to sanitize something at all?

Yes, the logic for creating a set of recoverable sanitizers should be
the same as the logic for creating a set of enabled sanitizers.

-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-29 Thread Alexey Samsonov
(resending in plain-text mode)

-fasan-recover doesn't look like a good idea - for instance, in Clang,
we never use ?san
in flag names, preferring -fsanitize-whatever. What's the rationale
behind splitting
-fsanitize-recover in two flags (ASan- and UBSan- specific)?
Is there no way to keep a single -f(no-)sanitize-recover for that
purpose? Now it works
only for UBSan checks, but we may extend it to another sanitizers as well.

On Mon, Sep 29, 2014 at 2:20 PM, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:
 +Alexey Samsonov

 On Mon, Sep 29, 2014 at 10:43 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Sep 29, 2014 at 09:21:11PM +0400, Yury Gribov wrote:
 This patch enables -fsanitize-recover for KASan by default. This causes
 KASan to continue execution after error in case of inline
 instrumentation. This feature is needed because
 - reports during early bootstrap won't even be printed
 - needed to run all tests w/o rebooting machine for every test
 - needed for interactive work on desktop

 This is the third version of patch which renames -fsanitize-recover to
 -fubsan-recover and introduces -fasan-recover (enabled by default for
 KASan). It also moves flag handling to finish_options per Jakub's request.

 As the -fsanitize-recover option comes from clang originally, I think
 this needs coordination with them (whether clang will also rename the
 option), and certainly keep -fsanitize-recover as a non-documented
 compat option alias for -fubsan-recover.
 So, can you please talk to the clang folks about it?

 Jakub



-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-29 Thread Alexey Samsonov
On Mon, Sep 29, 2014 at 4:17 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Sep 29, 2014 at 03:36:20PM -0700, Alexey Samsonov wrote:
 -fasan-recover doesn't look like a good idea - for instance, in Clang, we
 never use ?san
 in flag names, preferring -fsanitize-whatever. What's the rationale behind
 splitting
 -fsanitize-recover in two flags (ASan- and UBSan- specific)?
 Is there no way to keep a single -f(no-)sanitize-recover for that purpose?
 Now it works
 only for UBSan checks, but we may extend it to another sanitizers as well.

 The problem is that if we start using it for ASan, it needs to have a
 different default, because ASan wants to abort by default, while UBSan
 recover by default.  -fsanitize=kernel-address w (KASan) wants to recover
 by default.  So, the option is either to never support recover for
 -fsanitize=address, for ubsan keep -fsanitize-recover (by default) as is
 and for kasan use that same switch, or have separate flags.

 Jakub

I don't think we ever going to support recovery for regular ASan
(Kostya, correct me if I'm wrong).
I see no problem in enabling -fsanitize-recover by default for
-fsanitize=undefined and
-fsanitize=kernel-address. We can, potentially, extend
-fsanitize-recover flag to take the same values as -fsanitize= one,
so that one can specify which sanitizers are recoverable, and which
are not, but I'd try to make this a last resort - this is too complex.

-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan

2014-09-29 Thread Alexey Samsonov
On Mon, Sep 29, 2014 at 5:24 PM, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:

 On Mon, Sep 29, 2014 at 4:26 PM, Alexey Samsonov samso...@google.com wrote:
  On Mon, Sep 29, 2014 at 4:17 PM, Jakub Jelinek ja...@redhat.com wrote:
  On Mon, Sep 29, 2014 at 03:36:20PM -0700, Alexey Samsonov wrote:
  -fasan-recover doesn't look like a good idea - for instance, in Clang, we
  never use ?san
  in flag names, preferring -fsanitize-whatever. What's the rationale behind
  splitting
  -fsanitize-recover in two flags (ASan- and UBSan- specific)?
  Is there no way to keep a single -f(no-)sanitize-recover for that purpose?
  Now it works
  only for UBSan checks, but we may extend it to another sanitizers as well.
 
  The problem is that if we start using it for ASan, it needs to have a
  different default, because ASan wants to abort by default, while UBSan
  recover by default.  -fsanitize=kernel-address w (KASan) wants to recover
  by default.  So, the option is either to never support recover for
  -fsanitize=address, for ubsan keep -fsanitize-recover (by default) as is
  and for kasan use that same switch, or have separate flags.
 
  Jakub
 
  I don't think we ever going to support recovery for regular ASan
  (Kostya, correct me if I'm wrong).

 I hope so too.
 Another point is that with asan-instrumentation-with-call-threshold=0
 (instrumentation with callbacks)
 we can and probably will allow to recover from errors (glibc demands that),
 but that does not require any compile-time flag.


So, if we switch to instrumentation to callbacks (which are not
necessarily noreturn),
-fsanitize-recover is the thing we plan to support?




  I see no problem in enabling -fsanitize-recover by default for
  -fsanitize=undefined and

 This becomes more interesting when we use asan and ubsan together.
 Which default setting is stronger? :)

We can preserve the default behavior for each tool (no recovery for
ASan, recovery for UBSan)
unless recovery mode is explicitly specified.


  -fsanitize=kernel-address. We can, potentially, extend
  -fsanitize-recover flag to take the same values as -fsanitize= one,
  so that one can specify which sanitizers are recoverable, and which
  are not, but I'd try to make this a last resort - this is too complex.
 
  --
  Alexey Samsonov, Mountain View, CA




-- 
Alexey Samsonov, Mountain View, CA


Re: [PATCH] libsanitizer demangling using cp-demangle.c

2014-01-16 Thread Alexey Samsonov
I've landed patches for libbacktrace and cp-demangle support in LLVM.
However, they required some changes (e.g. some files LLVM trunk were
modified after the last merge). This means that the next merge to GCC
(IIUC it won't happen anytime soon before GCC 4.9 release) will not be
clean. Sorry for delay.

On Fri, Jan 10, 2014 at 7:56 AM, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:
 On Thu, Jan 9, 2014 at 5:57 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Jan 09, 2014 at 05:51:05PM +0400, Konstantin Serebryany wrote:
 On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek ja...@redhat.com wrote:
  On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote:
  There was a recent buggy patch to the demangler that added calls to
  malloc and realloc (2013-10-25 Gary Benson gben...@redhat.com).
  That patch must be fixed or reverted before the 4.9 release.  The main
  code in the demangler must not call malloc/realloc.
 
  When that patch is fixed, you can use the cplus_demangle_v3_callback
  function to get a demangler that never calls malloc.
 
  AFAIK Gary is working on a fix, when that is fixed, with the following
  patch libsanitizer (when using libbacktrace for symbolization) will not
  use system malloc/realloc/free for the demangling at all.
 
  Tested on x86_64-linux (-m64/-m32).  Note that the changes for the 3 files
  unfortunately will need to be applied upstream to compiler-rt, is that
  possible?
 
  2013-12-10  Jakub Jelinek  ja...@redhat.com
 
  * sanitizer_common/sanitizer_symbolizer_libbacktrace.h
  (LibbacktraceSymbolizer::Demangle): New declaration.
  * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc

 sanitizer_symbolizer_posix_libcdep.cc is the file from upstream.
 If it gets any change in the GCC variant, I will not be able to do
 merges from upstream until the same code is applied upstream.

 Sure, but we are nearing GCC 4.9 stage3 finish and really need to demangle
 the libbacktrace provided output.  Has the compiler-rt situation been
 cleared up?

 I hope it just did (see the fresh Chandler's reply).

 --kcc

 Haven't seen any follow-ups after Chandler's reversion.
 So, this change is meant to be temporary, with hope that in upstream this
 will be resolved, either with the same patch or something similar.

 Jakub



-- 
Alexey Samsonov, MSK


Re: [PATCH] Handle PIEs in libbacktrace

2013-12-06 Thread Alexey Samsonov
(now in plain-text mode).

ASan calls this function only for global variables (not functions). It
is needed because
we use a different machinery to pass global names to ASan runtime - we
register globals
by a call from instrumented code, which also tells runtime the
(mangled) global name.

OTOH, in TSan we rely on symbolization to demangle names of global variables.


  From what I can see, even when
 you are using llvm-symbolizer, sanitizer_common doesn't pass -demangle
 option to it.


-demangle defaults to on in llvm-symbolizer.


On Fri, Dec 6, 2013 at 12:51 PM, Dmitry Vyukov dvyu...@google.com wrote:
 On Fri, Dec 6, 2013 at 12:25 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Dec 06, 2013 at 12:19:45PM +0400, Dmitry Vyukov wrote:
  And the reason why check-g++ tsan.exp fails even with this patch is
  that apparently tsan doesn't try to demangle the symbol names, so we get
  e.g.:

 Demangling must be done by the symbolizer.
 +samsonov for this

 So why does asan_report.cc have then:
 static const char *MaybeDemangleGlobalName(const char *name) {
   // We can spoil names of globals with C linkage, so use an heuristic
   // approach to check if the name should be demangled.
   return (name[0] == '_'  name[1] == 'Z')
  ? Symbolizer::Get()-Demangle(name)
  : name;
 }
 and uses it where it wants to demangle?

 I would say this is a defect. It's symbolizer responsibility.

 From what I can see, even when
 you are using llvm-symbolizer, sanitizer_common doesn't pass -demangle
 option to it.


 It defaults to true:
 static cl::optbool
 ClDemangle(demangle, cl::init(true), cl::desc(Demangle function names));



-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-27 Thread Alexey Samsonov
On Mon, Nov 25, 2013 at 7:02 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Nov 25, 2013 at 06:53:59PM +0400, Alexey Samsonov wrote:
  In GCC, libbacktrace is built as a libtool convenience library only and
  then linked into whatever libraries want to use it.  So indeed, the plan
  is to link libbacktrace.la into libasan.so.1.0.0 and libasan.a
  (and the earlier GCC patch I've posted included corresponding toplevel
  configure/Makefile bits and libsanitizer Makefile/configure changes to make
  it happen).  E.g. libgo.so.* links libbacktrace.la into itself too.
  So, for GCC user experience, things will work automatically.

 Good. Note, however, that you'd need at least two versions of
 libbacktrace - 32-bit and 64-bit.

 In GCC you get that automatically (unless --disable-multilib, but then you
 don't get 32-bit libsanitizer either say on x86_64-linux).

  it means finding some way in the build system/configure to detect
  or request libbacktrace availability (for libraries not included in
  GCC tree GCC's configure typically uses --with-package=path
  or --with-package-lib=path/--with-package-include=path) and just
  link against it and you'd just unpack/configure/build libbacktrace
  on one of your test bots.

 We indeed can add support for detecting presence of libbacktrace
 binaries and headers
 when we configure a build tree. But to just link against it we have
 to either link against it in Clang
 driver (I don't think we should add support for it), or link against
 it in all the places where we build something
 with -fsanitize=foo, and there are quite a few of them.

 Or change the rules for creation of *sanitizer_common.*a, so that you
 link the libbacktrace.la convenience library into it (if configure found
 it).  If you build it using libtool/automake, you get it automatically,
 otherwise you can e.g. using ar unpack libbacktrace.a and note the filenames
 of the object files from it, then add those to the ar command line for
 creation of *sanitizer_common.*a.

This would be messy, especially assuming that we have two (both pretty ugly)
build systems in compiler-rt repository. And the argument about two
versions (32-bit and 64-bit) of
libbacktrace is more serious in this setting. Also, see note below.

 As for redirecting libc calls from libbacktrace to our internal
 versions (or interceptors) -
 if it works fine for our toy tests (which is the case), I suggest to
 wait for the actual problems gcc
 users may discover. If there will be need in compiling a separate
 sanitizer-specific version of libbacktrace,
 or providing an additional layer between libbacktrace and sanitizer
 runtimes, or both, this could probably
 be done in gcc version of the runtime only (w/o modifying any existing
 source files, only adding standalone gcc-specific stuff).

 Sure, worst case you keep it untested in your LLVM copy of libsanitizer
 and we'll just need to fix it up during merges if something breaks.
 If it will be used for GCC (and we have a P1 bug so it is a release blocker
 if llvm-symbolizer is still used), then it will be tested there.

Yes, I think that's what we should do in the short term.
I've submitted (slightly edited) version of your patch to compiler-rt
as r195837, I think
it should work after the next merge to gcc. I've done a minimal
testing of this change
(built the runtime with -DSANITIZER_LIBBACKTRACE, headers provided,
libbacktrace.a linked),
and it kind of worked - libbacktrace functions were properly picked
up, error callbacks was not called,
but the symbolization didn't work either:

Turns out libbacktrace doesn't work with executables produced by LLVM:
emitted compile unit DIEs don't have neither
DW_AT_low_pc/DW_AT_high_pc pair, nor DW_AT_ranges reference.
It means that to build a set of address ranges corresponding to each
compile unit, one actually
has to extract all the subprogram DIEs. I treat this as a bug in
libbacktrace (unless it's goal is to work only
for GCC-produced binaries). That's what I was afraid when Ian
suggested us to fork the sources.

-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-27 Thread Alexey Samsonov
On Wed, Nov 27, 2013 at 4:20 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 27, 2013 at 04:12:33PM +0400, Alexey Samsonov wrote:
  Sure, worst case you keep it untested in your LLVM copy of libsanitizer
  and we'll just need to fix it up during merges if something breaks.
  If it will be used for GCC (and we have a P1 bug so it is a release blocker
  if llvm-symbolizer is still used), then it will be tested there.

 Yes, I think that's what we should do in the short term.
 I've submitted (slightly edited) version of your patch to compiler-rt
 as r195837, I think

 Thanks, once Kostya merges it in, I'll rework a patch for it (which
 hopefully will not need to change anything in the sanitizer_common/
 stuff, just Makefiles/configure and other stuff owned by GCC).

 it should work after the next merge to gcc. I've done a minimal
 testing of this change
 (built the runtime with -DSANITIZER_LIBBACKTRACE, headers provided,
 libbacktrace.a linked),
 and it kind of worked - libbacktrace functions were properly picked
 up, error callbacks was not called,
 but the symbolization didn't work either:

 Turns out libbacktrace doesn't work with executables produced by LLVM:
 emitted compile unit DIEs don't have neither
 DW_AT_low_pc/DW_AT_high_pc pair, nor DW_AT_ranges reference.

 I'd call that a LLVM bug, not emitting those attributes on a CU sounds like
 a flaw in it.

LLVM emits just a DW_AT_low_pc (base address of a compilation unit). The
standard tells that compilation unit entries may have attributes
specifying the
address range, but doesn't tell they are obligatory.
DWARF consumers probably shouldn't rely on their presence.

-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-27 Thread Alexey Samsonov
On Wed, Nov 27, 2013 at 4:51 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 27, 2013 at 04:31:48PM +0400, Alexey Samsonov wrote:
 LLVM emits just a DW_AT_low_pc (base address of a compilation unit). The
 standard tells that compilation unit entries may have attributes
 specifying the
 address range, but doesn't tell they are obligatory.
 DWARF consumers probably shouldn't rely on their presence.

 DWARF is generally full of may have and almost no must have.  If the
 DWARF consumer crashes on absence of some attribute/die etc., that would
 be of course a consumer's fault.  But not being able to symbolize something
 because the provided DWARF info omits important stuff is not wrong.

Ok, I agree to call it missing feature instead of bug =)

 BTW, libbacktrace also uses symbol table (and for backtrace_syminfo
 uses that exclusively), so even if it can't symbolize using DWARF info,
 it should at least symbolize using symbol table (of course inlining info
 isn't available in that case, nor tail call frames (but libbacktrace doesn't
 support those yet, only GDB does so far)).

Yes, symbol table symbolization works for me (it has no file/line
info, of course).

-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-27 Thread Alexey Samsonov
On Wed, Nov 27, 2013 at 9:44 PM, Ian Lance Taylor i...@google.com wrote:
 On Wed, Nov 27, 2013 at 4:31 AM, Alexey Samsonov samso...@google.com wrote:

 LLVM emits just a DW_AT_low_pc (base address of a compilation unit). The
 standard tells that compilation unit entries may have attributes
 specifying the
 address range, but doesn't tell they are obligatory.
 DWARF consumers probably shouldn't rely on their presence.

 How are consumers of LLVM debug info expected to determine the address
 range of the function?

Sorry, I wasn't clear enough. GCC emits DW_AT_low_pc and DW_AT_high_pc for
*compile unit* DIE, so the consumer doesn't have to iterate over all
the functions DIEs.
LLVM doesn't emit DW_AT_low_pc/DW_AT_high_pc for compile unit,
but emits them for function DIEs.

I've added some debugging output to libbacktrace and noticed it was
able to build
[address range]-[compile unit] mapping only for sources compiled with GCC, so I
suggested that it doesn't examine all the descendants of compile unit
DIEs to build the ranges.

 Is the intent that the consumer should assume
 that the function continues from the DW_AT_low_pc address to the next
 DW_AT_low_pc address from the debug info?  Or should the consumer look
 at the function size from the symbol table?  Or should the consumer
 examine all the DIEs looking for address ranges?

 Does LLVM support anything like GCC's -freorder-blocks-and-partition
 option, and, if so, how does it represent the address ranges in the
 debug info?

 Ian

-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-25 Thread Alexey Samsonov
On Fri, Nov 22, 2013 at 10:34 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Nov 22, 2013 at 10:19:02PM +0400, Alexey Samsonov wrote:
 On Tue, Nov 19, 2013 at 8:42 PM, Jakub Jelinek ja...@redhat.com wrote:
  Ok, here it is (untested though, because libsanitizer in gcc is older and I
  don't have spare cycles to play with llvm).
  Guess you need to write it up into compiler-rt buildsystem somehow,
  and for testing add
  -DSANITIZER_LIBBACKTRACE -I/srcdir_of_libbacktrace 
  -I/builddir_of_libbacktrace -L/builddir_of_libbacktrace/.libs 
  -lbacktrace
  This is against current libbacktrace (r205028 or later, before that
  the syminfo callbacks had one less argument).


 Do you have an idea of how the libbacktrace symbolizer should look to
 the end user (if the static runtime was built with
 -DSANITIZER_LIBBACKTRACE)? I see the following options, none of which
 looks nice:
 1) Force the user to provide -L/path/to/libbacktrace -lbacktrace at
 link time (otherwise one will see link errors like undefined symbol:
 backtrace_create_state). Certainly this is a bad user experience.
 2) Make -fsanitize=address link flag imply -lbacktrace as we do for
 libpthread, librt etc. But libbacktrace may not be installed in the
 system.
 3) Pull libbacktrace.a inside libasan.a. This means we should teach
 the build system to merge two archives (or have access to backtrace
 sources at build time and recompile them ourselves). This is doable,
 but kind of ugly and leads to funny consequences - we'll fail to build
 libbacktrace itself with ASan (or, arguably, any program which uses
 it).

 I ask this because I'm slightly afraid libbacktrace symbolizer will be
 pretty undertested in the llvm tree and we may break it without
 noticing. Building it is not a problem, but making all the tests just
 work with it is harder.

 In GCC, libbacktrace is built as a libtool convenience library only and
 then linked into whatever libraries want to use it.  So indeed, the plan
 is to link libbacktrace.la into libasan.so.1.0.0 and libasan.a
 (and the earlier GCC patch I've posted included corresponding toplevel
 configure/Makefile bits and libsanitizer Makefile/configure changes to make
 it happen).  E.g. libgo.so.* links libbacktrace.la into itself too.
 So, for GCC user experience, things will work automatically.

Good. Note, however, that you'd need at least two versions of
libbacktrace - 32-bit and 64-bit.

 As Ian said, libbacktrace right now is not maintained as it's own separate
 entity that would be individually installable etc.  So, for testing
 in LLVM, if you don't want to copy over libbacktrace sources in there,

I don't think we want to copy libbacktrace sources into LLVM and maintain it :(

 it means finding some way in the build system/configure to detect
 or request libbacktrace availability (for libraries not included in
 GCC tree GCC's configure typically uses --with-package=path
 or --with-package-lib=path/--with-package-include=path) and just
 link against it and you'd just unpack/configure/build libbacktrace
 on one of your test bots.

We indeed can add support for detecting presence of libbacktrace
binaries and headers
when we configure a build tree. But to just link against it we have
to either link against it in Clang
driver (I don't think we should add support for it), or link against
it in all the places where we build something
with -fsanitize=foo, and there are quite a few of them.

As for redirecting libc calls from libbacktrace to our internal
versions (or interceptors) -
if it works fine for our toy tests (which is the case), I suggest to
wait for the actual problems gcc
users may discover. If there will be need in compiling a separate
sanitizer-specific version of libbacktrace,
or providing an additional layer between libbacktrace and sanitizer
runtimes, or both, this could probably
be done in gcc version of the runtime only (w/o modifying any existing
source files, only adding standalone gcc-specific stuff).

That said, I'm going to do a basic sanity test of Jakub's patch
applied to LLVM trunk (test that it build fine with and w/o
-DSANITIZER_LIBBACKTRACE
and works if we add -lbacktrace in the latter case), and commit it, so
that we can merge this into gcc. Once again, sorry for the delays.

-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-22 Thread Alexey Samsonov
;
 +if (libbacktrace_symbolizer_) {
 +  mu_.CheckLocked();
 +  libbacktrace_symbolizer_-SymbolizeData(info);
 +  return true;
 +}
  const char *str = SendCommand(true, module_name, module_offset);
  if (str == 0)
return true;
 @@ -455,7 +477,9 @@ class POSIXSymbolizer : public Symbolize
}

bool IsAvailable() {
 -return internal_symbolizer_ != 0 || external_symbolizer_ != 0;
 +return internal_symbolizer_ != 0 ||
 +  libbacktrace_symbolizer_ != 0 ||
 +  external_symbolizer_ != 0;
}

bool IsExternalAvailable() {
 @@ -573,14 +597,19 @@ class POSIXSymbolizer : public Symbolize

ExternalSymbolizer *external_symbolizer_;// Leaked.
InternalSymbolizer *const internal_symbolizer_;  // Leaked.
 +  LibbacktraceSymbolizer *libbacktrace_symbolizer_;  // Leaked.
  };

  Symbolizer *Symbolizer::PlatformInit(const char *path_to_external) {
InternalSymbolizer* internal_symbolizer =
InternalSymbolizer::get(symbolizer_allocator_);
 +  LibbacktraceSymbolizer* libbacktrace_symbolizer = 0;
ExternalSymbolizer *external_symbolizer = 0;

 -  if (!internal_symbolizer) {
 +  if (!internal_symbolizer)
 +libbacktrace_symbolizer =
 +  LibbacktraceSymbolizer::get(symbolizer_allocator_);
 +  if (!internal_symbolizer  !libbacktrace_symbolizer) {
  if (!path_to_external || path_to_external[0] == '\0')
path_to_external = FindPathToBinary(llvm-symbolizer);

 @@ -593,7 +622,8 @@ Symbolizer *Symbolizer::PlatformInit(con
}

return new(symbolizer_allocator_)
 -  POSIXSymbolizer(external_symbolizer, internal_symbolizer);
 +  POSIXSymbolizer(external_symbolizer, libbacktrace_symbolizer,
 + internal_symbolizer);
  }

  }  // namespace __sanitizer


 Jakub




-- 
Alexey Samsonov, MSK


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-19 Thread Alexey Samsonov
Hi Jakub,

Can you please an updated patch in the attachment?

On Mon, Nov 18, 2013 at 8:23 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Nov 18, 2013 at 07:50:33PM +0400, Alexey Samsonov wrote:
 On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov samso...@google.com wrote:
  Unfortunately, recently there were a few enhancements to
  sanitizer_symbolizer_posix_libcdep.cc (and friends),
  in LLVM trunk, and now it looks different from gcc version (apparently, the
  changes were committed
  after the merge to gcc happened, I should have pointed this out earlier,
  sorry).

  Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise
  this patch can't go in ASan runtime
  in gcc - the code will significantly diverge.

 Here is an (untested) forward port of the patch to what is in llvm svn right
 now.  The unpatched code always had either internal_symbolizer_ != NULL, or
 external_symbolizer_ != NULL, or both NULL, but not both set.  The patch
 just adds a third variant, libbacktrace_symbolizer_, again, at most one
 of the 3 will be non-NULL, and the priorities are that internal_symbolizer_
 has highest priority, then libbacktrace (if available/usable), then the
 external one.

 --- sanitizer_symbolizer_posix_libcdep.cc.jj2013-11-12 19:35:30.0 
 +0100
 +++ sanitizer_symbolizer_posix_libcdep.cc   2013-11-18 17:16:03.202643957 
 +0100
 @@ -27,6 +27,16 @@
  #include sys/wait.h
  #include unistd.h

 +#ifdef SANITIZER_LIBBACKTRACE
 +#include backtrace-supported.h
 +#if SANITIZER_LINUX  BACKTRACE_SUPPORTED \
 + !BACKTRACE_USES_MALLOC  BACKTRACE_SUPPORTS_THREADS
 +#include backtrace.h
 +#else
 +#undef SANITIZER_LIBBACKTRACE
 +#endif
 +#endif
 +
  // C++ demangling function, as required by Itanium C++ ABI. This is weak,
  // because we do not require a C++ ABI library to be linked to a program
  // using sanitizers; if it's not present, we'll just use the mangled name.
 @@ -364,12 +374,124 @@ class InternalSymbolizer {

  #endif  // SANITIZER_SUPPORTS_WEAK_HOOKS

 +#if SANITIZER_LIBBACKTRACE
 +namespace {
 +
 +struct SymbolizeCodeData {
 +  AddressInfo *frames;
 +  uptr n_frames;
 +  uptr max_frames;

max_frames is not used anywhere.

 +  const char *module_name;
 +  uptr module_offset;
 +};
 +
 +extern C {

why do you need extern C here?

 +
 +static int SymbolizeCodePCInfoCallback(void *vdata, uintptr_t addr,
 +  const char *filename, int lineno,
 +  const char *function) {
 +  SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata;
 +  if (function) {
 +AddressInfo *info = cdata-frames[cdata-n_frames++];
 +info-Clear();
 +info-FillAddressAndModuleInfo(addr, cdata-module_name, 
 cdata-module_offset);
 +info-function = internal_strdup(function);
 +if (filename)
 +  info-file = internal_strdup(filename);
 +info-line = lineno;
 +  }
 +  return 0;
 +}
 +
 +static void SymbolizeCodeCallback(void *vdata, uintptr_t addr,
 + const char *symname, uintptr_t) {
 +  SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata;
 +  if (symname) {
 +AddressInfo *info = cdata-frames[0];
 +info-Clear();
 +info-FillAddressAndModuleInfo(addr, cdata-module_name, 
 cdata-module_offset);
 +info-function = internal_strdup(symname);
 +cdata-n_frames = 1;
 +  }
 +}
 +
 +static void SymbolizeDataCallback(void *vdata, uintptr_t,
 + const char *symname, uintptr_t symval) {
 +  DataInfo *info = (DataInfo *)vdata;
 +  if (symname  symval) {
 +info-name = internal_strdup(symname);
 +info-start = symval;
 +  }
 +}
 +
 +static void ErrorCallback(void *, const char *, int) {
 +}
 +
 +}
 +
 +}
 +
 +class LibbacktraceSymbolizer {
 + public:
 +  static LibbacktraceSymbolizer *get(LowLevelAllocator *alloc) {
 +backtrace_state *state
 +  = backtrace_create_state(/proc/self/exe, 1, ErrorCallback, NULL);

You may want to use ReadBinaryName() here (in case it was cached
earlier, e.g. before
we create a sandbox).

 +if (!state)
 +  return 0;
 +return new(*alloc) LibbacktraceSymbolizer(state);
 +  }
 +
 +  uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames,
 +const char *module_name, uptr module_offset) {
 +SymbolizeCodeData data;
 +data.frames = frames;
 +data.n_frames = 0;
 +data.max_frames = max_frames;
 +data.module_name = module_name;
 +data.module_offset = module_offset;
 +backtrace_pcinfo(state_, addr, SymbolizeCodePCInfoCallback, 
 ErrorCallback,
 +data);
 +if (data.n_frames)
 +  return data.n_frames;
 +backtrace_syminfo(state_, addr, SymbolizeCodeCallback, ErrorCallback, 
 data);
 +return data.n_frames;
 +  }
 +
 +  void SymbolizeData(DataInfo *info) {
 +backtrace_syminfo(state_, info-address, SymbolizeDataCallback,
 + ErrorCallback, info);
 +  }
 +
 + private:
 +  LibbacktraceSymbolizer

Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Alexey Samsonov
(resending in plain text)

On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov samso...@google.com wrote:
 Hi Jakub,

 Unfortunately, recently there were a few enhancements to
 sanitizer_symbolizer_posix_libcdep.cc (and friends),
 in LLVM trunk, and now it looks different from gcc version (apparently, the
 changes were committed
 after the merge to gcc happened, I should have pointed this out earlier,
 sorry).

 Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise
 this patch can't go in ASan runtime
 in gcc - the code will significantly diverge.

 On Mon, Nov 18, 2013 at 5:39 PM, Jakub Jelinek ja...@redhat.com wrote:

 Hi!

 This patch adds libbacktrace symbolizer to libsanitizer, with which we
 can avoid spawning and running external an external program (especially
 when
 it is not even provided by gcc itself).

 I've kept the possibility to override the symbolizer by magic symbols
 (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc.
 works and what is the possibility there to add libbacktrace, this just
 requires users to compile with SANITIZE_LIBBACKTRACE defined to signal
 that backtrace-supported.h and backtrace.h is available and the
 sanitizer_symbolizer_posix_libcdep.cc source then decides based on
 backtrace-supported.h etc. whether it is usable.

 make check RUNTESTFLAGS='asan.exp ubsan.exp' passes with this.

 Some pending issues on the libbacktrace side:
 1) right now libbacktrace performs dl_iterate_phdr only the first time
backtrace_pcinfo or backtrace_syminfo is called, if there are some
dlopens/dlcloses in between that and another querying of the
 symbolizer,
it won't notice that.  Perhaps it can be done only when we don't
find a symbol and/or have some function that tries to dl_iterare_phdr
again, look at cached st_ino/st_mtime or similar, and for threaded
version likely just keep old records, just add a flag to them that they
should be ignored (or say atomically decrease symbol count to zero
and something similar for debug info).
 2) for tsan querying of data symbols, apparently the classes want to see
not just the symbol name and start value, but also size.  libbacktrace
has all this info available, just doesn't pass it down to the callback.
I wonder if we'd need to create yet another libbacktrace entrypoint, or
if it would be acceptable to do source code incompatible, ABI (at least
on all sane targets) compatible version of just adding another
uintptr_t symsize argument to backtrace_syminfo_callback.
 3) I wonder if libbacktrace couldn't be updated to use __atomic_*
 builtins,
then it could avoid the ugliness to emulate atomic loads and stores.

 As for sanitizer, the reason I haven't implemented SendCommand method for
 the libbacktrace symbolizer is that the library doesn't provide the info
 as text, but as individual values passed to the callback, so printing
 that to text and then parsing the text would be very ugly.  libbacktrace
 doesn't even need the module names and module offsets,


If libbacktrace does its own call of dl_iterate_phdr, then it doesn't need
many
pieces of existing Symbolizer (or POSIXSymbolizer, in new version of source
code)
complexity - like FindModuleForAddress() function. Consider creating a
separate
wrapper class for libbacktrace functionality and returning it from
Symbolizer::PlatformInit
factory (in the new version of code) instead of POSIXSymbolizer.


 so supposedly
 we would need that only if libbacktrace failed to get accurate inline/call
 or symbol info.  While the classes have both symbol name/file/line etc.
 and module name/offset fields, apparently the latter are unused if the
 former is filled in.

 2013-11-18  Jakub Jelinek  ja...@redhat.com

 PR sanitizer/59136
 * configure.ac: Don't add target-libbacktrace to noconfigdirs
 just because go hasn't been enabled if target-libsanitizer isn't
 in
 noconfigdirs.
 * Makefile.def: Add configure-target-libsanitizer dependency on
 configure-target-libbacktrace and all-target-libsanitizer
 dependency
 on configure-target-libsanitizer.
 * configure: Regenerated.
 libsanitizer/
 * sanitizer_common/Makefile.am (AM_CPPFLAGS): If
 SANITIZER_LIBBACKTRACE, append -I for libbacktrace headers and
 -DSANITIZER_LIBBACKTRACE.
 * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Add
 libbacktrace symbolizer.
 * tsan/Makefile.am (libtsan_la_LIBADD): Add libbacktrace.la if
 SANITIZER_LIBBACKTRACE.
 * asan/Makefile.am (libasan_la_LIBADD): Likewise.
 * ubsan/Makefile.am (libubsan_la_LIBADD): Likewise.
 * configure.ac (SANITIZER_LIBBACKTRACE): New AM_CONDITIONAL.
 * sanitizer_common/Makefile.in: Regenerated.
 * tsan/Makefile.in: Regenrated.
 * asan/Makefile.in: Regenerated.
 * ubsan/Makefile.in: Regenerated.
 * configure: Regenerated