Re: [PATCH 7/12] ubsan: _BitInt -fsanitize=undefined support [PR102989]

2023-08-22 Thread Richard Biener via Gcc-patches
On Wed, 9 Aug 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following patch introduces some -fsanitize=undefined support for _BitInt,
> but some of the diagnostics is limited by lack of proper support in the
> library.
> I've filed https://github.com/llvm/llvm-project/issues/64100 to request
> proper support, for now some of the diagnostics might have less or more
> confusing or inaccurate wording but UB should still be diagnosed when it
> happens.

OK, you're the expert here.

Richard.

> 2023-08-09  Jakub Jelinek  
> 
>   PR c/102989
> gcc/
>   * internal-fn.cc (expand_ubsan_result_store): Add LHS, MODE and
>   DO_ERROR arguments.  For non-mode precision BITINT_TYPE results
>   check if all padding bits up to mode precision are zeros or sign
>   bit copies and if not, jump to DO_ERROR.
>   (expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow):
>   Adjust expand_ubsan_result_store callers.
>   * ubsan.cc: Include target.h and langhooks.h.
>   (ubsan_encode_value): Pass BITINT_TYPE values which fit into pointer
>   size converted to pointer sized integer, pass BITINT_TYPE values
>   which fit into TImode (if supported) or DImode as those integer types
>   or otherwise for now punt (pass 0).
>   (ubsan_type_descriptor): Handle BITINT_TYPE.  For pstyle of
>   UBSAN_PRINT_FORCE_INT use TK_Integer (0x) mode with a
>   TImode/DImode precision rather than TK_Unknown used otherwise for
>   large/huge BITINT_TYPEs.
>   (instrument_si_overflow): Instrument BITINT_TYPE operations even when
>   they don't have mode precision.
>   * ubsan.h (enum ubsan_print_style): New enumerator.
> gcc/c-family/
>   * c-ubsan.cc (ubsan_instrument_shift): Use UBSAN_PRINT_FORCE_INT
>   for type0 type descriptor.
> 
> --- gcc/ubsan.cc.jj   2023-08-08 15:54:35.443599459 +0200
> +++ gcc/ubsan.cc  2023-08-08 16:12:02.329939798 +0200
> @@ -50,6 +50,8 @@ along with GCC; see the file COPYING3.
>  #include "gimple-fold.h"
>  #include "varasm.h"
>  #include "realmpfr.h"
> +#include "target.h"
> +#include "langhooks.h"
>  
>  /* Map from a tree to a VAR_DECL tree.  */
>  
> @@ -125,6 +127,25 @@ tree
>  ubsan_encode_value (tree t, enum ubsan_encode_value_phase phase)
>  {
>tree type = TREE_TYPE (t);
> +  if (TREE_CODE (type) == BITINT_TYPE)
> +{
> +  if (TYPE_PRECISION (type) <= POINTER_SIZE)
> + {
> +   type = pointer_sized_int_node;
> +   t = fold_build1 (NOP_EXPR, type, t);
> + }
> +  else
> + {
> +   scalar_int_mode arith_mode
> + = (targetm.scalar_mode_supported_p (TImode) ? TImode : DImode);
> +   if (TYPE_PRECISION (type) > GET_MODE_PRECISION (arith_mode))
> + return build_zero_cst (pointer_sized_int_node);
> +   type
> + = build_nonstandard_integer_type (GET_MODE_PRECISION (arith_mode),
> +   TYPE_UNSIGNED (type));
> +   t = fold_build1 (NOP_EXPR, type, t);
> + }
> +}
>scalar_mode mode = SCALAR_TYPE_MODE (type);
>const unsigned int bitsize = GET_MODE_BITSIZE (mode);
>if (bitsize <= POINTER_SIZE)
> @@ -355,14 +376,32 @@ ubsan_type_descriptor (tree type, enum u
>  {
>/* See through any typedefs.  */
>type = TYPE_MAIN_VARIANT (type);
> +  tree type3 = type;
> +  if (pstyle == UBSAN_PRINT_FORCE_INT)
> +{
> +  /* Temporary hack for -fsanitize=shift with _BitInt(129) and more.
> +  libubsan crashes if it is not TK_Integer type.  */
> +  if (TREE_CODE (type) == BITINT_TYPE)
> + {
> +   scalar_int_mode arith_mode
> + = (targetm.scalar_mode_supported_p (TImode)
> +? TImode : DImode);
> +   if (TYPE_PRECISION (type) > GET_MODE_PRECISION (arith_mode))
> + type3 = build_qualified_type (type, TYPE_QUAL_CONST);
> + }
> +  if (type3 == type)
> + pstyle = UBSAN_PRINT_NORMAL;
> +}
>  
> -  tree decl = decl_for_type_lookup (type);
> +  tree decl = decl_for_type_lookup (type3);
>/* It is possible that some of the earlier created DECLs were found
>   unused, in that case they weren't emitted and varpool_node::get
>   returns NULL node on them.  But now we really need them.  Thus,
>   renew them here.  */
>if (decl != NULL_TREE && varpool_node::get (decl))
> -return build_fold_addr_expr (decl);
> +{
> +  return build_fold_addr_expr (decl);
> +}
>  
>tree dtype = ubsan_get_type_descriptor_type ();
>tree type2 = type;
> @@ -370,6 +409,7 @@ ubsan_type_descriptor (tree type, enum u
>pretty_printer pretty_name;
>unsigned char deref_depth = 0;
>unsigned short tkind, tinfo;
> +  char tname_bitint[sizeof ("unsigned _BitInt(2147483647)")];
>  
>/* Get the name of the type, or the name of the pointer type.  */
>if (pstyle == UBSAN_PRINT_POINTER)
> @@ -403,8 +443,18 @@ ubsan_type_descriptor (tree type, enum u
>  }
>  
>if (tname == NULL)
> -/* We weren't able to determine the type 

[PATCH 7/12] ubsan: _BitInt -fsanitize=undefined support [PR102989]

2023-08-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch introduces some -fsanitize=undefined support for _BitInt,
but some of the diagnostics is limited by lack of proper support in the
library.
I've filed https://github.com/llvm/llvm-project/issues/64100 to request
proper support, for now some of the diagnostics might have less or more
confusing or inaccurate wording but UB should still be diagnosed when it
happens.

2023-08-09  Jakub Jelinek  

PR c/102989
gcc/
* internal-fn.cc (expand_ubsan_result_store): Add LHS, MODE and
DO_ERROR arguments.  For non-mode precision BITINT_TYPE results
check if all padding bits up to mode precision are zeros or sign
bit copies and if not, jump to DO_ERROR.
(expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow):
Adjust expand_ubsan_result_store callers.
* ubsan.cc: Include target.h and langhooks.h.
(ubsan_encode_value): Pass BITINT_TYPE values which fit into pointer
size converted to pointer sized integer, pass BITINT_TYPE values
which fit into TImode (if supported) or DImode as those integer types
or otherwise for now punt (pass 0).
(ubsan_type_descriptor): Handle BITINT_TYPE.  For pstyle of
UBSAN_PRINT_FORCE_INT use TK_Integer (0x) mode with a
TImode/DImode precision rather than TK_Unknown used otherwise for
large/huge BITINT_TYPEs.
(instrument_si_overflow): Instrument BITINT_TYPE operations even when
they don't have mode precision.
* ubsan.h (enum ubsan_print_style): New enumerator.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_shift): Use UBSAN_PRINT_FORCE_INT
for type0 type descriptor.

--- gcc/ubsan.cc.jj 2023-08-08 15:54:35.443599459 +0200
+++ gcc/ubsan.cc2023-08-08 16:12:02.329939798 +0200
@@ -50,6 +50,8 @@ along with GCC; see the file COPYING3.
 #include "gimple-fold.h"
 #include "varasm.h"
 #include "realmpfr.h"
+#include "target.h"
+#include "langhooks.h"
 
 /* Map from a tree to a VAR_DECL tree.  */
 
@@ -125,6 +127,25 @@ tree
 ubsan_encode_value (tree t, enum ubsan_encode_value_phase phase)
 {
   tree type = TREE_TYPE (t);
+  if (TREE_CODE (type) == BITINT_TYPE)
+{
+  if (TYPE_PRECISION (type) <= POINTER_SIZE)
+   {
+ type = pointer_sized_int_node;
+ t = fold_build1 (NOP_EXPR, type, t);
+   }
+  else
+   {
+ scalar_int_mode arith_mode
+   = (targetm.scalar_mode_supported_p (TImode) ? TImode : DImode);
+ if (TYPE_PRECISION (type) > GET_MODE_PRECISION (arith_mode))
+   return build_zero_cst (pointer_sized_int_node);
+ type
+   = build_nonstandard_integer_type (GET_MODE_PRECISION (arith_mode),
+ TYPE_UNSIGNED (type));
+ t = fold_build1 (NOP_EXPR, type, t);
+   }
+}
   scalar_mode mode = SCALAR_TYPE_MODE (type);
   const unsigned int bitsize = GET_MODE_BITSIZE (mode);
   if (bitsize <= POINTER_SIZE)
@@ -355,14 +376,32 @@ ubsan_type_descriptor (tree type, enum u
 {
   /* See through any typedefs.  */
   type = TYPE_MAIN_VARIANT (type);
+  tree type3 = type;
+  if (pstyle == UBSAN_PRINT_FORCE_INT)
+{
+  /* Temporary hack for -fsanitize=shift with _BitInt(129) and more.
+libubsan crashes if it is not TK_Integer type.  */
+  if (TREE_CODE (type) == BITINT_TYPE)
+   {
+ scalar_int_mode arith_mode
+   = (targetm.scalar_mode_supported_p (TImode)
+  ? TImode : DImode);
+ if (TYPE_PRECISION (type) > GET_MODE_PRECISION (arith_mode))
+   type3 = build_qualified_type (type, TYPE_QUAL_CONST);
+   }
+  if (type3 == type)
+   pstyle = UBSAN_PRINT_NORMAL;
+}
 
-  tree decl = decl_for_type_lookup (type);
+  tree decl = decl_for_type_lookup (type3);
   /* It is possible that some of the earlier created DECLs were found
  unused, in that case they weren't emitted and varpool_node::get
  returns NULL node on them.  But now we really need them.  Thus,
  renew them here.  */
   if (decl != NULL_TREE && varpool_node::get (decl))
-return build_fold_addr_expr (decl);
+{
+  return build_fold_addr_expr (decl);
+}
 
   tree dtype = ubsan_get_type_descriptor_type ();
   tree type2 = type;
@@ -370,6 +409,7 @@ ubsan_type_descriptor (tree type, enum u
   pretty_printer pretty_name;
   unsigned char deref_depth = 0;
   unsigned short tkind, tinfo;
+  char tname_bitint[sizeof ("unsigned _BitInt(2147483647)")];
 
   /* Get the name of the type, or the name of the pointer type.  */
   if (pstyle == UBSAN_PRINT_POINTER)
@@ -403,8 +443,18 @@ ubsan_type_descriptor (tree type, enum u
 }
 
   if (tname == NULL)
-/* We weren't able to determine the type name.  */
-tname = "";
+{
+  if (TREE_CODE (type2) == BITINT_TYPE)
+   {
+ snprintf (tname_bitint, sizeof (tname_bitint),
+   "%s_BitInt(%d)", TYPE_UNSIGNED (type2) ? "unsigned " : "",
+