1. Introduction

The Summer of Code project "Better Uninitialized Warnings" had
ambitious goals. The option "Wuninitialized" has always been a source
of controversy and frustration. One of the reasons is the conflictive
views as to what should be warned. Initially, we identified two groups
of users:

* -Wuninitialized=verbose
  "Is there a code path through this function, when considered in
  isolation, and without being too clever, under which an uninitialized
  value is used?" [MM05]
  Produce consistent warnings across architectures and optimization
  levels, (and ideally releases). Warn about any potential case, even
  for unreachable code.

An example would be:

int i;
int j=5;
if (0)
  j = i; /* 'i' may be used uninitialized */
return j;

* -Wuninitialized=precise
  "Is there a code path through this function, when compiled on this
  architecture with these flags, etc., for which we might actually use an
  uninitialized value?" [MM05]
  Produce the most precise warnings possible. Ideally, when more
  optimisations are used, more false positives are detected and not
  warned.


In addition to specializing Wuninitialized into these two separate
flags, another objective of the project was to enable Wuninitialized
when not using optimisation, that is at -O0. Finally, open PRs related
to uninitialized warnings would be reviewed, analysed and, hopefully,
fixed.


2. Enabling -Wuninitialized at -O0

Since the current implementation of Wuninitialized relies on the SSA
representation, we would need to build the SSA tree when no
optimisations are enabled. The attached patch "enable-ssa-at-O0.diff"
implements this and also removes any warnings about enabling
Wuninitialized without optimisation. This effectively allows to use
-Wuninitialized with -O0. However, with just this patch, only "is used
uninitialized" warnings would be enabled. The patch allows to
bootstrap GCC with BOOT_CFLAGS="-O0". Yet, running the testsuite
resulted in numerous failures. When bootstrapped with default flags,
no new regressions are noticed.

There is plenty room for improvement on this. For example, some
fine-tuning will discard anything that is enabled by this patch and
that is not required to detect uninitialized variables. Also, alias
information is enabled too late to improve the precision of the
warnings.


3. New flags -Wuninitialized=verbose and -Wuninitialized=precise

The attached patch "new-wuninitialized.diff" changes Wuninitialized to
accept one argument (either "verbose" or "precise").
Wuninitialized=verbose will generate all uninitialized warnings ("is
used" and "may be used") before optimisation. This means that the
results do not depend on the optimisation flags used (-O0, -O1,
-ftree-vrp, etc). On the other hand, Wuninitialized=precise is
equivalent to the current Wuninitialized.

The implementation was originally proposed by Jeffrey A. Law and the
current patch is based on the one attached to PR24639
[http://gcc.gnu.org/bugzilla/attachment.cgi?id=10181&action=view].

However, after reviewing the open PRs related to uninitialized
warnings, my conclusion is that we might wish to have 3 levels:

Wuninitialized=verbose : As explained and implemented above, look into
statements and phi nodes before any optimization. This results in zero
number of false negatives, maximum number of false positives and
maximum consistency.

Wuninitialized=terse : Almost identical to current Wuninitialized,
that is, look into statements before any optimization and statements
and phi nodes after all optimization passes. However, if unsure, don't
warn. This results in a high number of false negatives but a minimum
(perhaps zero) number of false positives.

Wuninitialized=speculative :  The same as Wuninitialized, but if
unsure, do warn. For example, warn when any pass assumes a value for
an uninitialized variable. This results in low (perhaps zero) number
of false negatives but high number of false positives.

Nonetheless, this is just an alternative proposal that requires more
consideration and could be implemented in the far future as an
evolution of the two new flags -Wuninitialized=verbose and
-Wuninitialized=precise


4. Review of open PRs related to Wuninitialized.

As the final phase of the project within the schedule of the Google's
Summer of Code, we reviewed open (and some closed) PRs about
spurious/missing uninitialized warnings. In total we reviewed 14 PRs:
PR179 (was reopened as a result), PR5035 (was discovered to be fixed
in mainline), PR10138, PR18501, PR19430, PR19808, PR20644, PR20968,
PR22197, PR22456,  PR27120, PR2972, PR32759, PR33086.

Some of them were identified as feature requests:

* Warn for uninitialized array elements (PR27120). This doesn't seem
trivial to implement unless the array is scalarized. Currently, we
don't even warn for references or pointers.

* Warn for uninitialized variables passed by reference as pointers to
constant (PR33086). There is a controversy here. On the one hand,
without the body available, we cannot be sure that the "const" is not
cast away and the variable is initialized. On the other hand, this is
a request for a warning message, not an error, and code like this is
likely to be wrong. In any case, the warning could be interesting for
functions marked as "pure". Yet, technically, without support from the
optimisers or the SSA representation, it seems difficult to implement
this at the present moment, even if we wanted to.

* Warn for uninitialized arrays passed as pointers to constant
(PR10138). This seems like a combination of the two above (PR27120 and
PR33086).

* Warn for uninitialized use in constructor (PR19808) and Warn for
uninitialized member variable usage in constructors (PR2972). These
two cases are very similar and, for simple scenarios, they are likely
not to be very difficult to solve as soon as we are able to warn about
variables that live in memory.

* Don't warn for copying of a structure partly initialized (PR22197).
The reporter argues that a copy is not an use as long as the
uninitialized part is never accessed.

One of the most important issues we have identified is that the
current implementation of Wuninitialized does not handle variables
that live in memory or virtual operands. Therefore, taking the address
of a variable confuses the uninitialized passes (PR179 and PR19430).
In particular, the simple testcase gcc.dg/uninit-B.c is currently
XFAILed in trunk because of this reason. A preliminary patch has been
developed to address this issue (fix-uninit-B.diff). It fixes
gcc.dg/uninit-B.c and other simple testcases. However, it results in
new false positives. In particular, the current patch seems unable to
distinguish between the two following examples when i.0D.1284_2 =
iD.1283; is analyzed by warn_uninit.

f ()
{
 intD.0 iD.1283;
 intD.0 i.0D.1284;

 # BLOCK 0, starting at line 8
 # PRED: ENTRY (fallthru)
 [doublereference-2.c : 8] g (&iD.1283);
 #   VUSE <iD.1283_1>;
 [doublereference-2.c : 9] i.0D.1284_2 = iD.1283;
 [doublereference-2.c : 9] h (i.0D.1284_2);
 [doublereference-2.c : 10] return;
 # SUCC: EXIT

}

f ()
{
 intD.0 iD.1283;
 intD.0 i.0D.1284;

 # BLOCK 0, starting at line 7
 # PRED: ENTRY (fallthru)
 #   VUSE <iD.1283_1>;
 [doublereference-3.c : 7] i.0D.1284_2 = iD.1283;
 [doublereference-3.c : 7] h (i.0D.1284_2);
 [doublereference-3.c : 8] g (&iD.1283);
 [doublereference-3.c : 9] return;
 # SUCC: EXIT
}


Another source of problems identified during the review is that some
optimisation passes assume particular values for uninitialized
variables (PR1850, PR22456). Undoubtedly, this helps optimisation.
However, it also tends to hide real bugs. CCP merging uninitialized
nodes with constant values or DCE removing PHI operators with empty
definitions are examples of this. At the point of the merge, not
warning results in false negatives while warning results in false
positives. CCP by itself cannot distinguish the two following cases:

/* PR1850 */
unsigned bmp_iter_set ();
int something (void);

void bitmap_print_value_set (void)
{
  unsigned first;

  for (; bmp_iter_set (); )
    {
      if (!first)
        something ();
      first = 0;
    }
}


/* Testcase from J. Law. */
sub()
{
  int i = 0;
  int j = 0;
  int k;

  while ((i | j) == 0)
    {
      k = 10;
      i = sub ();
    }

  return k;
}

During the review, we also identified a potential problem that
surprisingly does not seem to arise so often in practice (perhaps
because users do not care so much about the distinction betwen "is
used" and "may be used"). We generate "is used uninitialized" warnings
for the use of a SSA name with an empty definition statement, even
when the basic block (BB) that contains the use of the SSA name is
executed conditionally. This is the cause of PR20644 and similar
spurious "is used" warnings. A preliminary fix could be to only
consider BBs that have an incoming fallthru edge:

--- gcc/tree-ssa.c      (revision 126606)
+++ gcc/tree-ssa.c      (working copy)
@@ -1302,8 +1334,11 @@
     }
 }

static unsigned int
execute_early_warn_uninitialized (void)
 {
   block_stmt_iterator bsi;
   basic_block bb;

   FOR_EACH_BB (bb)
-    for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
-      {
-       tree context = bsi_stmt (bsi);
-       walk_tree (bsi_stmt_ptr (bsi), warn_uninitialized_var,
-                  context, NULL);
-      }
+    {
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->preds)
+        if (e->flags & EDGE_FALLTHRU)
+          {
+            for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+              {
+                tree context = bsi_stmt (bsi);
+                walk_tree (bsi_stmt_ptr (bsi), warn_uninitialized_var,
+                           context, NULL);
+              }
+            break;
+          }
+    }


This fixes PR20644, however it may hide previously detected uninitialized cases.

Finally, the review of the rest of open PRs leads us to conclude that
some mechanism that takes into account the conditions to reach a
statement would noticeably improve the current uninitialized warnings.
This, perhaps, could be implemented by propagating the predicates
themselves to the edges of the CFG or by using a representation like
Gated SSA.


5. Conclusion.

I must admit that I am personally not very satisfied with the outcome
of the project. The amount of code produced was scarce and of a very
experimental nature. In particular, the code for enabling SSA at O0
still requires a lot of fine-tuning and testing.

Unfortunately, I cannot provide a clear plan to fix the current
shortcomings in Wuninitialized and even my original proposal of
dividing it into two separate flags will require more careful
consideration.

On the other hand, from the work so far, many real problems have been
identified and it is clear that the current implementation is very
fragile. Most of the time, the right answer is obtained because the
optimisers inadvertently remove the potential false positive or expose
a previously missed uninitialized use. That is, the right answer is
obtained by sheer luck in many cases. However, since precision depends
on the thoroughness of the analysis of the code, and that, in turn,
depends on optimisation, this problem cannot be completely solved.

Nonetheless, the current situation could improve greatly by, firstly,
making the uninitialized passes smarter (for example, with the patch
above to fix PR20644, by handling memory variables and virtual SSA and
by using some sort of predicate analysis). And secondly, by teaching
optimisation passes to report when they touch an uninitialized
variable and letting the uninitialized passes to decide what to do
with that information.


6. Addendum.

Although Google's Summer of Code is over, I plan to keep working on
this project in my spare time. The evolution of the project is being
documented in the wiki:
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
It contains more details about everything that has been discussed in
this report. Any comments and suggestions are welcome there or by
writing to me.

I would like to thank, Diego Novillo, my mentor for this project, for
his infinite patience and for not giving up on me when I have almost
done it myself.


Manuel López-Ibáñez.
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c	(revision 126606)
+++ gcc/tree-ssa.c	(working copy)
@@ -1205,7 +1205,7 @@
 /* Emit a warning for T, an SSA_NAME, being uninitialized.  The exact
    warning text is in MSGID and LOCUS may contain a location or be null.  */
 
-static void
+void
 warn_uninit (tree t, const char *gmsgid, void *data)
 {
   tree var = SSA_NAME_VAR (t);
@@ -1217,8 +1217,40 @@
   /* Default uses (indicated by an empty definition statement),
      are uninitialized.  */
   if (!IS_EMPTY_STMT (def))
+    {
+        /* Otherwise, it may be an initial assignment that has a VUSE
+           to a variable's default definition.  */
+        if (TREE_CODE (def) == GIMPLE_MODIFY_STMT
+            && has_stmt_ann (def)
+            && ssa_operands_active () 
+            && stmt_references_memory_p (def))
+        {
+          struct voptype_d *vuses = VUSE_OPS (def);
+          while (vuses)
+            {
+              int i;
+              int n = VUSE_NUM (vuses);
+              for (i = 0; i < n; i++)
+                {
+                  if (TREE_CODE (VUSE_OP (vuses, i)) == SSA_NAME) 
+                    warn_uninit (VUSE_OP (vuses, i), 
+                             "%H%qD is used uninitialized in this function", 
+                                 def);
+                }
+              vuses = vuses->next;
+            }
+        }
+      return;
+    }
+
+  /* We don't warn for global variables or memory tags. */
+  if (MTAG_P (var) || !is_gimple_variable (var) 
+      || !is_gimple_reg_type (TREE_TYPE (var))
+      || TREE_THIS_VOLATILE (var)
+      || is_global_var (var))
     return;
 
+
   /* Except for PARMs of course, which are always initialized.  */
   if (TREE_CODE (var) == PARM_DECL)
     return;
Index: gcc/opts.c
===================================================================
--- gcc/opts.c	(revision 126606)
+++ gcc/opts.c	(working copy)
@@ -847,13 +847,6 @@
 	 so force it not to be done.  */
       flag_no_inline = 1;
       warn_inline = 0;
-
-      /* The c_decode_option function and decode_option hook set
-	 this to `2' if -Wall is used, so we can avoid giving out
-	 lots of errors for people who don't realize what -Wall does.  */
-      if (warn_uninitialized == 1)
-	warning (OPT_Wuninitialized,
-		 "-Wuninitialized is not supported without -O");
     }
 
   if (flag_really_no_inline == 2)
@@ -1705,14 +1711,6 @@
 {
   extra_warnings = setting;
   warn_unused_parameter = (setting && maybe_warn_unused_parameter);
-
-  /* We save the value of warn_uninitialized, since if they put
-     -Wuninitialized on the command line, we need to generate a
-     warning about not using it without also specifying -O.  */
-  if (setting == 0)
-    warn_uninitialized = 0;
-  else if (warn_uninitialized != 1)
-    warn_uninitialized = 2;
 }
 
 /* Initialize unused warning flags.  */
Index: gcc/c-opts.c
===================================================================
--- gcc/c-opts.c	(revision 126606)
+++ gcc/c-opts.c	(working copy)
@@ -412,11 +412,7 @@
 	 headers.  */
       warn_unknown_pragmas = value;
 
-      /* We save the value of warn_uninitialized, since if they put
-	 -Wuninitialized on the command line, we need to generate a
-	 warning about not using it without also specifying -O.  */
-      if (warn_uninitialized != 1)
-	warn_uninitialized = (value ? 2 : 0);
+      warn_uninitialized = value;
 
       if (!c_dialect_cxx ())
 	/* We set this to 2 here, but 1 in -Wmain, so -ffreestanding
Index: gcc/tree-optimize.c
===================================================================
--- gcc/tree-optimize.c	(revision 126606)
+++ gcc/tree-optimize.c	(working copy)
@@ -219,9 +219,9 @@
 
 struct tree_opt_pass pass_free_datastructures =
 {
-  NULL,					/* name */
+  "free_datastructs",			/* name */
   NULL,					/* gate */
-  execute_free_datastructures,			/* execute */
+  execute_free_datastructures,		/* execute */
   NULL,					/* sub */
   NULL,					/* next */
   0,					/* static_pass_number */
@@ -328,7 +328,7 @@
 static bool
 gate_init_datastructures (void)
 {
-  return (optimize >= 1);
+  return (optimize >= 0);
 }
 
 struct tree_opt_pass pass_init_datastructures =
Index: gcc/tree-outof-ssa.c
===================================================================
--- gcc/tree-outof-ssa.c	(revision 126606)
+++ gcc/tree-outof-ssa.c	(working copy)
@@ -1301,13 +1301,19 @@
   return 0;
 }
 
+static bool
+gate_out_of_ssa (void)
+{
+  return (cfun->gimple_df->in_ssa_p);
+}
 
+
 /* Define the parameters of the out of SSA pass.  */
 
 struct tree_opt_pass pass_del_ssa = 
 {
   "optimized",				/* name */
-  NULL,					/* gate */
+  gate_out_of_ssa,			/* gate */
   rewrite_out_of_ssa,			/* execute */
   NULL,					/* sub */
   NULL,					/* next */
Index: gcc/passes.c
===================================================================
--- gcc/passes.c	(revision 126606)
+++ gcc/passes.c	(working copy)
@@ -510,13 +510,13 @@
       NEXT_PASS (pass_cleanup_cfg);
       NEXT_PASS (pass_init_datastructures);
       NEXT_PASS (pass_expand_omp);
+      NEXT_PASS (pass_referenced_vars);
+      NEXT_PASS (pass_reset_cc_flags);
+      NEXT_PASS (pass_build_ssa);
+      NEXT_PASS (pass_early_warn_uninitialized);
       NEXT_PASS (pass_all_early_optimizations);
 	{
 	  struct tree_opt_pass **p = &pass_all_early_optimizations.sub;
-	  NEXT_PASS (pass_referenced_vars);
-	  NEXT_PASS (pass_reset_cc_flags);
-	  NEXT_PASS (pass_build_ssa);
-	  NEXT_PASS (pass_early_warn_uninitialized);
 	  NEXT_PASS (pass_rebuild_cgraph_edges);
 	  NEXT_PASS (pass_early_inline);
 	  NEXT_PASS (pass_cleanup_cfg);
@@ -548,11 +548,11 @@
      output to the assembler file.  */
   p = &all_passes;
   NEXT_PASS (pass_apply_inline);
+  NEXT_PASS (pass_create_structure_vars);
+  NEXT_PASS (pass_may_alias);
   NEXT_PASS (pass_all_optimizations);
     {
       struct tree_opt_pass **p = &pass_all_optimizations.sub;
-      NEXT_PASS (pass_create_structure_vars);
-      NEXT_PASS (pass_may_alias);
       NEXT_PASS (pass_return_slot);
       NEXT_PASS (pass_rename_ssa_copies);
 
@@ -682,6 +682,7 @@
       NEXT_PASS (pass_mark_used_blocks);
       NEXT_PASS (pass_cleanup_cfg_post_optimizing);
     }
+  NEXT_PASS (pass_del_ssa);
   NEXT_PASS (pass_warn_function_noreturn);
   NEXT_PASS (pass_free_datastructures);
   NEXT_PASS (pass_mudflap_2);
@@ -1070,6 +1071,8 @@
   unsigned int todo_after = 0;
 
   current_pass = pass;
+  /*fprintf (stderr, "%s\n", (pass->name) ? pass->name : "NULL"); */
+
   /* See if we're supposed to run this pass.  */
   if (pass->gate && !pass->gate ())
     return false;
Index: gcc/flags.h
===================================================================
--- gcc/flags.h	(revision 126606)
+++ gcc/flags.h	(working copy)
@@ -354,6 +354,15 @@
   WARN_STRICT_OVERFLOW_MAGNITUDE = 5
 };

+/* Names for the different levels of -Wuninitialized. */
+
+enum warn_uninitialized_code
+{
+  WARN_UNINITIALIZED_PRECISE = 1,
+
+  WARN_UNINITIALIZED_VERBOSE = 2
+};
+
 /* Whether to emit an overflow warning whose code is C.  */
 #define issue_strict_overflow_warning(c) (warn_strict_overflow >= (int) (c))

Index: gcc/opts.c
===================================================================
--- gcc/opts.c	(revision 126606)
+++ gcc/opts.c	(working copy)
@@ -1379,2 +1372,2 @@
       set_Wunused (value);
       break;

+    case OPT_Wuninitialized:
+      warn_uninitialized = value ? WARN_UNINITIALIZED_PRECISE : 0;
+      break;
+
+    case OPT_Wuninitialized_:
+      if (!strcmp (arg, "precise"))
+        warn_uninitialized = WARN_UNINITIALIZED_PRECISE;
+      else if (!strcmp (arg, "verbose"))
+        warn_uninitialized = WARN_UNINITIALIZED_VERBOSE;
+      else
+	return 0;
+      break;
+
     case OPT_aux_info:
     case OPT_aux_info_:
       aux_info_file_name = arg;
Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 126606)
+++ gcc/common.opt	(working copy)
@@ -186,2 +186,2 @@
 Common Var(warn_uninitialized) Warning
 Warn about uninitialized automatic variables

+Wuninitialized=
+Common Joined RejectNegative Warning
+-Wuninitialized=[precise|verbose] Warn about uninitialized automatic variables
+
 Wunreachable-code
 Common Var(warn_notreached) Warning
 Warn about code that will never be executed
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c	(revision 126606)
+++ gcc/tree-ssa.c	(working copy)
@@ -1205,7 +1205,7 @@
 /* Emit a warning for T, an SSA_NAME, being uninitialized.  The exact
    warning text is in MSGID and LOCUS may contain a location or be null.  */
 
-static void
+void
 warn_uninit (tree t, const char *gmsgid, void *data)
 {
   tree var = SSA_NAME_VAR (t);
@@ -1302,8 +1311,11 @@
     }
 }
 
-static unsigned int
-execute_early_warn_uninitialized (void)
+/* Search through all the statements and warn if any use of an
+   uninitialized variable is found.  */
+
+static void
+warn_uninitialized_stmts (void)
 {
   block_stmt_iterator bsi;
   basic_block bb;
@@ -1315,23 +1327,72 @@
 	walk_tree (bsi_stmt_ptr (bsi), warn_uninitialized_var,
 		   context, NULL);
       }
+}
+
+/* Search through all the PHIs any which may reference an 
+   uninitialized variable.  Warn if any such PHI is found.  */
+static void
+warn_uninitialized_phis (void)
+{
+  basic_block bb;
+  tree phi;
+
+  FOR_EACH_BB (bb)
+    for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
+      warn_uninitialized_phi (phi);
+}
+
+/* This pass emits warnings for uninitialized variables before any
+   optimization occurs.
+
+   We always identify and warn for variables which can be trivially
+   determined are uninitialized at one or more uses.
+
+   We conditionally warn for "may be" uninitialized variables at this
+   time as well (dependent on user flags). 
+
+   By emitting "may be" warnings before optimization we can get warnings
+   for uninitialized variables in which uses are optimized away or 
+   unreachable.  However, we get more false positive "may be" uninitialized
+   warnings and some "may be" uninitialized warnings which really should
+   be "is" uninitialized warnings.  */
+
+static unsigned int
+execute_early_warn_uninitialized (void)
+{
+  warn_uninitialized_stmts ();
+
+  /* If explicitly requested, look for "may be" uninitialized variables
+     before optimizations as well.  */
+  if (warn_uninitialized == WARN_UNINITIALIZED_VERBOSE)
+    warn_uninitialized_phis ();
+
   return 0;
 }
 
+/* This pass emits warnings for uninitialized variables after 
+   optimization has been performed.
+
+   By delaying warnings until after optimization we can turn more
+   "may be" uninitialized warnings into "is" uninitialized warnings.
+   We can also eliminate some false positive "may be" uninitialized
+   warnings. 
+
+   This pass also emits warnings for any variables exposed by the
+   optimizers.  For example, if optimization removed the need to
+   take the address of a variable, then this pass will be the only
+   chance we have to get uninitialized warnings for such variables.  */
+
 static unsigned int
 execute_late_warn_uninitialized (void)
 {
-  basic_block bb;
-  tree phi;
-
   /* Re-do the plain uninitialized variable check, as optimization may have
      straightened control flow.  Do this first so that we don't accidentally
      get a "may be" warning when we'd have seen an "is" warning later.  */
-  execute_early_warn_uninitialized ();
+  warn_uninitialized_stmts ();
 
-  FOR_EACH_BB (bb)
-    for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
-      warn_uninitialized_phi (phi);
+  warn_uninitialized_phis ();
+
   return 0;
 }
 

Reply via email to