On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote:
> C99 mixed declarations support interleaving of local variable
> declarations and code.
> 
> The coding style "generally" forbids C99 mixed declarations with some
> exceptions to the rule. This rule is not checked by checkpatch.pl and
> naturally there are violations in the source tree.
> 
> While contemplating adding another exception, I came to the conclusion
> that the best location for declarations depends on context. Let the
> programmer declare variables where it is best for legibility. Don't try
> to define all possible scenarios/exceptions.

IIRC, we had a discussion on this topic sometime last year, but can't
remember what the $SUBJECT was, so I'll just repeat what I said then.

Combining C99 mixed declarations with 'goto' is dangerous.

Consider this program:

$ cat jump.c
#include <stdlib.h>

int main(int argc, char**argv) {

  if (getenv("SKIP"))
    goto target;

  char *foo = malloc(30);

 target:
  free(foo);
}

$ gcc -Wall -Wuninitialized -o jump jump.c

$ SKIP=1 ./jump 
free(): invalid pointer
Aborted (core dumped)


 -> The programmer thinks they have initialized 'foo'
 -> GCC thinks the programmer has initialized 'foo'
 -> Yet 'foo' is not guaranteed to be initialized at 'target:'

Given that QEMU makes heavy use of 'goto', allowing C99 mixed
declarations exposes us to significant danger.

Full disclosure, GCC fails to diagnmose this mistake, even
with a decl at start of 'main', but at least the mistake is
now more visible to the programmer.

Fortunately with -fanalyzer GCC can diagnose this:

$ gcc -fanalyzer -Wall -o jump jump.c
jump.c: In function ‘main’:
jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
   12 |   free(foo);
      |   ^~~~~~~~~
  ‘main’: events 1-5
    |
    |    6 |   if (getenv("SKIP"))
    |      |      ~  
    |      |      |
    |      |      (3) following ‘true’ branch...
    |    7 |     goto target;
    |      |     ~~~~
    |      |     |
    |      |     (4) ...to here
    |    8 | 
    |    9 |  char *foo = malloc(30);
    |      |        ^~~
    |      |        |
    |      |        (1) region created on stack here
    |      |        (2) capacity: 8 bytes
    |......
    |   12 |   free(foo);
    |      |   ~~~~~~~~~
    |      |   |
    |      |   (5) use of uninitialized value ‘foo’ here


...but -fanalyzer isn't something we have enabled by default, it
is opt-in. I'm also not sure how comprehensive the flow control
analysis of -fanalyzer is ?  Can we be sure it'll catch these
mistakes in large complex functions with many code paths ?

Even if the compiler does reliably warn, I think the code pattern
remains misleading to contributors, as the flow control flaw is
very non-obvious.

Rather than accept the status quo and remove the coding guideline,
I think we should strengthen the guidelines, such that it is
explicitly forbidden in any method that uses 'goto'. Personally
I'd go all the way to -Werror=declaration-after-statement, as
while C99 mixed decl is appealing, it isn't exactly a game
changer in improving code maintainability.

> Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
> ---
>  docs/devel/style.rst | 20 --------------------
>  1 file changed, 20 deletions(-)
> 
> diff --git a/docs/devel/style.rst b/docs/devel/style.rst
> index 2f68b50079..80c4e4df52 100644
> --- a/docs/devel/style.rst
> +++ b/docs/devel/style.rst
> @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) 
> bracing style reduces
>  ambiguity and avoids needless churn when lines are added or removed.
>  Furthermore, it is the QEMU coding style.
>  
> -Declarations
> -============
> -
> -Mixed declarations (interleaving statements and declarations within
> -blocks) are generally not allowed; declarations should be at the beginning
> -of blocks. To avoid accidental re-use it is permissible to declare
> -loop variables inside for loops:
> -
> -.. code-block:: c
> -
> -    for (int i = 0; i < ARRAY_SIZE(thing); i++) {
> -        /* do something loopy */
> -    }
> -
> -Every now and then, an exception is made for declarations inside a
> -#ifdef or #ifndef block: if the code looks nicer, such declarations can
> -be placed at the top of the block even if there are statements above.
> -On the other hand, however, it's often best to move that #ifdef/#ifndef
> -block to a separate function altogether.
> -
>  Conditional statements
>  ======================
>  
> -- 
> 2.43.0
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to