On Friday, December 04, 2015, Laszlo Ersek wrote:
> Now, after reading the -fno-common documentation in the gcc manual, I'm
> actually *convinced* this is a gcc bug. The manual says on my RHEL-7.2
> system (excerpt):
> 
>        -fno-common
>            In C code, controls the placement of uninitialized global
>            variables.  Unix C compilers have traditionally permitted
>            multiple definitions of such variables in different
>            compilation units by placing the variables in a common
>            block.  This is the behavior specified by -fcommon, and is
>            the default for GCC on most targets.  On the other hand,
>            this behavior is not required by ISO C, [...]
> 

I think this part of the GCC documentation was written a bit sloppily and used
"definition" instead of "declaration".  Since it is talking about
"uninitialized" global variables, then it is a "tentative definition" and
"the behavior is exactly as if the translation unit contains a
file scope |declaration| of that identifier...".

> Let us stop right there -- this  behavior is *forbidden* by ISO C,
> which can be proved if someone takes the time to decipher the nearly
> impenetrable passages on declarations and definitions.

I disagree, unless you are talking about the behavior of the -fno-common flag
or where the GCC doc. says "this behavior is not required by ISO C".
I think use of the -fno-common flag produces non-conforming behavior.

> This means that the above program (of two files) is equivalent to:
> 
>     f1-b.c:
>     ----
>     int x = 0;
>     ----
> 
>     f2-b.c:
>     ----
>     int x = 0;
>     int main(void) { return x; }
>     ----
> 
> Therefore *each* file is ultimately required to behave "as if" it
> contains one external definition for "x".

I strongly disagree.  I believe the misunderstanding occurs from the very sloppy
use of the term "translation unit" in the C specs; from ANSII (C89) through
ISO/IEC 9899:TC3.  The definition of a "translation unit" (TU) is fairly well
defined as a single preprocessed C file, but the specs are inconsistent in
describing whether something applies to a SINGLE TU or a SET of TUs.

Also poorly described is the difference between "translation" and "linking".
While "linking" is beyond the scope of the spec., and since the process of
"linking", and the use of "libraries", is mentioned many times, the spec could
have been clearer about how this interacts with "translation units".

>From the spec as a whole, one can infer that a library is treated as a TU and
when linked with other TUs to produce an executable program the entire set is
treated as a single unit for linkage purposes.

The "implicit initialization" described in item 2 of section
"External object definitions" seems to contradict other parts of the C spec.
if one maintains that the TUs must be thought of as individual files in 
isolation.

Here are the specification sections related to this.

5.1.1.1 Program structure
  1 A C program need not all be translated at the same time. The text of the
    program is kept in units called source files, (or preprocessing files) in
    this International Standard. A source file together with all the headers
    and source files included via the preprocessing directive #include is known
    as a preprocessing translation unit. After preprocessing, a preprocessing
    translation unit is called a translation unit. Previously translated
    translation units may be preserved individually or in libraries. The
    separate translation units of a program communicate by (for example) calls
    to functions whose identifiers have external linkage, manipulation of
    objects whose identifiers have external linkage, or manipulation of data
    files. Translation units may be separately translated and then later linked
    to produce an executable program.

6.2.2 Linkages of identifiers
  1  An identifier declared in different scopes or in the same scope more than
     once can be made to refer to the same object or function by a process
     called linkage. [...]
     
  2 In the set of translation units and libraries that constitutes an entire
    program, each declaration of a particular identifier with external linkage
    denotes the same object or function. Within one translation unit, each
    declaration of an identifier with internal linkage denotes the same object
    or function. [...]

6.2.4 Storage durations of objects
  3 An object whose identifier is declared with external or internal linkage,
    or with the storage-class specifier static has static storage duration.
    Its lifetime is the entire execution of the program and its stored value
    is initialized only once, prior to program startup.

6.9 External definitions
  Syntax
    1 translation-unit:
        external-declaration
        translation-unit external-declaration
      external-declaration:
        function-definition
        declaration
        
  Semantics
    4 As discussed in 5.1.1.1, the unit of program text after preprocessing is
      a translation unit, which consists of a sequence of external
      declarations. These are described as ''external'' because they appear
      outside any function (and hence have file scope). As discussed in 6.7,
      a declaration that also causes storage to be reserved for an object or a
      function named by the identifier is a definition.

    5 An external definition is an external declaration that is also a
      definition of a function (other than an inline definition) or an object.
      If an identifier declared with external linkage is used in an expression
      (other than as part of the operand of a sizeof operator whose result is
      an integer constant), somewhere in the entire program there shall be
      exactly one external definition for the identifier; otherwise, there
      shall be no more than one.(140)

      (140) Thus, if an identifier declared with external linkage is not used
            in an expression, there need be no external definition for it.

6.9.2 External object definitions
  Semantics
    2 A declaration of an identifier for an object that has file scope without
      an initializer, and without a storage-class specifier or with the
      storage-class specifier static, constitutes a tentative definition. If a
      translation unit contains one or more tentative definitions for an
      identifier, and the translation unit contains no external definition for
      that identifier, then the behavior is exactly as if the translation unit
      contains a file scope declaration of that identifier, with the composite
      type as of the end of the translation unit, with an initializer
      equal to 0.

By my analysis, key items are:
  1 Within the SET of TUs, each identifier with external linkage is the same
    object.
  2 If an identifier with external linkage is not initialized, then it will be
    implicitly initialized to 0.
  3 Within the entire program, there may be 0 or 1 initializer for an identifier
    with external linkage.
  4 The entire SET of TUs comprising a program must be examined in order to
    resolve identifiers with external linkage.
  5 The use, and definition, of the term "translation unit" within the
    C specifications is inconsistent and misleading in the presence of
    "libraries" and multiple source files making up a single executable
    program.

> This patch will enforce the right behavior.
> 
> Reviewed-by: Laszlo Ersek <[email protected]>
> 
> Thanks!
> Laszlo

I don't agree, use of -fno-common is a bad idea and does not really fix the
problem. The code needs to be re-examined and fixed.

> On 12/03/15 13:54, Ard Biesheuvel wrote:
>> The default behavior of the GCC compiler is to emit uninitialized
>> globals into a COMMON section, where duplicate definitions are merged.
>> This may result in unexpected behavior, since global variables appearing
>> by the same name in different C files typically do not refer to the same
>> conceptual data item.

If the files are being linked together, then this is exactly the correct
behavior.  If the variables do not refer to the "same conceptual data item"
then they either have to be renamed or made static.

This is a common problem when combining code that comes from multiple
unrelated sources.  Checking for this type of name collision should be part
of the porting process.

Sincerely,
Daryl McDaniel


_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to