Hey all--

Niels, anthraxx and i were just commiserating about the fact that we're
punting on reproducibility of the build path.  We think we might have
found a way to make progress on this.

Problem Statement
-----------------

One of the main concerns about the build path is that it gets included
by gcc in any generated DWARF [0] debugging symbols, specifically in the
dwarf attribute named DW_AT_comp_dir.

Background
----------

gcc already allows the user to tweak this attribute directly:

      -fdebug-prefix-map=old=new
           When compiling files in directory old, record debugging information
           describing them as in new instead.

So, for example, i can do:

   gcc -fdebug-prefix-map=$(pwd)=. -o test test.c

gdb still works for me when debugging code that is built this way.

However, gcc also stores all the switches used during the build in the
DW_AT_producer, so if you do this, then you're just moving the build
path to a different dwarf attribute, so it's still being encoded in the
output.  This doesn't solve the reproducibility problem, but it provides
us with a way to demonstrate that removing the data from DW_AT_comp_dir
doesn't cripple our ability to debug.

We also observed that DW_AT_name already stores the name of the compiled
file relative to the DW_AT_comp_dir -- this poses no reproducibility
problems on its own.

Proposed Solution
-----------------

A minor change to gcc:

 * if the "old" parameter for -fdebug-prefix-map starts with a literal $
   character, make gcc treat it as an environment variable name.  So:
   (note the shell escaping)

    export SOURCE_BUILD_DIR=$(pwd)
    gcc -g -fdebug-prefix-map=\$SOURCE_BUILD_DIR=. -o test.o -c test.c

   should do what we need: the gcc flags are static, and the build path
   is stripped.

   - What to do if the chosen env var isn't set?  Probably just skip the
     match entirely, and maybe raise a warning.

   - What about bizarre theoretical filesystems that might have $ as a
     leading character?  We don't know of any.  We're willing to
     sacrifice them for this feature. :)

I've patched GCC to work this way successfully:

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5256031..234432f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6440,7 +6440,9 @@ link processing time.  Merging is enabled by default.
 @item -fdebug-prefix-map=@var{old}=@var{new}
 @opindex fdebug-prefix-map
 When compiling files in directory @file{@var{old}}, record debugging
-information describing them as in @file{@var{new}} instead.
+information describing them as in @file{@var{new}} instead.  If
+@file{@var{old}} starts with a @samp{$}, the corresponding environment
+variable will be dereferenced, and its value will be used instead.
 
 @item -fno-dwarf2-cfi-asm
 @opindex fdwarf2-cfi-asm
diff --git a/gcc/final.c b/gcc/final.c
index 8cb5533..bc43b61 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1525,6 +1525,9 @@ add_debug_prefix_map (const char *arg)
 {
   debug_prefix_map *map;
   const char *p;
+  char *env;
+  const char *old;
+  size_t oldlen;
 
   p = strchr (arg, '=');
   if (!p)
@@ -1532,9 +1535,29 @@ add_debug_prefix_map (const char *arg)
       error ("invalid argument %qs to -fdebug-prefix-map", arg);
       return;
     }
+  if (*arg == '$')
+    {
+      env = xstrndup (arg+1, p - (arg+1));
+      old = getenv(env);
+      if (!old)
+	{
+	  warning (0, "environment variable %qs not set in argument to "
+		   "-fdebug-prefix-map", env);
+	  free(env);
+	  return;
+	}
+      oldlen = strlen(old);
+      free(env);
+    }
+  else
+    {
+      old = xstrndup (arg, p - arg);
+      oldlen = p - arg;
+    }
+
   map = XNEW (debug_prefix_map);
-  map->old_prefix = xstrndup (arg, p - arg);
-  map->old_len = p - arg;
+  map->old_prefix = old;
+  map->old_len = oldlen;
   p++;
   map->new_prefix = xstrdup (p);
   map->new_len = strlen (p);
What do r-b people think about this?  I'm happy to try to push this
patch to the gcc upstream if folks here think this sounds reasonable and
would address a real future r-b issue.


Alternate Solutions
-------------------

We considered and discarded several other possible solutions, which i'm
noting below, along with the downsides that led us to select the one we
chose:


 * ask gcc to not record -fdebug-prefix-map options in DW_AT_producer

    - it's weird that some options wouldn't be recorded and some
      would.
      
    - build systems would need to set dynamic CFLAGS not be able to
      use this approach.  debian can do this in dpkg-buildpackage, but
      apparently it's tougher on Arch (though Arch can more easily
      set dynamic environment variables).

or three different ideas for new gcc flags, all of which share the
problem that adding a new gcc option would mean that attempts to apply
this prefix map would fail hard when using non-updated gcc:

 * gcc -fdebug-prefix-map-from-env=NEW

  This evaluates a specific, fixed environment variable like
  SOURCE_BUILD_ROOT as the "old" part of the prefix map.

    - asking gcc to adopt a new magic environment variable seems
      sketchy.

 * gcc -fdebug-prefix-map-from-env=ENVVAR=NEW

   This does the same thing as the as the main proposal, but it uses a
   distinct flag and doesn't expect the leading $ prefix.  e.g.

       gcc -fdebug-prefix-map-from-env=SOURCE_BUILD_ROOT=.

 * gcc -fdebug-force-path-to=NEW

   This approach just forces the value of all generated DW_AT_comp_dir
   attributes, which might be overkill.

    - this fails to record the paths relative to the build directory in
      the event that a recursive descent build pattern (a tree of
      Makefiles) is used.  That is, if the top level Makefile does both
      "make -C src1" and "make -C src2", then debug info from files
      named foo.c in each directory will be indistinguishable, even
      within the same project.

feedback welcome,

     --dkg


[0] http://dwarfstd.org/doc/DWARF4.pdf

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Reply via email to