Package: tar
Version: 1.27.1-2
Severity: wishlist
Tags: patch
User: reproducible-builds@lists.alioth.debian.org
Usertags: toolchain timestamps

Hi!

Within the “reproducible builds” effort [1], we are always trying to
find better solutions to make it either to create determenistic build
systems.

One issue we face regularly (and `dpkg` is actually affected) is
that timestamps of files created during the build gets embedded in
tarballs. This makes them impossible to reproduce at another time.

Our generic solution [2] is to use a reference time (e.g. the time of the
latest entry in `debian/changelog') and use it for all files created
later instead of their actual modification date.

We currently implement this by a long call to find+xargs+touch before
calling tar. On top of the extra complexity, this has downside of
modifying the filesystem when we actually only care about the archive
content.

The attached patch adds a `--clamp-mtime` option to tar. When specified
together with `--mtime`, it will switch to the aforementioned behavior
instead of setting all mtimes to the same value.

Sadly, after being submitted upstream [3], Paul Eggert failed to see the
benefits of this addition. Help in moving the matter further, in Debian
or upstream, would be most welcome.

 [1]: https://wiki.debian.org/ReproducibleBuilds
 [2]: https://wiki.debian.org/ReproducibleBuilds/TimestampsInTarball
 [3]: https://lists.gnu.org/archive/html/help-tar/2015-06/msg00000.html

Thanks,
-- 
Lunar                                .''`. 
lu...@debian.org                    : :Ⓐ  :  # apt-get install anarchism
                                    `. `'` 
                                      `-   
From de55fd95593aa880329879b42779cddef48e8ee0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=A9my=20Bobbio?= <lu...@debian.org>
Date: Mon, 29 Jun 2015 11:42:29 +0200
Subject: [PATCH] add a `--clamp-mtime` option to make reproducible builds
 easier

---
 debian/patches/add-clamp-mtime.diff | 185 ++++++++++++++++++++++++++++++++++++
 debian/patches/series               |   1 +
 2 files changed, 186 insertions(+)
 create mode 100644 debian/patches/add-clamp-mtime.diff

diff --git a/debian/patches/add-clamp-mtime.diff b/debian/patches/add-clamp-mtime.diff
new file mode 100644
index 0000000..c9d6015
--- /dev/null
+++ b/debian/patches/add-clamp-mtime.diff
@@ -0,0 +1,185 @@
+From d9bea5154e28817f7c42e7fb7798df17eca483ff Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?J=C3=A9r=C3=A9my=20Bobbio?= <lu...@debian.org>
+Date: Thu, 4 Jun 2015 11:05:20 +0000
+Subject: [PATCH] Add --clamp-mtime option
+
+The new `--clamp-mtime` option will change the behavior of `--mtime` to
+only use the time specified if the file mtime is newer than the given time.
+The `--clamp-mtime` option can only be used together with `--mtime`.
+
+Typical use case is to make builds reproducible: to loose less
+information, it's better to keep the original date of an archive, except for
+files modified during the build process. In that case, using a reference (and
+thus reproducible) timestamps for the latter is good enough. See
+<https://wiki.debian.org/ReproducibleBuilds> for more information.
+
+In order to implement the option, we transform `set_mtime_option` from
+a bool to an enum with three values: use original file mtime, force all mtimes
+to be of the same value, and clamp mtimes (as explained above).
+
+To verify that `--clamp-mtime` is used together with `--mtime`, `mtime_option`
+is now initialized to a minimal value as done for `newer_mtime_option`. As
+the same macro can now be used for both options, NEWER_OPTION_INITIALIZED
+has been renamed to TIME_OPTION_INITIALIZED.
+---
+ src/common.h | 17 ++++++++++++-----
+ src/create.c | 15 ++++++++++++++-
+ src/list.c   |  2 +-
+ src/tar.c    | 23 ++++++++++++++++++++---
+ 4 files changed, 47 insertions(+), 10 deletions(-)
+
+diff --git a/src/common.h b/src/common.h
+index 42fd539..962ce1d 100644
+--- a/src/common.h
++++ b/src/common.h
+@@ -211,13 +211,20 @@ GLOBAL bool multi_volume_option;
+    do not get archived (also see after_date_option above).  */
+ GLOBAL struct timespec newer_mtime_option;
+ 
+-/* If true, override actual mtime (see below) */
+-GLOBAL bool set_mtime_option;
+-/* Value to be put in mtime header field instead of the actual mtime */
++enum set_mtime_option_mode
++{
++  USE_FILE_MTIME,
++  FORCE_MTIME,
++  CLAMP_MTIME,
++};
++
++/* Override actual mtime if set to FORCE_MTIME or CLAMP_MTIME */
++GLOBAL enum set_mtime_option_mode set_mtime_option;
++/* Value to use when forcing or clamping the mtime header field. */
+ GLOBAL struct timespec mtime_option;
+ 
+-/* Return true if newer_mtime_option is initialized.  */
+-#define NEWER_OPTION_INITIALIZED(opt) (0 <= (opt).tv_nsec)
++/* Return true if mtime_option or newer_mtime_option is initialized.  */
++#define TIME_OPTION_INITIALIZED(opt) (0 <= (opt).tv_nsec)
+ 
+ /* Return true if the struct stat ST's M time is less than
+    newer_mtime_option.  */
+diff --git a/src/create.c b/src/create.c
+index 4344a24..63585a1 100644
+--- a/src/create.c
++++ b/src/create.c
+@@ -822,7 +822,20 @@ start_header (struct tar_stat_info *st)
+   }
+ 
+   {
+-    struct timespec mtime = set_mtime_option ? mtime_option : st->mtime;
++    struct timespec mtime;
++    switch (set_mtime_option)
++      {
++        case FORCE_MTIME:
++          mtime = mtime_option;
++          break;
++        case CLAMP_MTIME:
++          mtime = timespec_cmp (st->mtime, mtime_option) > 0 ? mtime_option : st->mtime;
++          break;
++        default:
++          mtime = st->mtime;
++          break;
++      }
++
+     if (archive_format == POSIX_FORMAT)
+       {
+ 	if (MAX_OCTAL_VAL (header->header.mtime) < mtime.tv_sec
+diff --git a/src/list.c b/src/list.c
+index 858aa73..ce2d304 100644
+--- a/src/list.c
++++ b/src/list.c
+@@ -166,7 +166,7 @@ read_and (void (*do_something) (void))
+ 	  decode_header (current_header, &current_stat_info,
+ 			 &current_format, 1);
+ 	  if (! name_match (current_stat_info.file_name)
+-	      || (NEWER_OPTION_INITIALIZED (newer_mtime_option)
++	      || (TIME_OPTION_INITIALIZED (newer_mtime_option)
+ 		  /* FIXME: We get mtime now, and again later; this causes
+ 		     duplicate diagnostics if header.mtime is bogus.  */
+ 		  && ((mtime.tv_sec
+diff --git a/src/tar.c b/src/tar.c
+index 4f5017d..cbaa9df 100644
+--- a/src/tar.c
++++ b/src/tar.c
+@@ -267,6 +267,7 @@ enum
+   CHECK_DEVICE_OPTION,
+   CHECKPOINT_OPTION,
+   CHECKPOINT_ACTION_OPTION,
++  CLAMP_MTIME_OPTION,
+   DELAY_DIRECTORY_RESTORE_OPTION,
+   HARD_DEREFERENCE_OPTION,
+   DELETE_OPTION,
+@@ -515,6 +516,8 @@ static struct argp_option options[] = {
+    N_("force NAME as group for added files"), GRID+1 },
+   {"mtime", MTIME_OPTION, N_("DATE-OR-FILE"), 0,
+    N_("set mtime for added files from DATE-OR-FILE"), GRID+1 },
++  {"clamp-mtime", CLAMP_MTIME_OPTION, 0, 0,
++   N_("only set time when the file is more recent than what was given with --mtime"), GRID+1 },
+   {"mode", MODE_OPTION, N_("CHANGES"), 0,
+    N_("force (symbolic) mode CHANGES for added files"), GRID+1 },
+   {"atime-preserve", ATIME_PRESERVE_OPTION,
+@@ -1355,6 +1358,10 @@ parse_opt (int key, char *arg, struct argp_state *state)
+       set_subcommand_option (CREATE_SUBCOMMAND);
+       break;
+ 
++    case CLAMP_MTIME_OPTION:
++      set_mtime_option = CLAMP_MTIME;
++      break;
++
+     case 'C':
+       name_add_dir (arg);
+       break;
+@@ -1492,7 +1499,8 @@ parse_opt (int key, char *arg, struct argp_state *state)
+ 
+     case MTIME_OPTION:
+       get_date_or_file (args, "--mtime", arg, &mtime_option);
+-      set_mtime_option = true;
++      if (set_mtime_option == USE_FILE_MTIME)
++        set_mtime_option = FORCE_MTIME;
+       break;
+ 
+     case 'n':
+@@ -1508,7 +1516,7 @@ parse_opt (int key, char *arg, struct argp_state *state)
+       /* Fall through.  */
+ 
+     case NEWER_MTIME_OPTION:
+-      if (NEWER_OPTION_INITIALIZED (newer_mtime_option))
++      if (TIME_OPTION_INITIALIZED (newer_mtime_option))
+ 	USAGE_ERROR ((0, 0, _("More than one threshold date")));
+       get_date_or_file (args,
+ 			key == NEWER_MTIME_OPTION ? "--newer-mtime"
+@@ -2249,6 +2257,8 @@ decode_options (int argc, char **argv)
+   excluded = new_exclude ();
+   newer_mtime_option.tv_sec = TYPE_MINIMUM (time_t);
+   newer_mtime_option.tv_nsec = -1;
++  mtime_option.tv_sec = TYPE_MINIMUM (time_t);
++  mtime_option.tv_nsec = -1;
+   recursion_option = FNM_LEADING_DIR;
+   unquote_option = true;
+   tar_sparse_major = 1;
+@@ -2408,7 +2418,7 @@ decode_options (int argc, char **argv)
+ 		  _("Multiple archive files require '-M' option")));
+ 
+   if (listed_incremental_option
+-      && NEWER_OPTION_INITIALIZED (newer_mtime_option))
++      && TIME_OPTION_INITIALIZED (newer_mtime_option))
+     USAGE_ERROR ((0, 0,
+ 		  _("Cannot combine --listed-incremental with --newer")));
+   if (incremental_level != -1 && !listed_incremental_option)
+@@ -2461,6 +2471,13 @@ decode_options (int argc, char **argv)
+ 	USAGE_ERROR ((0, 0, _("Cannot concatenate compressed archives")));
+     }
+ 
++  if (set_mtime_option == CLAMP_MTIME)
++    {
++      if (!TIME_OPTION_INITIALIZED (mtime_option))
++	USAGE_ERROR ((0, 0,
++		      _("--clamp-mtime needs a date specified using --mtime")));
++    }
++
+   /* It is no harm to use --pax-option on non-pax archives in archive
+      reading mode. It may even be useful, since it allows to override
+      file attributes from tar headers. Therefore I allow such usage.
+-- 
+2.1.4
+
diff --git a/debian/patches/series b/debian/patches/series
index 5974cbb..8db6ac0 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,2 +1,3 @@
 pristine-tar.diff
 listed03-linux-only
+add-clamp-mtime.diff
-- 
1.9.1

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Reply via email to