On 2025-09-05, Sergio Durigan Junior wrote:
> On Friday, September 05 2025, John Scott wrote:
>> I'm trying to update the gcc-sh-elf package which uses a shared tree
>> to build a whole toolchain, similar to what's described at
>> https://gcc.gnu.org/simtest-howto.html but with some added complexity
>> due to drift between versions of GCC, GDB, and Newlib. In my specific
>> instance, the top-level include/dwarf2.h is a header shared by both
>> GCC and GDB. GDB's copy appears to not have symbols GCC 15 needs.
>>
>> In general GCC keeps the "master copies", but since we use tarballs
>> on different release schedules in Debian, letting GCC's version
>> always win in a conflict isn't the right way to solve this. What
>> would be helpful, instead, is for me to use file timestamps to figure
>> out which project has the newest version and use that when merging
>> the trees together. However gdb-source doesn't preserve this: with
>> your experimental upload from earlier today I get

A timestamp does not necessarily give you the granularity you are
looking for. It could easily end up being the timestamp of whenever it
was built (e.g. a binary NMU with no source code changes), not when the
file was last updated. A git clone vs. updating an existing git
repository may end up with different timestamps, even with the exact
same top-level commit.  If you get a security update for an old version
that gets updated after the newest upstream version fixes, you may end
up with a newer date with an older copy of the file... etc.

There are so many ways timestamps can be modified; I do not think it
will reliably detect the "newest" copy for you...


>>      $ tar --utc -tf /usr/src/gdb.tar.xz gdb/include/dwarf2.h
>>      -rwxr-xr-x 0/0           16958 2025-09-05 01:47 gdb/include/dwarf2.h
>> so it looks like the timestamp gets mangled in the process of packaging. 
>> This seems to be done at:
>>
>> debian/rules:396
>> tar cfJ $(CURDIR)/debian/gdb-source/usr/src/gdb.tar.xz \
>> --format=gnu \
>> --mode=755 \
>> --mtime="@$(DEB_TIMESTAMP)" --clamp-mtime \
>> --numeric-owner --owner=0 --group=0 \
>> --sort=name \
>> $(notdir $(builddir_source))
>
> Yeah, I can how these options can be a problem for your use case.
>
>> These parameters look overzealous and neither necessary nor appropriate.
>
> They are actually necessary; more below.
>
>>  • Why enforce use of the GNU-specific tar format as opposed to
>>  leaving it at the default? The ustar and pax formats are
>>  standardized in POSIX and the former ought to be adequate, but the
>>  default should always be fine.

With pax it is possible to set the necessary attributes to get a
reproducible result, but it is considerably more fiddly finicky work
than with the gnu tar format. I am not sure about ustar; the
Reproducible Builds project documentation suggests ustar might be a
viable option (if indeed it's limitations are acceptible):

  https://reproducible-builds.org/docs/archives/


>>  • All files get the executable bit set with 'mode=755', even plain
>>  text ones.

Setting the mode was to fix differences in umask in the build
environments, but indeed, it is arguably an overly broad hammer. There
may be ways to do that more elegantly, without specifying each and every
file's mode appropriately.

Does it actually cause ... problems with the source code?


>>  • It's not clear what the modification time-related options are
>>  supposed to accomplish here, but it's hurting my use case and I
>>  wonder how this affects reproducible builds.

Without the timestamp normalization, If you build the package now, and
you build it a week from now, or possibly even seconds from now, you get
a different result... because the build time of some files gets embedded
in the archive, resulting in different .tar archives.


>> The actual Debian source package keeps the timestamps and permissions
>> intact—it's just when bundling the files up for gdb-source they get
>> lost.

If that is actually true, I welcome better fixes to the issue...


>> Once you understand how this got to be the way it is, can you restore
>> the information in subsequent uploads? That will help a lot. Thanks
>
> So, the reason why GDB generates its source package this way is
> explained at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=950606 .
> Long story short, these extra parameters are there to ensure that the
> final tarball is reproducible.
>
> I'm Cc'ing Vagrant, who requested this change and provided the patch, so
> that we can have a more informed conversation about what's happening
> here and hopefully reach a solution.

Admittedly, A few things have happened in the last five or so years that
the details are not fresh in my mind... this was not the only issue
making GDB unreproducible in Debian at the time, but it seems the other
issues have been fixed recently! Yay!


> On a side note, I would personally like to get rid of the gdb-source
> binary package.  I understand why it exists and all, but it's just a
> hack that we do in order to save megabytes of space in the archive.
> Either way, we're keeping it for the time being of course.

Well, with my reproducible builds hat on, that would be another way to
solve the problem! :)


live well,
  vagrant

Attachment: signature.asc
Description: PGP signature

Reply via email to