Bug#1069268: gnulib: package version is long

2024-05-01 Thread Vincent Lefevre
Hi Simon,

On 2024-05-01 11:36:56 +0200, Simon Josefsson wrote:
> Vincent Lefevre  writes:
> > IMHO, for additional version information, the Debian changelog is a
> > good place for the user + output from a standard command providing
> > the version, e.g. "gnulib-tool --version". Scripts can use such a
> > command to record the necessary information about software in log
> > files. By "standard", I mean that it needs to exist upstream.
> >
> > For instance, for GCC, there is "gcc --version", where Debian adds
> > some information. In particular for gcc-snapshot:
> >
> > $ gcc-snapshot --version
> > gcc (Debian 20240117-1) 14.0.1 20240117 (experimental) [master 
> > r14-8187-gb00be6f1576]
> > [...]
> >
> > One has all the details... When compiling software, this can be
> > found in the generated config.log file, which is really nice for
> > bug reports and debugging.
> >
> > And the gcc-snapshot version string is basically just a date.
> 
> Right.  There is one implementation problem: gnulib-tool is patched to
> read the version from /usr/share/doc/gnulib/changelog.Debian.gz now, but
> if we change changelog to only be a date, we would want to change this
> logic to actually print the useful git commit version information, and
> I'm not yet sure how to do that.

IMHO, changelog.Debian.gz should contain complete information about
the version, not in the Debian package version, but in the log. It
already has lines like:

  * New upstream snapshot from stable branch stable-202401.

You may choose some fixed format such that it is parsable by both
humans and the machine, i.e. something that a human can understand,
but simple enough to that a script can produce a more compact version
for gnulib-tool.

However, the Debian policy

  https://www.debian.org/doc/debian-policy/ch-docs.html

says

  Packages must not require the existence of any files in
  /usr/share/doc/ in order to function.[6] Any files that are used or
  read by programs but are also useful as stand alone documentation
  should be installed elsewhere, such as under /usr/share/package/,
  and then included via symbolic links in /usr/share/doc/package.

  [6] The system administrator should be able to delete files in
  /usr/share/doc/ without causing any programs to break.

BTW, since gnulib-tool is already in /usr/share/gnulib, it would
make sense to have version information there too (perhaps just in
the compact form, for gnulib-tool).

> There is also the "risk" that upstream gnulib eventually release
> versioned archives.  There is a recently added v1.0 tag in git for
> example, suggesting things may change.  Since we haven't used the
> 0~20240501* pattern for gnulib version historically, to move to a

(Anyway, the 0~20240501* pattern would have been a bad idea, IMHO,
because AFAIK, there is no version 0 in gnulib, and that would have
confused the user.)

> version based approach we would need an epoch like 1:1.3-1 or (more
> likely) 1:1.3+20250314-2 since I think we need to package more recent
> git commits than what's in any most recent gnulib git tags.  However I
> don't think the upstream version number is relevant either, for the same
> reasons we realized the upstream commit id or branches weren't.  So we
> can continue to use dates for package versions even if upstream start to
> release packaged archives.
> 
> I'm now inclinced to use a pure date-based version string like
> 20240501-1 going forward.  Any objections?

This is OK for me. Until there would be an obvious advantage to
change, I'd say that it is better to still use just the date.

> We then also have to fix Debian's gnulib-tool --version output to embed
> the latest git commit information from upstream that we package --
> possibly using the new git bundle as a source?  Instead of parsing
> /usr/share/doc/gnulib/changelog.Debian.gz, which may not be present in
> stripped down images anyway.

Well, you must not use files under /usr/share/doc (see above).

I saw that you now ship gnulib.bundle, but I don't know whether
this is a good thing (usefulness in practice vs the fact that
it makes the package significantly larger).

> Since gnulib is a fairly large package, I would prefer to not spam the
> archive with new *.orig.tar.gz uploads too often.  So I prefer to not
> fix this bug until we have to upload a more recent gnulib into the
> archive for other reasons.  I don't expect that to take long: I'm
> planning to do new release of several projects (oath-toolkit, libidn2,
> inetutils, etc) that use gnulib, and work on making those Debian
> packages use the Debian gnulib package instead of vendored code would
> require a new gnulib Debian package upload.

OK. I reverted to 20240117+stable-1 on my machines. IIRC, I had
installed gnulib to be able to rebuild the libtool package, but
a rebuild won't happen until I need to patch libtool.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - 

Bug#1069268: gnulib: package version is long

2024-05-01 Thread Simon Josefsson
Hi Vincent,

Vincent Lefevre  writes:

> Thanks for the explanations. However, I think that it is not the
> goal of the Debian version string to entirely reflect all the
> Debian-side and upstream-side versions involved. This may be a
> good idea when the obtained version string is short enough and
> easy to understand, but here, this is not the case. There are
> probably better places to give such information, in a clearer
> way. This could be in well-chosen files and/or output from
> commands like "gnulib-tool --version" (see below).

I agree now.

> BTW, is the "+stable" really useful in the version string?

I'm no longer convinced it adds anything useful, so would prefer to
remove it.

>> To be honest, after writing all this, I'm no longer certain why anyone
>> would really look at the version number at all for the gnulib Debian
>> package.  A sequentially increasing number is sufficient for packaging
>> reasons: if anyone really wants to know what git commits are inside the
>> package, just read the source code in the package to find out.
>
> IMHO, for additional version information, the Debian changelog is a
> good place for the user + output from a standard command providing
> the version, e.g. "gnulib-tool --version". Scripts can use such a
> command to record the necessary information about software in log
> files. By "standard", I mean that it needs to exist upstream.
>
> For instance, for GCC, there is "gcc --version", where Debian adds
> some information. In particular for gcc-snapshot:
>
> $ gcc-snapshot --version
> gcc (Debian 20240117-1) 14.0.1 20240117 (experimental) [master 
> r14-8187-gb00be6f1576]
> [...]
>
> One has all the details... When compiling software, this can be
> found in the generated config.log file, which is really nice for
> bug reports and debugging.
>
> And the gcc-snapshot version string is basically just a date.

Right.  There is one implementation problem: gnulib-tool is patched to
read the version from /usr/share/doc/gnulib/changelog.Debian.gz now, but
if we change changelog to only be a date, we would want to change this
logic to actually print the useful git commit version information, and
I'm not yet sure how to do that.

There is also the "risk" that upstream gnulib eventually release
versioned archives.  There is a recently added v1.0 tag in git for
example, suggesting things may change.  Since we haven't used the
0~20240501* pattern for gnulib version historically, to move to a
version based approach we would need an epoch like 1:1.3-1 or (more
likely) 1:1.3+20250314-2 since I think we need to package more recent
git commits than what's in any most recent gnulib git tags.  However I
don't think the upstream version number is relevant either, for the same
reasons we realized the upstream commit id or branches weren't.  So we
can continue to use dates for package versions even if upstream start to
release packaged archives.

I'm now inclinced to use a pure date-based version string like
20240501-1 going forward.  Any objections?

We then also have to fix Debian's gnulib-tool --version output to embed
the latest git commit information from upstream that we package --
possibly using the new git bundle as a source?  Instead of parsing
/usr/share/doc/gnulib/changelog.Debian.gz, which may not be present in
stripped down images anyway.

Since gnulib is a fairly large package, I would prefer to not spam the
archive with new *.orig.tar.gz uploads too often.  So I prefer to not
fix this bug until we have to upload a more recent gnulib into the
archive for other reasons.  I don't expect that to take long: I'm
planning to do new release of several projects (oath-toolkit, libidn2,
inetutils, etc) that use gnulib, and work on making those Debian
packages use the Debian gnulib package instead of vendored code would
require a new gnulib Debian package upload.

/Simon


signature.asc
Description: PGP signature


Bug#1069268: gnulib: package version is long

2024-04-23 Thread Vincent Lefevre
Hi Simon,

On 2024-04-19 09:13:01 +0200, Simon Josefsson wrote:
> Vincent Lefevre  writes:
> 
> > Package: gnulib
> > Version: 20240412~dfb7117+stable202401.20240408~aa0aa87-2
> > Severity: normal
> >
> > A long package version is annoying for the user (for the "dpkg -l"
> > output and other reasons). I doubt that such a long version is
> > necessary;

I would also add that long version strings are truncated in the
aptitude TUI due to the limited terminal width.

> Hi Vincent and thanks for the report.
> 
> Yeah, I can sympathise with this concern, and deciding on the version
> scheme probably took me the most time in the last update.  Some
> discussion would help.  Quoting README.source:
[...]

Thanks for the explanations. However, I think that it is not the
goal of the Debian version string to entirely reflect all the
Debian-side and upstream-side versions involved. This may be a
good idea when the obtained version string is short enough and
easy to understand, but here, this is not the case. There are
probably better places to give such information, in a clearer
way. This could be in well-chosen files and/or output from
commands like "gnulib-tool --version" (see below).

> The only superflous information in the version strings are the dates,
> but removing them does not seem like it would improve on your concern,
> rather the opposite.  Maybe the dates are what makes sense for users.
> 
> And something like dates are needed for dpkg version ordering.

Yes, for version strings, dates are much more important than things
like commit ids (which just look like random characters).

> Some ideas for improvement:
> 
> 20240411+stable-1 - revert to earlier pattern, this loses the commit id
> information and which stable branch was used, potentially making it
> impossible to use the same pattern in the future if there is one Debian
> upload of 20240411+stable-1 and somehow upstream gnulib commits an
> important patch on the same date that we need to package.

In the *rare* case of an upstream commit at the same date, a suffix
can be used, something like 20240411-2+stable-1.

BTW, is the "+stable" really useful in the version string?
Such information could be given in the Debian changelog
(which is already the case) and at some other places.

If it is dropped, there's also:
  20240411-1 (first version)
  20240411a-1 (additional commit on the same date)

[...]
> To be honest, after writing all this, I'm no longer certain why anyone
> would really look at the version number at all for the gnulib Debian
> package.  A sequentially increasing number is sufficient for packaging
> reasons: if anyone really wants to know what git commits are inside the
> package, just read the source code in the package to find out.

IMHO, for additional version information, the Debian changelog is a
good place for the user + output from a standard command providing
the version, e.g. "gnulib-tool --version". Scripts can use such a
command to record the necessary information about software in log
files. By "standard", I mean that it needs to exist upstream.

For instance, for GCC, there is "gcc --version", where Debian adds
some information. In particular for gcc-snapshot:

$ gcc-snapshot --version
gcc (Debian 20240117-1) 14.0.1 20240117 (experimental) [master 
r14-8187-gb00be6f1576]
[...]

One has all the details... When compiling software, this can be
found in the generated config.log file, which is really nice for
bug reports and debugging.

And the gcc-snapshot version string is basically just a date.

Regards,

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1069268: gnulib: package version is long

2024-04-19 Thread Simon Josefsson
Vincent Lefevre  writes:

> Package: gnulib
> Version: 20240412~dfb7117+stable202401.20240408~aa0aa87-2
> Severity: normal
>
> A long package version is annoying for the user (for the "dpkg -l"
> output and other reasons). I doubt that such a long version is
> necessary;

Hi Vincent and thanks for the report.

Yeah, I can sympathise with this concern, and deciding on the version
scheme probably took me the most time in the last update.  Some
discussion would help.  Quoting README.source:

 A package version "20240411~b57dd0d+stable202401.20240408~aa0aa87"
 means that the gnulib git clone had the latest commit on "master"
 dated "20240411" with revision "b57dd0d" and the "stable-202401"
 branch had a commit dated "20240408" with revision "aa0aa87".  The
 files in /usr/share/gnulib correspond to files on the "stable-202401"
 branch with revision "aa0aa87", except for
 /usr/share/gnulib/gnulib.bundle which is a git bundle containing both
 master and stable branches up until "b57dd0d" (master) and "aa0aa87"
 (stable-202401) respectively.  This approach enable /usr/share/gnulib
 to be an unpacked stable gnulib release, but the
 /usr/share/gnulib/gnulib.bundle (which is more likely to be used by
 other packages for bootstrapping) can contain newer commits too.

So there really are two upstreams versions involved, and upstream
neither tag nor release tarballs, so timestamps, branch names and git
commits are the only identifiers that we have available:

0~a57dd0d~stable-202401~ba0aa87

It could be that we will have these two versions at some point, maybe
one in unstable and one in experimental:

20250411~a57dd0d+stable202501.20250408~ba0aa87
20250411~a57dd0d+stable202407.20250308~c928adb

That correspond to the same master branch but different stable branches.

We could also have differences like this, showing updated commits on the
same stable branch:

20250411~a57dd0d+stable202501.20250408~ba0aa87
20250411~a57dd0d+stable202501.20250308~c928adb

Upstream maintains multiple stable branches in parallel.

We can have uploads for different master commits but same stable branch,
consider if a security fix was applied to a stable branch at some point:

20250411~8585cab+stable202501.20250408~ba0aa87
20250411~abc4818+stable202501.20250408~abc4874

or even:

20250411~8585cab+stable202501.20250408~ba0aa87
20250311~abc4818+stable202501.20250408~ba0aa87

To be able to identify the upstream version, we need the information of
the commit id on the master branch together with which stable branch was
used and which commit on the stable branch.

The only superflous information in the version strings are the dates,
but removing them does not seem like it would improve on your concern,
rather the opposite.  Maybe the dates are what makes sense for users.
And something like dates are needed for dpkg version ordering.

Some ideas for improvement:

20240411+stable-1 - revert to earlier pattern, this loses the commit id
information and which stable branch was used, potentially making it
impossible to use the same pattern in the future if there is one Debian
upload of 20240411+stable-1 and somehow upstream gnulib commits an
important patch on the same date that we need to package.

20240411~a57dd0d-1 - this makes the git commit identifiable, but loses
information about which stable branch was used.  Maybe the stable branch
version is more important than the master branch release date?

20240411+stable2401~a47dd0d-1 - this adds the release branch used, and
has git commit id corresponding to git master.  May lead to version
conflicts if multiple releases are needed on the same date.

I guess you can come up with some more variations that for better or
worse make some sense.

To be honest, after writing all this, I'm no longer certain why anyone
would really look at the version number at all for the gnulib Debian
package.  A sequentially increasing number is sufficient for packaging
reasons: if anyone really wants to know what git commits are inside the
package, just read the source code in the package to find out.

So maybe just fall back to what 'uscan' suggests: 20240418~0a85f70-1.
Or 20240418+stable-1 like we had before, although I'm not really sure if
the '+stable' helps anyone, and it is a bit confusing if the date refers
to the timestamp on the stable or master branch.

> I'm wondering whether it is intended or is due to a bug in a script
> (the "Fix gnulib-tool --version" seems to have done nothing
> significant).

The current version scheme is intentional, but I'm open to changing it.
That was a unrelated fix: before 'gnulib-tool --version' tried to parse
/usr/share/doc/gnulib/NEWS.stable.gz, which doesn't exist, so you got an
error message and the printed version string became garbled.

/Simon


signature.asc
Description: PGP signature


Bug#1069268: gnulib: package version is long

2024-04-18 Thread Vincent Lefevre
Package: gnulib
Version: 20240412~dfb7117+stable202401.20240408~aa0aa87-2
Severity: normal

A long package version is annoying for the user (for the "dpkg -l"
output and other reasons). I doubt that such a long version is
necessary; I'm wondering whether it is intended or is due to a
bug in a script (the "Fix gnulib-tool --version" seems to have
done nothing significant).

-- System Information:
Debian Release: trixie/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 
'stable-security'), (500, 'stable-debug'), (500, 'proposed-updates-debug'), 
(500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.6.15-amd64 (SMP w/12 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

gnulib depends on no packages.

Versions of packages gnulib recommends:
ii  autoconf 2.71-3+local2
ii  automake 1:1.16.5-1.3
ii  autopoint0.21-14
ii  bison2:3.8.2+dfsg-1+b1
ii  build-essential  12.10
ii  gettext  0.21-14+b1
ii  gperf3.1-1
ii  libtool  2.4.7-7
ii  m4   1.4.19-4
ii  texinfo  7.1-3

Versions of packages gnulib suggests:
pn  clisp
ii  git  1:2.43.0-1+b1
ii  perl 5.38.2-3.2+b2
ii  python3  3.11.8-1

-- no debconf information

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)