subject:"Re\: Versioned releases"

Re: Versioned releases

2020-06-05 Thread Paul Smith

On Thu, 2020-06-04 at 23:29 +0200, Bruno Haible wrote:
> I disagree on this one. It would make people think that the Nth
> commit, or the Monday commit, or whatever, is preferred over the
> other commits.  Which it really isn't - there may be a regression fix
> coming in just the next day.

I'm not sure about that.  A tag which is just the date is pretty
clearly nothing more than that: a tag which is the date.  I don't think
anyone will somehow believe that it means more than it is.

Plenty of systems out there do similar things.

> In summary:
>   * The date (first line of ChangeLog) is a good version indicator.
>   * If someone doesn't like dates, for whatever reason, they can use
> 'git describe'.

IMO 'git describe' is less useful without some type of tagging regimen.

Even tagging once a year would be helpful.

This is today:

  $ git describe
  v0.1-3536-gd50852525

??  If we added a tag "2020" at the beginning of the year we'd get:

  $ git describe
  2020-427-gd50852525

A tag like "202006" the beginning of the month would be:

  $ git describe
  202006-11-gd50852525

However it's done, my main hope is that gnulib provide some kind of
module which does this version detection and generation for you, and
builds that into its scripting so it's automatic, rather than everyone
reinventing it (possibly slightly differently) for themselves.

For example when I run bootstrap against a Git repo, it would run "git
describe" and put the results into some gnulib version string in the
files copied into my workspace.

And there could be an "extract a static workspace" script that would do
the same type of thing for an entire gnulib copy, that distributions
(if they really wanted to ship a "gnulib" package) could run.

Re: Versioned releases

2020-06-04 Thread Bruno Haible

Bernhard Voelker wrote:
> Well, the projects using gnulib (via git submodule) could at least generate
> the 'git describe' value into their NEWS file or other documentation.

Some packages do this already. The latest GNU Bison release announcement [1],
for example:

  "This release was bootstrapped with the following tools:
 Autoconf 2.69
 Automake 1.16.2
 Flex 2.6.4
 Gettext 0.19.8.1
 Gnulib v0.1-3420-gffbb0ced8
  "

> And gnulib could provide helpers for that.

The build-aux/announce-gen script already has support for it:

   --gnulib-version=VERSION report VERSION as the gnulib version, where
VERSION is the result of running git describe
in the gnulib source directory.

Bruno

[1] https://lists.gnu.org/archive/html/bug-bison/2020-05/msg00097.html

Re: Versioned releases

2020-06-04 Thread Bernhard Voelker

On 2020-06-04 20:19, Bruno Haible wrote:
> Indeed e.g. Debian has a gnulib "package":
>   https://packages.debian.org/sid/all/gnulib/filelist
> 
> But I think it's a red herring, since basically no one is using gnulib
> this way.

I agree, it's not really useful: e.g. I'm using openSUSE:Tumbleweed,
a rolling release with almost the latest and greatest.  But still,
the version of it is already >260 commits behind (from 2020-02-16).

Well, the 'gnulib-docs' package integrates well, but some content
is already outdated - especially the newer, interesting parts.

I guess that other, non-rolling distros come with much older and
therefore even more useless versions.

> You mean, a distributor wants to determine which of the coreutils,
> findutils, gawk, gettext, etc. package use the Gnulib before 2018-09-23?
> This is nontrivial, but not because Gnulib does not have a version
> number, but because it's shipped as a source-code library - something that
> we don't want to change.

Well, the projects using gnulib (via git submodule) could at least generate
the 'git describe' value into their NEWS file or other documentation.
And gnulib could provide helpers for that.

Still, that wouldn't help in the case the packager adds a downstream
patch for gnulib files.  Well, that same patch could include add a note
for it in the docs as well.

Have a nice day,
Berny

Re: Versioned releases

2020-06-04 Thread Bruno Haible

Hi Dmitry,

> My claim only covers standalone distribution of gnulib. I don't want
> to dig into the reasons for why upstream forces bundling and why
> downstream don't follow it anyway, but the sole fact that it's packaged
> standalone in so many distribution speaks for itself of that this way of
> distribution is a necessity.

I don't think so. This way of distribution is a misunderstanding.

Every developer nowadays is used to doing 'git clone' here and there;
there are even more and more people who prefer the hassles of building a
package from a git checkout to the sailing trip of building a tarball.

> With standalone distribuition there's no way to peek into git history
> or some source files, but there's a clear identifier of which specific
> version is packaged.

Yes. As I said, the first line of the ChangeLog is the best identifier.

> > Or are you suggesting that the Gnulib developers pick, say, every 100th
> > Gnulib commit and assign it a version number? And how would that be useful,
> > since the consumers upgrade when they like to?
> 
> I would suggest using proper semver.

semver is not a good philosophy for gnulib, because different packages
use different gnulib modules. This week we made an incompatible change
to the 'read-file' module; but the vast majority of the packages will
not be impacted because they don't use this module. Therefore bumping
a version number is not really meaningful.

> But dumb tagging every nth
> commit, or weekly or so would definitely be better than nothing

I disagree on this one. It would make people think that the Nth commit,
or the Monday commit, or whatever, is preferred over the other commits.
Which it really isn't - there may be a regression fix coming in just the
next day.

In summary:
  * The date (first line of ChangeLog) is a good version indicator.
  * If someone doesn't like dates, for whatever reason, they can use
'git describe'.

Bruno

Re: Versioned releases

2020-06-04 Thread Paul Smith

On Thu, 2020-06-04 at 23:11 +0300, Dmitry Marakasov wrote:
> * Paul Smith (psm...@gnu.org) wrote:
> 
> > Regarding the format of the version:
> > 
> > First, semver is not right for gnulib.  The entire concept behind
> > semver and similar versioning schemes is to use a version string to
> > describe compatibility guarantees between different versions. 
> > That's (IMO) completely inappropriate for a source-only package
> > like gnulib.
> 
> Why, that's precisely what semver is useful and was designed for.
> It's MAJOR.MINOR.PATCH - if you break API, bump MAJOR, if you
> introduce new feature, bump MINOR, otherwise bump PATCH.

I'm not a gnulib developer, so I don't want to speak for them: maybe
they would like to make this attempt.  But IMO it's not appropriate for
gnulib.

During the development of gnulib there aren't discrete release points
where someone will stop and consider all the changes since the last
release, and assign some version to it as a whole.  To the extent that
such discrete points exist they are invented by distributions that
include gnulib as a package... not by the gnulib developers.

To follow semver, or a similar versioning scheme, would mean that EVERY
SINGLE COMMIT would have to change the version, because EVERY SINGLE
COMMIT makes some change, and anyone could do a Git pull of gnulib at
any instant and include it in their program, or in their distribution.

The only possibly workable option would be to have the first two
numbers in a semver be bumped by developers when they pushed changes
which they knew to change the API or add a feature, and leave the last
number to be automatically generated based on the number of Git commits
since the last version bump (since those commits can be assumed to be
bugfixes only).

However, I doubt this is reasonable either.  First, even only
considering the first two semver values it would add a lot of overhead
and effort to the development process to consider and get right these
version bumps with every push to the repository.

Second, remember gnulib is not a monolithic entity: it's a collection
of 1,200 or so discrete "utilities" (and counting...), most of which
are just one or two files.  Do we say that the version of gnulib should
change every time ANY ONE of those hundreds of utilities had a new
feature or a change to their API?

Suppose Bruno pushes a new module (second number bump), then the next
day realizes it has a problem that needs the API to change (first
number bump).  Then an hour later he realizes there's another problem
with the API (another first number bump).  Etc.  Just because the API
version bumped doesn't tell you anything very interesting when it could
be any one of >1000 different utilities whose API was changed, for any
number of reasons.

> So as a consumer I may just require e.g. version >=1.2.3 <2, and
> expect it to be API-compatible and have all the features my code
> requires.

That isn't how gnulib is intended to be used.

> > My recommendation would be to automatically add a tag once a month
> > (say) to the gnulib Git repo with the date, and then use the "git
> > describe" output as the version.  This gives an easily-comparable
> > version string with all the info needed.
> 
> This complicates the format, as SHAs are never appropriate in the
> verions, for they are not monotonic and alphabetic characters are not
> compatible with all package managers. Someone may include them, some
> may omit them, and we'll end up with incompatible versioning schemes
> again.

IMO the idea of being able to learn anything from a gnulib version that
is more informative than, "this one contains more recent commits than
that one" is not feasible.

But I think that "this one contains more recent commits than that one"
_IS_ a very useful and desirable metric and speaking as a gnlib user I
hope we can find a relatively painless way to incorporate it.

Re: Versioned releases

2020-06-04 Thread Dmitry Marakasov

* Paul Smith (psm...@gnu.org) wrote:

> Regarding the format of the version:
> 
> First, semver is not right for gnulib.  The entire concept behind
> semver and similar versioning schemes is to use a version string to
> describe compatibility guarantees between different versions.  That's
> (IMO) completely inappropriate for a source-only package like gnulib.

Why, that's precisely what semver is useful and was designed for.
It's MAJOR.MINOR.PATCH - if you break API, bump MAJOR, if you introduce
new feature, bump MINOR, otherwise bump PATCH.

So as a consumer I may just require e.g. version >=1.2.3 <2, and expect
it to be API-compatible and have all the features my code requires. With
that, library code may be safely (even automatically) updated to the
latest 1.x version, be it a systemwide package maintained by someone
else, or a bundled code/subrepository, and consumer code will not break,
yet having all the latest features/fixes from the library.

> I think the Git SHA is the single most critical element and must be
> included.  However, it's not too informative unless the user has the
> Git repo.
>
> My recommendation would be to automatically add a tag once a month
> (say) to the gnulib Git repo with the date, and then use the "git
> describe" output as the version.  This gives an easily-comparable
> version string with all the info needed.

This complicates the format, as SHAs are never appropriate in the
verions, for they are not monotonic and alphabetic characters are not
compatible with all package managers. Someone may include them, some may
omit them, and we'll end up with incompatible versioning schemes again.

If you're going to introduce a version, please be sure it's the same
being a tag or embedded in the source. You may as well embed git commit
or `git describe` output, but it should be clearly separated from the
version.

-- 
Dmitry Marakasov   .   55B5 0596 FF1E 8D84 5F56  9510 D35A 80DD F9D2 F77D
amd...@amdmi3.ru  ..:  https://github.com/AMDmi3

Re: Versioned releases

2020-06-04 Thread Paul Smith

On Thu, 2020-06-04 at 20:19 +0200, Bruno Haible wrote:
> Are you suggesting that every gnulib commit can be translated to a
> version number? There's 'git describe' which does that.
> 
> Or are you suggesting that the Gnulib developers pick, say, every
> 100th Gnulib commit and assign it a version number? And how would
> that be useful, since the consumers upgrade when they like to?

What would be useful is if there were a "gnulib-version" module or
similar that was constructed when bootstrap was run and pulled in a new
suite of gnulib content, for example, based on the Git version perhaps.

Then applications could call a C function to return the gnulib version
as a string and include it in their --version output (if they wanted
to) and users could judge the "freshness" of the gnulib content.

For the distro packages, that take a snapshot of the Git repo: it would
be good if there were some way to have that snapshot contain hardcoded
version details from the Git, so that if apps bootstrapped from the
distro snapshot of gnulib they would get the correct hardcoded version.

I don't pretend to know too much about how all this works, including
how distros create gnulib packages, but this seems like something that
would be do-able and useful, and wouldn't need to involve any type of
"automatic versioning" of gnulib in the Git repo.

Regarding the format of the version:

First, semver is not right for gnulib.  The entire concept behind
semver and similar versioning schemes is to use a version string to
describe compatibility guarantees between different versions.  That's
(IMO) completely inappropriate for a source-only package like gnulib.

I think the Git SHA is the single most critical element and must be
included.  However, it's not too informative unless the user has the
Git repo.

My recommendation would be to automatically add a tag once a month
(say) to the gnulib Git repo with the date, and then use the "git
describe" output as the version.  This gives an easily-comparable
version string with all the info needed.

Re: Versioned releases

2020-06-04 Thread Dmitry Marakasov

* Bruno Haible (br...@clisp.org) wrote:

> > Despite that gnulib homepage says "Gnulib does not make releases.
> > It is intended to be used at the source level." gnulib is in fact
> > packaged in quite a lot of distributions:
> > 
> > https://repology.org/project/gnulib/versions
> 
> Indeed e.g. Debian has a gnulib "package":
>   https://packages.debian.org/sid/all/gnulib/filelist
> 
> But I think it's a red herring, since basically no one is using gnulib
> this way.
> 
> > Note that since there are no official versions maintainers have to
> > invent versioning schemes which include "0", multiple date based and
> > commit number based formats.
> 
> There is nothing wrong with that. As long as the date be retrieved from
> the checkout, there is no problem:
> 
> git_checkout_date=`if test -d .git; then
>  git log -n 1 --date=iso --format=fuller | sed -n -e 
> 's/^CommitDate: //p';
>else
>  sed -n -e 
> 's/^\([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]\).*/\1/p' -e 1q ChangeLog;
>fi`
> pretty_date=`LC_ALL=C date +"%e %B %Y" --date="$git_checkout_date"`
> 
> > There are known vulnerabilities for gnulib which also have to use
> > something version-like to describe which gnulib versions are affected
> > (these use dates in -MM-DD format):
> > 
> > https://nvd.nist.gov/vuln/detail/CVE-2017-7476
> > https://nvd.nist.gov/vuln/detail/CVE-2018-17942
> 
> It says e.g. "in Gnulib before 2018-09-23 has a heap-based buffer overflow".
> It is easy for every user of Gnulib to determine whether their version
> is before or after 2018-09-23. Just peek at the ChangeLog or 'gitk'.
> 
> It is not harder than when a CVE is about "OpenSSL through 1.0.1i".
> 
> > Note that it's impossible to match these against package versions due
> > to inconsistent versioning scheme.
> 
> You mean, a distributor wants to determine which of the coreutils,
> findutils, gawk, gettext, etc. package use the Gnulib before 2018-09-23?
> This is nontrivial, but not because Gnulib does not have a version
> number, but because it's shipped as a source-code library - something that
> we don't want to change.
> 
> Such a distributor would
>   - for packages for which they used tarballs, look at the particular file
> in the tarball (e.g. lib/vasnprintf.c); I admit it is tedious;
>   - for packages for which they use the git checkout, look at the git
> submodule version (e.g. [1][2]); this is tedious as well.
> 
> But I don't see how a versioning scheme would significantly help.

My claim only covers standalone distribution of gnulib. I don't want
to dig into the reasons for why upstream forces bundling and why
downstream don't follow it anyway, but the sole fact that it's packaged
standalone in so many distribution speaks for itself of that this way of
distribution is a necessity.

With standalone distribuition there's no way to peek into git history
or some source files, but there's a clear identifier of which specific
version is packaged. And it can be used to estimate of how up to date
the packaged version is, and to reliably check whether it has known
vulnerabilities and (when semver is used) whether it's compatible with
particular consumers.

> > So as you can see, even though there are no official versioned releases,
> > people have to invent and use these to refer to specific gnulib commit
> > ranges, and not having any consistency in these schemes results in e.g.
> > inability to report vulnerable packages.
> 
> I don't see noticeable problems caused by this inconsistency.
> 
> > So I suggest to fix this by introducing any kind of upstream versioning.
> 
> Are you suggesting that every gnulib commit can be translated to a
> version number? There's 'git describe' which does that.
> 
> Or are you suggesting that the Gnulib developers pick, say, every 100th
> Gnulib commit and assign it a version number? And how would that be useful,
> since the consumers upgrade when they like to?

I would suggest using proper semver. But dumb tagging every nth
commit, or weekly or so would definitely be better than nothing,
as long as the tags use consistent scheme. There's no need for
exact commit:version mapping, to say that "versions below x.y.z"
contain a bug or vulnerability. Just enough precision to not have
to wait months for a fixed version to be released.

> [1] https://git.savannah.gnu.org/cgit/poke.git/log/gnulib
> [2] https://git.savannah.gnu.org/cgit/gettext.git/log/gnulib

-- 
Dmitry Marakasov   .   55B5 0596 FF1E 8D84 5F56  9510 D35A 80DD F9D2 F77D
amd...@amdmi3.ru  ..:  https://github.com/AMDmi3

Re: Versioned releases

2020-06-04 Thread Bruno Haible

Hi Dmitry,

> Despite that gnulib homepage says "Gnulib does not make releases.
> It is intended to be used at the source level." gnulib is in fact
> packaged in quite a lot of distributions:
> 
> https://repology.org/project/gnulib/versions

Indeed e.g. Debian has a gnulib "package":
  https://packages.debian.org/sid/all/gnulib/filelist

But I think it's a red herring, since basically no one is using gnulib
this way.

> Note that since there are no official versions maintainers have to
> invent versioning schemes which include "0", multiple date based and
> commit number based formats.

There is nothing wrong with that. As long as the date be retrieved from
the checkout, there is no problem:

git_checkout_date=`if test -d .git; then
 git log -n 1 --date=iso --format=fuller | sed -n -e 
's/^CommitDate: //p';
   else
 sed -n -e 
's/^\([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]\).*/\1/p' -e 1q ChangeLog;
   fi`
pretty_date=`LC_ALL=C date +"%e %B %Y" --date="$git_checkout_date"`

> There are known vulnerabilities for gnulib which also have to use
> something version-like to describe which gnulib versions are affected
> (these use dates in -MM-DD format):
> 
> https://nvd.nist.gov/vuln/detail/CVE-2017-7476
> https://nvd.nist.gov/vuln/detail/CVE-2018-17942

It says e.g. "in Gnulib before 2018-09-23 has a heap-based buffer overflow".
It is easy for every user of Gnulib to determine whether their version
is before or after 2018-09-23. Just peek at the ChangeLog or 'gitk'.

It is not harder than when a CVE is about "OpenSSL through 1.0.1i".

> Note that it's impossible to match these against package versions due
> to inconsistent versioning scheme.

You mean, a distributor wants to determine which of the coreutils,
findutils, gawk, gettext, etc. package use the Gnulib before 2018-09-23?
This is nontrivial, but not because Gnulib does not have a version
number, but because it's shipped as a source-code library - something that
we don't want to change.

Such a distributor would
  - for packages for which they used tarballs, look at the particular file
in the tarball (e.g. lib/vasnprintf.c); I admit it is tedious;
  - for packages for which they use the git checkout, look at the git
submodule version (e.g. [1][2]); this is tedious as well.

But I don't see how a versioning scheme would significantly help.

> So as you can see, even though there are no official versioned releases,
> people have to invent and use these to refer to specific gnulib commit
> ranges, and not having any consistency in these schemes results in e.g.
> inability to report vulnerable packages.

I don't see noticeable problems caused by this inconsistency.

> So I suggest to fix this by introducing any kind of upstream versioning.

Are you suggesting that every gnulib commit can be translated to a
version number? There's 'git describe' which does that.

Or are you suggesting that the Gnulib developers pick, say, every 100th
Gnulib commit and assign it a version number? And how would that be useful,
since the consumers upgrade when they like to?

Bruno

[1] https://git.savannah.gnu.org/cgit/poke.git/log/gnulib
[2] https://git.savannah.gnu.org/cgit/gettext.git/log/gnulib

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

Re: Versioned releases

9 matches

Site Navigation

Mail list logo

Footer information