Re: DEP14 policy for two dots

2016-11-15 Thread Robie Basak
FTR, I answered most questions about "why not dgit?" in the thread I
just moved to vcs-pkg-discuss only[1].

For some specific questions here:

On Thu, Nov 10, 2016 at 12:31:31AM +, Ian Jackson wrote:
> dgit can work on Ubuntu too, in a readonly mode.  (It would be nice to
> make `dgit push' work on Ubuntu but that requires a suitable git
> server, and some configration on the dgit client side.)

A key difference with our system is that no infrastructure is required.
It's distributed (-able), with no special git remote.

> The --skip-patches is a vital difference.
> Has the decision to use --skip-patches been definitively taken ?

For now, to fulfill our use case, yes. In the general case, no, not at
all. Probably the best place to enter into this would be in a reply to
my fuller explanation of the reasons in [1].

> I would like to beg you to reconsider this, in the strongest possible
> terms.  --skip-patches generates a patches-unapplied tree.
> 
> A patches-unapplied tree:

Sorry, I missed reference to this list when I wrote [1]. I'll study
these and consider how they related to my reasons later.

> If you are making patches-unapplied trees I think you cannot possibly
> be representing the quilt patch stack of a `3.0 (quilt)' source
> package as a series of git commits.

Correct. This has not been our goal.

> Representing a `3.0 (quilt)' package that way is desirable, as it
> means that `git blame' and `git log' can be used to see which patches
> do what.

In contrast, in our Ubuntu development workflow, what you mention is not
a requirement, but what is a requirement is to use "git blame" and "git
log" to see when the quilt patches applied were altered. We don't want
to drill down as far as you are suggesting; instead, we want the
information at one level removed.

Robie


signature.asc
Description: PGP signature
___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: DEP14 policy for two dots

2016-11-10 Thread Raphael Hertzog
On Wed, 09 Nov 2016, Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> > a reasonable extension.
> 
> Attached.  FYI I intend to implement this in dgit.

Thanks, committed to the dep svn repository.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
I forgot one:

Ian Jackson writes ("Re: DEP14 policy for two dots"):
> A patches-unapplied tree:
> 
>  * produces confusing and sometimes misleading output from
>git grep, or (even if appropriate history is available)
>with git blame;
> 
>  * cannot be used with `git cherry pick ';
> 
>  * cannot be used as a basis for `git merge upstream/';
> 
>  * requires that the user not say `git diff upstream/master'
>but rather that they read patches in debian/patches;
> 
>  * cannot be directly edited by the user;
> 
>  * leaves the git tree dirty after every build with dpkg-buildpackage
>no matter how careful or tidy the package's build system.

   * when built with the upstream build system (eg, for a GNU package,
 ./configure && make), silently and successfuly produces wrong
 output - perhaps dangerously wrong output, such as binaries
 lacking important security patches.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: DEP14 policy for two dots

2016-11-09 Thread Nish Aravamudan
On 09.11.2016 [21:27:14 +], Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> > a reasonable extension.
> 
> Attached.  FYI I intend to implement this in dgit.

Thank you! We will follow the same in the Ubuntu tooling used by the Server
Team.

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> a reasonable extension.

Attached.  FYI I intend to implement this in dgit.

Thanks,
Ian.

>From 5c63400e9be8cb1532515764a1179730aed550fb Mon Sep 17 00:00:00 2001
From: Ian Jackson <ijack...@chiark.greenend.org.uk>
Date: Wed, 9 Nov 2016 18:36:23 +
Subject: [PATCH] DEP-14: Version -> refname mangling: Escape dots

Signed-off-by: Ian Jackson <ijack...@chiark.greenend.org.uk>
---
 deps/dep14.mdwn | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/deps/dep14.mdwn b/deps/dep14.mdwn
index 4c7ce63..a7328a4 100644
--- a/deps/dep14.mdwn
+++ b/deps/dep14.mdwn
@@ -3,7 +3,7 @@
 Title: Recommended layout for Git packaging repositories
 DEP: 14
 State: DRAFT
-Date: 2014-11-04
+Date: 2016-11-09
 Drivers: Raphael Hertzog <hert...@debian.org>
 URL: http://dep.debian.net/deps/dep14
 Source: http://anonscm.debian.org/viewvc/dep/web/deps/dep14.mdwn
@@ -60,8 +60,26 @@ Version mangling
 
 When a Git tag needs to refer to a specific version of a Debian package,
 the Debian version needs to be mangled to cope with Git's restrictions.
-The colon (`:`) needs to be replaced with a percent (`%`), and the tilde
-(`~`) needs to be replaced with an underscore (`_`).
+This mangling is deterministic and reversible:
+
+ * Each colon (`:`) is replaced with a percent (`%`)
+ * Each tilde (`~`) is replaced with an underscore (`_`)
+ * A hash (`#`) is inserted between each pair of adjacent dots (`..`)
+ * A hash (`#`) is appended if the last character is a dot (`.`)
+ * If the version ends in precisely `.lock`
+   (dot `l` `o` `c` `k`, lowercase, at the end of the version),
+   a hash (`#`) is inserted after the dot, giving `.#lock`.
+
+This can be expressed concisely in the following Perl5 statements:
+
+ y/:~/%_/;
+ s/\.(?=\.|$|lock$)/.#/g;
+
+The reverse transformation is:
+
+ * Each percent (`%`) is replaced with a colon (`:`)
+ * Each underscore (`_`) is replaced with a tilde (`~`)
+ * Each hash (`#`) is deleted
 
 Packaging branches and tags
 ===
@@ -274,3 +292,4 @@ Changes
 ===
 
 * 2014-11-05: Initial draft by Raphaël Hertzog.
+* 2016-11-09: Extended version mangling to troublesome dots - Ian Jackson.
-- 
2.10.2



-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: DEP14 policy for two dots

2016-11-09 Thread Raphael Hertzog
On Tue, 08 Nov 2016, Ian Jackson wrote:
> > The reverse rule is to convert _ and % and delete all #.
> 
> Quoted for completeness.

Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
a reasonable extension.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: DEP14 policy for two dots

2016-11-08 Thread Ian Jackson
Ian Jackson writes ("Re: DEP14 policy for two dots"):
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > On Fri, 04 Nov 2016, Ian Jackson wrote:
> > > My proposal is reversible.  It does not need to be extensible.
> > 
> > So what about "..."? Would it give ".#.#."?
> 
> Yes.  I said (fixing my bug):
> 
>  > Insert "#":
>  >- between each pair of adjacent dots
>  >- after any trailing dot
>  >- before any leading dot
>   - after the `.' of a trailing `.lock'
> 
> The latter is necessary because git reserves .lock.  (!)
> The summary is `add # after any troublesome dot' (discounting leading
> dots which you say are now illegal in Debian).
> 
> I'm running some exhaustive tests to check that this rule is
> sufficient, because I'm not sure I trust the git docs.

I have now:

 * Read the code in git upstream master.  It's not particularly easy
   to analyse conclusively, but I'm pretty sure it doesn't have any
   special cases which involve longer strings than ".lock".  I felt I
   was able to identify the manpage rule corresponding to each element
   of the logic.

 * Conducted an exhaustive search of all strings of length 6
   or less.  Specifically, I generated all strings of between
   zero and 6 characters from this set (in C notation):

   "0123456789"
   "abcdefghijklmnopqrstuvwxyz"
   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
   ".-+:~"

   filtered them by whether the `parseversion' function in
   dpkg likes them, escaped them with the following perl
   script

 #!/usr/bin/perl -pw
 use strict;

 y/:~/%_/;
 s/\.(?=\.|$|lock$)/.#/g;

   prepended "refs/tags/" to each one and fed them to
   git-check-ref-format (modified to run in a pipe mode).

   I also verified that when I don't escape ".lock", my exhaustive
   search correctly detects the illegal ref name "refs/tags/1.lock"
   genrated by version "1.lock" (and similar).

> The reverse rule is to convert _ and % and delete all #.

Quoted for completeness.

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: DEP14 policy for two dots

2016-11-04 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> On Fri, 04 Nov 2016, Ian Jackson wrote:
> > My proposal is reversible.  It does not need to be extensible.
> 
> So what about "..."? Would it give ".#.#."?

Yes.  I said (fixing my bug):

 > Insert "#":
 >- between each pair of adjacent dots
 >- after any trailing dot
 >- before any leading dot
  - after the `.' of a trailing `.lock'

The latter is necessary because git reserves .lock.  (!)
The summary is `add # after any troublesome dot' (discounting leading
dots which you say are now illegal in Debian).

I'm running some exhaustive tests to check that this rule is
sufficient, because I'm not sure I trust the git docs.

> What's the rule to apply? if it's just to drop the "#", then yes
> it's reversible in a single step. If it's "s/\.#\./../g" then you need
> to do it multiple times until you no longer find ".#.".

The reverse rule is to convert _ and % and delete all #.

> > > My suggestion would be to allow "##". 
> > > Thus my personal preference would be to replace ".." with ".#2e#".
> > 
> > This is a bad idea because it (implicitly) makes the conversion
> > nondeterministic.
> 
> We define the conversion rule in DEP-14. We can define it in a
> deterministic way.

If you define it in a deterministic way then it is by definition not
extensible, because all valid version strings have a definitive git
tag representation.

Unless by `extensible' you mean `we can update the rule if we discover
that some of the specified git tag representations are not accepted by
git', or `we can update the rule if the set of valid Debian version
strings is extended'.  But this is true of any proposal, no matter
what the syntax is.

> I wanted something extensible because what's allowed in git ref names
> might evolve. It would not be the first time that a special syntax
> is introduced with a new feature.

I think the git folks are going to try not to further restrict the git
ref name syntax.  After all, if they do restrict it, what about
existing tags with the now-forbidden names ?

> Which of # or = is more likely to be used for a new syntax/feature in git?
> My bet would go for "#" so that "=" is an even better choice.

I think = is more likely to be used for other things (both by git and
by others).

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: DEP14 policy for two dots

2016-11-04 Thread Raphael Hertzog
On Fri, 04 Nov 2016, Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > We have defined simple "readable" mappings for the common cases that
> > we encounter frequently. Now if we need mappings for silly things
> > that we don't encounter, I would suggest to use something easily
> > reversible and extendable.
> 
> My proposal is reversible.  It does not need to be extensible.

So what about "..."? Would it give ".#.#."?

What's the rule to apply? if it's just to drop the "#", then yes
it's reversible in a single step. If it's "s/\.#\./../g" then you need
to do it multiple times until you no longer find ".#.".

I wanted something extensible because what's allowed in git ref names
might evolve. It would not be the first time that a special syntax
is introduced with a new feature.

> > My suggestion would be to allow "##". 
> > Thus my personal preference would be to replace ".." with ".#2e#".
> 
> This is a bad idea because it (implicitly) makes the conversion
> nondeterministic.

We define the conversion rule in DEP-14. We can define it in a
deterministic way.

> You might write some rule about which . should be replaced by #2e#
> but it would be easy to misimplement.

It's possible to misimplement almost any rule.

> Also if we are going to introduce an arbitrary codepoint quoting
> system like this it should be identical to quoted-printable (bad as
> that is).

It's limited to ASCII but I guess it's highly unlikely that we will
one day allow unicode in version strings. So yes, that would be an option
as well. Given that "=" is forbidden in version numbers and allowed in git
refnames, it should work.

Which of # or = is more likely to be used for a new syntax/feature in git?
My bet would go for "#" so that "=" is an even better choice.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: DEP14 policy for two dots

2016-11-03 Thread Ian Jackson
Nish Aravamudan writes ("DEP14 policy for two dots"):
> [ Raphael, apologies for sending twice, had a error in the headers in
> the prior one ]
> 
> Not sure exactly where to ask this better than debian-devel, but I am
> working on an importer for the Ubuntu Server team which parses published
> versions of source packages in Debian and Ubuntu. I ran into an issue
> today where there is a published version of src:pcre3 with version
> '8.30..2'. `man git-check-ref-format` says that reference names "cannot
> have two consecutive dots ..  anywhere." DEP14 specifies appropriate
> substitutions for : and ~, but it seems like .. should also be accounted
> for so I can correctly tag historic versions?

Urk.  How exciting.  I think we may need a more general escaping
scheme for these and other weirdnesses.

I have an interest as dgit uses DEP-14 tag escaping.  I have CC'd the
vcs-pkg list.


tl;dr: I think we should insert `#' characters as needed.


Looking at git-check-ref-format(1) and
https://wiki.debian.org/Punctuation:

  .special to git, generally permitted in versions,
   and we want it usually to be literal - this is our problem

  ~special to git, permitted in versions, handled by DEP-14 as _
  :special to git, epoch in versions, handled by DEP-14 as %

  @special to git (although sometimes allowed), forbidden in versions

  % _  not special to git but already used by DEP-14

  # , =
   not mentioned in the git manual as special, forbidden in versions

  ]not special to git, although [ is so let's not, eh ?

  + -  not special to git, permitted in versions

  " ' $ & ( ) * ; < > ? `
   not mentioned in the git manual but troublesome shell
   metacharacters which we would be insane to use here

  [ / { }
   interpreted specially by git some of the time,
   forbidden in versions - not really useful

  ^ ? * \
   all of these are forbiden by git, not permitted in versions

So I think in fact the only thing we have a problem with is multiple
dots.  Looking at the summary above, we have the choice of one of
these:

  #   Its use as a shell comment character is fine, because when inside
  a version tag it is always preceded by some string like
  "debian/" or "upstream/".  We would almost never need to put it
  at the start of the encoded version string anyway, and we have
  already tolerated a similar situation with ~.

  There is possible confusion with HTML fragment identifiers, and
  possibly in languages other than shell which use # for
  comments (athough hopefuly they aren't dealing with our versions
  as literals anyway).

   Proposed rule:

   Insert "#":
  - between each pair of adjacent dots
  - after any trailing dot
  - before any leading dot

   Examples:
8.30..2 => 8.30.#.2
8.30.   => 8.30.#
.42 => #.42

  ,   I would like to avoid this because lots of people are probably
  using it as a list separator in ways that are difficult for us
  to predict.  If we used it, I would suggest the same as for #.

  =   In principle we could use this.  I don't like it for a similar
  reason to above.  If we did use it it might look a bit like
  Q-P encoding in some contexts.

  @   We could use this although I wouldn't like to rely on the fact
  that git dislikes `@{' and `@' but not @ followed by other
  things.

  % Reusing this is tempting because an epoch separator can never
  follow `.', so any `%' after any `.' would unambiguously mean
  `escape for dot rather than colon'.  But in principle `.' can
  occur at the start of the version, so `:3' and `.3' both =>
  `%3'.  There would have to be some horror of an exception rule.
  (Although `:3' and `3' compare equal as Debian versions, they
  are different textual strings and the tag needs to convey the
  whole string.)

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss