Re: DEP14 policy for two dots

2016-11-15 Thread Michael Hudson-Doyle
Robie forgot this bit (I think):

[1]
http://lists.alioth.debian.org/pipermail/vcs-pkg-discuss/2016-November/000909.html

Cheers,
mwh

On 16 November 2016 at 05:38, Robie Basak  wrote:

> FTR, I answered most questions about "why not dgit?" in the thread I
> just moved to vcs-pkg-discuss only[1].
>
> For some specific questions here:
>
> On Thu, Nov 10, 2016 at 12:31:31AM +, Ian Jackson wrote:
> > dgit can work on Ubuntu too, in a readonly mode.  (It would be nice to
> > make `dgit push' work on Ubuntu but that requires a suitable git
> > server, and some configration on the dgit client side.)
>
> A key difference with our system is that no infrastructure is required.
> It's distributed (-able), with no special git remote.
>
> > The --skip-patches is a vital difference.
> > Has the decision to use --skip-patches been definitively taken ?
>
> For now, to fulfill our use case, yes. In the general case, no, not at
> all. Probably the best place to enter into this would be in a reply to
> my fuller explanation of the reasons in [1].
>
> > I would like to beg you to reconsider this, in the strongest possible
> > terms.  --skip-patches generates a patches-unapplied tree.
> >
> > A patches-unapplied tree:
>
> Sorry, I missed reference to this list when I wrote [1]. I'll study
> these and consider how they related to my reasons later.
>
> > If you are making patches-unapplied trees I think you cannot possibly
> > be representing the quilt patch stack of a `3.0 (quilt)' source
> > package as a series of git commits.
>
> Correct. This has not been our goal.
>
> > Representing a `3.0 (quilt)' package that way is desirable, as it
> > means that `git blame' and `git log' can be used to see which patches
> > do what.
>
> In contrast, in our Ubuntu development workflow, what you mention is not
> a requirement, but what is a requirement is to use "git blame" and "git
> log" to see when the quilt patches applied were altered. We don't want
> to drill down as far as you are suggesting; instead, we want the
> information at one level removed.
>
> Robie
>


Re: DEP14 policy for two dots

2016-11-15 Thread Robie Basak
FTR, I answered most questions about "why not dgit?" in the thread I
just moved to vcs-pkg-discuss only[1].

For some specific questions here:

On Thu, Nov 10, 2016 at 12:31:31AM +, Ian Jackson wrote:
> dgit can work on Ubuntu too, in a readonly mode.  (It would be nice to
> make `dgit push' work on Ubuntu but that requires a suitable git
> server, and some configration on the dgit client side.)

A key difference with our system is that no infrastructure is required.
It's distributed (-able), with no special git remote.

> The --skip-patches is a vital difference.
> Has the decision to use --skip-patches been definitively taken ?

For now, to fulfill our use case, yes. In the general case, no, not at
all. Probably the best place to enter into this would be in a reply to
my fuller explanation of the reasons in [1].

> I would like to beg you to reconsider this, in the strongest possible
> terms.  --skip-patches generates a patches-unapplied tree.
> 
> A patches-unapplied tree:

Sorry, I missed reference to this list when I wrote [1]. I'll study
these and consider how they related to my reasons later.

> If you are making patches-unapplied trees I think you cannot possibly
> be representing the quilt patch stack of a `3.0 (quilt)' source
> package as a series of git commits.

Correct. This has not been our goal.

> Representing a `3.0 (quilt)' package that way is desirable, as it
> means that `git blame' and `git log' can be used to see which patches
> do what.

In contrast, in our Ubuntu development workflow, what you mention is not
a requirement, but what is a requirement is to use "git blame" and "git
log" to see when the quilt patches applied were altered. We don't want
to drill down as far as you are suggesting; instead, we want the
information at one level removed.

Robie


signature.asc
Description: PGP signature


Re: DEP14 policy for two dots

2016-11-10 Thread Raphael Hertzog
On Wed, 09 Nov 2016, Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> > a reasonable extension.
> 
> Attached.  FYI I intend to implement this in dgit.

Thanks, committed to the dep svn repository.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/



Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
I forgot one:

Ian Jackson writes ("Re: DEP14 policy for two dots"):
> A patches-unapplied tree:
> 
>  * produces confusing and sometimes misleading output from
>git grep, or (even if appropriate history is available)
>with git blame;
> 
>  * cannot be used with `git cherry pick ';
> 
>  * cannot be used as a basis for `git merge upstream/';
> 
>  * requires that the user not say `git diff upstream/master'
>but rather that they read patches in debian/patches;
> 
>  * cannot be directly edited by the user;
> 
>  * leaves the git tree dirty after every build with dpkg-buildpackage
>no matter how careful or tidy the package's build system.

   * when built with the upstream build system (eg, for a GNU package,
 ./configure && make), silently and successfuly produces wrong
 output - perhaps dangerously wrong output, such as binaries
 lacking important security patches.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
Nish Aravamudan writes ("Re: DEP14 policy for two dots"):
> On 09.11.2016 [23:38:30 +], Ian Jackson wrote:
> > Can you confirm what approach you have taken to the representation of
> > Debian source packges as git trees ?  I would like to encourage you to
> > use a representation which is compatible with dgit.
> 
> I think we are fairly compatible. The only difference would be the
> actual git commits themselves, I think, because we treat Launchpad as
> our canonical source of information, while dgit uses the Debian archive
> (aiui).

dgit can work on Ubuntu too, in a readonly mode.  (It would be nice to
make `dgit push' work on Ubuntu but that requires a suitable git
server, and some configration on the dgit client side.)

> > That is, the git tree object should look exactly like the results of
> > `dpkg-source -x', except that the .pc directory which dpkg-source
> > creates for `3.0 (quilt)' packages is deleted.
> 
> Yes, we use `dpkg-source -x --skip-patches` on an appropriate DSC file
> for both Debian and Ubuntu publications. That is the tree we commit with
> appropriate parents (as determined by Launchpad's publication history
> and the d/changelog file).

The --skip-patches is a vital difference.
Has the decision to use --skip-patches been definitively taken ?

I would like to beg you to reconsider this, in the strongest possible
terms.  --skip-patches generates a patches-unapplied tree.

A patches-unapplied tree:

 * produces confusing and sometimes misleading output from
   git grep, or (even if appropriate history is available)
   with git blame;

 * cannot be used with `git cherry pick ';

 * cannot be used as a basis for `git merge upstream/';

 * requires that the user not say `git diff upstream/master'
   but rather that they read patches in debian/patches;

 * cannot be directly edited by the user;

 * leaves the git tree dirty after every build with dpkg-buildpackage
   no matter how careful or tidy the package's build system.

For these reasons, dgit's interchange format is patches-applied
trees.  (dgit 2.x supports a maintainer using patches-unapplied branch
and converts it for publication during dgit push.)

I understand why many Debian maintainers like to use patches-unapplied
trees.  They make a reasonable archival format for a maintainer who:

 - understands the `3.0 (quilt)' format very well;

 - understands their own quilt/git workflow;

 - has chosen to use (and to learn) a specific Debian git workflow
   tool such as git-buildpackage;

 - is willing to read interdiffs occasionally.

These are not properties we should expect of all our users and
downstreams.  They are not even properties we should expect of all our
future developers.

dgit (and git-dpm) show that it is possible to work with, and
interchange, patches-applied git trees.

> > It would probably be nice if the commit history structure of imported
> > source packges was a bit like the dgit imports.  Or better if it were
> > identical, but that's probably too much to ask for because you
> > probably do not want to make dgit 2.x a dependency for your project.
> 
> I will spend some time next week looking at what dgit imports look like,
> but I'm guessing they are not the same currently.

If you are making patches-unapplied trees I think you cannot possibly
be representing the quilt patch stack of a `3.0 (quilt)' source
package as a series of git commits.

Representing a `3.0 (quilt)' package that way is desirable, as it
means that `git blame' and `git log' can be used to see which patches
do what.

> > I encourage you to try out dgit 2.x and see what you think of its
> > efforts for some existing source packages.  Eg `dgit clone libvirt
> > stretch'.
> 
> I will do this soon, to compare, thanks!

The dgit_2.11_all.deb binary package in Debian unstable is very likely
to be directly installable and useable on all supported Ubuntu
versions.  So I think you can just `dpkg -i && apt-get -f install'
(or apt install ./dgit.deb, if your apt can do that).

Anyway, thanks for your consideration of my points, above.

Regards,
Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-09 Thread Nish Aravamudan
On 09.11.2016 [23:38:30 +], Ian Jackson wrote:
> Nish Aravamudan writes ("Re: DEP14 policy for two dots"):
> > Thank you! We will follow the same in the Ubuntu tooling used by the Server
> > Team.
> 
> Great, thanks.
> 
> Can I ask you a rather unrelated question ?  AIUI you are working on
> importing Ubuntu's history into git.  That's great.

Yep, our original use-case was 'Ubuntu merges' where there is some
Ubuntu delta from the Debian package has has to be maintained and
reapplied to new Debian publications.

> Can you confirm what approach you have taken to the representation of
> Debian source packges as git trees ?  I would like to encourage you to
> use a representation which is compatible with dgit.

I think we are fairly compatible. The only difference would be the
actual git commits themselves, I think, because we treat Launchpad as
our canonical source of information, while dgit uses the Debian archive
(aiui).

> That is, the git tree object should look exactly like the results of
> `dpkg-source -x', except that the .pc directory which dpkg-source
> creates for `3.0 (quilt)' packages is deleted.

Yes, we use `dpkg-source -x --skip-patches` on an appropriate DSC file
for both Debian and Ubuntu publications. That is the tree we commit with
appropriate parents (as determined by Launchpad's publication history
and the d/changelog file).

> It would probably be nice if the commit history structure of imported
> source packges was a bit like the dgit imports.  Or better if it were
> identical, but that's probably too much to ask for because you
> probably do not want to make dgit 2.x a dependency for your project.

I will spend some time next week looking at what dgit imports look like,
but I'm guessing they are not the same currently.

> I encourage you to try out dgit 2.x and see what you think of its
> efforts for some existing source packages.  Eg `dgit clone libvirt
> stretch'.

I will do this soon, to compare, thanks!

-Nish

P.S. Totally a plug, but I will be hosting a UOS talk about the importer
and git trees next week:
http://summit.ubuntu.com/uos-1611/meeting/22710/git-based-merge-workflow/

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd



Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
Nish Aravamudan writes ("Re: DEP14 policy for two dots"):
> Thank you! We will follow the same in the Ubuntu tooling used by the Server
> Team.

Great, thanks.

Can I ask you a rather unrelated question ?  AIUI you are working on
importing Ubuntu's history into git.  That's great.

Can you confirm what approach you have taken to the representation of
Debian source packges as git trees ?  I would like to encourage you to
use a representation which is compatible with dgit.

That is, the git tree object should look exactly like the results of
`dpkg-source -x', except that the .pc directory which dpkg-source
creates for `3.0 (quilt)' packages is deleted.

It would probably be nice if the commit history structure of imported
source packges was a bit like the dgit imports.  Or better if it were
identical, but that's probably too much to ask for because you
probably do not want to make dgit 2.x a dependency for your project.

I encourage you to try out dgit 2.x and see what you think of its
efforts for some existing source packages.  Eg `dgit clone libvirt
stretch'.

Thanks,
Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-09 Thread Nish Aravamudan
On 09.11.2016 [21:27:14 +], Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> > a reasonable extension.
> 
> Attached.  FYI I intend to implement this in dgit.

Thank you! We will follow the same in the Ubuntu tooling used by the Server
Team.

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd



Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> a reasonable extension.

Attached.  FYI I intend to implement this in dgit.

Thanks,
Ian.

>From 5c63400e9be8cb1532515764a1179730aed550fb Mon Sep 17 00:00:00 2001
From: Ian Jackson <ijack...@chiark.greenend.org.uk>
Date: Wed, 9 Nov 2016 18:36:23 +
Subject: [PATCH] DEP-14: Version -> refname mangling: Escape dots

Signed-off-by: Ian Jackson <ijack...@chiark.greenend.org.uk>
---
 deps/dep14.mdwn | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/deps/dep14.mdwn b/deps/dep14.mdwn
index 4c7ce63..a7328a4 100644
--- a/deps/dep14.mdwn
+++ b/deps/dep14.mdwn
@@ -3,7 +3,7 @@
 Title: Recommended layout for Git packaging repositories
 DEP: 14
 State: DRAFT
-Date: 2014-11-04
+Date: 2016-11-09
 Drivers: Raphael Hertzog <hert...@debian.org>
 URL: http://dep.debian.net/deps/dep14
 Source: http://anonscm.debian.org/viewvc/dep/web/deps/dep14.mdwn
@@ -60,8 +60,26 @@ Version mangling
 
 When a Git tag needs to refer to a specific version of a Debian package,
 the Debian version needs to be mangled to cope with Git's restrictions.
-The colon (`:`) needs to be replaced with a percent (`%`), and the tilde
-(`~`) needs to be replaced with an underscore (`_`).
+This mangling is deterministic and reversible:
+
+ * Each colon (`:`) is replaced with a percent (`%`)
+ * Each tilde (`~`) is replaced with an underscore (`_`)
+ * A hash (`#`) is inserted between each pair of adjacent dots (`..`)
+ * A hash (`#`) is appended if the last character is a dot (`.`)
+ * If the version ends in precisely `.lock`
+   (dot `l` `o` `c` `k`, lowercase, at the end of the version),
+   a hash (`#`) is inserted after the dot, giving `.#lock`.
+
+This can be expressed concisely in the following Perl5 statements:
+
+ y/:~/%_/;
+ s/\.(?=\.|$|lock$)/.#/g;
+
+The reverse transformation is:
+
+ * Each percent (`%`) is replaced with a colon (`:`)
+ * Each underscore (`_`) is replaced with a tilde (`~`)
+ * Each hash (`#`) is deleted
 
 Packaging branches and tags
 ===
@@ -274,3 +292,4 @@ Changes
 ===
 
 * 2014-11-05: Initial draft by Raphaël Hertzog.
+* 2016-11-09: Extended version mangling to troublesome dots - Ian Jackson.
-- 
2.10.2



-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-09 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> On Tue, 08 Nov 2016, Ian Jackson wrote:
> > > The reverse rule is to convert _ and % and delete all #.
> > 
> > Quoted for completeness.
> 
> Ok, can you prepare a patch for DEP-14 then? I'll apply it as it looks like
> a reasonable extension.

Willdo, thanks.

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-08 Thread Ian Jackson
Ian Jackson writes ("Re: DEP14 policy for two dots"):
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > On Fri, 04 Nov 2016, Ian Jackson wrote:
> > > My proposal is reversible.  It does not need to be extensible.
> > 
> > So what about "..."? Would it give ".#.#."?
> 
> Yes.  I said (fixing my bug):
> 
>  > Insert "#":
>  >- between each pair of adjacent dots
>  >- after any trailing dot
>  >- before any leading dot
>   - after the `.' of a trailing `.lock'
> 
> The latter is necessary because git reserves .lock.  (!)
> The summary is `add # after any troublesome dot' (discounting leading
> dots which you say are now illegal in Debian).
> 
> I'm running some exhaustive tests to check that this rule is
> sufficient, because I'm not sure I trust the git docs.

I have now:

 * Read the code in git upstream master.  It's not particularly easy
   to analyse conclusively, but I'm pretty sure it doesn't have any
   special cases which involve longer strings than ".lock".  I felt I
   was able to identify the manpage rule corresponding to each element
   of the logic.

 * Conducted an exhaustive search of all strings of length 6
   or less.  Specifically, I generated all strings of between
   zero and 6 characters from this set (in C notation):

   "0123456789"
   "abcdefghijklmnopqrstuvwxyz"
   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
   ".-+:~"

   filtered them by whether the `parseversion' function in
   dpkg likes them, escaped them with the following perl
   script

 #!/usr/bin/perl -pw
 use strict;

 y/:~/%_/;
 s/\.(?=\.|$|lock$)/.#/g;

   prepended "refs/tags/" to each one and fed them to
   git-check-ref-format (modified to run in a pipe mode).

   I also verified that when I don't escape ".lock", my exhaustive
   search correctly detects the illegal ref name "refs/tags/1.lock"
   genrated by version "1.lock" (and similar).

> The reverse rule is to convert _ and % and delete all #.

Quoted for completeness.

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-04 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> On Fri, 04 Nov 2016, Ian Jackson wrote:
> > My proposal is reversible.  It does not need to be extensible.
> 
> So what about "..."? Would it give ".#.#."?

Yes.  I said (fixing my bug):

 > Insert "#":
 >- between each pair of adjacent dots
 >- after any trailing dot
 >- before any leading dot
  - after the `.' of a trailing `.lock'

The latter is necessary because git reserves .lock.  (!)
The summary is `add # after any troublesome dot' (discounting leading
dots which you say are now illegal in Debian).

I'm running some exhaustive tests to check that this rule is
sufficient, because I'm not sure I trust the git docs.

> What's the rule to apply? if it's just to drop the "#", then yes
> it's reversible in a single step. If it's "s/\.#\./../g" then you need
> to do it multiple times until you no longer find ".#.".

The reverse rule is to convert _ and % and delete all #.

> > > My suggestion would be to allow "##". 
> > > Thus my personal preference would be to replace ".." with ".#2e#".
> > 
> > This is a bad idea because it (implicitly) makes the conversion
> > nondeterministic.
> 
> We define the conversion rule in DEP-14. We can define it in a
> deterministic way.

If you define it in a deterministic way then it is by definition not
extensible, because all valid version strings have a definitive git
tag representation.

Unless by `extensible' you mean `we can update the rule if we discover
that some of the specified git tag representations are not accepted by
git', or `we can update the rule if the set of valid Debian version
strings is extended'.  But this is true of any proposal, no matter
what the syntax is.

> I wanted something extensible because what's allowed in git ref names
> might evolve. It would not be the first time that a special syntax
> is introduced with a new feature.

I think the git folks are going to try not to further restrict the git
ref name syntax.  After all, if they do restrict it, what about
existing tags with the now-forbidden names ?

> Which of # or = is more likely to be used for a new syntax/feature in git?
> My bet would go for "#" so that "=" is an even better choice.

I think = is more likely to be used for other things (both by git and
by others).

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-04 Thread Raphael Hertzog
On Fri, 04 Nov 2016, Ian Jackson wrote:
> Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> > We have defined simple "readable" mappings for the common cases that
> > we encounter frequently. Now if we need mappings for silly things
> > that we don't encounter, I would suggest to use something easily
> > reversible and extendable.
> 
> My proposal is reversible.  It does not need to be extensible.

So what about "..."? Would it give ".#.#."?

What's the rule to apply? if it's just to drop the "#", then yes
it's reversible in a single step. If it's "s/\.#\./../g" then you need
to do it multiple times until you no longer find ".#.".

I wanted something extensible because what's allowed in git ref names
might evolve. It would not be the first time that a special syntax
is introduced with a new feature.

> > My suggestion would be to allow "##". 
> > Thus my personal preference would be to replace ".." with ".#2e#".
> 
> This is a bad idea because it (implicitly) makes the conversion
> nondeterministic.

We define the conversion rule in DEP-14. We can define it in a
deterministic way.

> You might write some rule about which . should be replaced by #2e#
> but it would be easy to misimplement.

It's possible to misimplement almost any rule.

> Also if we are going to introduce an arbitrary codepoint quoting
> system like this it should be identical to quoted-printable (bad as
> that is).

It's limited to ASCII but I guess it's highly unlikely that we will
one day allow unicode in version strings. So yes, that would be an option
as well. Given that "=" is forbidden in version numbers and allowed in git
refnames, it should work.

Which of # or = is more likely to be used for a new syntax/feature in git?
My bet would go for "#" so that "=" is an even better choice.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/



Re: DEP14 policy for two dots

2016-11-04 Thread Ian Jackson
Raphael Hertzog writes ("Re: DEP14 policy for two dots"):
> We have defined simple "readable" mappings for the common cases that
> we encounter frequently. Now if we need mappings for silly things
> that we don't encounter, I would suggest to use something easily
> reversible and extendable.

My proposal is reversible.  It does not need to be extensible.
(Although before we adopt it it does need a review to make sure that I
have read the manuals correctly.  I see that I have failed to specify
s/\.lock$/.#lock/.)

Debian version numbers can contain only:

 ASCII alphanumerics Permitted freely in git ref names

 + - Permitted freely in git ref names

 : ~ Forbidden in git ref names; replaced by
 % and _ which are permitted freely by git

 .   Permitted in git ref names subject to
 restrictions

> My suggestion would be to allow "##". 
> Thus my personal preference would be to replace ".." with ".#2e#".

This is a bad idea because it (implicitly) makes the conversion
nondeterministic.  It is also unnecessary to consider unicode code
points other than 7-bit ASCII because Debian version numbers may
contain only 7-bit ASCII.

You might write some rule about which . should be replaced by #2e#
but it would be easy to misimplement.

Also if we are going to introduce an arbitrary codepoint quoting
system like this it should be identical to quoted-printable (bad as
that is).

> No, a version can't start with a dot, at least dpkg has been ensuring
> this for a few years now.

Then that part of my proposed rule is a harmless nullity.

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: DEP14 policy for two dots

2016-11-04 Thread Raphael Hertzog
Hello,

On Thu, 03 Nov 2016, Ian Jackson wrote:
>   % Reusing this is tempting because an epoch separator can never
>   follow `.', so any `%' after any `.' would unambiguously mean
>   `escape for dot rather than colon'.  But in principle `.' can
>   occur at the start of the version, so `:3' and `.3' both =>
>   `%3'.  There would have to be some horror of an exception rule.
>   (Although `:3' and `3' compare equal as Debian versions, they
>   are different textual strings and the tag needs to convey the
>   whole string.)

No, a version can't start with a dot, at least dpkg has been ensuring
this for a few years now.

$ dpkg --compare-versions .1 eq 0
dpkg: warning: version '.1' has bad syntax: version number does not start with 
digit

Here it's only a warning, but when building a package or trying to install a 
package,
it's a failure.

So the problem is not with the colon of the epoch, but there can be confusion 
with
"0:1..1" and "0:1.:1".

We have defined simple "readable" mappings for the common cases that we 
encounter
frequently. Now if we need mappings for silly things that we don't encounter, I 
would
suggest to use something easily reversible and extendable.

My suggestion would be to allow "##". 

Thus my personal preference would be to replace ".." with ".#2e#".

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/



Re: DEP14 policy for two dots

2016-11-03 Thread Nish Aravamudan
On 03.11.2016 [19:37:41 +], Ian Jackson wrote:
> Nish Aravamudan writes ("DEP14 policy for two dots"):
> > [ Raphael, apologies for sending twice, had a error in the headers in
> > the prior one ]
> > 
> > Not sure exactly where to ask this better than debian-devel, but I am
> > working on an importer for the Ubuntu Server team which parses published
> > versions of source packages in Debian and Ubuntu. I ran into an issue
> > today where there is a published version of src:pcre3 with version
> > '8.30..2'. `man git-check-ref-format` says that reference names "cannot
> > have two consecutive dots ..  anywhere." DEP14 specifies appropriate
> > substitutions for : and ~, but it seems like .. should also be accounted
> > for so I can correctly tag historic versions?
> 
> Urk.  How exciting.  I think we may need a more general escaping
> scheme for these and other weirdnesses.
> 
> I have an interest as dgit uses DEP-14 tag escaping.  I have CC'd the
> vcs-pkg list.

Thank you, I should have thought of that!

> tl;dr: I think we should insert `#' characters as needed.

Thank you as well for your excellent analysis of the options. I think
the proposal 

>Proposed rule:
> 
>Insert "#":
>   - between each pair of adjacent dots
>   - after any trailing dot
>   - before any leading dot
> 
>Examples:
> 8.30..2 => 8.30.#.2
> 8.30.   => 8.30.#
> .42 => #.42

is very reasonable and I assume could be added to DEP14 itself?
Presuming broader agreement, of course. For the purpose of 'ubuntu/'
tagging in our tool, we would adopt the same rule.

Thank,s
Nish

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd



Re: DEP14 policy for two dots

2016-11-03 Thread Ian Jackson
Nish Aravamudan writes ("DEP14 policy for two dots"):
> [ Raphael, apologies for sending twice, had a error in the headers in
> the prior one ]
> 
> Not sure exactly where to ask this better than debian-devel, but I am
> working on an importer for the Ubuntu Server team which parses published
> versions of source packages in Debian and Ubuntu. I ran into an issue
> today where there is a published version of src:pcre3 with version
> '8.30..2'. `man git-check-ref-format` says that reference names "cannot
> have two consecutive dots ..  anywhere." DEP14 specifies appropriate
> substitutions for : and ~, but it seems like .. should also be accounted
> for so I can correctly tag historic versions?

Urk.  How exciting.  I think we may need a more general escaping
scheme for these and other weirdnesses.

I have an interest as dgit uses DEP-14 tag escaping.  I have CC'd the
vcs-pkg list.


tl;dr: I think we should insert `#' characters as needed.


Looking at git-check-ref-format(1) and
https://wiki.debian.org/Punctuation:

  .special to git, generally permitted in versions,
   and we want it usually to be literal - this is our problem

  ~special to git, permitted in versions, handled by DEP-14 as _
  :special to git, epoch in versions, handled by DEP-14 as %

  @special to git (although sometimes allowed), forbidden in versions

  % _  not special to git but already used by DEP-14

  # , =
   not mentioned in the git manual as special, forbidden in versions

  ]not special to git, although [ is so let's not, eh ?

  + -  not special to git, permitted in versions

  " ' $ & ( ) * ; < > ? `
   not mentioned in the git manual but troublesome shell
   metacharacters which we would be insane to use here

  [ / { }
   interpreted specially by git some of the time,
   forbidden in versions - not really useful

  ^ ? * \
   all of these are forbiden by git, not permitted in versions

So I think in fact the only thing we have a problem with is multiple
dots.  Looking at the summary above, we have the choice of one of
these:

  #   Its use as a shell comment character is fine, because when inside
  a version tag it is always preceded by some string like
  "debian/" or "upstream/".  We would almost never need to put it
  at the start of the encoded version string anyway, and we have
  already tolerated a similar situation with ~.

  There is possible confusion with HTML fragment identifiers, and
  possibly in languages other than shell which use # for
  comments (athough hopefuly they aren't dealing with our versions
  as literals anyway).

   Proposed rule:

   Insert "#":
  - between each pair of adjacent dots
  - after any trailing dot
  - before any leading dot

   Examples:
8.30..2 => 8.30.#.2
8.30.   => 8.30.#
.42 => #.42

  ,   I would like to avoid this because lots of people are probably
  using it as a list separator in ways that are difficult for us
  to predict.  If we used it, I would suggest the same as for #.

  =   In principle we could use this.  I don't like it for a similar
  reason to above.  If we did use it it might look a bit like
  Q-P encoding in some contexts.

  @   We could use this although I wouldn't like to rely on the fact
  that git dislikes `@{' and `@' but not @ followed by other
  things.

  % Reusing this is tempting because an epoch separator can never
  follow `.', so any `%' after any `.' would unambiguously mean
  `escape for dot rather than colon'.  But in principle `.' can
  occur at the start of the version, so `:3' and `.3' both =>
  `%3'.  There would have to be some horror of an exception rule.
  (Although `:3' and `3' compare equal as Debian versions, they
  are different textual strings and the tag needs to convey the
  whole string.)

Ian.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



DEP14 policy for two dots

2016-11-03 Thread Nish Aravamudan
[ Raphael, apologies for sending twice, had a error in the headers in
the prior one ]

Not sure exactly where to ask this better than debian-devel, but I am
working on an importer for the Ubuntu Server team which parses published
versions of source packages in Debian and Ubuntu. I ran into an issue
today where there is a published version of src:pcre3 with version
'8.30..2'. `man git-check-ref-format` says that reference names "cannot
have two consecutive dots ..  anywhere." DEP14 specifies appropriate
substitutions for : and ~, but it seems like .. should also be accounted
for so I can correctly tag historic versions?

Thanks,
Nish

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd