Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2023-09-10 Thread Edward Little
Please remove the following email address:  e.little...@gmail.com

On Sun, Sep 10, 2023 at 2:33 PM Russ Allbery  wrote:

> Guillem Jover  writes:
>
> > Seems I missed another file:
>
> >   * .changes:
> > policy → «upload control file» / «Debian changes file»
> > dpkg   → «upload control file» / «.changes control file» /
> >  «Debian .changes file» / «Debian changes file»
>
> [...]
>
> > For changes I think something like the following might be a more clear
> > option (and has the minor bonus of aligning perfectly on the first
> > words! :), with it mentioning explicitly this is about changes being
> > uploaded, and that it is a control file (but I'm not sure I'm entirely
> > convinced about it):
>
> > * .changes:   «Debian upload changes control files»
>
> [...]
>
> > I've also found instances of «record» and «section» referring to fields
> > or stanzas.
>
> [...]
>
> > I also recalled another term that has always seemed very confusing in
> > context: «control information files» or «control information area». For
> > example in a sentence such as “the control file is a control information
> > file in the control information area in a .deb archive”. :) This also
> > seems confusing when some of the files in the .deb control member are
> > not really “control files” with a deb822(5) format.
>
> > My thinking has been going into calling these as the «metadata files»,
> > and being located in either the  «metadata part of the .deb archive» or
> > explicitly the «control member of the .deb archive», in contrast to the
> > filesystem part. In dpkg I'd be eventually switching to meta/metadata
> > and fsys/filesystem, from control or info and data. I've added a patch
> > with the proposed change, but again nothing set in stone, and I'm again
> > open to discussing pros/cons of this.
>
> > Attached the proposals for discussion/review, and I might again have
> > perhaps missed instances or similar.
>
> All of these changes seem straightforward and uncontroversial to me, and
> there are huge advantages to using consistent terminology between Policy
> and dpkg.  I have applied all of them for the next Policy release.  Thank
> you!
>
> --
> Russ Allbery (r...@debian.org)  
>
>


Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2023-09-10 Thread Russ Allbery
Guillem Jover  writes:

> Seems I missed another file:

>   * .changes:
> policy → «upload control file» / «Debian changes file»
> dpkg   → «upload control file» / «.changes control file» /
>  «Debian .changes file» / «Debian changes file»

[...]

> For changes I think something like the following might be a more clear
> option (and has the minor bonus of aligning perfectly on the first
> words! :), with it mentioning explicitly this is about changes being
> uploaded, and that it is a control file (but I'm not sure I'm entirely
> convinced about it):

> * .changes:   «Debian upload changes control files»

[...]

> I've also found instances of «record» and «section» referring to fields
> or stanzas.

[...]

> I also recalled another term that has always seemed very confusing in
> context: «control information files» or «control information area». For
> example in a sentence such as “the control file is a control information
> file in the control information area in a .deb archive”. :) This also
> seems confusing when some of the files in the .deb control member are
> not really “control files” with a deb822(5) format.

> My thinking has been going into calling these as the «metadata files»,
> and being located in either the  «metadata part of the .deb archive» or
> explicitly the «control member of the .deb archive», in contrast to the
> filesystem part. In dpkg I'd be eventually switching to meta/metadata
> and fsys/filesystem, from control or info and data. I've added a patch
> with the proposed change, but again nothing set in stone, and I'm again
> open to discussing pros/cons of this.

> Attached the proposals for discussion/review, and I might again have
> perhaps missed instances or similar.

All of these changes seem straightforward and uncontroversial to me, and
there are huge advantages to using consistent terminology between Policy
and dpkg.  I have applied all of them for the next Policy release.  Thank
you!

-- 
Russ Allbery (r...@debian.org)  



Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-20 Thread Russ Allbery
Guillem Jover  writes:

> Ok, I've prepared the attached incremental patch, which only switches
> from paragraph(s) to stanza(s) all over the place.

Thanks, applied.

> I've updated all the specs for consistency. I've updated the footnote to
> swap the preference and to mention paragraph is now discouraged
> nomenclature. I've also updated all «id»s out of consistency, which
> might break links, so I can revert that if you'd prefer.

It looks like it was primarily in the copyright-format specification.  I
think that's fine; we haven't historically tried hard to preserve anchors,
and if we ever did, we should probably use some scheme to assign stable
anchors rather than using the text of the heading.

> And I've preserved the (upper) casing for one of the titles
> (“Stand-alone License Stanza”, although that was not consistent with the
> other titles, such as “Files stanza”, I'm happy to lower case that one).

I personally have been convinced by a co-worker who did the research that
one should stop using title-casing in technical documents, since it's
mostly a US convention, US readers don't mind lowercase, and title-casing
can look weird to European readers.  But that's a fix for another day.

> I've gone one by one, but please review carefully as I might have
> perhaps switched in excess!

Reviewed, and also checked for remaining uses of "paragraph."  Everything
looked good.

-- 
Russ Allbery (r...@debian.org)  



Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-19 Thread Guillem Jover
Hi!

On Sun, 2022-09-18 at 17:34:57 -0700, Russ Allbery wrote:
> Sean Whitton  writes:
> > On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:
> >> So, personally, I'd be happy to fully switch to stanza TBH, because it
> >> seems more specific to our use, probably easier to search for, and
> >> it's shorter.
> 
> > I think this is fine for Policy to do.
> 
> I vote for switching to stanza.  Paragraph is going to be confusing when
> talking about package descriptions, which often have multiple paragraphs
> in the normal English meaning of the term.

Ok, I've prepared the attached incremental patch, which only switches
from paragraph(s) to stanza(s) all over the place.

I've updated all the specs for consistency. I've updated the footnote
to swap the preference and to mention paragraph is now discouraged
nomenclature. I've also updated all «id»s out of consistency, which
might break links, so I can revert that if you'd prefer. And I've
preserved the (upper) casing for one of the titles (“Stand-alone
License Stanza”, although that was not consistent with the other
titles, such as “Files stanza”, I'm happy to lower case that one).

I've gone one by one, but please review carefully as I might have
perhaps switched in excess!

Thanks,
Guillem
From 6d02f28eb1f0cd2f7afa75b04691265425122366 Mon Sep 17 00:00:00 2001
From: Guillem Jover 
Date: Mon, 19 Sep 2022 22:33:40 +0200
Subject: [PATCH] Use stanza to refer to deb822 parts instead of paragraph
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The «stanza» name is a commonly used and understood term when referring
to deb822 blocks. Although «paragraph» is commonly used it has the
problem of being confusing as it then makes it hard to distinguish
actual text paragraphs in prose, while «stanza» is a very specific
term that is not applied anywhere else in the deb822 context, so it's
always more clear and specific.

In addition «stanza» is shorter, which is always a nice attribute
on code for example.

The references in dpkg documentation and code, will be updated shortly,
so that there is uniform nomenclature used.

Fixes: #1020248
---
 autopkgtest.md |   2 +-
 copyright-format-1.0.xml   | 116 -
 policy/ch-controlfields.rst|  46 ++---
 policy/upgrading-checklist.rst |   8 +--
 4 files changed, 87 insertions(+), 85 deletions(-)

diff --git a/autopkgtest.md b/autopkgtest.md
index bc7bdaf..74d6885 100644
--- a/autopkgtest.md
+++ b/autopkgtest.md
@@ -219,7 +219,7 @@ debian/control by adding
 
 XS-Testsuite: autopkgtest
 
-in the `Source:` paragraph.
+in the `Source:` stanza.
 
 Implicit test control file for known package types
 --
diff --git a/copyright-format-1.0.xml b/copyright-format-1.0.xml
index d5d2bbe..954a65b 100644
--- a/copyright-format-1.0.xml
+++ b/copyright-format-1.0.xml
@@ -115,17 +115,17 @@
   The syntax of the file is the same as for other Debian control files, as
   specified in the Debian Policy Manual.  See its https://www.debian.org/doc/debian-policy/ch-controlfields#s-controlsyntax;>section
-  5.1 for details. Extra fields can be added to any paragraph.  No
+  5.1 for details. Extra fields can be added to any stanza.  No
   prefixing is necessary or desired, but please avoid names similar to
   standard ones so that mistakes are easier to catch.  Future versions of
   the debian/copyright specification will attempt to
   avoid conflicting specifications for widely used extra fields.
 
 
-  The file consists of two or more paragraphs.  At minimum, the file
-  must include one header
-  paragraph and one Files
-  paragraph.
+  The file consists of two or more stanzas.  At minimum, the file
+  must include one header
+  stanza and one Files
+  stanza.
 
 
   There are four types of fields.  The definition for each field in this
@@ -184,22 +184,22 @@
 
   
 
-  
-Paragraphs
+  
+Stanzas
 
-  There are three kinds of paragraphs.  The first paragraph in the file
-  is called the header paragraph.
-  Every other paragraph is either a Files paragraph or a stand-alone License
-  paragraph.  This is similar to source and binary package
-  paragraphs in debian/control files.
+  There are three kinds of stanzas.  The first stanza in the file
+  is called the header stanza.
+  Every other stanza is either a Files stanza or a stand-alone License
+  stanza.  This is similar to source and binary package
+  stanzas in debian/control files.
 
 
-
-  Header paragraph (once)
+
+  Header stanza (once)
   
-The following fields may be present in a header paragraph.
+The following fields may be present in a header stanza.
   
   
 
@@ -249,9 +249,9 @@
   
   
 The Copyright and License
-fields in 

Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Sean Whitton
Hello,

On Sun 18 Sep 2022 at 05:34PM -07, Russ Allbery wrote:

> Sean Whitton  writes:
>> On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:
>
>>> So, personally, I'd be happy to fully switch to stanza TBH, because it
>>> seems more specific to our use, probably easier to search for, and
>>> it's shorter.
>
>> I think this is fine for Policy to do.
>
> I vote for switching to stanza.  Paragraph is going to be confusing when
> talking about package descriptions, which often have multiple paragraphs
> in the normal English meaning of the term.

Yes, I had this example in mind too.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Russ Allbery
Guillem Jover  writes:

> I've been considering naming debian/control something like
> «Debian template source package control file», as that is used to
> generate both the source and binary control files. And always
> prefixing with Debian, so that would end up as:

>   * debian/control: «Debian source package template control file»
>   * .dsc:   «Debian source package control file»
>   * DEBIAN/control: «Debian binary package control file»

> This also removes the «master» usage in dpkg, for me for the same
> reasons as I covered at
> .

I like this.  It took a bit for my brain to adjust to it because
"template" felt wrong, but the more I thought about it, the more I think
that's correct and it's pointing out an error in my default way of
thinking about packages.

> File contents
> -

> We have references to the various parts being called as «paragraphs»,
> «stanza», «blocks», but this seems to be more of an issue with dpkg, as
> the usage in the Debian policy is quite clear and uniform now, so I'll
> at least try to remove the «block» usage there, stanza has the nice
> property of being shorter and policy already mentions that this is
> currently a common alias, so I might keep paragraph and stanza for now
> in dpkg.

> The other thing affecting dpkg and debian-policy is how the parts
> within the control files are referred to. We have for example:

>   dpkg   → «general section of control info file»
>«source stanza»
>   policy → «general paragraph»

>   dpkg   → «package's section of control info file»
>   policy → «binary package paragraphs»

> So, how does «source package paragraph» and «binary package paragraph»
> (of the «template control file») sound instead?

As mentioned in the other thread, I think source package stanza and binary
package stanza (of the template control file) sound great.

Obviously a patch to Policy would be delightful, but it's not blocking.
Just let us know if that's more than you have time for.

-- 
Russ Allbery (r...@debian.org)  



Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Russ Allbery
Sean Whitton  writes:
> On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:

>> So, personally, I'd be happy to fully switch to stanza TBH, because it
>> seems more specific to our use, probably easier to search for, and
>> it's shorter.

> I think this is fine for Policy to do.

I vote for switching to stanza.  Paragraph is going to be confusing when
talking about package descriptions, which often have multiple paragraphs
in the normal English meaning of the term.

-- 
Russ Allbery (r...@debian.org)  



Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Sean Whitton
Hello,

On Mon 19 Sep 2022 at 12:45AM +02, Guillem Jover wrote:

> I went for paragraph, because dpkg has some instances of it already in
> docs and code (and stanza only in code), and mainly because the Debian
> policy uses almost exclusively paragraph for this with a single
> mention of "stanza" in a footnote to mention it's a common alias or
> similar.

Hmm, I see.

> So, personally, I'd be happy to fully switch to stanza TBH, because it
> seems more specific to our use, probably easier to search for, and
> it's shorter.

I think this is fine for Policy to do.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Guillem Jover
On Sun, 2022-09-18 at 14:53:30 -0700, Sean Whitton wrote:
> On Sun 18 Sep 2022 at 10:28PM +02, Guillem Jover wrote:
> 
> > So, how does «source package paragraph» and «binary package paragraph»
> > (of the «template control file») sound instead?
> 
> Can we standardise on 'stanza', please?
>
> I thought that was already standard, and "paragraph" is for prose.

I was also thinking about whether I'd prefer paragraph or stanza, and
the latter seems more specific to deb822 "blocks", and as you say
paragraph seems more for prose.

I went for paragraph, because dpkg has some instances of it already in
docs and code (and stanza only in code), and mainly because the Debian
policy uses almost exclusively paragraph for this with a single
mention of "stanza" in a footnote to mention it's a common alias or
similar.

So, personally, I'd be happy to fully switch to stanza TBH, because
it seems more specific to our use, probably easier to search for, and
it's shorter.

Thanks,
Guillem



Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Sean Whitton
Hello,

On Sun 18 Sep 2022 at 10:28PM +02, Guillem Jover wrote:

> So, how does «source package paragraph» and «binary package paragraph»
> (of the «template control file») sound instead?

Can we standardise on 'stanza', please?

I thought that was already standard, and "paragraph" is for prose.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1020248: debian-policy: Clarifying nomenclature for control file names

2022-09-18 Thread Guillem Jover
Package: debian-policy
Version: 4.6.1.1
Severity: wishlist

Hi!

This is a followup from my comment at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998165#43

To summarize, we have IMO confusing naming and nomenclature for the
various control files and paragraphs/stanzas, and this is even
confusing me when having to deal with dpkg code, so I'd like to give
these more clear and unambiguous new names, and I'd very strongly
prefer to agree on the same naming for Debian policy and dpkg, to
avoid further and worse confusion (even though they currently do not
match exactly anyway, but I'd prefer to not make it worse…).

Just for reference and to give some context, I've got the following
WIP branches, trying to clarify the names in documentation and in the
API on, which I'll probably rework (split/merge) and reword as needed,
so do not take them as anything set in stone:

  
https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=next/clarify-control-filenames
  
https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=next/deb822-field-types


File descriptions
-

For example we have:

  * debian/control:
policy → «Source package control file»
dpkg   → «Debian source packages' master control file»

  * .dsc:
policy → «Debian source control file»
dpkg   → «Debian source packages' control file»

  * DEBIAN/control
policy → «Binary package control files»
dpkg   → «Debian binary packages' master control file»

These are quite confusingly close.

I've been considering naming debian/control something like
«Debian template source package control file», as that is used to
generate both the source and binary control files. And always
prefixing with Debian, so that would end up as:

  * debian/control: «Debian source package template control file»
  * .dsc:   «Debian source package control file»
  * DEBIAN/control: «Debian binary package control file»

This also removes the «master» usage in dpkg, for me for the same
reasons as I covered at
.


File contents
-

We have references to the various parts being called as «paragraphs»,
«stanza», «blocks», but this seems to be more of an issue with dpkg, as
the usage in the Debian policy is quite clear and uniform now, so I'll
at least try to remove the «block» usage there, stanza has the nice
property of being shorter and policy already mentions that this is
currently a common alias, so I might keep paragraph and stanza for now
in dpkg.

The other thing affecting dpkg and debian-policy is how the parts
within the control files are referred to. We have for example:

  dpkg   → «general section of control info file»
   «source stanza»
  policy → «general paragraph»

  dpkg   → «package's section of control info file»
  policy → «binary package paragraphs»


So, how does «source package paragraph» and «binary package paragraph»
(of the «template control file») sound instead?


If I've missed any other problematic nomenclature, I'm happy to
discuss and update those on the dpkg side.

Thanks,
Guillem