Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v5]

2017-12-01 Thread Michał Górny
W dniu pią, 01.12.2017 o godzinie 12∶30 +0100, użytkownik Fabian Groffen
napisał:
> Hi,
> 
> While trying to implement full tree Manifests for the Prefix tree, I ran
> into the following:
> 
> Would it be possible to add a section to define what directories receive
> what kind of Manifest?
> 
> I mean in particular what is encoded in gemato/profile.py, the metadata
> directory is an interesting mix and match of subdirectories that have a
> Manifest of their own, and subdirectories whose content is included in
> the Manifest at the metadata level.
> 
> More specifically, it seems like in the current GLEP it doesn't mention
> what directories should have their own Manifest or not.  It would be
> good to know if for instance adding Manifest(.gz) to
> metadata/install-qa-check.d is ok as per GLEP or not (and if so, the
> consumer of that directory should be fixed to ignore the Manifest*
> files, instead of barking it can't source the gz file or doesn't get
> it).

It's on purpose, to allow us to create Manifests as we see a need for
them. The GLEP permits every directory to have its own Manifest file.
If some directory can't receive one, it's a limitation enforced
by something else and tracking all of those does not really fit
the purpose of this GLEP.

>   Also, what if someone would want to include all entries in the
> top-level Manifest, would that be OK (albeit stupid I guess)?

It is ok, albeit it will probably be quite slow and memory consuming
when doing partial validation only.

> I think it would be a good addition to specify (for a Gentoo tree) what
> directories receive a Manifest file and what their content is.

No, it wouldn't. It would add a lot of complexity and prevent us from
doing minor modifications without having to update the GLEP. It is
flexible by design, and it should stay that way.

Furthermore, if we specified that then I'm pretty sure some people will
decide it's written in stone and start writing stupid implementations
that rely on presence of Manifest files in some directory and their
absence in other.

> In addition to this, because it is related, it would be nice to also
> document the IGNORE entries that seem present at the top-level and
> metadata-level, or specify where they would come from for the Gentoo
> case.

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v5]

2017-12-01 Thread Fabian Groffen
Hi,

While trying to implement full tree Manifests for the Prefix tree, I ran
into the following:

Would it be possible to add a section to define what directories receive
what kind of Manifest?

I mean in particular what is encoded in gemato/profile.py, the metadata
directory is an interesting mix and match of subdirectories that have a
Manifest of their own, and subdirectories whose content is included in
the Manifest at the metadata level.

More specifically, it seems like in the current GLEP it doesn't mention
what directories should have their own Manifest or not.  It would be
good to know if for instance adding Manifest(.gz) to
metadata/install-qa-check.d is ok as per GLEP or not (and if so, the
consumer of that directory should be fixed to ignore the Manifest*
files, instead of barking it can't source the gz file or doesn't get
it).  Also, what if someone would want to include all entries in the
top-level Manifest, would that be OK (albeit stupid I guess)?

I think it would be a good addition to specify (for a Gentoo tree) what
directories receive a Manifest file and what their content is.

In addition to this, because it is related, it would be nice to also
document the IGNORE entries that seem present at the top-level and
metadata-level, or specify where they would come from for the Gentoo
case.

Thanks!
Fabian

On 23-11-2017 21:53:57 +0100, Michał Górny wrote:
> W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
> napisał:
> > Hi, everyone.
> > 
> > Here's the updated version of GLEP 74 taking into consideration
> > the points made during the Council pre-review.
> > 
> > ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> > HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> > 
> > Changes:
> 
> 27c2a9e glep-0074: Grammar corrections from Ulrich Müller
> d39f865 glep-0074: Make extended filename encoding optional
> ed111f8 glep-0074: Always exclude control characters
> 
> ---
> GLEP: 74
> Title: Full-tree verification using Manifest files
> Author: Michał Górny ,
> Robin Hugh Johnson ,
> Ulrich Müller 
> Type: Standards Track
> Status: Draft
> Version: 1
> Created: 2017-10-21
> Last-Modified: 2017-11-23
> Post-History: 2017-10-26, 2017-11-16
> Content-Type: text/x-rst
> Requires: 59, 61
> Replaces: 44, 58, 60
> ---
> 
> Abstract
> 
> 
> This GLEP extends the Manifest file format to cover full-tree file
> integrity and authenticity checks. The format aims to be future-proof,
> efficient and provide means of backwards compatibility.
> 
> 
> Motivation
> ==
> 
> The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
> means of verifying the integrity of distfiles and package files
> in Gentoo. Combined with OpenPGP signatures, they provide means to
> ensure the authenticity of the covered files. However, as noted
> in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
> authenticity verification as they do not cover any files outside
> the package directory. In particular, they provide multiple ways
> for a third party to inject malicious code into the ebuild environment.
> 
> Historically, the topic of providing authenticity coverage for the whole
> repository has been mentioned multiple times. The most noteworthy effort
> are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
> They were accepted by the Council in 2010 but have never been
> implemented. When potential implementation work started in 2017, a new
> discussion about the specification arose. It prompted the creation
> of a competing GLEP that would provide a redesigned alternative to
> the old GLEPs.
> 
> This specification is designed with the following goals in mind:
> 
> 1. It should provide means to ensure the authenticity of the complete
>repository, including preventing the injection of additional files.
> 
> 2. The format should be universal enough to work both for the Gentoo
>repository and third-party repositories of different characteristics.
> 
> 3. The Manifest files should be verifiable stand-alone, that is without
>knowing any details about the underlying repository format.
> 
> 
> Specification
> =
> 
> Manifest file format
> 
> 
> This specification reuses and extends the Manifest file format defined
> in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
> repurposed as a generic *tag* that could also indicate additional
> (non-checksum) metadata. Appropriately, those tags can be followed by
> other space-separated values.
> 
> Unless specified otherwise, the paths used in the Manifest files
> are relative to the directory containing the Manifest file. The paths
> must not reference the parent directory (``..``). Forward slash (``/``)
> is used as path component separator.
> 
> The Manifest files use UTF-8 encoding.
> 
> 
> Manifest file locations and nesting
> ---

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v5]

2017-11-23 Thread Michał Górny
W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
napisał:
> Hi, everyone.
> 
> Here's the updated version of GLEP 74 taking into consideration
> the points made during the Council pre-review.
> 
> ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> 
> Changes:

27c2a9e glep-0074: Grammar corrections from Ulrich Müller
d39f865 glep-0074: Make extended filename encoding optional
ed111f8 glep-0074: Always exclude control characters

---
GLEP: 74
Title: Full-tree verification using Manifest files
Author: Michał Górny ,
Robin Hugh Johnson ,
Ulrich Müller 
Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
Last-Modified: 2017-11-23
Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
Replaces: 44, 58, 60
---

Abstract


This GLEP extends the Manifest file format to cover full-tree file
integrity and authenticity checks. The format aims to be future-proof,
efficient and provide means of backwards compatibility.


Motivation
==

The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
means of verifying the integrity of distfiles and package files
in Gentoo. Combined with OpenPGP signatures, they provide means to
ensure the authenticity of the covered files. However, as noted
in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
authenticity verification as they do not cover any files outside
the package directory. In particular, they provide multiple ways
for a third party to inject malicious code into the ebuild environment.

Historically, the topic of providing authenticity coverage for the whole
repository has been mentioned multiple times. The most noteworthy effort
are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
They were accepted by the Council in 2010 but have never been
implemented. When potential implementation work started in 2017, a new
discussion about the specification arose. It prompted the creation
of a competing GLEP that would provide a redesigned alternative to
the old GLEPs.

This specification is designed with the following goals in mind:

1. It should provide means to ensure the authenticity of the complete
   repository, including preventing the injection of additional files.

2. The format should be universal enough to work both for the Gentoo
   repository and third-party repositories of different characteristics.

3. The Manifest files should be verifiable stand-alone, that is without
   knowing any details about the underlying repository format.


Specification
=

Manifest file format


This specification reuses and extends the Manifest file format defined
in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
repurposed as a generic *tag* that could also indicate additional
(non-checksum) metadata. Appropriately, those tags can be followed by
other space-separated values.

Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
must not reference the parent directory (``..``). Forward slash (``/``)
is used as path component separator.

The Manifest files use UTF-8 encoding.


Manifest file locations and nesting
---

The ``Manifest`` file located in the root directory of the repository
is called top-level Manifest, and it is used to perform the full-tree
verification. In order to verify the authenticity, it must be signed
using OpenPGP, using the armored cleartext format.

The top-level Manifest may reference sub-Manifests contained
in subdirectories of the repository. The sub-Manifests are traditionally
named ``Manifest``; however, the implementation must support arbitrary
names, including the possibility of multiple (split) Manifests
for a single directory. The sub-Manifest can only cover the files inside
the directory tree where it resides.

The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted since it
already is covered by the signed top-level Manifest.


Directory tree coverage
---

The specification provides three ways of skipping Manifest verification
of specific files and directories (recursively):

1. explicit ``IGNORE`` entries in Manifest files,

2. injected ignore paths via package manager configuration,

3. using names starting with a dot (``.``) which are always skipped.

All files that are not ignored must be covered by at least one
of the Manifests.

A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
specify the same size and the checksums common to both entries match.
It is an error for a single file to be matched by multiple entries
of different semantics, file size or checksum 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v4]

2017-11-22 Thread Ulrich Mueller
> On Wed, 22 Nov 2017, Michał Górny wrote:

> Path and filename encoding
> --

> The path fields in the Manifest file must consist of characters
> corresponding to valid UTF-8 code points excluding the NULL character
> (``U+``), the backwards slash (``\``) and characters classified
> as whitespace in the current version of the Unicode standard
> [#UNICODE]_.

As I said before, all C0 and C1 control characters and DEL should be
excluded as well, i.e. 0x00 to 0x1f, 0x7f, and 0x80 to 0x9f. Allowing
such characters in what is basically a text file is only asking for
trouble.

> Any of the excluded characters that are present in path must be encoded
> using one of the following escape sequences:

> - characters in the ``U+`` to ``U+007F`` range can be encoded
>   as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal
>   character code,

> - characters in the ``U+`` to ``U+`` range can be encoded
>   as ``\u`` where ```` specifies the zero-padded, hexadecimal
>   character code,

> - characters in the UCS-4 range can be encoded as ``\U``
>   where ```` specifies the zero-padded, hexadecimal character
>   code.

> It is invalid for backwards slash to be used in any other context,
> and a backwards slash present in filename must be encoded. Backwards
> slash used as path component separator should be replaced by forward
> slash instead.

This entire section about the escape mechanism should be clearly
labelled as being purely optional, as it is not relevant for Gentoo
(and would break backwards compatibility with existing package
manager implementations). Maybe add a reference to GLEP 31 too?

> The encoding can be used for other characters as well. In particular,
> escaping control characters is recommended to ensure that the file
> works correctly in text editors.

See above, this should not be "recommended", but literal control chars
should be strictly forbidden.

Ulrich


pgpgwAnxngceA.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v4]

2017-11-22 Thread Michał Górny
W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
napisał:
> Hi, everyone.
> 
> Here's the updated version of GLEP 74 taking into consideration
> the points made during the Council pre-review.
> 
> ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> 
> Changes:
> 

b3964b6 glep-0074: Recommend escaping control characters, suggested by
ulm
11f19f9 glep-0074: Provide encoding for disallowed characters
da2aace glep-0074: Clarify ignoring directories


---
GLEP: 74
Title: Full-tree verification using Manifest files
Author: Michał Górny ,
Robin Hugh Johnson ,
Ulrich Müller 
Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
Last-Modified: 2017-11-16
Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
Replaces: 44, 58, 60
---

Abstract


This GLEP extends the Manifest file format to cover full-tree file
integrity and authenticity checks. The format aims to be future-proof,
efficient and provide means of backwards compatibility.


Motivation
==

The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
means of verifying the integrity of distfiles and package files
in Gentoo. Combined with OpenPGP signatures, they provide means to
ensure the authenticity of the covered files. However, as noted
in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
authenticity verification as they do not cover any files outside
the package directory. In particular, they provide multiple ways
for a third party to inject malicious code into the ebuild environment.

Historically, the topic of providing authenticity coverage for the whole
repository has been mentioned multiple times. The most noteworthy effort
are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
They were accepted by the Council in 2010 but have never been
implemented. When potential implementation work started in 2017, a new
discussion about the specification arose. It prompted the creation
of a competing GLEP that would provide a redesigned alternative to
the old GLEPs.

This specification is designed with the following goals in mind:

1. It should provide means to ensure the authenticity of the complete
   repository, including preventing the injection of additional files.

2. The format should be universal enough to work both for the Gentoo
   repository and third-party repositories of different characteristics.

3. The Manifest files should be verifiable stand-alone, that is without
   knowing any details about the underlying repository format.


Specification
=

Manifest file format


This specification reuses and extends the Manifest file format defined
in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
repurposed as a generic *tag* that could also indicate additional
(non-checksum) metadata. Appropriately, those tags can be followed by
other space-separated values.

Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
must not reference the parent directory (``..``). Forward slash (``/``)
is used as path component separator.

The Manifest files use UTF-8 encoding.


Manifest file locations and nesting
---

The ``Manifest`` file located in the root directory of the repository
is called top-level Manifest, and it is used to perform the full-tree
verification. In order to verify the authenticity, it must be signed
using OpenPGP, using the armored cleartext format.

The top-level Manifest may reference sub-Manifests contained
in subdirectories of the repository. The sub-Manifests are traditionally
named ``Manifest``; however, the implementation must support arbitrary
names, including the possibility of multiple (split) Manifests
for a single directory. The sub-Manifest can only cover the files inside
the directory tree where it resides.

The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted since it
already is covered by the signed top-level Manifest.


Directory tree coverage
---

The specification provides three ways of skipping Manifest verification
of specific files and directories (recursively):

1. explicit ``IGNORE`` entries in Manifest files,

2. injected ignore paths via package manager configuration,

3. using names starting with a dot (``.``) which are always skipped.

All files that are not ignored must be covered by at least one
of the Manifests.

A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
specify the same size and the checksums common to both entries match.
It is an error for a single file to be matched by multiple entries
of different semantics, 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-22 Thread R0b0t1
On Wed, Nov 22, 2017 at 2:02 AM, Michał Górny  wrote:
> W dniu wto, 21.11.2017 o godzinie 20∶59 -0600, użytkownik R0b0t1
> napisał:
>> On Mon, Nov 20, 2017 at 12:42 PM, Michał Górny  wrote:
>> > W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
>> > napisał:
>> > > Hi, everyone.
>> > >
>> > > Here's the updated version of GLEP 74 taking into consideration
>> > > the points made during the Council pre-review.
>> > >
>> > > ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
>> > > HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
>> > >
>> >
>> > New changes:
>> >
>> > 9d819c9 glep-0074: Disallow filenames containing whitespace
>>
>> This seems like a bad idea. I apologize if this is covered in more
>> detail somewhere, but the only justification I can see is that the
>> current grammar does not permit quoting or some other method of
>> specifying whitespace as part of a field value.
>>
>> Is there any way to assure that this won't break things in a
>> non-obvious way? I'm having a hard time imagining how it would be an
>> inflexible requirement to use a space in a filename, but it could come
>> up if it was necessary to use Portage on a non-Gentoo distribution.
>
> Having a whitespace there *will* break the parser. Until a better parser
> is provided, we need to reject it to prevent tools from accidentally
> generating broken files. It's better to tell straight away 'sorry, you
> can't use Manifest here' than cause completely unexpected behavior
> in the parser.
>
> Using whitespace in filenames is going to break Portage in horrible
> ways. Half of shell script in it is based on whitespace-separated lists.
> PMS doesn't provide any means to replace some of them. It's not going to
> happen.
>

Yes, I was talking about providing a better parser. I understand it is
as it is now because whitespace is a delimiter.

If it's not possible to know where all code that has this as a
requirement is, that's fairly bad.

http://langsec.org/occupy/

>> It seems very arbitrary. I think the better solution is to use a better 
>> parser.
>>
>
> The parser is already there for 15 years or more. We can't just replace
> it without breaking all old Portage versions.
>

It sounds like portage is already broken.

Cheers,
 R0b0t1



Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-22 Thread Michał Górny
W dniu wto, 21.11.2017 o godzinie 20∶59 -0600, użytkownik R0b0t1
napisał:
> On Mon, Nov 20, 2017 at 12:42 PM, Michał Górny  wrote:
> > W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
> > napisał:
> > > Hi, everyone.
> > > 
> > > Here's the updated version of GLEP 74 taking into consideration
> > > the points made during the Council pre-review.
> > > 
> > > ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> > > HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> > > 
> > 
> > New changes:
> > 
> > 9d819c9 glep-0074: Disallow filenames containing whitespace
> 
> This seems like a bad idea. I apologize if this is covered in more
> detail somewhere, but the only justification I can see is that the
> current grammar does not permit quoting or some other method of
> specifying whitespace as part of a field value.
> 
> Is there any way to assure that this won't break things in a
> non-obvious way? I'm having a hard time imagining how it would be an
> inflexible requirement to use a space in a filename, but it could come
> up if it was necessary to use Portage on a non-Gentoo distribution.

Having a whitespace there *will* break the parser. Until a better parser
is provided, we need to reject it to prevent tools from accidentally
generating broken files. It's better to tell straight away 'sorry, you
can't use Manifest here' than cause completely unexpected behavior
in the parser.

Using whitespace in filenames is going to break Portage in horrible
ways. Half of shell script in it is based on whitespace-separated lists.
PMS doesn't provide any means to replace some of them. It's not going to
happen.

> It seems very arbitrary. I think the better solution is to use a better 
> parser.
> 

The parser is already there for 15 years or more. We can't just replace
it without breaking all old Portage versions.

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Ulrich Mueller
> On Wed, 22 Nov 2017, Michał Górny wrote:

> W dniu wto, 21.11.2017 o godzinie 22∶48 +0100, użytkownik Ulrich Mueller
> napisał:
>> > > > > > On Tue, 21 Nov 2017, Michał Górny wrote:
>> > > > > > It is an error for a single file to be matched by multiple
>> > > > > > entries of different semantics, file size or checksum values.
>> > > > > > It is an error to specify another entry for a file matching
>> > > > > > ``IGNORE``, or one of its subdirectories.
>> > > [...]
>> > Indeed, the second part of that sentence needs to change. Do you
>> > have a suggestion how to word it best?
>>
>> "It is an error to specify another entry for a file that matches
>> ``IGNORE`` or that is covered by an ignored directory."

> I'm not sure if 'covered' wouldn't be confusing here.

Indeed, that verb can have many meanings.

> Maybe:

> | It is an error to specify another entry for a file that matches
> | ``IGNORE``, or that is located inside an ignored directory.

> ?

Works for me.

Ulrich


pgpTdGqgXiMnT.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread R0b0t1
On Mon, Nov 20, 2017 at 12:42 PM, Michał Górny  wrote:
> W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
> napisał:
>> Hi, everyone.
>>
>> Here's the updated version of GLEP 74 taking into consideration
>> the points made during the Council pre-review.
>>
>> ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
>> HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
>>
>
> New changes:
>
> 9d819c9 glep-0074: Disallow filenames containing whitespace

This seems like a bad idea. I apologize if this is covered in more
detail somewhere, but the only justification I can see is that the
current grammar does not permit quoting or some other method of
specifying whitespace as part of a field value.

Is there any way to assure that this won't break things in a
non-obvious way? I'm having a hard time imagining how it would be an
inflexible requirement to use a space in a filename, but it could come
up if it was necessary to use Portage on a non-Gentoo distribution.

It seems very arbitrary. I think the better solution is to use a better parser.

Cheers,
  R0b0t1



Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Michał Górny
W dniu wto, 21.11.2017 o godzinie 22∶48 +0100, użytkownik Ulrich Mueller
napisał:
> > > > > > On Tue, 21 Nov 2017, Michał Górny wrote:
> > > > > > It is an error for a single file to be matched by multiple
> > > > > > entries of different semantics, file size or checksum values.
> > > > > > It is an error to specify another entry for a file matching
> > > > > > ``IGNORE``, or one of its subdirectories.
> > > [...]
> > Indeed, the second part of that sentence needs to change. Do you
> > have a suggestion how to word it best?
> 
> "It is an error to specify another entry for a file that matches
> ``IGNORE`` or that is covered by an ignored directory."
> 

I'm not sure if 'covered' wouldn't be confusing here. Maybe:

| It is an error to specify another entry for a file that matches
| ``IGNORE``, or that is located inside an ignored directory.

?

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Ulrich Mueller
> On Tue, 21 Nov 2017, Michał Górny wrote:

>> > > > It is an error for a single file to be matched by multiple
>> > > > entries of different semantics, file size or checksum values.
>> > > > It is an error to specify another entry for a file matching
>> > > > ``IGNORE``, or one of its subdirectories.

>> [...]

> Indeed, the second part of that sentence needs to change. Do you
> have a suggestion how to word it best?

"It is an error to specify another entry for a file that matches
``IGNORE`` or that is covered by an ignored directory."

Ulrich


pgpaRRJ0Ujxpx.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Michał Górny
W dniu wto, 21.11.2017 o godzinie 21∶28 +0100, użytkownik Ulrich Mueller
napisał:
> > > > > > On Tue, 21 Nov 2017, Michał Górny wrote:
> > > > It is an error for a single file to be matched by multiple entries
> > > > of different semantics, file size or checksum values. It is an error
> > > > to specify another entry for a file matching ``IGNORE``, or one of its
> > > > subdirectories.
> > > 
> > > What about regular files in a directory (or subdirectory) matched
> > > by IGNORE? Looks like this case is not covered (?).
> > Ignored regular files must not have any other (e.g. DATA) entries.
> > Otherwise the expected behavior is unclear -- are we supposed to
> > verify the file or ignore it?
> 
> I still believe that the wording doesn't convey that. Maybe an example
> will clarify what I mean.
> 
> There is a directory foo/bar and a regular file foo/bar/quux in it.
> Now in Manifest there are these entries:
> 
>IGNORE foo/bar
>DATA foo/bar/quux  
> 
> The spec says: "It is an error to specify another entry for a file
> matching ``IGNORE``, or one of its subdirectories." However, file
> foo/bar/quux neither matches IGNORE nor is a subdirectory of it.

Indeed, the second part of that sentence needs to change. Do you have
a suggestion how to word it best?

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Ulrich Mueller
> On Tue, 21 Nov 2017, Michał Górny wrote:

>> > It is an error for a single file to be matched by multiple entries
>> > of different semantics, file size or checksum values. It is an error
>> > to specify another entry for a file matching ``IGNORE``, or one of its
>> > subdirectories.
>> 
>> What about regular files in a directory (or subdirectory) matched
>> by IGNORE? Looks like this case is not covered (?).

> Ignored regular files must not have any other (e.g. DATA) entries.
> Otherwise the expected behavior is unclear -- are we supposed to
> verify the file or ignore it?

I still believe that the wording doesn't convey that. Maybe an example
will clarify what I mean.

There is a directory foo/bar and a regular file foo/bar/quux in it.
Now in Manifest there are these entries:

   IGNORE foo/bar
   DATA foo/bar/quux  

The spec says: "It is an error to specify another entry for a file
matching ``IGNORE``, or one of its subdirectories." However, file
foo/bar/quux neither matches IGNORE nor is a subdirectory of it.

Ulrich


pgpXTymAHtpjt.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v3]

2017-11-21 Thread Michał Górny
W dniu wto, 21.11.2017 o godzinie 19∶20 +0100, użytkownik Ulrich Mueller
napisał:
> > > > > > On Tue, 21 Nov 2017, Michał Górny wrote:
> > All paths specified in the Manifest file must consist of characters
> > corresponding to valid UTF-8 code points excluding the NULL character
> > (``U+``), the backwards slash (``\``) and characters classified
> > as whitespace in the current version of the Unicode standard
> > [#UNICODE]_. It is an error to use Manifest files in directories
> > containing files whose names contain the disallowed characters.
> > The forward slash (``/``) must be used as path separator.
> 
> In addition to whitespace, you should also exclude C0 controls (U+
> to U+001F), DEL (U+007F), and C1 controls (U+0080 to U+009F).
> 
> Rationale, these control characters can leave the user's terminal
> in an unusable state when a package manager tries to output such a
> filename in a message. As you reserve the backslash for a future
> escape mechanism, this shouldn't be a too severe restriction.
> 

Works for me. I'll update the spec later. Can you think of any other
sequences that should be explicitly forbidden?

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v3]

2017-11-21 Thread Ulrich Mueller
> On Tue, 21 Nov 2017, Michał Górny wrote:

> All paths specified in the Manifest file must consist of characters
> corresponding to valid UTF-8 code points excluding the NULL character
> (``U+``), the backwards slash (``\``) and characters classified
> as whitespace in the current version of the Unicode standard
> [#UNICODE]_. It is an error to use Manifest files in directories
> containing files whose names contain the disallowed characters.
> The forward slash (``/``) must be used as path separator.

In addition to whitespace, you should also exclude C0 controls (U+
to U+001F), DEL (U+007F), and C1 controls (U+0080 to U+009F).

Rationale, these control characters can leave the user's terminal
in an unusable state when a package manager tries to output such a
filename in a message. As you reserve the backslash for a future
escape mechanism, this shouldn't be a too severe restriction.

Ulrich


pgpzrgA93CWHZ.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v3]

2017-11-21 Thread Michał Górny
W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
napisał:
> Hi, everyone.
> 
> Here's the updated version of GLEP 74 taking into consideration
> the points made during the Council pre-review.
> 
> ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> 
> Changes:
> 

5ba0654 glep-0074: Specify slash as path separator, disallow backwards
slash
d3b65ba glep-0074: Mention that newline needs to be restricted too in
rationale
54cc3ef glep-0074: Apply suggestions from Ulrich Müller


---
GLEP: 74
Title: Full-tree verification using Manifest files
Author: Michał Górny ,
Robin Hugh Johnson ,
Ulrich Müller 
Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
Last-Modified: 2017-11-16
Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
Replaces: 44, 58, 60
---

Abstract


This GLEP extends the Manifest file format to cover full-tree file
integrity and authenticity checks. The format aims to be future-proof,
efficient and provide means of backwards compatibility.


Motivation
==

The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
means of verifying the integrity of distfiles and package files
in Gentoo. Combined with OpenPGP signatures, they provide means to
ensure the authenticity of the covered files. However, as noted
in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
authenticity verification as they do not cover any files outside
the package directory. In particular, they provide multiple ways
for a third party to inject malicious code into the ebuild environment.

Historically, the topic of providing authenticity coverage for the whole
repository has been mentioned multiple times. The most noteworthy effort
are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
They were accepted by the Council in 2010 but have never been
implemented. When potential implementation work started in 2017, a new
discussion about the specification arose. It prompted the creation
of a competing GLEP that would provide a redesigned alternative to
the old GLEPs.

This specification is designed with the following goals in mind:

1. It should provide means to ensure the authenticity of the complete
   repository, including preventing the injection of additional files.

2. The format should be universal enough to work both for the Gentoo
   repository and third-party repositories of different characteristics.

3. The Manifest files should be verifiable stand-alone, that is without
   knowing any details about the underlying repository format.


Specification
=

Manifest file format


This specification reuses and extends the Manifest file format defined
in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
repurposed as a generic *tag* that could also indicate additional
(non-checksum) metadata. Appropriately, those tags can be followed by
other space-separated values.

Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
must not reference the parent directory (``..``).

The Manifest files use UTF-8 encoding.


Manifest file locations and nesting
---

The ``Manifest`` file located in the root directory of the repository
is called top-level Manifest, and it is used to perform the full-tree
verification. In order to verify the authenticity, it must be signed
using OpenPGP, using the armored cleartext format.

The top-level Manifest may reference sub-Manifests contained
in subdirectories of the repository. The sub-Manifests are traditionally
named ``Manifest``; however, the implementation must support arbitrary
names, including the possibility of multiple (split) Manifests
for a single directory. The sub-Manifest can only cover the files inside
the directory tree where it resides.

The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted since it
already is covered by the signed top-level Manifest.


Directory tree coverage
---

The specification provides three ways of skipping Manifest verification
of specific files and directories (recursively):

1. explicit ``IGNORE`` entries in Manifest files,

2. injected ignore paths via package manager configuration,

3. using names starting with a dot (``.``) which are always skipped.

All files that are not ignored must be covered by at least one
of the Manifests.

A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
specify the same size and the checksums common to both entries match.
It is an error for a single file to be matched by multiple entries
of different semantics, file size or checksum values. It 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-21 Thread Michał Górny
W dniu pon, 20.11.2017 o godzinie 22∶37 +0100, użytkownik Ulrich Mueller
napisał:
> > > > > > On Mon, 20 Nov 2017, Michał Górny wrote:
> > New changes:
> > 9d819c9 glep-0074: Disallow filenames containing whitespace
> > 4124b2f glep-0074: Explicitly specify UTF-8 encoding
> > 7f9bd9f glep-0074: Include suggestions from Daniel Campbell
> 
> Here are a few comments (quoting below only the parts of the text
> referenced by them):
> 
> > The Manifest files use UTF-8 encoding.
> 
> I don't understand the purpose of that requirement. The only place
> where bytes outside of the ASCII range can occur are names of
> distfiles, and these should simply be passed transparently. Otherwise,
> you would have to reject any sequence of non-ASCII bytes that doesn't
> form a valid UTF-8 sequence, which looks like an arbitrary restriction
> to me.

Let me reply in parts.

Why not plain ASCII? Because the spec tries to avoid entirely arbitrary
restrictions, and forcing everyone to use just ASCII entirely counts
as such.

Why not plain bytestring? Mostly because it's really PITA to work
on them in Python. Besides, you can't allow arbitrary bytestring since
you still need to apply restrictions making it safe to parse in text
context, i.e. forbid 0x20, 0x0A, possibly more. Which makes
the definition kinda silly in the end. Not to mention transferring files
over systems which can recode filenames but will not recode Manifest
contents.

Why UTF-8 then? Because it's quite reliable and widely established.
It works for most of the people out of the box. Those who use other
encodings can usually transcode reliably. It's what we're using
in ebuilds and everywhere else wrt GLEP 31, so I don't think we should
make Manifests any different.

> > It is an error for a single file to be matched by multiple entries
> > of different semantics, file size or checksum values. It is an error
> > to specify another entry for a file matching ``IGNORE``, or one of its
> > subdirectories.
> 
> What about regular files in a directory (or subdirectory) matched by
> IGNORE? Looks like this case is not covered (?).

Ignored regular files must not have any other (e.g. DATA) entries.
Otherwise the expected behavior is unclear -- are we supposed to verify
the file or ignore it?

> > All paths specified in the Manifest file must consist of characters
> > corresponding to valid UTF-8 code points excluding the NULL character
> > (``U+``) and characters classified as whitespace in the current
> > version of the Unicode standard [#UNICODE]_. It is an error to use
> > Manifest files in directories containing files whose names contain
> > the disallowed characters.
> 
> See above. I believe that NUL and ASCII whitespace (i.e. characters 09
> 0a 0b 0c 0d 20) should be excluded, but excluding byte sequences like
> "e1 9a 80" (which is the UTF-8 encoding for U+1680 "OGHAM SPACE MARK")
> doesn't make sense.

The restriction is meant to be intentionally wider to prevent problems
with implementations which e.g. use Python's str.split() or '\S' regular
expression character (Portage). When working in Unicode-compliant mode,
those can match additional whitespace characters, and I'm rejecting them
to be on the safe side.

> > During the verification process, the client should compare the timestamp
> > against the update time obtained from a local clock or a trusted time
> > source. If the comparison result indicates that the Manifest at the time
> > of receiving was already significantly outdated, the client should
> > either fail the verification or require manual confirmation from user.
> 
> s/from user./from the user./
> 
> > ``TIMESTAMP ``
> >   Specifies a timestamp of when the Manifest file was last updated.
> >   The timestamp must be a valid second-precision ISO8601 extended format
> 
> s/ISO8601/ISO 8601/

Both done.

> 
> > ``IGNORE ``
> >   Ignores a subdirectory or file from Manifest checks. If the specified
> >   path is present, it and its contents are omitted from the Manifest
> >   verification (always pass). *Path* must be a plain file or directory
> >   path without a trailing slash, and must not contain wildcards.
> 
> What does that mean? Wildcards are not special (so "foo*" will match
> literally), or wildcard characters like "*" are not allowed at all?

Not special. Will reword to:

| Wildcards are not supported and wildcard characters are interpreted
| literally.

> 
> > ``AUX   ...``
> >   Equivalent to the ``DATA`` type, except that the filename is relative
> >   to ``files/`` subdirectory.
> 
> s/to/to the/
> 
> > 3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
> >files according to `file verification`_ section, and include their
> 
> s/according to/according to the/
> 
> > 6. Verify the entries in *covered* set for incompatible duplicates
> 
> s/in *covered* set/in the *covered* set/
> 
> > 7. Verify all the files in the union of the *present* and *covered*
> >sets, according to `file verification`_ section.
> 
> 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-20 Thread Ulrich Mueller
> On Mon, 20 Nov 2017, Ulrich Mueller wrote:

> On Mon, 20 Nov 2017, Michał Górny wrote:
>> All paths specified in the Manifest file must consist of characters
>> corresponding to valid UTF-8 code points excluding the NULL character
>> (``U+``) and characters classified as whitespace in the current
>> version of the Unicode standard [#UNICODE]_. It is an error to use
>> Manifest files in directories containing files whose names contain
>> the disallowed characters.

> See above. I believe that NUL and ASCII whitespace (i.e. characters
> 09 0a 0b 0c 0d 20) should be excluded, but excluding byte sequences
> like "e1 9a 80" (which is the UTF-8 encoding for U+1680 "OGHAM SPACE
> MARK") doesn't make sense.

Thinking about it, this still looks too complicated. So, exclude only
SPACE (0x20) which is used as separator between fields. (NUL can be
excluded too, but it won't occur anyway.)

In fact, all Manifest files in the tree are ASCII only.
So alternatively, filenames could be restricted to printable ASCII.
This is also what GLEP 31 [1] says:

| Suitable Characters for File and Directory Names
|
| Characters outside the ASCII 0..127 range cannot safely be used for
| file or directory names. (Of course, not all characters inside the
| ASCII 0..127 range can be used safely either.)

Ulrich


[1] Character Sets for Portage Tree Items
https://www.gentoo.org/glep/glep-0031.html


pgpBeq6WPQhpm.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-20 Thread Ulrich Mueller
> On Mon, 20 Nov 2017, Michał Górny wrote:

> New changes:

> 9d819c9 glep-0074: Disallow filenames containing whitespace
> 4124b2f glep-0074: Explicitly specify UTF-8 encoding
> 7f9bd9f glep-0074: Include suggestions from Daniel Campbell

Here are a few comments (quoting below only the parts of the text
referenced by them):

> The Manifest files use UTF-8 encoding.

I don't understand the purpose of that requirement. The only place
where bytes outside of the ASCII range can occur are names of
distfiles, and these should simply be passed transparently. Otherwise,
you would have to reject any sequence of non-ASCII bytes that doesn't
form a valid UTF-8 sequence, which looks like an arbitrary restriction
to me.

> It is an error for a single file to be matched by multiple entries
> of different semantics, file size or checksum values. It is an error
> to specify another entry for a file matching ``IGNORE``, or one of its
> subdirectories.

What about regular files in a directory (or subdirectory) matched by
IGNORE? Looks like this case is not covered (?).

> All paths specified in the Manifest file must consist of characters
> corresponding to valid UTF-8 code points excluding the NULL character
> (``U+``) and characters classified as whitespace in the current
> version of the Unicode standard [#UNICODE]_. It is an error to use
> Manifest files in directories containing files whose names contain
> the disallowed characters.

See above. I believe that NUL and ASCII whitespace (i.e. characters 09
0a 0b 0c 0d 20) should be excluded, but excluding byte sequences like
"e1 9a 80" (which is the UTF-8 encoding for U+1680 "OGHAM SPACE MARK")
doesn't make sense.

> During the verification process, the client should compare the timestamp
> against the update time obtained from a local clock or a trusted time
> source. If the comparison result indicates that the Manifest at the time
> of receiving was already significantly outdated, the client should
> either fail the verification or require manual confirmation from user.

s/from user./from the user./

> ``TIMESTAMP ``
>   Specifies a timestamp of when the Manifest file was last updated.
>   The timestamp must be a valid second-precision ISO8601 extended format

s/ISO8601/ISO 8601/

> ``IGNORE ``
>   Ignores a subdirectory or file from Manifest checks. If the specified
>   path is present, it and its contents are omitted from the Manifest
>   verification (always pass). *Path* must be a plain file or directory
>   path without a trailing slash, and must not contain wildcards.

What does that mean? Wildcards are not special (so "foo*" will match
literally), or wildcard characters like "*" are not allowed at all?

> ``AUX   ...``
>   Equivalent to the ``DATA`` type, except that the filename is relative
>   to ``files/`` subdirectory.

s/to/to the/

> 3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
>files according to `file verification`_ section, and include their

s/according to/according to the/

> 6. Verify the entries in *covered* set for incompatible duplicates

s/in *covered* set/in the *covered* set/

> 7. Verify all the files in the union of the *present* and *covered*
>sets, according to `file verification`_ section.

s/to/to the/

>a. If a ``IGNORE`` entry in the ``Manifest`` file covers
>   the *original* directory (or one of the parent directories), stop.

s/a ``IGNORE`` entry/an ``IGNORE`` entry/

> An example top-level Manifest file for the Gentoo repository would have
> the following content::

> TIMESTAMP 2017-10-30T10:11:12Z
> IGNORE distfiles
> IGNORE local
> IGNORE lost+found
> IGNORE packages
> MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
> ...
> MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
> ...

> An example modern Manifest (disregarding backwards compatibility)
> for a package directory would have the following content::

> DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
> DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
> DATA metadata.xml 664 SHA256 97c6.. SHA512 1175..
> DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
> DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
> DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
> DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..

Update hashes to BLAKE2B SHA512?

> This specification aims to avoid arbitrary restrictions. For this
> reason, the filename characters are only restricted by excluding two

s/the filename characters/filename characters/

> technically problematic groups:

> 1. The NULL character (``U+``) is normally used to indicate the end
>of a null-terminated string. Its use could therefore break programs
>written using C. Furthermore, it is not allowed in any known
>filesystem.

> 2. The whitespace characters are used to separate Manifest fields. 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]

2017-11-20 Thread Michał Górny
W dniu czw, 16.11.2017 o godzinie 11∶19 +0100, użytkownik Michał Górny
napisał:
> Hi, everyone.
> 
> Here's the updated version of GLEP 74 taking into consideration
> the points made during the Council pre-review.
> 
> ReST: https://dev.gentoo.org/~mgorny/tmp/glep-0074.rst
> HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0074.html
> 

New changes:

9d819c9 glep-0074: Disallow filenames containing whitespace
4124b2f glep-0074: Explicitly specify UTF-8 encoding
7f9bd9f glep-0074: Include suggestions from Daniel Campbell


---
GLEP: 74
Title: Full-tree verification using Manifest files
Author: Michał Górny ,
Robin Hugh Johnson ,
Ulrich Müller 
Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
Last-Modified: 2017-11-16
Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
Replaces: 44, 58, 60
---

Abstract


This GLEP extends the Manifest file format to cover full-tree file
integrity and authenticity checks. The format aims to be future-proof,
efficient and provide means of backwards compatibility.


Motivation
==

The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
means of verifying the integrity of distfiles and package files
in Gentoo. Combined with OpenPGP signatures, they provide means to
ensure the authenticity of the covered files. However, as noted
in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
authenticity verification as they do not cover any files outside
the package directory. In particular, they provide multiple ways
for a third party to inject malicious code into the ebuild environment.

Historically, the topic of providing authenticity coverage for the whole
repository has been mentioned multiple times. The most noteworthy effort
are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
They were accepted by the Council in 2010 but have never been
implemented. When potential implementation work started in 2017, a new
discussion about the specification arose. It prompted the creation
of a competing GLEP that would provide a redesigned alternative to
the old GLEPs.

This specification is designed with the following goals in mind:

1. It should provide means to ensure the authenticity of the complete
   repository, including preventing the injection of additional files.

2. The format should be universal enough to work both for the Gentoo
   repository and third-party repositories of different characteristics.

3. The Manifest files should be verifiable stand-alone, that is without
   knowing any details about the underlying repository format.


Specification
=

Manifest file format


This specification reuses and extends the Manifest file format defined
in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
repurposed as a generic *tag* that could also indicate additional
(non-checksum) metadata. Appropriately, those tags can be followed by
other space-separated values.

Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
must not reference the parent directory (``..``).

The Manifest files use UTF-8 encoding.


Manifest file locations and nesting
---

The ``Manifest`` file located in the root directory of the repository
is called top-level Manifest, and it is used to perform the full-tree
verification. In order to verify the authenticity, it must be signed
using OpenPGP, using the armored cleartext format.

The top-level Manifest may reference sub-Manifests contained
in subdirectories of the repository. The sub-Manifests are traditionally
named ``Manifest``; however, the implementation must support arbitrary
names, including the possibility of multiple (split) Manifests
for a single directory. The sub-Manifest can only cover the files inside
the directory tree where it resides.

The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted since it
already is covered by the signed top-level Manifest.


Directory tree coverage
---

The specification provides three ways of skipping Manifest verification
of specific files and directories (recursively):

1. explicit ``IGNORE`` entries in Manifest files,

2. injected ignore paths via package manager configuration,

3. using names starting with a dot (``.``) which are always skipped.

All files that are not ignored must be covered by at least one
of the Manifests.

A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
specify the same size and the checksums common to both entries match.
It is an error for a single file to be matched by multiple entries
of different semantics, file size or checksum values. It is an error
to specify another entry for 

Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update

2017-11-20 Thread Michał Górny
W dniu pią, 17.11.2017 o godzinie 12∶37 -0800, użytkownik Daniel
Campbell napisał:
> > 
> > [snip]
> > Non-strict Manifest verification
> > 
> > 
> > Originally the Manifest2 format provided a special ``MISC`` tag that
> > was used for ``metadata.xml`` and ``ChangeLog`` files. This tag
> > indicated that the Manifest verification failures could be ignored for
> > those files unless the package manager was working in strict mode.
> > 
> > The first versions of this specification continued the use of this tag.
> > However, after a long debate it was decided to deprecate it along with
> > the non-strict behavior, and require all files to strictly match.
> 
> It may be outside the scope of the GLEP, but a link to said long debate
> might be relevant to the reader, especially if they have suggestions or
> points that have already been discussed in the debate.

It's been on IRC.

> Aside from the few nitpicks this looks good. Hope this helps.

I think I've fixed every single one of them. I'm going to fix one issue
I've noticed (lack of filename whitespace restriction) and resubmit.

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update

2017-11-17 Thread Daniel Campbell
On Thu, Nov 16, 2017 at 11:19:54AM +0100, Michał Górny wrote:
> [snip]
> Abstract
> 
> 
> This GLEP extends the Manifest file format to cover full-tree file
> integrity and authenticity checks.The format aims to be future-proof,

Missing a space after the first sentence, between "checks." and "The".

> efficient and provide means of backwards compatibility.

Could use an Oxford comma after "efficient", but it's a style choice. Up
to you.

> 
> 
> Motivation
> ==
> 
> The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
> means of verifying the integrity of distfiles and package files
> in Gentoo. Combined with OpenPGP signatures, they provide means to
> ensure the authenticity of the covered files. However, as noted
> in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
> authenticity verification as they do not cover any files outside
> the package directory. In particular, they provide multiple ways
> for a third party to inject malicious code into the ebuild environment.
> 
> Historically, the topic of providing authenticity coverage for the whole
> repository has been mentioned multiple times. The most noteworthy effort
> are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
> They were accepted by the Council in 2010 but have never been
> implemented. When potential implementation work started in 2017, a new
> discussion about the specification arose. It prompted the creation
> of a competing GLEP that would provide a redesigned alternative to
> the old GLEPs.

No correction, but I really like the inclusion of history here. It gives
the reader more context, should they have questions about prior
discussions.

> [snip]
> 1. It is more future-proof. If an incompatible change to the repository
>format is introduced, only developers need to be upgrade the tools
>they use to generate the Manifests. The tools used to verify
>the updated Manifests will continue to work.

"be upgrade" -> "upgrade"

> 
> [snip]
> While both models have their advantages, the hierarchical model was
> selected because it reduces the number of OpenPGP operations
> which are comparatively costly to the minimum.

It seems like "which are comparatively costly" should be in parentheses,
or separated by some other punctuation like en- or em-dashes. e.g.

"... because it reduces the number of OpenPGP operations (which are
comparatively costly) to the minimum."

or

"... because it reduces the number of OpenPGP operations – which are
comparatively costly – to the minimum."

(Note en-dash was used (0x2013), not a regular hyphen (0x2D).)

Or something like that.

> 
> [snip]
> Non-strict Manifest verification
> 
> 
> Originally the Manifest2 format provided a special ``MISC`` tag that
> was used for ``metadata.xml`` and ``ChangeLog`` files. This tag
> indicated that the Manifest verification failures could be ignored for
> those files unless the package manager was working in strict mode.
> 
> The first versions of this specification continued the use of this tag.
> However, after a long debate it was decided to deprecate it along with
> the non-strict behavior, and require all files to strictly match.

It may be outside the scope of the GLEP, but a link to said long debate
might be relevant to the reader, especially if they have suggestions or
points that have already been discussed in the debate.

> [snip]
> Finally, the non-strict mode could be used as means to an attack.
> The allowance of missing or modified documentation file could be used
> to spread misinformation, resulting in bad decisions made by the user.
> A modified file could also be used e.g. to exploit vulnerabilities
> of an XML parser.

"used e.g." -> "used, e.g."

Helps it reflect the way it would be spoken.

> 
> 
> Timestamp field
> ---
> 
> The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
> to include a generation timestamp in the Manifest. A similar feature
> was originally proposed in GLEP 58 [#GLEP58]_.

"Manifests" and "allows" disagree grammatically -- one of them needs to
drop the "s". Context clues indicate a singular top-level Manifest.

> 
> A malicious third-party may use the principles of exclusion or replay
> [#C08]_ to deny an update to clients, while at the same time recording
> the identity of clients to attack. The timestamp field can be used to
> detect that.
> 
> In order to provide a more complete protection, the Gentoo
> Infrastructure should provide an ability to obtain the timestamps
> of all Manifests from a recent timeframe over a secure channel
> from a trusted source for comparison.

"a more complete protection"; should probably drop the "a".

> 
> Strictly speaking, this information is already provided by the various
> ``metadata/timestamp*`` files that are already present. However,
> including the value in the Manifest itself has a little cost
> and provides the ability to perform the verification