Re: tag2upload (git-debpush) service architecture - draft

2019-09-14 Thread Guillem Jover
On Thu, 2019-08-01 at 16:22:19 +0100, Sean Whitton wrote:
> On Thu 01 Aug 2019 at 02:35PM +02, Guillem Jover wrote:
> > This argument seems very counter-intuitive. This assumes a
> > Debian-archive-centric world-view, where most of the heavy lifting is
> > delegated to some external service. But the reality is that most
> > maintainers, and other external parties handling Debian sources do
> > need to handle actual source packages. Requiring people to setup an
> > equivalent to the Debian archive + dgit service + tag2upload "locally"
> > just so that they can reduce their cognitive load, seems to be pretty
> > much the opposite to the above stated goal TBH.
> 
> Hmm, I'm not sure all that local setup would be needed.  You can start
> with `dgit clone` and work from there.

That was my point. This only works as long as there's such package in
the Debian archive. A Debian source package does not imply it will be
in Debian. There are many parties that do use and work with these too.
Ranging from local packages of unpublished stuff, packages for public
things not intended for Debian, packaging overlays from third-parties,
etc.

> More generally, I think we should aim to replace the use of source
> packages in all those places too, but there are lots of unsolved
> problems there.  I don't think anyone knows what things are going to
> look like.

The more I've been considering this the more I think this would be a
monumental mistake. I'm pondering about a new workflow proposal where
I'll expand on this.

Thanks,
Guillem



Re: tag2upload (git-debpush) service architecture - draft

2019-08-03 Thread Russ Allbery
Charles Plessy  writes:

> if creating a source package is fast and reproducible, could the dgit
> user commit the signed .dsc file somewhere, and the dgit infrastructure
> use it and throw an error if the hash sums do not match ?

A difficulty with using the .dsc file as a signed artifact if you want to
base the upload on a Git repository is that a .dsc file points to
compressed tarballs, which means now you have to solve the problem of
recreating a compressed tarball from a Git repository in a byte-for-byte
identical way.  Past experience with pristine-tar says that this is more
fragile than we would like, and is prone to trouble if there are differing
versions of tar or the compression utility in play.

Admittedly, the tag2upload problem is much easier than the pristine-tar
problem because we're not trying to cope with arbitrary upstream tar
creation, but I suspect the ongoing maintenance burden (and random failure
rate) would be higher than the current proposal.

-- 
Russ Allbery (r...@debian.org)   



Re: tag2upload (git-debpush) service architecture - draft

2019-08-03 Thread Charles Plessy
Hello everybody,

sorry if it is too naive, and sorry if I missed part of the discussion,
but,

if creating a source package is fast and reproducible, could the
dgit user commit the signed .dsc file somewhere, and the dgit
infrastructure use it and throw an error if the hash sums do not match ?

Have a nice day,

Charles

-- 
Charles Plessy  Akano, Uruma, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from work,   https://mastodon.technology/@charles_plessy
Tooting from home, https://framapiaf.org/@charles_plessy



Re: tag2upload (git-debpush) service architecture - draft

2019-08-03 Thread Ian Jackson
Marco d'Itri writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> On Aug 02, Sam Hartman  wrote:
> > In effect, ftpmaster is saying they are uncomfortable trusting
> > tag2upload  very much.
> 
> A simple solution to this concern would be for ftpmaster to take over 
> the operations of tag2upload once it will be ready.

Indeed.

I still think it wants to run on a separate host to dak.  It doesn't
seem to me to be a good idea for security reasons for the dak host to
grow an internet facing webhook service, nor for it to perform complex
git-to-source-package transformations.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Marco d'Itri
On Aug 02, Sam Hartman  wrote:

> In effect, ftpmaster is saying they are uncomfortable trusting
> tag2upload  very much.
A simple solution to this concern would be for ftpmaster to take over 
the operations of tag2upload once it will be ready.

-- 
ciao,
Marco


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Rebecca N. Palmer

On 02/08/2019 19:09, Ian Jackson wrote:

Sam Hartman writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):

Can you outline how to get from the dsc to a verification of the tag
signature without contacting the dgit server?


Sure.

Split the tag object daa at the relevant - boundary.  This gives
you 1. an unsigned tag data file (first half) 2. a detached armoured
PGP signature (second half).  Feed that pair to gpgv (with appropriate
keyrings etc.).  That's it.



That only verifies that that user signed *something*, not what contents 
they signed.  To do that, you need to include both the tag object and 
the commit object: the tree objects (i.e. lists of file/subtree hashes) 
can be reconstructed from the files.


Also, in current git this package contents check is relying on SHA-1, 
unless you put an extra hash in the tag message.


You can think of Git's hashes as like Debian's .dsc/.changes but with 
more levels:

Git [0]: file -> tree ( -> tree...) -> commit -> tag
(One file hash per file ("blob"), one tree object per directory.  Not 
shown and not important here: each commit also has a hash of its parent 
commit(s))

Debian: tarball -> .dsc -> .changes
(Tarball hashed as a whole.  Not shown: tarball also directly linked to 
.changes.)


Verification would then be:
- check PGP signature on tag object
- check that the commit object has the hash listed in the tag object
- unpack source package tarball(s) (since you don't yet know they're the 
right ones, you need to trust the tool you do this with to not be 
vulnerable to malicious content, and be prepared to reject an 
overly-large tarball as a DoS attack)
- create tree hashes ('git init' in the top level of the source package 
is probably the easiest way to do this)
- check that the top-level tree object has the hash listed in the commit 
object


[0] https://git-scm.com/book/en/v2/Git-Internals-Git-References



Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Sam Hartman
> "Ian" == Ian Jackson  writes:

>> Can you outline how to get from the dsc to a verification of the
>> tag signature without contacting the dgit server?

Ian> Sure.

Ian> Split the tag object daa at the relevant - boundary.  This
Ian> gives you 1. an unsigned tag data file (first half) 2. a
Ian> detached armoured PGP signature (second half).  Feed that pair
Ian> to gpgv (with appropriate keyrings etc.).  That's it.

Ah, thanks.
I think this helps me understand where the confusion is.

My understanding of ftpmaster's requirement, confirmed by Bastian is
that without data  external to the dsc, someone needs to be able to
confirm the contents of the source package are certified by a user in
the Debian keyring.

That is, anyone needs to be able to prove only from the dsc (and
keyrings of course) that the dsc is created from the git objects
intended by the signer.
The output of git cat-file tag is insufficient to do that.
All in includes is the object hash of the commit object.
However, we don't have that commit object or the tree objects in the
dsc.

We could perform that verification given the dgit repository, but that
would violate the no external data requirement from ftpmaster as I have
explained to Sean.


In effect, ftpmaster is saying they are uncomfortable trusting
tag2upload  very much.

I think we may see this same issue come up again when we discuss
automated sourceful NMUs as requested by the reproducible builds community.



Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Ian Jackson
Sam Hartman writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> Ian Jackson  writes:
> > This requirement can be met (as I mentioned before) by
> > including the tag object data as a file in the upload (listed
> > in .changes).  The signature can be verified without any
> > further data.  A git bundle is not needed.
> 
> What do you mean by tag object data?

I mean the output of "git cat-file tag refs/tags/debian/".

Included as, say,   _-.git-tag
and referenced in .changes.

> Can you outline how to get from the dsc to a verification of the tag
> signature without contacting the dgit server?

Sure.

Split the tag object daa at the relevant - boundary.  This gives
you 1. an unsigned tag data file (first half) 2. a detached armoured
PGP signature (second half).  Feed that pair to gpgv (with appropriate
keyrings etc.).  That's it.

If information in the tag should be checked (eg, the intended source
package name or version, or the destination for the upload), this
should be parsed out of the "unsigned tag data file" half after
splitting, to avoid any possible attacks based on differences in
disassembly/parsing algorithms.

(Sample for this code can be found in dgit-repos-server and is already
deployed on the dgit git server; but it's not very hard.)

BTW, I thought the requirement was to be able to start with the upload
including the .changes, rather than necessarily with then .dsc, to do
this verification.  We could put the tag data file in the .dsc but it
seems to me that it is not really helpful to consumers of the .dsc and
that really we are putting it in the upload for the benefit of the
archive.  So the .changes is probably better.  Putting it into the
.dsc would involve changing more things to tolerate it (ie, ignore
it) and seems like a layering violation - the .dsc is firmly
dpkg's territory.

But the approach I sketch above would work for the .dsc too of course.

HTH.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Sam Hartman
> "Ian" == Ian Jackson  writes:

Ian> Sam Hartman writes ("Re: tag2upload (git-debpush) service
Ian> architecture - draft"):
>> Sean Whitton  writes: > Okay, thanks.
>> 
>> > I think that the Git-Tag-Info field solves this.  With that >
>> field available, anyone can do the following to perform an >
>> equivalent verification:
>> 
>> > 1. fetch the .dsc from the archive
>> 
>> > 2. fetch, from dgit-repos, the tag given in the Git-Tag-Info >
>> field of the .dsc
>> 
>> This violates the "no external data" requirement above.

Ian> This requirement can be met (as I mentioned before) by
Ian> including the tag object data as a file in the upload (listed
Ian> in .changes).  The signature can be verified without any
Ian> further data.  A git bundle is not needed.

What do you mean by tag object data?
Can you outline how to get from the dsc to a verification of the tag
signature without contacting the dgit server?



Re: tag2upload (git-debpush) service architecture - draft

2019-08-02 Thread Ian Jackson
Sam Hartman writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> Sean Whitton  writes:
> > Okay, thanks.
> 
> > I think that the Git-Tag-Info field solves this.  With that
> > field available, anyone can do the following to perform an
> > equivalent verification:
> 
> > 1. fetch the .dsc from the archive
> 
> > 2. fetch, from dgit-repos, the tag given in the Git-Tag-Info
> > field of the .dsc
> 
> This violates the "no external data" requirement above.

This requirement can be met (as I mentioned before) by including the
tag object data as a file in the upload (listed in .changes).  The
signature can be verified without any further data.  A git bundle is
not needed.

I just need to know what filename I should give it.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Sean Whitton
Hello,

On Thu 01 Aug 2019 at 02:35PM +02, Guillem Jover wrote:

> On Thu, 2019-08-01 at 04:37:41 -0700, Sean Whitton wrote:
>> On Wed 31 Jul 2019 at 10:53PM +01, Rebecca N. Palmer wrote:
>> > Do "complicated and inconvenient" mean "harder to remember than 'git
>> > debpush'" (which could equally well be fixed by a local-only script),
>> > the confusing errors mentioned below, or something else?
>>
>> It's a qualitative claim about what it is like to use the tools.  We
>> think that use of git-debpush imposes much less Debian-specific
>> cognitive load on package maintainers.  They are just signing and
>> pushing a git tag.
>
> But Debian-specific work requires Debian-specific knowledge.

I should have said: Debian-specific source code handling knowledge.

I'd like there to be more room in people's minds for Debian-specific
knowledge that isn't a matter of moving source code around.

>> The sense in which git-debpush & tag2upload have better error handling
>> is just that the user's computer is not responsible for any .dsc
>> manipulation.  This makes it easier to have the correct mental model of
>> what's going on, and thus what went wrong.
>>
>> As soon as .dsc generation is happening on the user's machine, you
>> introduce a whole load of stuff which the user has to incorporate into
>> their mental model of what's going on with the command they just typed.
>>
>> You and I basically already have all that stuff in our heads because
>> we've been doing Debian stuff for a while.  We want experienced
>> contributors to be able to discard it, and new contributors not have to
>> learn it.
>
> This argument seems very counter-intuitive. This assumes a
> Debian-archive-centric world-view, where most of the heavy lifting is
> delegated to some external service. But the reality is that most
> maintainers, and other external parties handling Debian sources do
> need to handle actual source packages. Requiring people to setup an
> equivalent to the Debian archive + dgit service + tag2upload "locally"
> just so that they can reduce their cognitive load, seems to be pretty
> much the opposite to the above stated goal TBH.

Hmm, I'm not sure all that local setup would be needed.  You can start
with `dgit clone` and work from there.

More generally, I think we should aim to replace the use of source
packages in all those places too, but there are lots of unsolved
problems there.  I don't think anyone knows what things are going to
look like.

tag2upload is meant to encapsulate the use of source packages in one
place, as a modest first step towards replacing them elsewhere too.

> Personally I'm more interested in simplifying our toolchain and
> concepts so that this cognitive load is reduced for everyone/everything.

We should do this too, of course :)

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Sean Whitton
Hello,

On Thu 01 Aug 2019 at 09:35AM -04, Sam Hartman wrote:

> This violates the "no external data" requirement above.
>
> You could technically meet this requirement by including a git bundle in
> the dsc, although that would be an unacceptable design for a variety of
> other reasons that seem obvious enough not to require enumeration.

Hmm, okay.  I think of dgit-repos as external to ftp-master only by
accident -- it's more like ftp-master than it is like salsa.  But
there's no doubting that dgit-repos is not ftp-master at the present
time.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Sam Hartman
> "Sean" == Sean Whitton  writes:

Sean> Hello Bastian,
Sean> On Wed 31 Jul 2019 at 10:37PM +02, Bastian Blank wrote:

>> On Wed, Jul 31, 2019 at 03:21:32PM -0400, Sam Hartman wrote:
Bastian> One last time: The user has to certify his upload in a way
Bastian> the archive can verify.
>>> Let me see if I'm correctly understanding this requirement.
>>> You're saying that given the dsc presented to dak by the
>>> tag2upload service, dak needs to be able to verify the contents
>>> of the DSC based on the user's signature and no external data.
>> 
>> Yes.
>> 
>> dak will push the signed .dsc into the pool.  This file and the
>> complete source package can then be verified independently by
>> everyone.  We don't need to trust ftp-master's verification of
>> the signature.
>> 
>> Not only dak, but everyone who downloads the source package needs
>> to be able to verify the user signature.
>> 

Sean> Okay, thanks.

Sean> I think that the Git-Tag-Info field solves this.  With that
Sean> field available, anyone can do the following to perform an
Sean> equivalent verification:

Sean> 1. fetch the .dsc from the archive

Sean> 2. fetch, from dgit-repos, the tag given in the Git-Tag-Info
Sean> field of the .dsc

This violates the "no external data" requirement above.

You could technically meet this requirement by including a git bundle in
the dsc, although that would be an unacceptable design for a variety of
other reasons that seem obvious enough not to require enumeration.

--Sam



Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Guillem Jover
On Thu, 2019-08-01 at 04:37:41 -0700, Sean Whitton wrote:
> On Wed 31 Jul 2019 at 10:53PM +01, Rebecca N. Palmer wrote:
> > Do "complicated and inconvenient" mean "harder to remember than 'git
> > debpush'" (which could equally well be fixed by a local-only script),
> > the confusing errors mentioned below, or something else?
> 
> It's a qualitative claim about what it is like to use the tools.  We
> think that use of git-debpush imposes much less Debian-specific
> cognitive load on package maintainers.  They are just signing and
> pushing a git tag.

But Debian-specific work requires Debian-specific knowledge.

> > On 31/07/2019 20:21, Sean Whitton wrote:
> > > Just fyi, it is indeed as simple as [dgit push-source && git push --all 
> > > --follow-tags].  However, when
> > > there are errors, it is quite a bit harder to understand what's going on
> > > than it is with git-debpush/tag2upload, basically because there are
> > > .dscs involved.
> >
> > If git debpush / tag2upload have better error handling, could that code
> > be used to improve dgit push-source?
> 
> The sense in which git-debpush & tag2upload have better error handling
> is just that the user's computer is not responsible for any .dsc
> manipulation.  This makes it easier to have the correct mental model of
> what's going on, and thus what went wrong.
> 
> As soon as .dsc generation is happening on the user's machine, you
> introduce a whole load of stuff which the user has to incorporate into
> their mental model of what's going on with the command they just typed.
> 
> You and I basically already have all that stuff in our heads because
> we've been doing Debian stuff for a while.  We want experienced
> contributors to be able to discard it, and new contributors not have to
> learn it.

This argument seems very counter-intuitive. This assumes a
Debian-archive-centric world-view, where most of the heavy lifting is
delegated to some external service. But the reality is that most
maintainers, and other external parties handling Debian sources do
need to handle actual source packages. Requiring people to setup an
equivalent to the Debian archive + dgit service + tag2upload "locally"
just so that they can reduce their cognitive load, seems to be pretty
much the opposite to the above stated goal TBH.

Personally I'm more interested in simplifying our toolchain and
concepts so that this cognitive load is reduced for everyone/everything.

Thanks,
Guillem



Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Sean Whitton
Hello Bastian,

On Wed 31 Jul 2019 at 10:37PM +02, Bastian Blank wrote:

> On Wed, Jul 31, 2019 at 03:21:32PM -0400, Sam Hartman wrote:
>> Bastian> One last time: The user has to certify his upload in a way
>> Bastian> the archive can verify.
>> Let me see if I'm correctly understanding this requirement.  You're
>> saying that given the dsc presented to dak by the tag2upload service,
>> dak needs to be able to verify the contents  of the DSC based on the
>> user's signature and no external data.
>
> Yes.
>
> dak will push the signed .dsc into the pool.  This file and the complete
> source package can then be verified independently by everyone.  We don't
> need to trust ftp-master's verification of the signature.
>
>> So, if the tag2upload service does some transformation to produce the
>> dsc:
>> 1) dak needs to be able to verify the inputs to that transformation
>> and
>> 2) confirm those inputs are certified back to a user signature.
>
> Not only dak, but everyone who downloads the source package needs to be
> able to verify the user signature.
>
> Ian's tag2upload tool wants to replace the user signature with a tool
> signature.  The user signature used as input for the tool would be not
> longer verifyable, as the input is not provided.  So everything after
> that would need to trust the tool and the instrastructure it runs on.
> This means we would need to trust it more than we need to trust
> ftp-master for source package verification.

Okay, thanks.

I think that the Git-Tag-Info field solves this.  With that field
available, anyone can do the following to perform an equivalent
verification:

1. fetch the .dsc from the archive

2. fetch, from dgit-repos, the tag given in the Git-Tag-Info field of
   the .dsc

3. check the uploader's signature on that tag against the Debian
   keyring/the Debian maintainers keyring/whatever it is the user wants
   to trust

4. produce a .dsc from the tag by running `dgit --quilt=foo
   build-source`, where 'foo' is a value from the signed metadata in the
   tag

5. unpack the .dscs from steps (1) and (4) with `dpkg-source -x`

6. the verification succeeds if the two unpacked trees are the same.

This process does not require trusting either ftp-master or dgit-repos.
Also, it should be noted that the tag cannot be deleted from dgit-repos
(except by a service administrator).  So we don't have to rely on salsa
either.

Given the above, I believe your requirement is satisfied by tag2upload,
with only the addition of the new Git-Tag-Info field.  Perhaps you could
confirm my reasoning here.

-- 
Sean Whitton



Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Sean Whitton
Hello,

On Wed 31 Jul 2019 at 10:53PM +01, Rebecca N. Palmer wrote:

> Do "complicated and inconvenient" mean "harder to remember than 'git
> debpush'" (which could equally well be fixed by a local-only script),
> the confusing errors mentioned below, or something else?

It's a qualitative claim about what it is like to use the tools.  We
think that use of git-debpush imposes much less Debian-specific
cognitive load on package maintainers.  They are just signing and
pushing a git tag.

I do not believe that a script which just runs the two commands will
achieve that.

> On 31/07/2019 20:21, Sean Whitton wrote:
>
>> Just fyi, it is indeed as simple as [dgit push-source && git push --all 
>> --follow-tags].  However, when
>> there are errors, it is quite a bit harder to understand what's going on
>> than it is with git-debpush/tag2upload, basically because there are
>> .dscs involved.
>
> If git debpush / tag2upload have better error handling, could that code
> be used to improve dgit push-source?

The sense in which git-debpush & tag2upload have better error handling
is just that the user's computer is not responsible for any .dsc
manipulation.  This makes it easier to have the correct mental model of
what's going on, and thus what went wrong.

As soon as .dsc generation is happening on the user's machine, you
introduce a whole load of stuff which the user has to incorporate into
their mental model of what's going on with the command they just typed.

You and I basically already have all that stuff in our heads because
we've been doing Debian stuff for a while.  We want experienced
contributors to be able to discard it, and new contributors not have to
learn it.

>> (I don't think we'd want to make git-debpush a wrapper for that because
>> it is not a pure git command, so shouldn't be in the git-* namespace.)
>
> I'm fine with calling it something else (though I find it weird that 'do
> X' isn't OK but 'ask a server to do X' is): suggestions?

I'm not really convinced that we should upload to the archive a shell
script which simply executes one dgit command followed by one git
command.  A shell alias in .bashrc seems appropriate.  I've one of
those, called 'debrel', since you asked :)

-- 
Sean Whitton



Re: tag2upload (git-debpush) service architecture - draft

2019-08-01 Thread Rebecca N. Palmer

Bastian Blank wrote:

The git object

checksums don't suffice anymore due to SHA1.  And as the world moves
towards SHA3, it will need to have the ability to follow.


Ian Jackson wrote:> The git signed tag object has a signature which is 
verifiable without

relying on the git object hash system.  The tag text directly contains
the source package name, and version, and intended upload target.


A git tag is internally similar to an SHA1-only .dsc or .changes, in 
that it uses a hash to specify what the actual repository contents 
should be: verifying the tag signature without using the hash only tells 
you that an authorized person tried to upload *something*, not whether 
it was the same content as is currently in Salsa.


Do you now intend to add an SHA-256 hash, or is one of us mistaken?

$ git cat-file tag debian/1.3.2-6
object 6a899bec4829cd941b65f9ddc2d4f6ef5468b972
type commit
tag debian/1.3.2-6
tagger Rebecca N. Palmer  1549574096 +

beignet Debian release 1.3.2-6
[signature deleted]

Bastian Blank wrote:

The output of all operations obviously needs to be reproducible to be signed.


Other parties could re-run the tag2upload transformation to verify it, 
but this would require reading from Salsa as well as the archive.


I agree that any re-signing form of tag2upload is highly 
security-critical code, and should be held to our standards for such. 
(I don't know what those standards are.)




Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Rebecca N. Palmer

On 31/07/2019 17:08, Ian Jackson wrote:

.dsc generation is complicated, slow, and inconvenient.


In what circumstances is it slow enough to matter?

My measurements, in a sid chroot:
source  .orig  .debian origcreate dpkg-b.
 size size   time time
dgit(native)4M  0.3sec  1.4sec
beignet  1M   <0.1M 0.5sec  1.7sec
theano  13M   <0.1M 1.4sec  1.8sec
libgpuarray  0.3M <0.1M 1.6sec  2.1sec
statsmodels 10M0.2M 1.7sec  3.3sec
pandas   8M4M   1.6sec 31.4sec

origcreate = create the *.orig.tar.* from the git repo (time gbp 
buildpackage --git-builder=/bin/true --git-cleaner=/bin/true)
dpkg-b. = build the source package (.debian.tar.* + .dsc + .changes) 
from that .orig and the repo (time dpkg-buildpackage -S -d -nc -us -uc)


The bad case (pandas) is a non-native package with a big /debian, which 
is rare (~30 this big in sid), and even it's quick compared to 
building+testing the binaries.


---

Do "complicated and inconvenient" mean "harder to remember than 'git 
debpush'" (which could equally well be fixed by a local-only script), 
the confusing errors mentioned below, or something else?


On 31/07/2019 20:21, Sean Whitton wrote:


Just fyi, it is indeed as simple as [dgit push-source && git push --all 
--follow-tags].  However, when
there are errors, it is quite a bit harder to understand what's going on
than it is with git-debpush/tag2upload, basically because there are
.dscs involved.


If git debpush / tag2upload have better error handling, could that code 
be used to improve dgit push-source?



(I don't think we'd want to make git-debpush a wrapper for that because
it is not a pure git command, so shouldn't be in the git-* namespace.)


I'm fine with calling it something else (though I find it weird that 'do 
X' isn't OK but 'ask a server to do X' is): suggestions?




Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Bastian Blank
Hi Sam

On Wed, Jul 31, 2019 at 03:21:32PM -0400, Sam Hartman wrote:
> Bastian> One last time: The user has to certify his upload in a way
> Bastian> the archive can verify.
> Let me see if I'm correctly understanding this requirement.  You're
> saying that given the dsc presented to dak by the tag2upload service,
> dak needs to be able to verify the contents  of the DSC based on the
> user's signature and no external data.

Yes.

dak will push the signed .dsc into the pool.  This file and the complete
source package can then be verified independently by everyone.  We don't
need to trust ftp-master's verification of the signature.

> So, if the tag2upload service does some transformation to produce the
> dsc:
> 1) dak needs to be able to verify the inputs to that transformation
> and
> 2) confirm those inputs are certified back to a user signature.

Not only dak, but everyone who downloads the source package needs to be
able to verify the user signature.

Ian's tag2upload tool wants to replace the user signature with a tool
signature.  The user signature used as input for the tool would be not
longer verifyable, as the input is not provided.  So everything after
that would need to trust the tool and the instrastructure it runs on.
This means we would need to trust it more than we need to trust
ftp-master for source package verification.

> Have I understood your requirement?

Yes.

Regards,
Bastian

-- 
Without followers, evil cannot spread.
-- Spock, "And The Children Shall Lead", stardate 5029.5



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Sam Hartman
>>>>> "Bastian" == Bastian Blank  writes:

Bastian> Hi Ian
Bastian> On Wed, Jul 31, 2019 at 05:08:51PM +0100, Ian Jackson wrote:
    >> Bastian Blank writes ("Re: tag2upload (git-debpush) service
>> architecture - draft"): > The hypothetical tool creates a
>> complete .dsc file with the names and > checksums of the
>> uncompressed files.  The user signed .dsc is put into > the tag.
>> The point of the tag2upload exercise is to move the .dsc
>> generation from the uploader's computer to a central service,
>> because .dsc generation is complicated, slow, and inconvenient.
>> So generating the .dsc on the user's system defeats the object of
>> the exercise.

Bastian> One last time: The user has to certify his upload in a way
Bastian> the archive can verify.

Let me see if I'm correctly understanding this requirement.  You're
saying that given the dsc presented to dak by the tag2upload service,
dak needs to be able to verify the contents  of the DSC based on the
user's signature and no external data.

So, if the tag2upload service does some transformation to produce the
dsc:

1) dak needs to be able to verify the inputs to that transformation

and
2) confirm those inputs are certified back to a user signature.

Presumably this all needs to be doable using software we'd be
comfortable running as part of dak.

Have I understood your requirement?

--Sam



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Sean Whitton
Hello,

On Wed 31 Jul 2019 at 07:53AM +01, Rebecca N. Palmer wrote:

> (c-scriptedstatusquo) git debpush becomes an automated way to do what is
> currently recommended, i.e. it creates and pushes a signed git tag (to
> salsa and to dgit), creates tarballs, creates and signs .dsc+.changes,
> dputs .dsc+.changes+tarball(s).  (This might be as simple as "dgit
> push-source && git push --all --follow-tags" [C], but I haven't tested
> that.)  tag2upload doesn't need to exist.

Just fyi, it is indeed as simple as those two commands.  However, when
there are errors, it is quite a bit harder to understand what's going on
than it is with git-debpush/tag2upload, basically because there are
.dscs involved.

(I don't think we'd want to make git-debpush a wrapper for that because
it is not a pure git command, so shouldn't be in the git-* namespace.)

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Bastian Blank
Hi Ian

On Wed, Jul 31, 2019 at 05:08:51PM +0100, Ian Jackson wrote:
> Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - 
> draft"):
> > The hypothetical tool creates a complete .dsc file with the names and
> > checksums of the uncompressed files.  The user signed .dsc is put into
> > the tag.
> The point of the tag2upload exercise is to move the .dsc generation
> from the uploader's computer to a central service, because .dsc
> generation is complicated, slow, and inconvenient.  So generating the
> .dsc on the user's system defeats the object of the exercise.

One last time:  The user has to certify his upload in a way the archive
can verify.

Now it is EOD from me.

Regards,
Bastian

-- 
All your people must learn before you can reach for the stars.
-- Kirk, "The Gamesters of Triskelion", stardate 3259.2



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Ian Jackson
Ansgar writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> There are also other issues, for example:
> 
>  - Such a service would bypass various sanity checks on the archive
>side, including various permission checks.

What permission checks are bypassed ?  The current service does expect
to perform the DD/DM check on behalf of the archive.  But that is
straightforward.

>  - Such a service would need to properly validate the PGP signature.
>The archive really shouldn't rely on a third-party service for this.
>(In particular the service in question here doesn't do that as far as
>I can tell.)

My prototype already validates the PGP signature on the signed tag it
uses as its input and instructions.  That seemed obviously essential
to me even for a demo.  (Particularly as even in the demo in theory
the machinery could be subverted by a malicious salsa, otherwise.)

I had the code for that and the DM/DD permission check already,
because they were needed for the dgit git server, which already has
a permissions implementation equivalent to that of the archive (and
using the DAM-supplied data files for that purpose).

Perhaps I have misunderstood what you mean by "validate the PGP
signature".

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Ian Jackson
Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> The hypothetical tool creates a complete .dsc file with the names and
> checksums of the uncompressed files.  The user signed .dsc is put into
> the tag.

This tool is almost exactly "dgit" and therefore already exists.  It
does parallel publication in the archive (.dsc) and git (signed tags).

The point of the tag2upload exercise is to move the .dsc generation
from the uploader's computer to a central service, because .dsc
generation is complicated, slow, and inconvenient.  So generating the
.dsc on the user's system defeats the object of the exercise.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Ian Jackson
Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> We discussed a bit within the ftp team and several points came up.  The
> following describes my interpretation of it:
> 
> The archive will need to do the final validation to check if an upload
> is accepted.  The uploaders signature would need to be added to the
> source package to allow checking the validity also in the future.  We
> already retain all user signatures of source packages in the archive and
> such a proposed service must provide the same level of possible
> verification.

I can certainly include a copy of the git signed tag object.  This
would require a modest change to dak to accept the new filename.  Can
you please tell me what filename would be good ?

> The signature needs to be collision resistant and needs to be verifyable
> with only the stuff included into the source package.  The git object
> checksums don't suffice anymore due to SHA1.  And as the world moves
> towards SHA3, it will need to have the ability to follow.  The output of
> all operations obviously needs to be reproducible to be signed.

The git signed tag object has a signature which is verifiable without
relying on the git object hash system.  The tag text directly contains
the source package name, and version, and intended upload target.

> I don't know if any of this requires a new dpkg source format to
> implement properly.

I don't think so.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Jonathan McDowell
On Mon, Jul 29, 2019 at 09:46:51 +0200, Ansgar wrote:
> There are also other issues, for example:
>
>  - Such a service would bypass various sanity checks on the archive
>side, including various permission checks.

tag2upload checks the Debian Keyring and the DM ACL (from dak)/DM
keyring. What other checks are done per key that are avoided by this?

The proposed Git-Tag-Info field has the fingerprint of the original key
that signed the tag, if there are further keyring related checks that
ftp-master wish to perform themselves.

>  - Such a service would need to properly validate the PGP signature.
>The archive really shouldn't rely on a third-party service for this.
>(In particular the service in question here doesn't do that as far as
>I can tell.)

I'm not clear on what the issue is here; perhaps you can expand? A DD/DM
has to sign the tag in Salsa[0], tag2upload does the appropriate check
that the tag is signed by a key that has the appropriate permissions to
do a source upload of the package in question.

There's no third party service being trusted here. The keyrings are from
Debian and the intent is to have tag2upload being run on Debian
infrastructure - indeed that's my understanding of part of the main
reason Ian has brought up his draft architecture here, to work out what
needs changed/included to make that possible.

It's not clear to me why this is significantly different from a security
perspective than the buildds; in fact as this service does not build
anything other than a source package it's much more auditable than the
binary uploads they perform[1].

J.

[0] Other Git Hosting Services Are Available.
[1] Until Reproducible Builds are at 100%.

-- 
101 things you can't have too much of : 40 - Star Wars toys.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-31 Thread Rebecca N. Palmer
There are at least 2 questions being debated here, and at least 5 
proposed solutions, and they are frequently being confused.


The questions:

(1-trust) Is it acceptable in principle for the archive to trust a 
tag2upload service?  (i.e. have tag2upload rather than dak be 
responsible for checking the tag signature)


(2-hash) If yes, is it acceptable for tag2upload to rely on SHA-1?

The solutions ('git debpush' is the part running on the uploader's 
system, 'tag2upload' is the part running on a server):


(a-sha1resign) git debpush pushes a special signed tag "please upload 
this commit" (i.e. identified by sha1).  tag2upload creates a source 
package from this, signs it with its own key, and dputs it.  (Ian's 
original, [A])


(b-sha256resign)  As (a) except the tag also includes a sha256. [B+D]

(c-scriptedstatusquo) git debpush becomes an automated way to do what is 
currently recommended, i.e. it creates and pushes a signed git tag (to 
salsa and to dgit), creates tarballs, creates and signs .dsc+.changes, 
dputs .dsc+.changes+tarball(s).  (This might be as simple as "dgit 
push-source && git push --all --follow-tags" [C], but I haven't tested 
that.)  tag2upload doesn't need to exist.


(d-tarballrecreator) git debpush creates and pushes a signed git tag, 
creates and signs .dsc+.changes, and sends them (but _not_ the tarballs 
they refer to) to tag2upload.  tag2upload creates the tarball(s) from 
the git repo, and dputs the .dsc+.changes+tarball(s). [B+D]


(e-modifydak) Add at least some git-upload-related functionality to dak 
itself, instead of a separate tag2upload service.  (This is more of a 
family of solutions than a single option: the specific variant [E] 
proposed by Bastian is close to (d), but the equivalent of (b) could 
also be done this way.)


Table of advantages and disadvantages (+=better, -=worse, .=slightly 
worse, compared to doing nothing):


abcde
Uploader's convenience:
+ Only need to know/type 'git debpush'
++ ++ Doesn't waste bandwidth on tarballs
Security:
--  - (1-trust) Requires trusting the new code
- (2-hash) Relies on SHA-1
Implementation difficulty:
 -.-- Code doesn't already exist
 . -. Needs reproducible tarballs (d) or equivalent (b+e)
-- -  Requires (somewhere to run a) new service
- Requires changes to dak
--  ? Breaks "get sponsor name from .dsc" tools
abcde

On 30/07/2019 16:54, Bastian Blank wrote:

On Sun, Jul 28, 2019 at 07:05:49PM +0100, Rebecca N. Palmer wrote:

That suggests that working towards requiring the SHA-256 mode of git (which
at least sort of exists since 2.21 [2], but I don't know if it's usable yet)
might be a better use of effort.


Please keep in mind that the archive needs to verify this.  How do you
intend to provide the required information within the existing source
package structure?


We don't: this is only trying to fix (2-hash), while you evidently 
object to (1-trust).


Also, as hinted at by Marco, the SHA-256 mode of git doesn't work yet:

(with git 1:2.23.0~rc0-1; the config lines are from [0])
$ cat .git/config
[core]
repositoryFormatVersion = 1
[extensions]
objectFormat = sha256
compatObjectFormat = sha1
[core]
filemode = true
bare = false
logallrefupdates = true
$ git log
fatal: unknown repository extensions found:
objectformat
compatobjectformat



Another idea, [...]  I would say
this is a new source format.


I agree that implementing the whole of your proposal would require 
modifying dak.  (I see it as "implement (some of) tag2upload inside 
dak".)  This potentially has similar security implications to having dak 
trust tag2upload: lower risk as it would be under the established 
package/maintainers/sysadmins for such sensitive code, but higher impact 
if gaining control of dak is worse/easier to hide than just being able 
to upload.


However, it has two elements that could be useful for a 
(d-tarballrecreator) scheme with current dak.  (They would then need to 
be .dsc+.changes not just .dsc, as .dsc and .changes must be signed by 
the same key [1].)



a complete .dsc file with the names and
checksums of the uncompressed files.


Not compressing the tarballs may make reproducibility easier.


The user signed .dsc is put into
the tag.


This would allow the git repo to be the only communication channel from 
git debpush to tag2upload.  (As in (a/b-sha*resign), but I don't know if 
this matters.)


[A] https://lists.debian.org/debian-devel/2019/07/msg00501.html
[B+D] https://lists.debian.org/debian-devel/2019/07/msg00596.html
[C] https://lists.debian.org/debian-devel/2019/07/msg00601.html
[E] https://lists.debian.org/debian-devel/2019/07/msg00641.html
[0] 
https://sources.debian.org/src/git/1:2.22.0-1/Documentation/technical/hash-function-transition.txt/#L125

[1] https://salsa.debian.org/ftp-team/dak/blob/master/daklib/checks.py#L157



Re: tag2upload (git-debpush) service architecture - draft

2019-07-30 Thread Bastian Blank
On Sun, Jul 28, 2019 at 07:05:49PM +0100, Rebecca N. Palmer wrote:
> That suggests that working towards requiring the SHA-256 mode of git (which
> at least sort of exists since 2.21 [2], but I don't know if it's usable yet)
> might be a better use of effort.

Please keep in mind that the archive needs to verify this.  How do you
intend to provide the required information within the existing source
package structure?

> [1] needs reproducibility, but simpler than pristine-tar in that we're only
> trying to create _a_ reproducible tarball (not match one created by
> upstream) and don't need to compress it (as it can be deleted after hashing
> - unfortunately tar doesn't obviously have a write-to-stdout option to allow
> tar | sha256).  Reproducible builds suggests tar --sort=name --owner=0
> --group=0 --numeric-owner.

For now "git archive" with tar output seems to reproducible from jessie
(2.1.4) to sid (2.23 rc).

Another idea, however we would need to trust some decompressors:

The hypothetical tool creates a complete .dsc file with the names and
checksums of the uncompressed files.  The user signed .dsc is put into
the tag.

The tag2upload service creates the .changes files with the names and
checksums of the compressed files.  It is then signed by the upload
tool.

Accepting a package with dak would looks more like this:
- Verify signature on .changes.
- Check for source-only (forced by the upload tool flag).
- Check checksums of included files.
- Verify signature of .dsc.
- Check ACL against user signature on .dsc.
- Decompress (this poses a DoS threat!).
- Check checksums of included decompressed files.
- Either:
  - accept compressed files as is.
  - re-compress (also DoS, due to large files), calculate new checksums,
accept.

Due to the implicit compression of files listed in .dsc, I would say
this is a new source format.

Regards,
Bastian

-- 
A little suffering is good for the soul.
-- Kirk, "The Corbomite Maneuver", stardate 1514.0



Re: tag2upload (git-debpush) service architecture - draft

2019-07-29 Thread Sean Whitton
Hello,

On Sun 28 Jul 2019 at 09:55PM +01, Rebecca N. Palmer wrote:

> On 28/07/2019 20:01, Sean Whitton wrote:
>> When I read your first e-mail what I thought you had in mind was just
>> this -- having git-debpush compute a stronger hash of the tree object
>> and add that to the tag metadata, ignoring commit objects.
>
> Of the files in the signer's repository, not of an actual tree object
> (since the second is a list of file/subtree SHA-1 hashes).

Ah, right.

>> But now I'm struggling to understand the relevance of your discussion of
>> having git-debpush create a .dsc in your second e-mail, if what you're
>> actually talking about is hashing a git tree object.
>
> "Tag with sha256" and "hidden .dsc" are two alternative options: the
> first is a narrowly targeted fix for the SHA-1 issue, the second a
> bigger redesign.

Okay.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-07-29 Thread Ansgar
Bernd Zeimetz writes:
> On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:> As a way to avoid relying
> on SHA-1, would it work to have git-debpush
>> include a longer hash in the tag message, and tag2upload also verify
>> that hash?
>>
> The other idea would be to convince git upstream to use something
> better than sha1 - and after a bit of searching, I found
[...]
> So I think the best thing to do is to get sha256 working in git and
> force the usage of sha256 if you want to sign a tag for upload.

That will take quite a while; we would probably need a version of git
supporting that in stable.

There are also other issues, for example:

 - Such a service would bypass various sanity checks on the archive
   side, including various permission checks.

 - Such a service would need to properly validate the PGP signature.
   The archive really shouldn't rely on a third-party service for this.
   (In particular the service in question here doesn't do that as far as
   I can tell.)

Ansgar



Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Marco d'Itri
On Jul 28, Bernd Zeimetz  wrote:

> So I think the best thing to do is to get sha256 working in git and
> force the usage of sha256 if you want to sign a tag for upload.
This cannot be a goal for this project since git upstream will need 
apparently a few more years for the transition to sha-256 to happen.

-- 
ciao,
Marco


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Bernd Zeimetz



On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:> As a way to avoid relying
on SHA-1, would it work to have git-debpush
> include a longer hash in the tag message, and tag2upload also verify
> that hash?
>
The other idea would be to convince git upstream to use something
better than sha1 - and after a bit of searching, I found

https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

- Git v2.13.0 and later use a hardened sha-1 implementation by
default, which isn't vulnerable to the SHAttered attack.
Still sha-1, though.

- there is a plan to support sha256.

Googling a bit more found

https://stackoverflow.com/questions/28159071/why-doesnt-git-use-more-modern-sha

which gives some insight on the (plans for) implementation.


So I think the best thing to do is to get sha256 working in git and
force the usage of sha256 if you want to sign a tag for upload.



-- 
 Bernd ZeimetzDebian GNU/Linux Developer
 http://bzed.dehttp://www.debian.org
 GPG Fingerprint: ECA1 E3F2 8E11 2432 D485  DD95 EB36 171A 6FF9 435F



Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Rebecca N. Palmer

On 28/07/2019 20:01, Sean Whitton wrote:

When I read your first e-mail what I thought you had in mind was just
this -- having git-debpush compute a stronger hash of the tree object
and add that to the tag metadata, ignoring commit objects.


Of the files in the signer's repository, not of an actual tree object 
(since the second is a list of file/subtree SHA-1 hashes).



But now I'm struggling to understand the relevance of your discussion of
having git-debpush create a .dsc in your second e-mail, if what you're
actually talking about is hashing a git tree object.


"Tag with sha256" and "hidden .dsc" are two alternative options: the 
first is a narrowly targeted fix for the SHA-1 issue, the second a 
bigger redesign.



(As an aside, if what you want is to hide .dsc creation from the user
but still do it on their machine and upload it, `dgit push-source` is
already available.)


That doesn't push to salsa [0].  However, I agree that it otherwise does 
solve the problem of "not making the uploader think about how Debian 
source packages work", without requiring a server-side component.


This does still "waste" the uploader's bandwidth on tarballs, but I 
don't know if that's an issue in practice.  For most packages [1] it is 
a much smaller data volume than the downloads needed to keep an 
up-to-date sid for building/testing the package.


[0] https://sources.debian.org/src/dgit/9.6/dgit-maint-gbp.7.pod/#L117
[1] Rough numbers: ~80% of .orig.tar.*z are <1MB, ~97% <10MB; a gcc 
update is a ~30MB download.




Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Sean Whitton
Hello,

On Sun 28 Jul 2019 at 07:05pm +01, Rebecca N. Palmer wrote:

> On 28/07/2019 16:18, Ian Jackson wrote:
>> What it amounts to is a parallel Merkle tree to the
>> git one, just with a different data format and a better hash.
>
> Not really: it wouldn't need the history tree structure (in Git terms
> [0], it would be a tree object not a commit object), and if we use
> tar+sha256 [1], it wouldn't need the hash-per-file directory tree
> structure either.

When I read your first e-mail what I thought you had in mind was just
this -- having git-debpush compute a stronger hash of the tree object
and add that to the tag metadata, ignoring commit objects.

But now I'm struggling to understand the relevance of your discussion of
having git-debpush create a .dsc in your second e-mail, if what you're
actually talking about is hashing a git tree object.

(As an aside, if what you want is to hide .dsc creation from the user
but still do it on their machine and upload it, `dgit push-source` is
already available.)

On Sun 28 Jul 2019 at 04:18pm +01, Ian Jackson wrote:

> The downside is that the tag is no longer just a normal signed git tag
> with some easy to construct and easy to understand metadata.  It will
> in practice then not be practical to make this tag other than with
> git-debpush (or some other special utility with the same code).

This is a downside, but it's not a permanent one -- it goes away if git
switches away from SHA-1, which perhaps it is reasonable to expect
eventually.

It would be good to hear responses to Rebecca's suggestion from those
who disagree that it is okay to rely on SHA-1 in the particular way that
git-debpush/tag2upload does.

-- 
Sean Whitton



Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Rebecca N. Palmer

On 28/07/2019 16:18, Ian Jackson wrote:

What it amounts to is a parallel Merkle tree to the
git one, just with a different data format and a better hash.


Not really: it wouldn't need the history tree structure (in Git terms 
[0], it would be a tree object not a commit object), and if we use 
tar+sha256 [1], it wouldn't need the hash-per-file directory tree 
structure either.



The upside is the better hash, but I think our overall risk from the
git SHA-1 problem is (i) still in practice quite low 


For attacks happening now, I agree (but am not an expert): my intent in 
suggesting this was "this is an easy way to have a better hash if we 
want it", not to take a side on the question of whether we need it.


This may change, but we have the option of implementing this fix then 
(and if it happens suddenly, temporarily disabling tag2upload to give us 
time to do so).



(ii) exists in
all the other places we rely on git already.


That suggests that working towards requiring the SHA-256 mode of git 
(which at least sort of exists since 2.21 [2], but I don't know if it's 
usable yet) might be a better use of effort.


[0] https://git-scm.com/book/en/v2/Git-Internals-Git-Objects
[1] needs reproducibility, but simpler than pristine-tar in that we're 
only trying to create _a_ reproducible tarball (not match one created by 
upstream) and don't need to compress it (as it can be deleted after 
hashing - unfortunately tar doesn't obviously have a write-to-stdout 
option to allow tar | sha256).  Reproducible builds suggests tar 
--sort=name --owner=0 --group=0 --numeric-owner.
[2] 
https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt




Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Rebecca N. Palmer

On 28/07/2019 10:58, Bernd Zeimetz wrote:

On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:

As a way to avoid relying on SHA-1, would it work to have git-debpush
include a longer hash in the tag message, and tag2upload also verify
that hash?


what exactly would you create that long hash of?


The signer's local files when they run git-debpush.  (To be decided: how 
to define the hash of a directory tree (as opposed to a single file), 
i.e. "tar | sha256 like a .dsc" or "what git uses but sha256".)


The hash security is for ensuring that tag2upload is seeing the same 
content as the signer did, and not something different an attacker 
placed on Salsa.  (If the attacker can get their changes into the 
signer's local copy without the signer noticing, we'd have a problem 
whatever method the signer uses to upload it.)


This does sort of raise the question of why not prefer "keep .dscs, but 
hide them from the user and regenerate tarballs", but this might be 
inappropriately reopening an already decided issue.  (I remember it 
being suggested before, but not what (if any) response this got.)


(+/=/- are relative to the existing proposal)
+ Security: dak doesn't have to trust dgit-repos-server
 (avoids both weak hashes and potential bugs)
+ Compatibility: finding the signer's name from the .dsc still works
= Uploader only needs to do 'git debpush'
= Doesn't spend uploader's (possibly low/expensive) bandwidth on 
uploading what Salsa already has

- Someone would have to implement it
 (if that's me - not in Perl and I'm not a DD or a security specialist)

git-debpush:
create .dsc # as normal
create tag # as normal, only needs version number
sign tag # not strictly required, but since the next step
# needs a key anyway, good to automate best practice
sign .dsc
push tag to Salsa
upload .dsc to dgit-repos-server # but not its tarballs

dgit-repos-server --tag2upload:
receive .dsc
check .dsc signature # do this first to prevent DoS
# maybe also check the version number to prevent DoS by
# re-submitting old/non-Debian .dscs
fetch source from Salsa
create source package tarballs
check if these match the .dsc hashes # not strictly required as
# dak will do it again anyway, but easy
dput the .dsc+tarballs # as normal

# not sure where .changes fits into this:
# replace ".dsc" by ".dsc+.changes" throughout?
# or have dgit-repos-server create .changes as if it were a buildd?



Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Ian Jackson
Rebecca N. Palmer writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> The signer's local files when they run git-debpush.  (To be decided: how 
> to define the hash of a directory tree (as opposed to a single file), 
> i.e. "tar | sha256 like a .dsc" or "what git uses but sha256".)

This would of course be possible.  I don't think it's a particularly
good idea though.  What it amounts to is a parallel Merkle tree to the
git one, just with a different data format and a better hash.

The upside is the better hash, but I think our overall risk from the
git SHA-1 problem is (i) still in practice quite low (ii) exists in
all the other places we rely on git already.

The downside is that the tag is no longer just a normal signed git tag
with some easy to construct and easy to understand metadata.  It will
in practice then not be practical to make this tag other than with
git-debpush (or some other special utility with the same code).

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-28 Thread Bernd Zeimetz
On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:
> As a way to avoid relying on SHA-1, would it work to have git-debpush
> include a longer hash in the tag message, and tag2upload also verify
> that hash?

what exactly would you create that long hash of?

If we don't trust sha-1, then we might also not be able to trust the
linked list of commits a git tag is pointing to.


-- 
 Bernd ZeimetzDebian GNU/Linux Developer
 http://bzed.dehttp://www.debian.org
 GPG Fingerprint: ECA1 E3F2 8E11 2432 D485  DD95 EB36 171A 6FF9 435F



Re: tag2upload (git-debpush) service architecture - draft

2019-07-27 Thread Rebecca N. Palmer
As a way to avoid relying on SHA-1, would it work to have git-debpush 
include a longer hash in the tag message, and tag2upload also verify 
that hash?




Re: tag2upload (git-debpush) service architecture - draft

2019-07-27 Thread Ian Jackson
Jonathan McDowell writes ("Re: tag2upload (git-debpush) service architecture - 
draft"):
> For the record I am in favour of this as a service. I'm not a dgit user,
> but I am a salsa user who pushes release tags there and then uploads to
> the archive. Reducing this to a single action sounds like less work for
> me and result in less likelihood of me forgetting a step (either the
> push to salsa, or sometimes an upload).

Right.  Thanks for your support.

> On Wed, Jul 24, 2019 at 02:56:22AM +0100, Ian Jackson wrote:
> >   Please see this blog post to learn about how it works:
> >   https://spwhitton.name/blog/entry/tag2upload/

Thanks for the review.

> I've clarified with Ian that despite Sean's blog talking about the
> debian-keyring package the dgit infrastructure correctly uses the
> keyring in /srv/keyring.debian.org/ as deployed by DSA on the Debian
> infrastructure.

Right.  This is the way the dgit git server already verifies DM push
permission.

> >  * tag2upload service
> > [stuff]
> 
> The piece of information that I think is missing here (and I've been
> able to discover in person) is that the "trusted" piece (all the !s) is
> keeping state during the processing of a particular tag/upload. That is,
> the trusted component gets handed the tag info, verifies it is sane,
> hands it off to the untrusted component to fetch + build a source
> package for, then does as much verification as it can that what it gets
> back from the untrusted component is the same package/version as
> expected.

Indeed.

> > [1] In principle other git servers would be possible but it would have
> > to be restricted to ones where we can either avoid, or stop, them
> > being used as a channel for a DoS attack against the tag2upload
> > service.
> 
> If we're hoping to pitch salsa as being the default place for Debian
> packages to live is limiting this service to salsa not a decent carrot?

I know some people have qualms about salsa.  (I have some slight
qualms myself).  So I think at the very least we (I, in this case)
should leave the door open to competing git hosting service(s).

The technical requirement here is merely basic sanity, and a certain
level of defence against DoS.  I think it is not a requirement that a
supported server is operated by Debian.  Or even that it continues to
exist - as part of the upload process the git history is transferred
to the *.dgit.d.o git servers, so if the original server vanishes we
still have everything.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: tag2upload (git-debpush) service architecture - draft

2019-07-27 Thread Bastian Blank
Hi Ian

On Wed, Jul 24, 2019 at 02:56:22AM +0100, Ian Jackson wrote:
> We've had a number of peripheral conversations, and informal
> internal reviews, but I think it's the stage now to have a public
> design review etc.  I'm CCing this to -devel because I just did a
> lightning talk demo of the prototype and IME many people are
> interested in these kinds of questions.

We discussed a bit within the ftp team and several points came up.  The
following describes my interpretation of it:

The archive will need to do the final validation to check if an upload
is accepted.  The uploaders signature would need to be added to the
source package to allow checking the validity also in the future.  We
already retain all user signatures of source packages in the archive and
such a proposed service must provide the same level of possible
verification.

The signature needs to be collision resistant and needs to be verifyable
with only the stuff included into the source package.  The git object
checksums don't suffice anymore due to SHA1.  And as the world moves
towards SHA3, it will need to have the ability to follow.  The output of
all operations obviously needs to be reproducible to be signed.

I don't know if any of this requires a new dpkg source format to
implement properly.

The service still might need credentials of it's own, but no permissions
will be attached to it.  And whatever you do, don't use Perl as
implementation language.

I would like to have such a service.  However it would have been nice
for you to talk about the verification requirements before you ask for a
key and a way to circumvent the archive upload checks and restrictions.

Regards,
Bastian

-- 
"We have the right to survive!"
"Not by killing others."
-- Deela and Kirk, "Wink of An Eye", stardate 5710.5



Re: tag2upload (git-debpush) service architecture - draft

2019-07-27 Thread Sean Whitton
Hello,

On Fri 26 Jul 2019 at 08:50PM +01, Jonathan McDowell wrote:

> I've clarified with Ian that despite Sean's blog talking about the
> debian-keyring package the dgit infrastructure correctly uses the
> keyring in /srv/keyring.debian.org/ as deployed by DSA on the Debian
> infrastructure.

Right, thanks.  Use of that package is just for try-it-on-your-laptop.

> The piece of information that I think is missing here (and I've been
> able to discover in person) is that the "trusted" piece (all the !s) is
> keeping state during the processing of a particular tag/upload. That is,
> the trusted component gets handed the tag info, verifies it is sane,
> hands it off to the untrusted component to fetch + build a source
> package for, then does as much verification as it can that what it gets
> back from the untrusted component is the same package/version as
> expected.

Thanks for this.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload (git-debpush) service architecture - draft

2019-07-26 Thread Jonathan McDowell
On Wed, Jul 24, 2019 at 02:56:22AM +0100, Ian Jackson wrote:

> I wrote this draft design doc / deployment plan for the tag-to-upload
> service, perhaps best summarised by Sean like this:
> 
>   We designed and implemented a system to make it possible for DDs to
>   upload new versions of packages by simply pushing a specially
>   formatted git tag to salsa.

For the record I am in favour of this as a service. I'm not a dgit user,
but I am a salsa user who pushes release tags there and then uploads to
the archive. Reducing this to a single action sounds like less work for
me and result in less likelihood of me forgetting a step (either the
push to salsa, or sometimes an upload).

>   Please see this blog post to learn about how it works:
>   https://spwhitton.name/blog/entry/tag2upload/

I've clarified with Ian that despite Sean's blog talking about the
debian-keyring package the dgit infrastructure correctly uses the
keyring in /srv/keyring.debian.org/ as deployed by DSA on the Debian
infrastructure.

> TAG-TO-UPLOAD - DEBIAN - DRAFT DESIGN / DEPLOYMENT PLAN
> ===
> 
> Overall structure and dataflow
> --
> 
>  * Uploader (DD or DM) makes signed git tag (containing metadata
>forming instructions to tag2upload service)
> 
>  * Uploader pushes said tag to salsa. [1]
> 
>  * salsa sends webhook to tag2upload service.
> 
>  * tag2upload service
> : provides an HTTPS service accessible to salsa's IP addrs
> : fishes url and tag name out of webhook json
> ! checks that url is basically sane
> - retrieves tag data (git shallow clone)
> ! parses the tag metadata
> ! checks to see if it is relevant
> ! verifies signature
> ! checks to see if signed by DD, or DM for appropriate package
> - obtains relevant git history
> - obtains, if applicable, orig tarball from archive
> - makes source package
> # signs source package and "dgit view" git tag
> - pushes history and both tags to dgit git server
> - uploads source package to archive
> 
>  * archive publishes package as normal

The piece of information that I think is missing here (and I've been
able to discover in person) is that the "trusted" piece (all the !s) is
keeping state during the processing of a particular tag/upload. That is,
the trusted component gets handed the tag info, verifies it is sane,
hands it off to the untrusted component to fetch + build a source
package for, then does as much verification as it can that what it gets
back from the untrusted component is the same package/version as
expected.

Looking at risk factors I think the major ones are dealt with:

 * The package build is still performed by the buildd, not by this new
   service, so there shouldn't be exposure to build issues for
   tag2upload.
 * tag2upload is making the appropriate checks that the signer of the
   tag has the right to upload the package to the archive; either is a
   full DD or is a DM with appropriate DAK ACL rights.
 * Automated signers for uploads are not new; buildds are already doing
   this for binary packages.
 * The complexity is in creating the source package; figuring out the
   source format type, potentially applying patches etc. This is pushed
   out to the untrusted component.
 * Given that the tag signer is independently able to do an upload this
   does not provide any additional avenue for them to push a nefarious
   package into the archive.

> [1] In principle other git servers would be possible but it would have
> to be restricted to ones where we can either avoid, or stop, them
> being used as a channel for a DoS attack against the tag2upload
> service.

If we're hoping to pitch salsa as being the default place for Debian
packages to live is limiting this service to salsa not a decent carrot?

J.

-- 
"For the effect of psychedelics on the development community, well,
there's Enlightenment, isn't there?" -- Adam J. Thornton, asr.



tag2upload (git-debpush) service architecture - draft

2019-07-23 Thread Ian Jackson
Hi all.

I wrote this draft design doc / deployment plan for the tag-to-upload
service, perhaps best summarised by Sean like this:

  We designed and implemented a system to make it possible for DDs to
  upload new versions of packages by simply pushing a specially
  formatted git tag to salsa.

  Please see this blog post to learn about how it works:
  https://spwhitton.name/blog/entry/tag2upload/

The server side of this is not running yet and there is some work to
do for that.

We've had a number of peripheral conversations, and informal
internal reviews, but I think it's the stage now to have a public
design review etc.  I'm CCing this to -devel because I just did a
lightning talk demo of the prototype and IME many people are
interested in these kinds of questions.

Right now this document is maintained here:
   https://salsa.debian.org/dgit-team/dgit/tree/wip.tag2upl-draft
but NB that that is a potentially rewinding branch.  (I probably won't
rewind it until it's time to fold it into master at which point I may
just delete it.)

Ian.


TAG-TO-UPLOAD - DEBIAN - DRAFT DESIGN / DEPLOYMENT PLAN
===

Overall structure and dataflow
--

 * Uploader (DD or DM) makes signed git tag (containing metadata
   forming instructions to tag2upload service)

 * Uploader pushes said tag to salsa. [1]

 * salsa sends webhook to tag2upload service.

 * tag2upload service
: provides an HTTPS service accessible to salsa's IP addrs
: fishes url and tag name out of webhook json
! checks that url is basically sane
- retrieves tag data (git shallow clone)
! parses the tag metadata
! checks to see if it is relevant
! verifies signature
! checks to see if signed by DD, or DM for appropriate package
- obtains relevant git history
- obtains, if applicable, orig tarball from archive
- makes source package
# signs source package and "dgit view" git tag
- pushes history and both tags to dgit git server
- uploads source package to archive

 * archive publishes package as normal

[1] In principle other git servers would be possible but it would have
to be restricted to ones where we can either avoid, or stop, them
being used as a channel for a DoS attack against the tag2upload
service.

Service architecture


I propose the following architecture for the tag2upload service.

 * Packet filter limiting the incoming connections to salsa.

 * Conventional webserver offering TLS and using Let's Encrypt.
   (Alternatively, HTTP could be used, but in the future we
   might want to handle embargoed security uploads so let's not.)

 * Web-service-style "application server" written in some scripting
   language listens on a local TCP port, handles HTTP connections
   proxied by the webserver, parses the JSON, and connects to:

 * Trusted service daemon.  Listens on a TCP connection and accepts a
   simple line-based "url tag" protocol.  Checks urls and tags for
   basic syntax and sanity (eg that it has the right protocol and
   host).  Keeps track of incoming requests in a sqlite3 database so
   that execution can be deferred and retried as applicable.  Spawns
   per-request worker children.

 * Request processor.  Trusted.  Does the trusted parts above.

 * Some VM or container or maybe chroot.  Instantiated by request
   processor via adt-virt protocol.  Request processor controls this
   by sending it commands (via the adt-virt facility for this).

 * In the VM, git is used to fetch all the bits and dgit does the
   actual source package generation work.

 * Trusted service daemon needs access to its GPG key which should be
   on a hardware token and not accessible to the VM instances.

Privsep
---

The tag2upload service will have to have a signing key that can upload
source packages to the archive.

We do not want that signing key to be abused.  In particular, even
though it will be in a hardware token we want to avoid giving
unrestricted access to that key to code which also has a large attack
surface.  In particular, source package construction is very complex.

So there will be a privilege separation arrangement, as described
above.  Different tasks run in a different security context:

! is fully trusted and has access to the signing key

- runs in the discardable VM or container, controlled by `!'

# is achieved by the `dgit rpush' protocol, where the trusted
  (invoking, signing) part offers a restricted signing oracle to
  the less-trusted (building) part.  The signing oracle will check
  that the files to be signed are roughly in the right form and
  that they name the right source package.  It will construct the
  "dgit view" git tag itself from metadata provided by the
  building part.

: can run as different unix users or even different VMs or
  something, if desirable

Reproducibility, metdata and auditing