Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-30 Thread Sean Whitton
Hello,

On Tue 29 Oct 2019 at 08:32AM +01, Tobias Frost wrote:

>> For example, you would not be able to do this:
>>git clone salsa:something
>>cd something
>>make some straightforward change
>>git tag# } [1]
>>git push   # }
>> Instead you would have to download the .origs and so on, and wait
>> while your machine crunched about unpacking and repacking tarballs,
>> applying patches, etc.
>
> 
> I'm missing a "and then I test my package to ensure it still works before
> upload" step…
>
> I wonder how someone should test their packages when they do
> not build it locally.
> And if they do (as they should), the advantages you line
> out are simply not there.
> 

If you use `dpkg-buildpackage -b` to do your local tests, then the
advantage of not having to go near any source packages remains.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-29 Thread Ian Jackson
Helmut Grohne writes ("Re: Building Debian source packages reproducibly (was: 
Re: [RFC] Proposal for new source format)"):
> I think I'd trust the tag2upload service given the documentation you
> presented about it. I'm less faithful in all dgit installations being
> sane, sorry. We've run into too many builds in dirty chroots already.

That does make sense.  This is one of the ways that tag2upload is
better than dgit push.  (It is a shame that "integrity" concerns are
blocking integrity improvements.)

It would be possible to write a QA service which would verify Dgit
fields and automatically file RC bugs.  So far that hasn't seemed
necessary.

It would also be possible for dgit clone to verify the correspondence
itself, at the point where it honours the Dgit field.  Would that be a
useful feature for you ?  Of course it does mean downloading the
elements of the source package, which it currently doesn't need to do
if it finds a Dgit field, but there's no real difficulty.  (I wouldn't
make this the default!)

> > You do not need to talk to any random git servers.  The git tree is
> > available on a single official Debian server, the dgit git server.
> > The Dgit: field in the .dsc identifies the commitid.  The .dsc is of
> > course available via the signed apt repositories, as well as being
> > available from the ftpmaster data API.
> 
> I was not trying to imply dgit to be a random git server. Given that
> dgit (currently) only contains history for a fraction of packages, we
> still need to compare with Vcs-Git. Given enough time, dgit will have
> useful histories eventually.

Yes.  If tag2upload is deployed, I expect it to be very popular.

Until then Vcs-Git has all the problems you mention and many others
too: it is hard to reliably find the right tag (even the tag name is
not formally standardised!) and certainly nothing checks that the tag
corresponds in any particular way.  How it might correspond is
generally not even documented anywhere - at least, not anywhere
machine-readable.

> Hmm. I'm not sure whether I actually need the tag object. The commit id
> is what I really need. dak might need the tag object. I'll defer to
> others.

I think ftpmaster's concerns mean that dak would want the tag object
to redo the uploader identify verification, even though from my point
of view that would be a redundant check.  But it's simple to provide
the tag and there is some integrity improvement from doing so, so that
is what I am proposing.

Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-29 Thread Bastian Blank
Hi Didier

On Mon, Oct 28, 2019 at 10:05:11AM +0100, Didier 'OdyX' Raboud wrote:
> Of course, all of this can only work if we can have, or make the ".git to 
> .dsc" conversion reproducible; hence my query.

Now, please read the first mail of this thread again.  Yes, maybe parts
of it are unclear, but we are way past the "we need this conversion"
stage.

Maybe we can stop running in circles around this concept and design
solutions.

Bastian

-- 
Prepare for tomorrow -- get ready.
-- Edith Keeler, "The City On the Edge of Forever",
   stardate unknown



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-29 Thread Helmut Grohne
Hi Ian,

On Tue, Oct 29, 2019 at 12:54:57PM +, Ian Jackson wrote:
> I wonder if I have misunderstood you, because:
> 
> The tag2upload proposal is based on dgit, which already provides this.
> dgit indeed defines an isomorphism between source packages and git
> trees, and dgit clone gives a git branch that is thus-isomorphic to
> the .dsc.  This is fundamental to dgit's design.

I get that this is the intention, but I don't see that this property can
be safely assumed. I see the Dgit field as a hint. It says "this source
package should be equivalent to this commit" without any guarantees of
this actually being the case. I guess that for all uploads performed
thus far, this is indeed the case, but it is not a requirement validated
by dak or any other trusted (by me) entity. We could easily end up with
an upload where the commit id is accidentally different. Everything that
we can be gotten wrong, we will eventually get wrong.

> With `dgit push', the isomorphism is checked on the maintainer's
> machine during `dgit push'.  With tag2upload it is ensured by the
> tag2upload service.  (When the uploader didn't use dgit, dgit clone
> does a .dsc import, thus ensuring the isomorphism.)

I think I'd trust the tag2upload service given the documentation you
presented about it. I'm less faithful in all dgit installations being
sane, sorry. We've run into too many builds in dirty chroots already.

> > This property allows me to start from a git tree that is
> > authenticated by dak rather than a random git tree on a random git
> > server of questionable origin.
> 
> You do not need to talk to any random git servers.  The git tree is
> available on a single official Debian server, the dgit git server.
> The Dgit: field in the .dsc identifies the commitid.  The .dsc is of
> course available via the signed apt repositories, as well as being
> available from the ftpmaster data API.

I was not trying to imply dgit to be a random git server. Given that
dgit (currently) only contains history for a fraction of packages, we
still need to compare with Vcs-Git. Given enough time, dgit will have
useful histories eventually.

> It is true that this doesn't give you precisely the *tag* object -
> just the commit.  Adding the objectid of the tag object to the .dsc
> Dgit: field would be easy, if that would be helpful to you.  (Please
> file a wishlist bug against dgit if so.)  Alternatively, dak could
> publish the tag object (in a similar way to how it publishes binary
> buildinfos).

Hmm. I'm not sure whether I actually need the tag object. The commit id
is what I really need. dak might need the tag object. I'll defer to
others.

Helmut



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-29 Thread Ian Jackson
Helmut Grohne writes ("Re: Building Debian source packages reproducibly (was: 
Re: [RFC] Proposal for new source format)"):
> In other words, I want these formats (source package and tagged git
> tree) to be isomorphic (minus history). This requirement is too strong
> since not every source package will have a corresponding tag, but when
> there is a tag, I want to safely go from source package to tag and back
> again and arrive where I started from.

I wonder if I have misunderstood you, because:

The tag2upload proposal is based on dgit, which already provides this.
dgit indeed defines an isomorphism between source packages and git
trees, and dgit clone gives a git branch that is thus-isomorphic to
the .dsc.  This is fundamental to dgit's design.

With `dgit push', the isomorphism is checked on the maintainer's
machine during `dgit push'.  With tag2upload it is ensured by the
tag2upload service.  (When the uploader didn't use dgit, dgit clone
does a .dsc import, thus ensuring the isomorphism.)

> This property allows me to start from a git tree that is
> authenticated by dak rather than a random git tree on a random git
> server of questionable origin.

You do not need to talk to any random git servers.  The git tree is
available on a single official Debian server, the dgit git server.
The Dgit: field in the .dsc identifies the commitid.  The .dsc is of
course available via the signed apt repositories, as well as being
available from the ftpmaster data API.

It is true that this doesn't give you precisely the *tag* object -
just the commit.  Adding the objectid of the tag object to the .dsc
Dgit: field would be easy, if that would be helpful to you.  (Please
file a wishlist bug against dgit if so.)  Alternatively, dak could
publish the tag object (in a similar way to how it publishes binary
buildinfos).

Note that there are *two* tag objects: 1. the canonical view:
the dgit view tag, which is simply-isomorphic to the source package.
2. the maintainer tag, which is on the maintainer's branch and refers
to a commit in maintainer branch format.

With dgit push these are both made during dgit push with the
maintainer's key.  With tag2upload the canonical view tag is made by
the tag2upload service, because it is that service which performs the
maintainer->canonical conversion.

Each maintainer workflow defines a different mapping between
maintainer views and canonical views.  The (currently supported[1])
workflows are all isomorphisms.  So it is possible in principle to
reverse the maintainer->canonical transformation (if you know the
workflow, which can be found in the tags) but there is not currently
code to do that.  I don't get the impression, however, that this is a
thing you feel you need ?  (Some form of reverse transformation would
be needed to automatically and workflow-agnostically handle MRs whose
submitter is using the canonical view.)

> This backwards-connection seems to be missing thus far, but I do find it
> important for the reasons above. Adding it would easily allow dak to
> validate the signature on the tag.

So, I'm not sure I understand what you think is missing.

Ian.

[1] I think with monorepo workflows the maintainer->canonical
conversion is generally irreversible, because it discards information
about source packages other than the one in question.  This wouldn't
block MR processing because MRs are deltas and by definition the other
parts of the monorepo aren't edited in the MR.  It does mean you
couldn't reconstruct the whole monorepo given just the canonical view.

(Arguably this means that the .dsc representation of a source package
from a git monorepo is not a PFM.  See arguments on -legal and
-project, passim.  But the canonical view dgit branch does contain the
whole of the monorepo in its history, in a discoverable way, so
doesn't have this issue.)

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: Building Debian source packages reproducibly

2019-10-29 Thread Philipp Kern

On 2019-10-29 08:32, Tobias Frost wrote:

On Mon, Oct 28, 2019 at 05:53:00PM +, Ian Jackson wrote:

(...)


For example, you would not be able to do this:
   git clone salsa:something
   cd something
   make some straightforward change
   git tag# } [1]
   git push   # }
Instead you would have to download the .origs and so on, and wait
while your machine crunched about unpacking and repacking tarballs,
applying patches, etc.



I'm missing a "and then I test my package to ensure it still works 
before

upload" step…

I wonder how someone should test their packages when they do
not build it locally.
And if they do (as they should), the advantages you line
out are simply not there.



More abstractly we do not do that for binNMUs either. My main worry here 
is that we are designing a solution which still precludes sourceful 
no-change NMUs, which would actually be the correct solution for 
consistent versioning across all architectures. Ubuntu exclusively does 
those and I still struggle how we would build such a service in Debian 
without facing exactly the same concerns as tag2upload. Maybe if dak 
itself would do it?


Kind regards
Philipp Kern



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-29 Thread Tobias Frost
Hi Ian,

On Mon, Oct 28, 2019 at 05:53:00PM +, Ian Jackson wrote:

(...)
 
> For example, you would not be able to do this:
>git clone salsa:something
>cd something
>make some straightforward change
>git tag# } [1]
>git push   # }
> Instead you would have to download the .origs and so on, and wait
> while your machine crunched about unpacking and repacking tarballs,
> applying patches, etc.


I'm missing a "and then I test my package to ensure it still works before
upload" step…

I wonder how someone should test their packages when they do
not build it locally.
And if they do (as they should), the advantages you line
out are simply not there.


-- 
 tobi
 


signature.asc
Description: PGP signature


Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Helmut Grohne
Hi Ian,

On Mon, Oct 28, 2019 at 05:53:00PM +, Ian Jackson wrote:
> The sticking point, as I understand it, is that this still does not
> allow dak to verify that the *contents* of the .dsc were as intended
> by the uploading human. [0]
> 
> In the tag2upload proposal, the conversion from git tag to dsc is
> `merely' done by an official Debian service on an official Debian
> machine, and is `merely' fully reproducible and auditable.
> 
> But this is not good enough for some ftpmasters, who want that
> verification to be done *by dak*.  Various people attempted in the
> previous thread on this topic to find out *why* this is thought
> important, without apparent success.

I fear I'll have to side with "some ftpmasters" here. As a user, I also
want this verification work in both ways. Going from tag to upload is
insufficient in my view. What I want is "apt source" with history. This
is not debcheckout. I want the exact tree (tag) that matches unstable
including its git history in a way that exactly reproduces the build
failure seen on the source package.

In other words, I want these formats (source package and tagged git
tree) to be isomorphic (minus history). This requirement is too strong
since not every source package will have a corresponding tag, but when
there is a tag, I want to safely go from source package to tag and back
again and arrive where I started from. This property allows me to start
from a git tree that is authenticated by dak rather than a random git
tree on a random git server of questionable origin.

This backwards-connection seems to be missing thus far, but I do find it
important for the reasons above. Adding it would easily allow dak to
validate the signature on the tag.

Helmut



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Scott Kitterman



On October 28, 2019 5:53:00 PM UTC, Ian Jackson 
 wrote:
>Scott Kitterman writes ("Re: Building Debian source packages
>reproducibly (was: Re: [RFC] Proposal for new source format)"):
>> Effectively tag2upload would replace DAK as the entry point for
>> packages into the archive (the equivalent to the current source
>> package signature verification being the git tag signature
>> verification).  I don't think the discussion got to a point where a
>> path forward that was considered reasonable by both the tag2upload
>> developers and the FTP Masters was reached.
>
>The current tag2upload proposal includes providing dak with the signed
>git tag object so that it can re-verify the identity of the human DD
>who authorised the upload.
>
>The sticking point, as I understand it, is that this still does not
>allow dak to verify that the *contents* of the .dsc were as intended
>by the uploading human. [0]
>
>In the tag2upload proposal, the conversion from git tag to dsc is
>`merely' done by an official Debian service on an official Debian
>machine, and is `merely' fully reproducible and auditable.
>
>But this is not good enough for some ftpmasters, who want that
>verification to be done *by dak*.  Various people attempted in the
>previous thread on this topic to find out *why* this is thought
>important, without apparent success.
>
>It would be nice to be able to work around this objection somehow by
>writing more code.  Unfortunately, any alternative - such as that
>described by Didier earlier in this thread - has undesirable
>properties.  In particular, it does not seem that it would be possible
>to do anything along these lines without continuing to burden the
>developer's working system with a whole pile of messing about with
>tarballs and quilt and so on.
>
>For example, you would not be able to do this:
>   git clone salsa:something
>   cd something
>   make some straightforward change
>   git tag# } [1]
>   git push   # }
>Instead you would have to download the .origs and so on, and wait
>while your machine crunched about unpacking and repacking tarballs,
>applying patches, etc.
>
>With tag2upload all that work is done for you on the tag2upload
>service, which of course has a fast network connection - and you don't
>need to wait for it.
>
>Ian.
>
>[0] Currently, of course, this requirement is not achieved for
>existing git based uploads.  In practice, DDs use ad-hoc
>git-buildpackage runes on their local machine to convert git data into
>.dscs.  That conversion is not controlled, not reproducible, and not
>practically auditable.  I guess maybe those blocking tag2upload think
>it is sufficient that we can blame the DD if it is messed up or
>suborned ?
>
>[1] In practice with tag2upload you would use `git-debpush' instead of
>`git tag' + `git push' but this is a thin wrapper around `git tag' and
>does not involve downloads or indeed any network activity, etc.

And the talking past each other surely continues because I don't think that in 
any way answers the objections.  Repeating the same thing you've said before 
isn't going to close the communication gap.  I don't think it's possible to do 
so right now.  Also, I'm a mere FTP Assistant, so I'm not one of the ones you 
have to convince.

Scott K



Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Ian Jackson
Didier 'OdyX' Raboud writes ("Building Debian source packages reproducibly 
(was: Re: [RFC] Proposal for new source format)"):
> Where I'm coming from is that we were discussing the tag2upload
> problem at miniDebConf Vaumarcus.  [...]

I appreciate your efforts to try to unstick all this.

> The hard part is not the packing and unpacking of the special tag; that's 
> mostly just strings massaging. But building the exact same source package in 
> different environments is harder than I expected.

Yes.  I have code to do it for tag2upload, though.  It's not released
yet because I stopped putting effort into this whole area after
getting discouraged.

> Of course, all of this can only work if we can have, or make the ".git to 
> .dsc" conversion reproducible; hence my query.
> 
> All-in-all; would this be a welcome mechanism?

I think by requiring the user to always have the tarballs on hand and
wait for them to be manipulated and maybe transferred, you are losing
a fair amount of the benefit of tag2upload.

But if you did want to do something along these lines, maybe you
should do it by adding code to git-debpush and the tag2upload service
rather than by reinventing the rest of the machinery ?

Regards,
Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Ian Jackson
Scott Kitterman writes ("Re: Building Debian source packages reproducibly (was: 
Re: [RFC] Proposal for new source format)"):
> Effectively tag2upload would replace DAK as the entry point for
> packages into the archive (the equivalent to the current source
> package signature verification being the git tag signature
> verification).  I don't think the discussion got to a point where a
> path forward that was considered reasonable by both the tag2upload
> developers and the FTP Masters was reached.

The current tag2upload proposal includes providing dak with the signed
git tag object so that it can re-verify the identity of the human DD
who authorised the upload.

The sticking point, as I understand it, is that this still does not
allow dak to verify that the *contents* of the .dsc were as intended
by the uploading human. [0]

In the tag2upload proposal, the conversion from git tag to dsc is
`merely' done by an official Debian service on an official Debian
machine, and is `merely' fully reproducible and auditable.

But this is not good enough for some ftpmasters, who want that
verification to be done *by dak*.  Various people attempted in the
previous thread on this topic to find out *why* this is thought
important, without apparent success.

It would be nice to be able to work around this objection somehow by
writing more code.  Unfortunately, any alternative - such as that
described by Didier earlier in this thread - has undesirable
properties.  In particular, it does not seem that it would be possible
to do anything along these lines without continuing to burden the
developer's working system with a whole pile of messing about with
tarballs and quilt and so on.

For example, you would not be able to do this:
   git clone salsa:something
   cd something
   make some straightforward change
   git tag# } [1]
   git push   # }
Instead you would have to download the .origs and so on, and wait
while your machine crunched about unpacking and repacking tarballs,
applying patches, etc.

With tag2upload all that work is done for you on the tag2upload
service, which of course has a fast network connection - and you don't
need to wait for it.

Ian.

[0] Currently, of course, this requirement is not achieved for
existing git based uploads.  In practice, DDs use ad-hoc
git-buildpackage runes on their local machine to convert git data into
.dscs.  That conversion is not controlled, not reproducible, and not
practically auditable.  I guess maybe those blocking tag2upload think
it is sufficient that we can blame the DD if it is messed up or
suborned ?

[1] In practice with tag2upload you would use `git-debpush' instead of
`git tag' + `git push' but this is a thin wrapper around `git tag' and
does not involve downloads or indeed any network activity, etc.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: Building Debian source packages reproducibly

2019-10-28 Thread Sven Joachim
On 2019-10-28 10:05 +0100, Didier 'OdyX' Raboud wrote:

> Le mercredi, 23 octobre 2019, 15.49:11 h CET Theodore Y. Ts'o a écrit :
>> On Wed, Oct 23, 2019 at 11:18:24AM +1000, Russell Stuart wrote:
>> > On Tue, 2019-10-22 at 16:52 -0700, Russ Allbery wrote:
>> > > That seems excessively pessimistic.  What about Git makes you think
>> > > it's impossible to create a reproducible source package?
>> > 
>> > Has it been done?  Given this point has been raised several times
>> > before if it hasn't been done by now I think it's reasonable to assume
>> > it's difficult, and thinking that it's so is not excessively
>> > pessimistic.
>> 
>> Generating a reproducible source package given a particuar git commit
>> is trivial.  All you have to do is use "git archive".  For example:
>
> When talking about upstream projects, sure.
>
> But generating Debian source packages (.dsc and friends) from a
> `debian/master` (+ `pristine-tar`) reproducibly is not really, right?
>
> As far as I understand, `gbp buildpackage -S` is the closest we have, but so 
> far, I fail to get it to give me the bit-by-bit identical unsigned .dsc that 
> I'd like to get. What am I missing?

Assuming format 3.0 (quilt): timestamps and permissions of files under
the debian/ directory.  The permissions of files in the git repository
are different from user to user (mostly depending on their umask), and
are propagated to the debian.tar.xz.

When building from a fresh clone, timestamps of files in the
debian.tar.xz should be set to the date of the latest debian/changelog
entry, as dpkg-source will clamp their mtimes to that value.  But in an
existing git repository there will likely be files older than that, and
their random mtime also propagates to the debian.tar.xz.

Cheers,
   Sven



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Scott Kitterman
On Monday, October 28, 2019 9:45:36 AM EDT Theodore Y. Ts'o wrote:
> On Mon, Oct 28, 2019 at 10:05:11AM +0100, Didier 'OdyX' Raboud wrote:
> > Where I'm coming from is that we were discussing the tag2upload problem at
> > miniDebConf Vaumarcus. The heart of the problem is that FTP-Master are
> > (currently) not going to accept .dscs built reproducibly by a (even
> > trusted) service. tag2upload is built on the idea that a signed git tag
> > is the only needed thing (`git tag -s`) to trigger an upload, and that is
> > not going to fly currently.
> 
> Ah, now I understand the problem you're trying to solve; thanks for
> the context.
> 
> What are FTP Master's objections?  Given that they *do* accept a
> source-only upload, which is just a signed dsc plus the source/debian
> tarballs, I would presume all that would be necessary is 
> demonstate that we have tools which can reliably translate between a
> git commit and the dsc plus source tarball, and (b) that the git tree
> is stored in Debian project infrastructure so we can be assured that
> it can be stored with the same level of assurance as where we store
> the source tar files.
> 
> Do they have other concerns?  If so, what are they?  I would be
> surprised that it has anything at all to do with reliable builds,
> given the acceptance of source-only uploads today.

My recollection of the discussion is that they key (pun intended) factor is 
signed by who.  Currently all uploads are signed by an individual authorized 
to upload the package to the archive.  The tag2upload proposal is premised on 
such keys being replaced by a single service based signing key.

Effectively tag2upload would replace DAK as the entry point for packages into 
the archive (the equivalent to the current source package signature 
verification being the git tag signature verification).  I don't think the 
discussion got to a point where a path forward that was considered reasonable 
by both the tag2upload developers and the FTP Masters was reached.

There was a fair amount of discussion on this point in the tag2upload threads.

Scott K




Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Theodore Y. Ts'o
On Mon, Oct 28, 2019 at 10:05:11AM +0100, Didier 'OdyX' Raboud wrote:
> Where I'm coming from is that we were discussing the tag2upload problem at 
> miniDebConf Vaumarcus. The heart of the problem is that FTP-Master are 
> (currently) not going to accept .dscs built reproducibly by a (even trusted) 
> service. tag2upload is built on the idea that a signed git tag is the only 
> needed thing (`git tag -s`) to trigger an upload, and that is not going to 
> fly 
> currently.

Ah, now I understand the problem you're trying to solve; thanks for
the context.

What are FTP Master's objections?  Given that they *do* accept a
source-only upload, which is just a signed dsc plus the source/debian
tarballs, I would presume all that would be necessary is (a)
demonstate that we have tools which can reliably translate between a
git commit and the dsc plus source tarball, and (b) that the git tree
is stored in Debian project infrastructure so we can be assured that
it can be stored with the same level of assurance as where we store
the source tar files.

Do they have other concerns?  If so, what are they?  I would be
surprised that it has anything at all to do with reliable builds,
given the acceptance of source-only uploads today.

> The hard part is not the packing and unpacking of the special tag; that's 
> mostly just strings massaging. But building the exact same source package in 
> different environments is harder than I expected.

Is there more than just (a) making sure the package can be built
reproducibly in the first place, and (b) the information in the
buildinfo file?

Of course, the big problem is that not all packages are currently set
up to be reproducibly built; for example if you try to compile using
Link Optimization (LTO), you're completely out of luck.  (I've since
dropped use of LTO to deal with this issue.)

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=932098

But if it *is* reproducibly buildable, are there case where setting up
a build environment using the information in buildinfo not enough?

Cheers.

- Ted



Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Marek Mosiewicz
Hello,

In fact what can be important is problem of downloading artifacts
during build. At least in Java world given application can be small but
be dependant on many libs which are downloaded during build. Program
works, build is reproducible, but we can have no idea what it consist
of.

Best regards,
Marek Mosiewicz

W dniu pon, 28.10.2019 o godzinie 10∶05 +0100, użytkownik Didier 'OdyX'
Raboud napisał:
> Le mercredi, 23 octobre 2019, 15.49:11 h CET Theodore Y. Ts'o a écrit
> :
> > On Wed, Oct 23, 2019 at 11:18:24AM +1000, Russell Stuart wrote:
> > > On Tue, 2019-10-22 at 16:52 -0700, Russ Allbery wrote:
> > > > That seems excessively pessimistic.  What about Git makes you
> > > > think
> > > > it's impossible to create a reproducible source package?
> > > 
> > > Has it been done?  Given this point has been raised several times
> > > before if it hasn't been done by now I think it's reasonable to
> > > assume
> > > it's difficult, and thinking that it's so is not excessively
> > > pessimistic.
> > 
> > Generating a reproducible source package given a particuar git
> > commit
> > is trivial.  All you have to do is use "git archive".  For example:
> 
> When talking about upstream projects, sure.
> 
> But generating Debian source packages (.dsc and friends) from a
> `debian/master` (+ `pristine-tar`) reproducibly is not really, right?
> 
> As far as I understand, `gbp buildpackage -S` is the closest we have,
> but so 
> far, I fail to get it to give me the bit-by-bit identical unsigned
> .dsc that 
> I'd like to get. What am I missing?
> 
> (A little digresssion…)
> 
> Where I'm coming from is that we were discussing the tag2upload
> problem at 
> miniDebConf Vaumarcus. The heart of the problem is that FTP-Master
> are 
> (currently) not going to accept .dscs built reproducibly by a (even
> trusted) 
> service. tag2upload is built on the idea that a signed git tag is the
> only 
> needed thing (`git tag -s`) to trigger an upload, and that is not
> going to fly 
> currently.
> 
> The solution that seemed obvious during the discussion [0] is to
> instead rely 
> on a local tool to produce a git tag with significantly more metadata
> (such as 
> .dsc signature, _source.changes signature); and reconstruct the a
> signed set 
> of .dsc and _source.changes automatically (as last pipeline step in
> Gitlab 
> CI), which are then acceptable by the archive.
> 
> In other words, its "tag2upload", but with a reproducible way to:
> - build a source package on developer machine;
> - sign it locally;
> - create and push a special git tag
> Then, in a different environment (such as a GitLab CI pipeline step),
> given a 
> special git tag and a repository;
> - build the exact unsigned same source package
> - unpack the special git tag;
> - apply the signatures to get the exact same signed source packages;
> - dput to the archive.
> 
> The hard part is not the packing and unpacking of the special tag;
> that's 
> mostly just strings massaging. But building the exact same source
> package in 
> different environments is harder than I expected.
> 
> Some caveats:
> - Q: if you built and signed the source package locally, why not
> uploading?  
>   A: Because you might want to only upload _after_ automated tests,
> and in an 
>  unsupervised manner.
> - Q: If one can fit pgp signatures in a git tag; why not inlining the
> complete 
>  .dsc and _source.changes?
>   A: Indeed. You still need the debian.tar though.
> - Q: What about Dgit: in the .dsc, or buildinfo files?
>   A: Currently optional; could just be left out for a prototype.
> 
> Of course, all of this can only work if we can have, or make the
> ".git to 
> .dsc" conversion reproducible; hence my query.
> 
> All-in-all; would this be a welcome mechanism?
> 
> 
> OdyX
> 
> [0] It probably was already considered.



Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

2019-10-28 Thread Didier 'OdyX' Raboud
Le mercredi, 23 octobre 2019, 15.49:11 h CET Theodore Y. Ts'o a écrit :
> On Wed, Oct 23, 2019 at 11:18:24AM +1000, Russell Stuart wrote:
> > On Tue, 2019-10-22 at 16:52 -0700, Russ Allbery wrote:
> > > That seems excessively pessimistic.  What about Git makes you think
> > > it's impossible to create a reproducible source package?
> > 
> > Has it been done?  Given this point has been raised several times
> > before if it hasn't been done by now I think it's reasonable to assume
> > it's difficult, and thinking that it's so is not excessively
> > pessimistic.
> 
> Generating a reproducible source package given a particuar git commit
> is trivial.  All you have to do is use "git archive".  For example:

When talking about upstream projects, sure.

But generating Debian source packages (.dsc and friends) from a
`debian/master` (+ `pristine-tar`) reproducibly is not really, right?

As far as I understand, `gbp buildpackage -S` is the closest we have, but so 
far, I fail to get it to give me the bit-by-bit identical unsigned .dsc that 
I'd like to get. What am I missing?

(A little digresssion…)

Where I'm coming from is that we were discussing the tag2upload problem at 
miniDebConf Vaumarcus. The heart of the problem is that FTP-Master are 
(currently) not going to accept .dscs built reproducibly by a (even trusted) 
service. tag2upload is built on the idea that a signed git tag is the only 
needed thing (`git tag -s`) to trigger an upload, and that is not going to fly 
currently.

The solution that seemed obvious during the discussion [0] is to instead rely 
on a local tool to produce a git tag with significantly more metadata (such as 
.dsc signature, _source.changes signature); and reconstruct the a signed set 
of .dsc and _source.changes automatically (as last pipeline step in Gitlab 
CI), which are then acceptable by the archive.

In other words, its "tag2upload", but with a reproducible way to:
- build a source package on developer machine;
- sign it locally;
- create and push a special git tag
Then, in a different environment (such as a GitLab CI pipeline step), given a 
special git tag and a repository;
- build the exact unsigned same source package
- unpack the special git tag;
- apply the signatures to get the exact same signed source packages;
- dput to the archive.

The hard part is not the packing and unpacking of the special tag; that's 
mostly just strings massaging. But building the exact same source package in 
different environments is harder than I expected.

Some caveats:
- Q: if you built and signed the source package locally, why not uploading?  
  A: Because you might want to only upload _after_ automated tests, and in an 
 unsupervised manner.
- Q: If one can fit pgp signatures in a git tag; why not inlining the complete 
 .dsc and _source.changes?
  A: Indeed. You still need the debian.tar though.
- Q: What about Dgit: in the .dsc, or buildinfo files?
  A: Currently optional; could just be left out for a prototype.

Of course, all of this can only work if we can have, or make the ".git to 
.dsc" conversion reproducible; hence my query.

All-in-all; would this be a welcome mechanism?


OdyX

[0] It probably was already considered.

signature.asc
Description: This is a digitally signed message part.