Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)

2018-04-05 Thread Philipp Kern
On 9/3/17 11:40 AM, Philipp Kern wrote:
> On 2017-09-02 23:48, Holger Levsen wrote:
>> On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:
>>> > Not yet.  We people from the reproducible team couldn't find a way to
>>> > usefully talk to ftp-masters people, whom never replied to any of the
>>> > questions in the thread at #763822 (they only did some quick
>>> comments on
>>> > IRC, and we have been left on guessing what they would like…).
>>> >
>>> > Anyhow, .buildinfo files are stored in ftp-master, just not
>>> exported to
>>> > the mirrors, you can find them in
>>> > coccia.debian.org:/srv/ftp-master.debian.org/.
>>>
>>> So I suppose we talk about 13 GB[1] of static content in about 1.7M
>>> files. Is that something that could be distributed through
>>> static.debian.org if there are concerns around inodes for the main
>>> mirrors? Given that they would be accessed mostly rarely[2]?
>>>
>>> [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
>>> packages * 10 architectures * 3 versions - so quite conservatively
>>> [2] So supposedly a CDN wouldn't bring a lot of benefit as individual
>>> files aren't likely to be hit frequently.
>>
>> using static.debian.org seems to be a good idea to me, what would be
>> needed to make
>> this happen?
>>
>> or, we could put them in a git repo instead, and use git.debian.org…
> 
> Git is an interesting thought for incremental mirroring. But then it
> also seems to be a poor choice for something that is an only growing
> repository of data.
> 
> What I think should be a requirement is that the data is pushed out
> before the mirror pulse. Otherwise you end up with a race where you try
> to mirror the data including the buildinfo but can't access it. (It's a
> little unfortunate that we don't simply put them onto the mirrors.

So what would be needed to make at least a simple export of the data
happen? I think the requirements I'd have are these:

* Data is sufficiently fresh and optimally accessible before the mirror
pulse happens so that you can always fetch the corresponding buildinfo
for a newly pushed package.
* Some way of actually deducing the path to the buildinfo file, either
through some sort of redirector or by naming the files in a consistent
fashion.

Right now the second point does not work with the date-based farm that
is used to archive the buildinfo files. It would work if we were to just
apply the same splitting as in the regular pool. For the former just
pushing the content through static.d.o should work and dak could push
the content before pushing the mirrors?

Intuitively I would not care about cryptographic authentication of the
data. After all it can be verified by rebuilding if the package is
reproducible.

Kind regards and thanks
Philipp Kern



signature.asc
Description: OpenPGP digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)

2017-09-03 Thread Holger Levsen
On Sun, Sep 03, 2017 at 11:40:53AM +0200, Philipp Kern wrote:
> Git is an interesting thought for incremental mirroring. But then it also
> seems to be a poor choice for something that is an only growing repository
> of data.

the nice thing with git is that you get a signed tree for free (or rather, very
easily with tools almost everybody understands), even though it atm only uses
sha1 hashes. IOW: it's a very simple blockchain, which has better properties
than a simple file based mirror.
 
> What I think should be a requirement is that the data is pushed out before
> the mirror pulse. Otherwise you end up with a race where you try to mirror
> the data including the buildinfo but can't access it. (It's a little
> unfortunate that we don't simply put them onto the mirrors.

agreed.


-- 
cheers,
Holger


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)

2017-09-03 Thread Philipp Kern

On 2017-09-02 23:48, Holger Levsen wrote:

On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:

> Not yet.  We people from the reproducible team couldn't find a way to
> usefully talk to ftp-masters people, whom never replied to any of the
> questions in the thread at #763822 (they only did some quick comments on
> IRC, and we have been left on guessing what they would like…).
>
> Anyhow, .buildinfo files are stored in ftp-master, just not exported to
> the mirrors, you can find them in
> coccia.debian.org:/srv/ftp-master.debian.org/.

So I suppose we talk about 13 GB[1] of static content in about 1.7M
files. Is that something that could be distributed through
static.debian.org if there are concerns around inodes for the main
mirrors? Given that they would be accessed mostly rarely[2]?

[1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
packages * 10 architectures * 3 versions - so quite conservatively
[2] So supposedly a CDN wouldn't bring a lot of benefit as individual
files aren't likely to be hit frequently.


using static.debian.org seems to be a good idea to me, what would be
needed to make
this happen?

or, we could put them in a git repo instead, and use git.debian.org…


Git is an interesting thought for incremental mirroring. But then it 
also seems to be a poor choice for something that is an only growing 
repository of data.


What I think should be a requirement is that the data is pushed out 
before the mirror pulse. Otherwise you end up with a race where you try 
to mirror the data including the buildinfo but can't access it. (It's a 
little unfortunate that we don't simply put them onto the mirrors.


Kind regards
Philipp Kern

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)

2017-09-02 Thread Paul Wise
On Sat, 2017-09-02 at 21:48 +, Holger Levsen wrote:

> > So I suppose we talk about 13 GB[1] of static content in about 1.7M
> > files. Is that something that could be distributed through
> > static.debian.org if there are concerns around inodes for the main
> > mirrors? Given that they would be accessed mostly rarely[2]?
> > 
> > [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
> > packages * 10 architectures * 3 versions - so quite conservatively

I had a quick look at the (currently) 4 systems behind static.d.o and
it looks like they can all take the extra space and inodes. senfter
only has 48GB space left but we can expand the storage there.
mirror-csail only has 64M inodes available, but should be fine.

One concern might be the rsync time for 1.7M inodes, I'm not sure if
our static setup does sites in parallel.

There might be other factors here that I'm not aware of, hopefully
other DSA folks can fill them in.

Are these files going to only be available for the versions of packages
that exist in the archive right now, or is it going to be a historical
archive of all Debian build information forever?
paralel
What kind of growth per year are we expecting?

> using static.debian.org seems to be a good idea to me, what would be needed 
> to make
> this happen?

Some patches to files in dsa-puppet to define the service:

modules/roles/manifests/static_mirror.pp
modules/roles/misc/static-components.yaml
modules/roles/templates/static-mirroring/vhost/static-vhosts-simple.erb
modules/sudo/files/sudoers

https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/

> or, we could put them in a git repo instead, and use git.debian.org…

It strikes me as quite a lot of data for one git repo :)

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)

2017-09-02 Thread Holger Levsen
On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:
> > Not yet.  We people from the reproducible team couldn't find a way to
> > usefully talk to ftp-masters people, whom never replied to any of the
> > questions in the thread at #763822 (they only did some quick comments on
> > IRC, and we have been left on guessing what they would like…).
> > 
> > Anyhow, .buildinfo files are stored in ftp-master, just not exported to
> > the mirrors, you can find them in
> > coccia.debian.org:/srv/ftp-master.debian.org/.
> 
> So I suppose we talk about 13 GB[1] of static content in about 1.7M
> files. Is that something that could be distributed through
> static.debian.org if there are concerns around inodes for the main
> mirrors? Given that they would be accessed mostly rarely[2]?
> 
> [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
> packages * 10 architectures * 3 versions - so quite conservatively
> [2] So supposedly a CDN wouldn't bring a lot of benefit as individual
> files aren't likely to be hit frequently.

using static.debian.org seems to be a good idea to me, what would be needed to 
make
this happen?

or, we could put them in a git repo instead, and use git.debian.org…

feedback welcome.


-- 
cheers,
Holger


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds