Re: new kubernetes packaging

2020-03-25 Thread Moritz Mühlenhoff
Sean Whitton  wrote:
> I am not sure, however, that your argument applies to security updates
> to our stable releases.  These updates are almost always a matter of
> backporting small fixes, rather than updating to new upstream releases.
> And for backported fixes, vendoring makes things much harder.

In the case of kubernetes it will most certainly make security updates
easier, not more complex. For an application like kubernetes there'll
be a steady stream of security releases and if some of these also rebase
to a fixed, vendored Go "library" that doesn't any extra effort.

It's very similar to e.g. Chromium (and to some extent Firefox) which
also frequently fix issues in bundled libraries, but it's always just
one more bug in a bigger update pile.

I have some concerns whether the fast-paced kubernetes release cadence
will be workable for Debian's release cycles, but I think Janos' tradeoffs
seems fair for packaging kubernetes.

Cheers,
Moritz



Re: new kubernetes packaging

2020-03-25 Thread Alastair McKinstry


On 24/03/2020 23:05, Simon McVittie wrote:
> On Tue, 24 Mar 2020 at 15:14:02 -0700, Russ Allbery wrote:
>> I think this calculus is not entirely obvious.
> Thank you for applying some much-needed nuance to this issue. I suspect
> the ideal policy is neither "never use vendored dependencies" nor
> "always use vendored dependencies".
>
> Many of our packaging policies were designed for medium-sized C/C++
> libraries - not much smaller than, say, zlib, but not much bigger than
> something like GTK either - with a sufficiently stable API and ABI that
> versions are somewhat interchangeable, and I think the further away
> from that we get, the less well those policies will fit. We have a lot
> of trouble with micropackages (as exemplified by the nodejs ecosystem
> in which many packages provide a single function), but at the other end
> of the scale, monoliths like Libreoffice, TeXLive and Firefox don't fit
> a lot of our usual policies and practices so well either.
>
We should probably look at formalizing a concept of "bundle" packages,
shipping either a number of static libs-like files (go) in a -dev package

or a bundle of microlibs in a libjs-* package.

Dependencies using them would use Built-Using: for package construction
but also ship a description of the exact content versions.

Note that we also have bunches of lib* packages shipping multiple small
C/C++ shared libs, as upstream does it this way;

we need to set policies in place for  handling incompatible upgrades in
these cases: what do we need to do to ship multiple incompatible copies
of a Go static lib (do we ever do that? )

One example: openmpi ships multiple shared libs in a libopenmpi3
package. I'm arguing with upstream that the SOVERSIONS of the shared
libs need to be kept in lockstep

so that if for example package 'libopenmpi3' ships 'libopen-pal.so.40',
then 'libopenmpi4' does not; the soversions gets bumped to '*.so.50*'
for them all, even if no code changes in this particular library.

> smcv
>
-- 
Alastair McKinstry, email: alast...@sceal.ie, matrix: @alastair:sceal.ie, 
phone: 087-6847928
Green Party Councillor, Galway County Council 



Re: new kubernetes packaging

2020-03-25 Thread Simon McVittie
On Wed, 25 Mar 2020 at 08:41:45 +0100, Florian Weimer wrote:
> De-vendoring sources might still be an advantage because applications
> can be fixed with a bin-NMU, but it's a lot of work.  The resulting
> divergence from upstream can result in additional bugs.  On the other
> hand, there are projects which bundle sources only for developer
> convenience, but expect production binaries to use different library
> sources for the dependencies.

I think it's important to distinguish between those two sets of
expectations, yes.  Indeed, the same is true in C/C++. We de-vendor
dependencies that are a stable library or CLI tool in their own right
("convenience copies"), but we don't de-vendor dependencies that are
designed to be vendored or that are tightly coupled to the parent package:

- flatpak's vendored bubblewrap and xdg-dbus-proxy: convenience copy,
  use system copy instead
- flatpak's vendored libglnx: unstable "copylib" designed to be vendored,
  keep
- ioquake3's vendored libjpeg: convenience copy, use system copy instead
  (and indeed it's been excluded from the .orig tarball to make d/copyright
  less onerous to maintain, since unrelated files had to be excluded for
  DFSG reasons anyway)
- mutter's fork of code that used to be cogl/clutter: unstable and tightly
  coupled, keep

The meson build system somewhat formalizes this with its concept of
subprojects, which can either be embedded in the upstream git repository
(by direct copying or git subtree), referenced by URL in the upstream
git repository but copied into in dist tarballs (git submodule),
or an external reference by URL (.wrap file), and can either be used
unconditionally, used only as a fallback if there is no suitable system
copy, or have a configure-time choice.

smcv



Re: new kubernetes packaging

2020-03-25 Thread Florian Weimer
* Vincent Bernat:

>  ❦ 24 mars 2020 16:30 -07, Russ Allbery:
>
>> On the other hand (and I don't follow this community closely, so apologies
>> if I have the details wrong here), my impression is that the Go community
>> is not planning to support shared libraries, loves its staticly-linked
>> binaries, and makes extensive use of the fact that different packages can
>> pin to different versions and this doesn't matter for their intended
>> output format (a static binary).
>
> Go supports shared libraries since quite some time but I don't think
> it's widely used. Notably, the tooling around it is quite primitive.
> Even the plugin system (which is mostly like dlopen() and could be
> useful in many cases) is seldomly used.

That's true, but also somewhat besides the point because in order to
use dynamic shared objects to avoid recompilation of applications, you
also need practical ABI stability, both between compiler versions and
versions of the library.  Go does not have a low-level ABI that
remains unchanged across compiler versions, and (like C and C++) it
encodes struct offsets and sizes directly in the machine code,
sometimes in unexpected places due to inlining.  So even if the Go
standard library was linked as a shared object, you would still have
to rebuild all applications using it.

I believe GHC is similar in this regard.

Using shared objects under such circumstances only makes updates
harder for end users because live systems end up in inconsistent
states (ideally only for a brief time).

De-vendoring sources might still be an advantage because applications
can be fixed with a bin-NMU, but it's a lot of work.  The resulting
divergence from upstream can result in additional bugs.  On the other
hand, there are projects which bundle sources only for developer
convenience, but expect production binaries to use different library
sources for the dependencies.  I don't know if Kubernetes is one of
those projects.



Re: new kubernetes packaging

2020-03-25 Thread Vincent Bernat
 ❦ 24 mars 2020 16:30 -07, Russ Allbery:

> On the other hand (and I don't follow this community closely, so apologies
> if I have the details wrong here), my impression is that the Go community
> is not planning to support shared libraries, loves its staticly-linked
> binaries, and makes extensive use of the fact that different packages can
> pin to different versions and this doesn't matter for their intended
> output format (a static binary).

Go supports shared libraries since quite some time but I don't think
it's widely used. Notably, the tooling around it is quite primitive.
Even the plugin system (which is mostly like dlopen() and could be
useful in many cases) is seldomly used.
-- 
This night methinks is but the daylight sick.
-- William Shakespeare, "The Merchant of Venice"


signature.asc
Description: PGP signature


Re: new kubernetes packaging

2020-03-25 Thread Vincent Bernat
 ❦ 24 mars 2020 16:30 -07, Russ Allbery:

> On the other hand (and I don't follow this community closely, so apologies
> if I have the details wrong here), my impression is that the Go community
> is not planning to support shared libraries, loves its staticly-linked
> binaries, and makes extensive use of the fact that different packages can
> pin to different versions and this doesn't matter for their intended
> output format (a static binary).

Go supports shared libraries but I don't think it's widely used.
Notably, the tooling around it is quite primitive.
>
> Trying to shoehorn the latter into a shared library update model is almost
> certain to fail because it's working at intense cross-purposes to
> upstream.
>
>> This isn't necessarily such a new thing - the scale is new, but the
>> practice isn't. There are several C/C++ libraries in Debian that are
>> specifically designed to be vendored into dependent projects (either
>> because they are not API-stable or to simplify dependency management),
>> like gnulib (which exists as a package, but I think it's only there to
>> facilitate vendoring bits of it?), libstb (which does exist as a
>> separate package with a shared library, but I don't have a good picture
>> of how API- and ABI-stable it is), and libglnx.
>
> Indeed, I have a package, rra-c-util, which is vendored into every C
> package that I personally maintain and package, because it's my version of
> gnulib plus some other utility functions.  I recognize the potential
> concern should a security vulnerability be found in any of its functions,
> and accept the cost of providing security updates for every one of my
> packages that use it.  This still is, in my opinion, a better maintenance
> choice, not so much for Debian but for many non-Debian users of those C
> packages who do not want to (and often get confused by trying to) install
> a shared library as a prerequisite to installing the thing they actually
> care about.  (Also because, like gnulib, rra-c-util consists of a lot of
> different pieces, most of which are not needed for any given package, and
> includes pieces like Autoconf machinery that are tricky to maintain
> separately.)

-- 
This night methinks is but the daylight sick.
-- William Shakespeare, "The Merchant of Venice"


signature.asc
Description: PGP signature


Re: new kubernetes packaging

2020-03-24 Thread Shengjing Zhu
Another question for the current kubernetes maintainer.

What's your plan for the k8s.io/* libraries, eg k8s.io/api k8s.io/client-go.
They are supposed to be built from src:kubernetes, but it currently doesn't.

Some existing packages already embed them, like
https://codesearch.debian.net/search?q=filetype%3Ago+k8s.io%2Fapi=1
Because we thought it's hard to maintain kubernetes package in Debian,
meanwhile follow the common practice in pkg-go team.

Some new packages are blocked by not having kubernetes libraries. I
haven't checked the wnpp list. But recently there's a thread in
debian-go@.

If you provide k8s.io/* libraries, what about the libraries that
k8s.io/* depends?

-- 
Shengjing Zhu



Re: new kubernetes packaging

2020-03-24 Thread Russ Allbery
Simon McVittie  writes:

> I think the API stability of the libraries is also relevant (and ABI
> would be relevant too, if we had dynamically-linked Go libraries), both
> in terms of intended API/ABI breaks and unintended behaviour changes and
> regressions. The more stable they are, the more appealing it is to have
> them in a shared library; the more unstable they are, the more appealing
> it is to vendor them into a larger project.

I think this is also where upstream intentions are important.  For
example, the Rust community does care (intensely) about API stability and
even ABI stability, and is at least thinking about a future of Rust shared
libraries, although that's not currently the normal mechanism of
development of pure Rust packages.  They're sympathetic.  This is part of
what makes our packaging approach viable, I think.

On the other hand (and I don't follow this community closely, so apologies
if I have the details wrong here), my impression is that the Go community
is not planning to support shared libraries, loves its staticly-linked
binaries, and makes extensive use of the fact that different packages can
pin to different versions and this doesn't matter for their intended
output format (a static binary).

Trying to shoehorn the latter into a shared library update model is almost
certain to fail because it's working at intense cross-purposes to
upstream.

> This isn't necessarily such a new thing - the scale is new, but the
> practice isn't. There are several C/C++ libraries in Debian that are
> specifically designed to be vendored into dependent projects (either
> because they are not API-stable or to simplify dependency management),
> like gnulib (which exists as a package, but I think it's only there to
> facilitate vendoring bits of it?), libstb (which does exist as a
> separate package with a shared library, but I don't have a good picture
> of how API- and ABI-stable it is), and libglnx.

Indeed, I have a package, rra-c-util, which is vendored into every C
package that I personally maintain and package, because it's my version of
gnulib plus some other utility functions.  I recognize the potential
concern should a security vulnerability be found in any of its functions,
and accept the cost of providing security updates for every one of my
packages that use it.  This still is, in my opinion, a better maintenance
choice, not so much for Debian but for many non-Debian users of those C
packages who do not want to (and often get confused by trying to) install
a shared library as a prerequisite to installing the thing they actually
care about.  (Also because, like gnulib, rra-c-util consists of a lot of
different pieces, most of which are not needed for any given package, and
includes pieces like Autoconf machinery that are tricky to maintain
separately.)

-- 
Russ Allbery (r...@debian.org)  



Re: new kubernetes packaging

2020-03-24 Thread Simon McVittie
On Tue, 24 Mar 2020 at 15:14:02 -0700, Russ Allbery wrote:
> I think this calculus is not entirely obvious.

Thank you for applying some much-needed nuance to this issue. I suspect
the ideal policy is neither "never use vendored dependencies" nor
"always use vendored dependencies".

Many of our packaging policies were designed for medium-sized C/C++
libraries - not much smaller than, say, zlib, but not much bigger than
something like GTK either - with a sufficiently stable API and ABI that
versions are somewhat interchangeable, and I think the further away
from that we get, the less well those policies will fit. We have a lot
of trouble with micropackages (as exemplified by the nodejs ecosystem
in which many packages provide a single function), but at the other end
of the scale, monoliths like Libreoffice, TeXLive and Firefox don't fit
a lot of our usual policies and practices so well either.

> Possibly more significantly is how the flow of security advisories work.
> If the advisory is likely to come from Kubernetes and their security fix
> release is a point release update to their package including the vendored
> modules, we can potentially adopt the "sane upstream stable point release"
> policy and just update stable to their point release.
...
> This of course doesn't apply if the individual libraries are releasing
> their own security advisories.

I think the API stability of the libraries is also relevant (and ABI
would be relevant too, if we had dynamically-linked Go libraries), both
in terms of intended API/ABI breaks and unintended behaviour changes
and regressions. The more stable they are, the more appealing it is to
have them in a shared library; the more unstable they are, the more
appealing it is to vendor them into a larger project.

This isn't necessarily such a new thing - the scale is new, but the
practice isn't. There are several C/C++ libraries in Debian that are
specifically designed to be vendored into dependent projects (either
because they are not API-stable or to simplify dependency management),
like gnulib (which exists as a package, but I think it's only there to
facilitate vendoring bits of it?), libstb (which does exist as a separate
package with a shared library, but I don't have a good picture of how API-
and ABI-stable it is), and libglnx.

smcv



Re: new kubernetes packaging

2020-03-24 Thread Sean Whitton
Hello,

On Tue 24 Mar 2020 at 03:14PM -07, Russ Allbery wrote:

> What you say is true if the library is used by multiple applications in
> Debian (although it's still not as good of a story with Go as it is for
> C).  We can backport a patch to that one library, and then rebuild the
> applications that incorporate it.
>
> However, if a library exists in Debian solely because it is a dependency
> of some sprawling application and isn't used by other things, it may be
> easier to do a security update if it's vendored.  There are, at the least,
> fewer packages to rebuild, and the testing story is slightly more
> straightforward.

Right, if something is technically an independent language module but is
used only by kubernetes, and we don't expect that to change, there's no
need for it to be in its own source package just because it might seem
tidier to us.

I doubt that this is true of all the hundreds of dependencies presently
bundled with kubernetes, however.

> Possibly more significantly is how the flow of security advisories work.
> If the advisory is likely to come from Kubernetes and their security fix
> release is a point release update to their package including the vendored
> modules, we can potentially adopt the "sane upstream stable point release"
> policy and just update stable to their point release.  (Kubernetes does
> maintain long-lived stable branches, although I don't know how stringent
> they are about what changes they're willing to take in the stable
> branches.)  In this case, we create a bit more security work by separately
> packaging the dependencies, since we now have to trace down the package
> that corresponds to a Kubernetes security advisory and update it.

This is certainly a reasonable approach, if such releases aren't going
to violate the expectations of users of Debian stable.

I guess we'd have to see whether the Security Team are up for another
package which gets updated in this way.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: new kubernetes packaging

2020-03-24 Thread Russ Allbery
Sean Whitton  writes:

> Thank you for your e-mail.  I agree with you that security support is
> the most pressing reason to avoid piles of vendored code, and you make
> an interesting argument regarding how it can be difficult to provide
> security fixes if our refusal to use vendored code means we lag too far
> behind upstream.

> I am not sure, however, that your argument applies to security updates
> to our stable releases.  These updates are almost always a matter of
> backporting small fixes, rather than updating to new upstream releases.
> And for backported fixes, vendoring makes things much harder.

I think this calculus is not entirely obvious.

What you say is true if the library is used by multiple applications in
Debian (although it's still not as good of a story with Go as it is for
C).  We can backport a patch to that one library, and then rebuild the
applications that incorporate it.

However, if a library exists in Debian solely because it is a dependency
of some sprawling application and isn't used by other things, it may be
easier to do a security update if it's vendored.  There are, at the least,
fewer packages to rebuild, and the testing story is slightly more
straightforward.

Possibly more significantly is how the flow of security advisories work.
If the advisory is likely to come from Kubernetes and their security fix
release is a point release update to their package including the vendored
modules, we can potentially adopt the "sane upstream stable point release"
policy and just update stable to their point release.  (Kubernetes does
maintain long-lived stable branches, although I don't know how stringent
they are about what changes they're willing to take in the stable
branches.)  In this case, we create a bit more security work by separately
packaging the dependencies, since we now have to trace down the package
that corresponds to a Kubernetes security advisory and update it.

This of course doesn't apply if the individual libraries are releasing
their own security advisories.

-- 
Russ Allbery (r...@debian.org)  



Re: new kubernetes packaging

2020-03-24 Thread Sean Whitton
Hello Janos,

Thank you for your e-mail.  I agree with you that security support is
the most pressing reason to avoid piles of vendored code, and you make
an interesting argument regarding how it can be difficult to provide
security fixes if our refusal to use vendored code means we lag too far
behind upstream.

I am not sure, however, that your argument applies to security updates
to our stable releases.  These updates are almost always a matter of
backporting small fixes, rather than updating to new upstream releases.
And for backported fixes, vendoring makes things much harder.

You also write:

On Tue 24 Mar 2020 at 07:08PM +00, Janos LENART wrote:

> 1. OTHER EXAMPLES. If we take this paragraph completely literally and to
> the extreme then other packages are also in violation of it. True, the
> current packaging of kubernetes does this to a greater extent than its
> predecessor for example, but perhaps this shows that this section was
> always open for interpretation. Examples of some prominent packages in
> Debian that bundle and use the vendored code (in parentheses is the number
> of go packages bundled, estimate):
> - docker.io (58, including some that are vendored more than once within the
> same source package, but not including the fact that docker.io itself is
> made up of 7 tarballs)
> - kubernetes (20 for the previous version, 200 now)
> - prometheus (4)
> - golang (4)

but I am not sure this is relevant because the number of vendored copies
in the new kubernetes package is an order of magnitude larger than any
of these examples.

Finally, I would like to hear why you think it is valuable for us to
have a package like this in Debian as opposed to expecting people to
install it from upstream:

On Tue 24 Mar 2020 at 08:37PM +00, Jeremy Stanley wrote:

> If this represents the actual state of building Kubernetes, it's
> unclear to me why Debian would package it at all. I don't see the
> value to users in consuming Kubernetes from a Debian package if the
> result is compromising on Debian's vision and values so that they
> can get the exact same thing they'd have if they just used the
> Kubernetes community's recommended tooling to install it instead.
> I'm all for using the best tool for the job, and while I've been a
> die-hard Debian user for more than two decades I also don't install
> every last bit of software from its packages. Some software
> ecosystems have chosen to focus on tools and workflows which are
> incompatible with Debian's, but that doesn't mean that either one is
> inherently bad nor that they need to be integrated at all costs.

I find this persuasive.

-- 
Sean Whitton


signature.asc
Description: PGP signature