Hi Steve,

firstly, thanks a lot for your valuable feedback and the time you took to
investigate this.

Quoting Steve Langasek (2015-09-12 04:12:56)
> In bug #752270, Peter refers to the "Feedback arc set" section of
> http://bootstrap.debian.net/amd64/ .  Reviewing the information at
> <http://bootstrap.debian.net/amd64/#fas>, I find no references to freetype at
> present.

that means, that Debian can (in theory) be bootstrapped without freetype
dropping its build dependencies on the x11 libraries.

The interesting bit is the "in theory" part. Botch is creating the dependency
graph for Debian Sid and then tries making that graph acyclic by using some
manually collected information about which source packages can drop which build
dependencies in practice. See here for the lists it uses:

https://gitlab.mister-muffin.de/debian-bootstrap/botch/tree/master/droppable

But even with that information, it often cannot make the whole graph acyclic.
Right now there are about 10 source packages where botch will remove build
dependencies even though it cannot know whether these dependencies are actually
droppable so that it can present a feedback arc set.

But even if the files mentioned above contained enough heuristics to make the
whole graph acyclic, they are still missing the information about which binary
packages those source packages would *not* build or *additionally build*
without certain build dependencies present. So this heuristic is a very trivial
method and the only way to get a more reliable result is to use actual build
profiles.

The feedback arc set is *only* a heuristic. That's why, as pointed out in the
talk, I don't think it's a good idea to directly push it to package
maintainers.

In fact, the only information on the whole page that is not a heuristic are the
tables for "Self-Cycles". That's why they are at the top. This information
should show up in the tracker at some point.

> I do find freetype included in the full listing for SCC #1 under "Amount of
> Cycles through Edges", but this only shows that freetype has
> build-dependencies on docbook-to-man and libx11-dev (which I know), it
> doesn't show me how these dependencies are involved in a cycle.

It does not show you the full cycle, yes. It only shows you how many cycles
would at minimum be broken if that build dependency was dropped. The value is
"7" for both docbook-to-man and libx11-dev which is a very low value. You can
see that other edges further up break more than 50 cycles. This is part of the
reason why freetype was not chosen to be part of the feedback arc set.

To get to the list of cycles for freetype from the table "Amount of Cycles
through Edges" you can click on the source package name.

> The docbook-to-man cycle, I'm able to work out for myself manually: freetype
> build-depends docbook-to-man depends opensp build-depends poppler-utils
> build-depends freetype.  This seems to be a false cycle because
> docbook-to-man depends on sp | opensp, and sp does not build-depend on
> freetype.
> 
> The libx11-dev cycle, I do not see where the cycle is when I eyeball the
> dependencies.  So I was hoping to easily be able to find this information in
> the linked bootstrap reports.

From the table "Amount of Cycles through Edges" you can quickly go to the
cycles the source package you are interested in is involved in, by clicking on
the source package name. It is highlighted in blue on the page and I thought
that would indicate a hyperlink:

http://bootstrap.debian.net/source/freetype_2.5.2-4.html

Can you point me at the exact cycle that you think is false?

> Peter then links to <http://bootstrap.debian.net/source/freetype.html>.  This
> page doesn't provide much information at all - just a table of versions and
> architectures (is an x good or bad or does it just mean information is
> available?) and a link for version 2.5.2-4.

Thanks, fixed in git.

> If I follow that link, I see it takes me to
> <http://bootstrap.debian.net/source/%5B%5Bu'gnat-4.6',%20u's390x',%20u'4.6.4-4'%5D%5D_2.5.2-4.html>.
> This is obviously incorrect.

Thanks, fixed in git.

> If I manually correct the URL to
> <http://bootstrap.debian.net/source/freetype_2.5.2-4.html>, I get a page that
> includes a list of "cycles", but only shows 10 of these by default (out of
> 236).  And the first 10 cycles it lists are all about opensp again.

Are you suggesting a different way of presenting the information? If yes, then
I'd like to hear about it.

> Finally, after asking for 100 rows at a time, I start to see some relevant
> information which starts to make sense:
> 
>   src:freetype ⇢ libx11-dev → src:libxdmcp ⇢ w3m → src:w3m ⇢ libimlib2-dev
>   src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:gpm ⇢ texlive-base
>   src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:w3m ⇢ libimlib2-dev
>   src:freetype ⇢ libx11-dev → src:libxcb ⇢ python-xcbgen → src:python2.7 ⇢ 
> blt-dev
>   src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:w3m ⇢ libimlib2-dev
>   src:freetype ⇢ libx11-dev → src:libxcb ⇢ python-xcbgen → src:python2.7 ⇢ 
> tk-dev
>   src:freetype ⇢ libx11-dev → src:libxcb ⇢ python → src:python2.7 ⇢ tk-dev
>   src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:gpm ⇢ texlive-base
>   src:freetype ⇢ libx11-dev → src:libxcb ⇢ python → src:python2.7 ⇢ xvfb
> 
> The report then starts to show some repetition, with cycles that have been
> listed previously shown again but listed separately for different
> architectures for no visible reason.

The reason is, that the involved packages are different ones. You will see that
the version of the packages and its installation sets differ.

I also don't think it's a good idea to look at the list of cycles. They give
you a very limited view of the problem which is not sufficient to solve a
problem in the most efficient manner (most efficient meaning here: modify the
least amount of source packages to make the graph acyclic).

The strongest indicator on that page is the table "Strong Bridges". It says
that removing the build dependency of freetype on libx11-dev only splits the
graph into individual strongly connected components on mips, mipsel, powerpc,
ppc64el and s390x. Not a very good indicator.

Same for the table "Amount of Cycles through Edges". It tells you that dropping
libx11-dev does not have a very big effect on any architecture. Dropping
docbook-to-man only helps on kfreebsd.

> Some of these cycles make sense.  libimlib2-dev, I recognize as a
> reverse-dependency of libfreetype6-dev.  But many of the others do not.  It
> appears that even here, the information about the cycle has been *truncated*
> to only show the first 6 packages in the cycle.

That should not be. If you find such a cycle, then it is a bug that I'd like to
know of.

> texlive-base, blt-dev, tk-dev, xvfb do not depend on freetype.

Yes they do. Here is a way how you can convince yourself:

for pkg in texlive-base blt-dev tk-dev xvfb; do
        echo finding dependencies of $pkg...
        dose-ceve -Tdeb --deb-native-arch=amd64 -c $pkg \
                
deb:///var/lib/apt/lists/http.debian.net_debian_dists_sid_main_binary-amd64_Packages
 \
                | grep-dctrl -s Package -n -F Package freetype \
                || echo no dependency on freetype found
done

If this does not convince you then I can tell you how to use ceve together with
botch to render a graph of the individual dependency situations.

> So it's very difficult to interpret the provided information to understand
> where the cycles are, to understand why freetype is the right place to make
> the change.
> 
> In the end, I have satisfied myself (by manually testing with apt) that
> enough of these package cycles are real that yes, freetype should have a
> build profile, and I will therefore be applying these patches in my upload.
> However, as someone who does care about bootstrappability, I wanted to give
> you this feedback about the difficulty I had finding the pertinent
> information.
> 
> As I commented from the audience during the bootstrapping talk at this
> year's DebConf, I think success in making Debian bootstrappable will
> heavily depend on being able to surface the required work in a way that is
> parallelizable across all Debian maintainers.  I think this experience shows
> just how far we are from that currently, if it takes me this much effort to
> confirm the cycles when someone else has /already/ done the work of
> identifying it and supplying a patch.

If you have any ideas on how to improve the situation, then I'd be very much
interested in it.

I think right now the problems we are talking about here are (mainly) twofold:

 1. the problem size

     The dependency graph is *huge*:
            http://bootstrap.debian.net/history.html
            http://mister-muffin.de/bootstrap/hairball.png (status 2013)
     And both graphs above are *already* a simplification. Half the vertices in
     these graphs are installation set vertices so they group together 100 or
     more packages in a single vertex. If these vertices were actually expanded
     we'd not be talking a graph of 800-1000 vertices anymore but probably in
     the order of magnitude of 50k vertices.

     The problem size creates two problems. Firstly, it becomes irrelevant to
     just look at simple cycles because they ignore the rest of the problem and
     thus can easily lead to a solution with little or no impact. Secondly, it
     is hard to display the information. Just as an example: you did not
     understand how texlive-base would depend on libfreetype6 and I totally see
     how but I do not see how to display such information in an easily
     accessible way.

 2. missing metadata

     The way we find source packages to modify with build profiles changed
     since this bug was filed more than a year ago, mainly due to Helmuts
     (CC-ed) work on rebootstrap.  The heuristics we currently have point out
     that the src:freetype->libx11-dev build dependency does affect something
     but also not that much. I don't know which analysis Peter did prior to
     filing this bug but right now rebootstrap is AFAIK the most reliable
     source for build profile bugs. Many pages at bootstrap.debian.net are not
     directly relevant precisely because metadata is missing.  Rebootstrap
     *actually* builds packages so it finds which pieces of metadata are
     actually missing or wrong and bugs can be filed.

     One of the biggest pieces of missing metadata is actually which packages
     are available during the native bootstrap. Most pages on
     bootstrap.debian.net assume that build-essential just *magically* exists
     which of course is not true in reality. But it has to make this assumption
     because there exists no better estimate. In practice, during the cross
     building of build-essential, many additional packages will be built which
     will then directly influence the native bootstrap.

     But again due to missing metadata, we cannot automatically analyze the
     cross bootstrap phase. Most source packages simply fail to satisfy their
     cross build dependencies because of missing multiarch annotations, because
     of missing translation of compiler dependencies or because of perl. There
     again exists a heuristic ( http://bootstrap.debian.net/cross.html ) but
     that also does *not* point at specific packages where the fix is to be
     made. So I do not see a way to tell maintainers "your package has this
     problem, please fix it the way you see best fit".


I like your idea of parallelizing work across all Debian maintainers but I do
not know how this can be accomplished (patches welcome) as maintainers usually
do not have the view of a bootstrapper, beyond their own packages.

I think right now, extending and making rebootstrap more useful and accessible
is the way that promises the biggest reward.

> Please let me know if there is some other forum where you would like me to
> raise these concerns.  I would very much like to see this cycle information
> exposed in a way that I can dive into it.

There does not exist a mailing list for bootstrapping only. The problem is that
most bootstrapping questions will also perfectly fit into mailing lists for
toolchains, multiarch, cross or embedded. So we usually defer people to the
respective lists. But maybe there should be a distinct list for bootstrapping
nevertheless?

Most discussion usually happens via IRC at #debian-bootstrap.

Thanks again for spending your time on investigating this problem and composing
your detailed reply!

cheers, josch

Attachment: signature.asc
Description: signature

Reply via email to