Hi Steve, firstly, thanks a lot for your valuable feedback and the time you took to investigate this.
Quoting Steve Langasek (2015-09-12 04:12:56) > In bug #752270, Peter refers to the "Feedback arc set" section of > http://bootstrap.debian.net/amd64/ . Reviewing the information at > <http://bootstrap.debian.net/amd64/#fas>, I find no references to freetype at > present. that means, that Debian can (in theory) be bootstrapped without freetype dropping its build dependencies on the x11 libraries. The interesting bit is the "in theory" part. Botch is creating the dependency graph for Debian Sid and then tries making that graph acyclic by using some manually collected information about which source packages can drop which build dependencies in practice. See here for the lists it uses: https://gitlab.mister-muffin.de/debian-bootstrap/botch/tree/master/droppable But even with that information, it often cannot make the whole graph acyclic. Right now there are about 10 source packages where botch will remove build dependencies even though it cannot know whether these dependencies are actually droppable so that it can present a feedback arc set. But even if the files mentioned above contained enough heuristics to make the whole graph acyclic, they are still missing the information about which binary packages those source packages would *not* build or *additionally build* without certain build dependencies present. So this heuristic is a very trivial method and the only way to get a more reliable result is to use actual build profiles. The feedback arc set is *only* a heuristic. That's why, as pointed out in the talk, I don't think it's a good idea to directly push it to package maintainers. In fact, the only information on the whole page that is not a heuristic are the tables for "Self-Cycles". That's why they are at the top. This information should show up in the tracker at some point. > I do find freetype included in the full listing for SCC #1 under "Amount of > Cycles through Edges", but this only shows that freetype has > build-dependencies on docbook-to-man and libx11-dev (which I know), it > doesn't show me how these dependencies are involved in a cycle. It does not show you the full cycle, yes. It only shows you how many cycles would at minimum be broken if that build dependency was dropped. The value is "7" for both docbook-to-man and libx11-dev which is a very low value. You can see that other edges further up break more than 50 cycles. This is part of the reason why freetype was not chosen to be part of the feedback arc set. To get to the list of cycles for freetype from the table "Amount of Cycles through Edges" you can click on the source package name. > The docbook-to-man cycle, I'm able to work out for myself manually: freetype > build-depends docbook-to-man depends opensp build-depends poppler-utils > build-depends freetype. This seems to be a false cycle because > docbook-to-man depends on sp | opensp, and sp does not build-depend on > freetype. > > The libx11-dev cycle, I do not see where the cycle is when I eyeball the > dependencies. So I was hoping to easily be able to find this information in > the linked bootstrap reports. From the table "Amount of Cycles through Edges" you can quickly go to the cycles the source package you are interested in is involved in, by clicking on the source package name. It is highlighted in blue on the page and I thought that would indicate a hyperlink: http://bootstrap.debian.net/source/freetype_2.5.2-4.html Can you point me at the exact cycle that you think is false? > Peter then links to <http://bootstrap.debian.net/source/freetype.html>. This > page doesn't provide much information at all - just a table of versions and > architectures (is an x good or bad or does it just mean information is > available?) and a link for version 2.5.2-4. Thanks, fixed in git. > If I follow that link, I see it takes me to > <http://bootstrap.debian.net/source/%5B%5Bu'gnat-4.6',%20u's390x',%20u'4.6.4-4'%5D%5D_2.5.2-4.html>. > This is obviously incorrect. Thanks, fixed in git. > If I manually correct the URL to > <http://bootstrap.debian.net/source/freetype_2.5.2-4.html>, I get a page that > includes a list of "cycles", but only shows 10 of these by default (out of > 236). And the first 10 cycles it lists are all about opensp again. Are you suggesting a different way of presenting the information? If yes, then I'd like to hear about it. > Finally, after asking for 100 rows at a time, I start to see some relevant > information which starts to make sense: > > src:freetype ⇢ libx11-dev → src:libxdmcp ⇢ w3m → src:w3m ⇢ libimlib2-dev > src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:gpm ⇢ texlive-base > src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:w3m ⇢ libimlib2-dev > src:freetype ⇢ libx11-dev → src:libxcb ⇢ python-xcbgen → src:python2.7 ⇢ > blt-dev > src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:w3m ⇢ libimlib2-dev > src:freetype ⇢ libx11-dev → src:libxcb ⇢ python-xcbgen → src:python2.7 ⇢ > tk-dev > src:freetype ⇢ libx11-dev → src:libxcb ⇢ python → src:python2.7 ⇢ tk-dev > src:freetype ⇢ libx11-dev → src:libx11 ⇢ w3m → src:gpm ⇢ texlive-base > src:freetype ⇢ libx11-dev → src:libxcb ⇢ python → src:python2.7 ⇢ xvfb > > The report then starts to show some repetition, with cycles that have been > listed previously shown again but listed separately for different > architectures for no visible reason. The reason is, that the involved packages are different ones. You will see that the version of the packages and its installation sets differ. I also don't think it's a good idea to look at the list of cycles. They give you a very limited view of the problem which is not sufficient to solve a problem in the most efficient manner (most efficient meaning here: modify the least amount of source packages to make the graph acyclic). The strongest indicator on that page is the table "Strong Bridges". It says that removing the build dependency of freetype on libx11-dev only splits the graph into individual strongly connected components on mips, mipsel, powerpc, ppc64el and s390x. Not a very good indicator. Same for the table "Amount of Cycles through Edges". It tells you that dropping libx11-dev does not have a very big effect on any architecture. Dropping docbook-to-man only helps on kfreebsd. > Some of these cycles make sense. libimlib2-dev, I recognize as a > reverse-dependency of libfreetype6-dev. But many of the others do not. It > appears that even here, the information about the cycle has been *truncated* > to only show the first 6 packages in the cycle. That should not be. If you find such a cycle, then it is a bug that I'd like to know of. > texlive-base, blt-dev, tk-dev, xvfb do not depend on freetype. Yes they do. Here is a way how you can convince yourself: for pkg in texlive-base blt-dev tk-dev xvfb; do echo finding dependencies of $pkg... dose-ceve -Tdeb --deb-native-arch=amd64 -c $pkg \ deb:///var/lib/apt/lists/http.debian.net_debian_dists_sid_main_binary-amd64_Packages \ | grep-dctrl -s Package -n -F Package freetype \ || echo no dependency on freetype found done If this does not convince you then I can tell you how to use ceve together with botch to render a graph of the individual dependency situations. > So it's very difficult to interpret the provided information to understand > where the cycles are, to understand why freetype is the right place to make > the change. > > In the end, I have satisfied myself (by manually testing with apt) that > enough of these package cycles are real that yes, freetype should have a > build profile, and I will therefore be applying these patches in my upload. > However, as someone who does care about bootstrappability, I wanted to give > you this feedback about the difficulty I had finding the pertinent > information. > > As I commented from the audience during the bootstrapping talk at this > year's DebConf, I think success in making Debian bootstrappable will > heavily depend on being able to surface the required work in a way that is > parallelizable across all Debian maintainers. I think this experience shows > just how far we are from that currently, if it takes me this much effort to > confirm the cycles when someone else has /already/ done the work of > identifying it and supplying a patch. If you have any ideas on how to improve the situation, then I'd be very much interested in it. I think right now the problems we are talking about here are (mainly) twofold: 1. the problem size The dependency graph is *huge*: http://bootstrap.debian.net/history.html http://mister-muffin.de/bootstrap/hairball.png (status 2013) And both graphs above are *already* a simplification. Half the vertices in these graphs are installation set vertices so they group together 100 or more packages in a single vertex. If these vertices were actually expanded we'd not be talking a graph of 800-1000 vertices anymore but probably in the order of magnitude of 50k vertices. The problem size creates two problems. Firstly, it becomes irrelevant to just look at simple cycles because they ignore the rest of the problem and thus can easily lead to a solution with little or no impact. Secondly, it is hard to display the information. Just as an example: you did not understand how texlive-base would depend on libfreetype6 and I totally see how but I do not see how to display such information in an easily accessible way. 2. missing metadata The way we find source packages to modify with build profiles changed since this bug was filed more than a year ago, mainly due to Helmuts (CC-ed) work on rebootstrap. The heuristics we currently have point out that the src:freetype->libx11-dev build dependency does affect something but also not that much. I don't know which analysis Peter did prior to filing this bug but right now rebootstrap is AFAIK the most reliable source for build profile bugs. Many pages at bootstrap.debian.net are not directly relevant precisely because metadata is missing. Rebootstrap *actually* builds packages so it finds which pieces of metadata are actually missing or wrong and bugs can be filed. One of the biggest pieces of missing metadata is actually which packages are available during the native bootstrap. Most pages on bootstrap.debian.net assume that build-essential just *magically* exists which of course is not true in reality. But it has to make this assumption because there exists no better estimate. In practice, during the cross building of build-essential, many additional packages will be built which will then directly influence the native bootstrap. But again due to missing metadata, we cannot automatically analyze the cross bootstrap phase. Most source packages simply fail to satisfy their cross build dependencies because of missing multiarch annotations, because of missing translation of compiler dependencies or because of perl. There again exists a heuristic ( http://bootstrap.debian.net/cross.html ) but that also does *not* point at specific packages where the fix is to be made. So I do not see a way to tell maintainers "your package has this problem, please fix it the way you see best fit". I like your idea of parallelizing work across all Debian maintainers but I do not know how this can be accomplished (patches welcome) as maintainers usually do not have the view of a bootstrapper, beyond their own packages. I think right now, extending and making rebootstrap more useful and accessible is the way that promises the biggest reward. > Please let me know if there is some other forum where you would like me to > raise these concerns. I would very much like to see this cycle information > exposed in a way that I can dive into it. There does not exist a mailing list for bootstrapping only. The problem is that most bootstrapping questions will also perfectly fit into mailing lists for toolchains, multiarch, cross or embedded. So we usually defer people to the respective lists. But maybe there should be a distinct list for bootstrapping nevertheless? Most discussion usually happens via IRC at #debian-bootstrap. Thanks again for spending your time on investigating this problem and composing your detailed reply! cheers, josch
signature.asc
Description: signature

