Simon McVittie writes ("Bug#1050001: Unwinding directory aliasing"): > What do you consider to be the end goal of this proposal?
Desired end state ================= This is a very good question. I had a very constructive conversation with Helmut via video chat. It seems that there's a misunderstanding about the desired end state. My idea of a desired end state is as follows: /bin and /lib etc. remain directories (so there is no aliasing). All actual files are shipped in /usr. / contains compatibility symlinks pointing into /usr, for those files/APIs/programs where this is needed (which is far from all of them). Eventualloy, over time, the set of compatibility links is reduced to a mere handful. I think this is a more desirable situation than the current planned end state, which is that /bin and /lib are symlinks. Aliasing is EBW, and "Only use canonical names" is not good enough ================================================================== There is basically one underlying technical reason for preferring the un-aliased usrmerge approach: aliasing directories in this way leads to great complication in file management, especially in package management software and in individual packages. The DEP-17 problem list is a survey of the aliasing-induced problems which have been discovered so far. But we (still!) keep discovering new ones. The current plan, as I understand it, is that we will fix these problems by arranging to *always* name files by their canonical paths, ie the ones in /usr. Naming files by their canonical names will have to be done everywhere. This is because any time a file is named by a non-canonical path, a program that tries to manipulate that file might malfunction. (Whether it malfunctions in practice depends on the precise details and gets very complicated.) Spotting and mitigating violations is hard ------------------------------------------ We do not currently have good tooling that will spot violations of this rule. It's not clear precisley what the right behaviour of our tools would be; we need to alert *the right set of users* to the mistakes, and *with the right level of severity*. Many of our key tools don't have a good way to produce "critical warnings". The consequences of violations are unpredicatable and can depend on event ordering. But they can be very severe. So we are creating a source of bad heisenbugs. Also, we only have direct control over the behaviour of our own packages, images, etc. that we (Debian) ship. Any time anyone in the field (perhaps an invididual sysadmin or user; perhaps a 3rd party software supplier; perhaps a downstream distro) violates this rule (whether through ignorance, or choice), affected systems will malfunction. (I think this means that relying on lintian, for example, as a defence against these mistakes, is not good enough.) The answer implied by the current plan seems to be that these people are just doing the wrong thing and will have learn not to? But the very existence of the directory symlinks implies a recognition that confusion over whether to name files in / or in /usr is expected to continue for a long time. If it weren't, then there would be no need for these symlinks. Violations of the "only use canonical names" rule are required -------------------------------------------------------------- Worse, violations of the "use only canonical names" rule are not only expected, they are *necessary*: There are quite a few places where we will have to keep naming files by their names in /, becaue those things appear in highly stsble public APIs/ABIs. For example, we must ship binaries that refer to the dynamic loader in /lib; shell scripts must start #!/bin/sh. Now, those references are almost all in "immutable" contexts, where it doesn't actually matter, since the file is in fact available by the non-canonical name. However, this introduces a new implied rule: it becomes a bug to take a filename you see in a place where the file is being *read*, and apply it in a context where the file is going to be *updated*. This reuse of a filename is a very natural approach. It is something that is frequently done by humans, but it is also sometimes doen by automatic software of many kinds. It's not something we've even had to consider before as a thing. But now it is (sometimes) wrong. Usually it will work, but sometimes it will make a (perhaps latent or unpredictable) bug. Looking towards the future -------------------------- It seems to me that directory aliasing will continue to be a source of very annoying bugs indefinitely, well after the transition is fully complete. In another 20 years we'll still be debugging strange installation breakage that will turn out to be due to directory aliasing. I don't doubt that the bug rate will kept "tolerably low" by QA efforts. However, we all know what a "tolerably low" bug rate looks like - systems that are in practice just not quite unreliable enough to be worth fixing. And we have much better things to spend our time and effort (and tolerance for bugs) on. As I understand it the focus of the current technical work is to try to figure out how we can get to only-canonical-paths from here, while working aorund all of the (potential) bugs which arise during the transition period, when necessarily we will be naming files sometimes by their names in / and sometimes by their names in /usr. This technical work seems really quite difficult. It's certainly clear that without funding from Freexian we wouldn't be in a position to undertake it. Nevertheless I think it is entirely possible that this technical work will succeed on its own tersm, in the sense that the upgrades for systems running Debian itself will go reasonably smoothly with only a tolerable failure rate. But as I say I don't think the end state being worked towards here, is far from the best end state. If I'm ever entitled to play the "I wrote dpkg" card, I think it's now. As the author of dpkg, which I intended to be highly reliable software (and, I like to think, I succeeded), I think this ia very poor system design. And, the approach being taken very seriously privileges Debian itself, and those well-staffed derivatives able to do the necessary transition auditing (albeit, indeed, with tooling from Debian). I am firmly ideologically opposed to such a tradeoff. The non-aliased approach ======================== Simon comments, on the non-aliases approach. > This does some but not all of what merged-/usr does: calling /usr/bin/sh > would become a non-bug, but calling /bin/env would still be an error, > /bin would still represent non-trivial on-disk and/or in-dpkg-database > state, I think that in the long term, /bin *will* become trivial enough. One of the advantages of the non-aliases approach is that we can continually improve it to get closer to the desired ideal. > and we would still potentially have other issues triggered by > the directories being distinct from one another (like the one discussed > by the tech committee in #911225, which was exactly a regression caused > by having moved a library in the traditional Debian way). >From my conversation with Helmut, it seems that we are envisaging, as part of the aliased-usrmerge approach, that there will be tools to detect violations of the "refer only by canonical path" rule. But detecting violations of "these directories only ought to contain compat symlinks into /usr" rule is a *lot* simpler. It can be done, quite reliably, on end-user systems. If we had done usrmerge the non-aliased way, then such a checking program would be able to detect a /-vs-/usr bug analogous to #911225. So I think a non-directory-aliases variant of this bug is more tractable than a directory-aliases variant. > If I remember correctly, openSUSE tried to get from unmerged /usr to > merged /usr by essentially the route you propose, successfully reached > the symlink-farm state, but then got stuck without a way to get from the > symlink farm to the single symbolic link. Do you have a plan for how that > would be achieved without breaking upgrades or going behind dpkg's back? As I say above, I don't think we should ever go to the state with a single symbolic link. The end state ought to be /lib and /bin with about six symlinks in. I hope this helps clarify my thinking. Ian. -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.