? (Was: Re: [gentoo-user] Re: Anyone switched to eudev yet?)

Mark David Dumlao Fri, 28 Dec 2012 09:16:48 -0800

TLDR: FHS is unrealistic about its promises. if we move our binaries /
libraries to /usr and work it to make sure /usr is mounted, we will
better serve FHS goals and also happen to fix some systemic, but
silent bugs.

On Fri, Dec 28, 2012 at 12:20 PM, Michael Mol <mike...@gmail.com> wrote:
> On Thu, Dec 27, 2012 at 5:37 PM, Mark David Dumlao <madum...@gmail.com>
> wrote:
>> Second, going back to problem solving in general - just because you
>> can put down in words what you think the problem is, doesn't mean
>> you've mapped out an accurate or even consistent statement of the
>> problem. There really are cases where it's not enough to just give
>> general airy abstractions and rules of thumb to map out a problem,
>> where it isn't obvious that you're running into edge cases until you
>> really look at it deeply, and yes, the / and /usr split is one of
>> them.
>
> So let's look at it deeply, since nobody else will. I'm game. This is the
> most detailed technical discussion of the problem anyone's cared to
> actually have, as far as I've been able to observe.

For the purposes of clarity I'm going to compare two competing
standards, which I will be identifying as follows:

s1) "FHS", or "plain FHS", based on FHS2.3, as identified in
http://www.pathname.com/fhs/pub/fhs-2.3.html
s2) "merged FHS", or "merged standard", based on FHS2.3 as above, but
with the caveat that all binaries and libraries are placed in /usr
instead of being split between /usr and /, as described by
http://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge

It will be helpful to examine how each system reacts to strange cases
that challenge FHS.

I think some of the following considerations are helpful in
determining which one works better. Whichever one is emphasized
conspicuously depends on which systems you're interested in
maintaining, how many people are using them, your personal taste,
sense of justice, etc. Perhaps you could add some of your own.
g1) Primary FHS purpose: software/users can predict location of
installed files and directories
g2) make distro maintainers' job easier
g3) make sysads' job easier
g4) it does not directly conflict with general practice

It is my contention that in all goals, merged FHS is better than plain
FHS. Secondly, it is also my contention that plain FHS with a separate
/usr does not give enough information to reliably satisfy its own
primary goal (g1). Back to the cases below.

=========================================
=== FUNDAMENTAL PROBLEM: / and /usr desync ===
=========================================
Thesis: FHS promise of /usr being sharable is not really deliverable
unless it contains the libraries in /.

>>>> And the "we have a standard" part is effectively not true anymore, on
>>>> the matter of the / and /usr split. That is - what the specification
>>>> says should happen is not happening, on a massive scale, because it
>>>> turns out that it's not that trivial to determine which binaries go in
>>>> / and which go in /usr.
>>>
>>> Give me an example, and I'll describe a reasonably detailed solution.
>>> It would be my pleasure.
>>
>> The most fundamental and relevant one for us Gentoo users is this:
>> - how can /usr be sharable among different hosts if it depends on
>> libraries in /?
>>
>> """
>> http://www.pathname.com/fhs/pub/fhs-2.3.html#THEUSRHIERARCHY
>>
>> Purpose
>>
>> /usr is the second major section of the filesystem. /usr is shareable,
>> read-only data. That means that /usr should be shareable between
>> various FHS-compliant hosts and must not be written to. Any
>> information that is host-specific or varies with time is stored
>> elsewhere.
>> """
>>
>> Many distros place fundamental libraries that many programs in /usr
>> depend on in /lib. Especially bad for Gentoo - libraries in /lib may
>> be recompiled as same-version variants if you want to change the USE
>> flags, resulting in clients that don't synchronously recompile their
>> own libraries in /lib to both silently and noisily fail.
>>
>> In other words, many programs in /usr in practice are functionally
>> inseparable from the libraries in /, conflicting with the notion that
>> they were properly shared in the first place.
>
> There are certain implicit assumptions made in the spec that are important.
>
> First, it's assumed the binaries are compatible with all the hosts. It's
> assumed you're not sharing s360 binaries with x86 hosts, or sparc binaries
> with ppc hosts.
>
> From there, it's reasonable to assume that the authors of the spec assume
> the administrator to be smart enough to not do things like:
> * Mix compiler versions
> * Mix program compile options
> * Place a dependency on a binary that's going to be missing.
>
> The spec is very, very much a "do what you want within these guidelines;
> don't shoot yourself in the foot" thing, it's very much not a declarative
> bikeshedding of everything related to it.

Unfortunately, FHS actually does explicitly specify the meaning of shareable.

http://www.pathname.com/fhs/pub/fhs-2.3.html#THEFILESYSTEM

""""Shareable" files are those that can be stored on one host and used
on others. "Unshareable" files are those that are not shareable. For
example, the files in user home directories are shareable whereas
device lock files are not."""

Here I take the meaning "stored on one host and used on others", in
the case of an executable, to mean "stored on one host and
successfully executed on others".

I think it is plainly obvious that binaries are not usable on other
hosts unless they come with exact libraries they were compiled for. As
far as the user is concerned, the library and the binary are two parts
of the same thing, except that the library is stored in such a way
that memory / disk space is minimized because it can be loaded
multiple times.

Following the above reasoning, to properly satisfy plain FHS, basic
libraries such as readline, ncurses, gz, that various programs depend
on should at minimum have _copies_ in /usr. But even with that
satisfied, the client of the /usr share would still need to be
configured to use those copies for the programs in /usr, possibly
independently from its own programs in /.

You say that the spec assumes that the sysad / distro will not "place
a dependency on a binary that's going to be missing". Placing the
library in /usr as well is the only way guarantee that for arbitrary
/usr clients.

So compare the following shared NFS /usr setups:

plain FHS
- some libraries are in /lib, with copies in /usr.
- clients using the /usr share need to be configured to use the
libraries in /usr over the ones in / (specifically only for the
libraries in /usr)
- distro maintainer has to determine whether library belongs in / or
/usr. If it belongs in /, a copy will also be placed in /usr.
- chance that sysad can accidentally unsync a client's / with /usr,
resulting in arbitrary unpredictable failures (hard to tell which
programs will silently segfault!)

merged FHS
- only one copy of the libraries, all in /usr
- clients using the /usr share automatically use the libraries in /usr
without reconfiguration
- distro maintainer installs all libraries to one location only
- sysad cannot desync / and /usr.

> If I were doing a shared-/usr-over-NFS setup, here's how I'd do it:
>
> 1) Have two NFS shares for /usr, A and B.
> 2) Have all my clients configured to draw from A.
> 3) Perform my upgrade on B.
> 4) Prepare my upgraded client image matching the data on B. Upgraded client
> image would mount B for /usr.
> 5) Replace my clients targeting A with clients targeting B.
>
> And tick-tock between A and B. Or proceed A->B->C, removing old shares once
> they're no longer needed.

First, the reason you need this update procedure is that you're
compensating for the point of failure you're introducing with plain
FHS: manually having to sync / and /usr. Which is functionally just
one step of sharing /usr anyways. In other words, the fact of / and
/usr being separated is what's giving you all this extra work.

Second, since you're doing an image upgrade instead of a system
upgrade, you're also complicating your process by having to either
redo your host-specific files in /etc or being dependent on additional
programs for reconfiguring each client based on mac address. Further
opportunities for accidentally shooting yourself in the foot.

>
> And in any case, I wouldn't share /usr between images with different roles.
> That way lies madness.

It might be a weird setup, but it isn't any more invalid than wanting
special performance characteristics out of having /usr separated from
/.

Consider, for instance, that a classroom of thin clients all shares
the same architecture and compile farm, with some clients wanting to
boot to a "photoshop" class and other clients wanting to boot to a
"office productivity class". Students of each class would have
effectively different roles that might be set in the / image, but
there's no good reason for their software to have to be compiled
twice.

==============================================
=== Where do udev rules belong? And the direction of udev ===
==============================================
Thesis: udev/systemd recommends the /usr merge because it fixes
subtle, silent bugs that nobody else wants to fix. We can have udev /
systemd try to work around these bugs, but right now merging /usr is
an available option.

>>>> * some teasers:
>>>> [1] udev rules themselves being a case in point. I mean, do the
>>>> requisite binaries belong in /?
>>>
>>> Udev is a dispatcher. Actually, in substance, it's a piece of the
>>> kernel that resides in userland; it exists because it was decided back
>>> around the time of devfs that what devfs was doing is something that
>>> ought to be outside the kernel. In reality, it's effectively been a
>>> userland kernel-support process its entire life.
>>>
>>> What should probably happen is that udev should be fixed to defer
>>> hotplug events until a rules file is able to sucessfully handle it.
>>
>> This is an interesting idea for a fix for the udev subproblem. However
>> again note that the / and /usr split/merge is a bigger thing than the
>> issues that just happen to manifest in udev. Perhaps right now they're
>> giving up on it because they'd rather not waste time fixing a sub
>> problem when fixing the / and /usr split fixes udev with less effort.
>
> When we went from x86 to x86_64, we didn't consider it the processor's fault
> for badly-written packages misbehaving. As we went from 8-bit codepages to
> UTF-8, we didn't consider it UTF-8's fault for badly-written packages
> misbehaving. As we go from IPv4 to IPv6, most don't consider it IPv6's fault
> when packages start misbehaving.
>
> We say the package needs to be fixed.

"""...when fixing the / and /usr split fixes udev with less effort."""

>
> So what on earth makes this different?

I think an analogy with programming languages makes this clear. There
are a lot of people that use terrible PHP security practices, and I'm
probably one of them on bad days. That doesn't mean that PHP is buggy.

Yes, the delivered software, in the end, still needs to be fixed.

Now the PHP devs have the ability to "fix" those bugs by twiddling the
default settings / behavior of PHP. Those things "fix" the bug by
changing PHP or its settings. OR they can add language features that,
if used correctly, will fix the bugs. Ditto udev. The udev guys can
twiddle the default settings / behavior of it to "fix" the bugs in the
udev rules. Or they can add udev logic that makes the bugs go away.

Something _is_ special when you're talking about language features.
How much software out there is still using deprecated features in PHP
4.x / 3.x? Heck, deprecated features in HTML4?

It is a fact of life that "waiting for other people to implement a fix
on their broken software", in general, doesn't work. If there's a big
enough majority following some behavior, it's better to change the
default behavior to fix the bug.

Which the merged /usr already does. Yes, it'd be nice to have extra
udev logic for detecting that rules can't execute yet. It isn't
needed, though, if you merge / and /usr. So hats off to eudev for
perhaps their most obvious coding target.

> While I understand why *systemd*
> should care about badly-written packages with cross-mount-point
> dependencies, I don't understand why udev should, or why udev's direction so
> be so *completely* subsumed by systemd's direction.

Because it fixes the same bugs.

>>
>> On the other hand you can just edit the system so that the _default
>> setting_ makes it unnecessary to specify dependencies for like 99.xx%
>> of programs out there.
>
> And why is this a thing that needs to be done except on an as-needed basis?

BECAUSE IT FIXES BUGS.

>>>  systemd does
>>> something interesting with its "After" clause; that makes perfect
>>> sense. And that's why I asked Canek why one couldn't write a systemd
>>> service file to treat /usr as a service
>>
>> Again, it's not enough to have just a passing understanding of the
>> problem and talk airy abstractions about solutions. You really have to
>> have an understanding of the problem space.
>>
>> systemd, for similar reasons to udev, is installed to /usr by default.
>
> Please, explain these reasons.

systemd (or some of the stuff it starts) depends on some stuff in
/usr. This makes their placement in /usr technically FHS bugs (for
systemd systems) that the upstreams won't fix.
- locales
- some udev rules
- udisks
- SMART
- PCI database
- USB database

(Actually some of those seem to kinda _really_ need to be in /
following plain FHS)

That it happens to work is just either systemd or the relevant
subsystems happening to fail silently and gracefully.

Placing systemd with all the other stuff fixes those silent failures
in the same way that placing udev with all the other stuff fixes bugs
in udev rules.

Reference: (Warning: Lennartspeak)
http://thread.gmane.org/gmane.comp.sysutils.systemd.devel/1337
http://article.gmane.org/gmane.comp.sysutils.systemd.devel/1350
http://article.gmane.org/gmane.comp.sysutils.systemd.devel/1426
http://freedesktop.org/wiki/Software/systemd/separate-usr-is-broken

>
> Almost everything useful the kernel does depends on the presence of an init.
> Does that seriously challenge the assumption that init should be separate
> from the kernel?

Only if the kernel's design goal was to do those useful things itself
(it isn't). One of systemd's design goals is to minimise boilerplate
code in its unit files.

=============================================
=== Hypothetical: does client software belong in / or /usr? ===
=============================================
Thesis: we cannot predict which programs are needed for mounting
arbitrary filesystems, even though mounting "other filesystems" is
required in the root filesystem in plain FHS.

>>>> [2] fuse-based filesystems allow an administrator the crazy
>>>> possibility of, for example, demanding that /home be an ssh
>>>> connection. Should the ssh client belong in /? ftp? substitute any
>>>> arbitrary client program.
>>>
>>> System dependent binaries and libraries aren't commonly placed in
>>> /home. Your better argument would have been fuse-mounted /usr...in
>>> which case it would be the administrator's responsibility to ensure
>>> said arbitrary client program is present in /bin, and its libraries in
>>> /lib.
>>
>> It's misleading to think that /home being in ssh is an issue, because
>> the point is that the purpose of the root filesystem is:
>
> Wha? *You're* the one who brought up /home in the first place.
>
>>
>> "To boot a system, enough must be present on the root partition to
>> mount other filesystems. This includes utilities, configuration, boot
>> loader information, and other essential start-up data."
>>
>> In other words, if /home is one of the "other filesystems" tools for
>> mounting it should, according to FHS, be in the root filesystem.
>
> Why would /home be "one of the 'other filesystems' tools for mounting"? You
> made a jump here I seriously don't follow. Are we talking credentials? That
> kind of thing that's usually kept under /etc.
>

You are misreading the spec. FHS says that the root filesystem needs
to be able to check and mount "other filesystems". This doesn't mean
"the /usr filesystem or any place where executables lie", but "all
other filesystems", including /home. If /home were mounted as a
separate filesystem, then the root filesystem needs to contain
whatever binaries are necessary to mount it, whatever filesystem type
it is.

It is following this logic that mount.fuse is expected to be found in /sbin.

It is _contrary_ to the spec that any files mount.fuse depends on is
found outside of the root filesystem.

As you've misunderstood the spec, I'll be dropping your comments on
this scenario being invalid.

> I'm frankly unfamiliar with emacs-daemon, but (from what you say), it's
> either broken, or was never, ever intended to be run as an init service.
>

I'll just drop this as it's not essential to the argument
http://www.emacswiki.org/emacs/EmacsAsDaemon

>>
>> Now you say that it's alright, in this case, the sysad makes ssh
>> available at /bin. But this undermines the primary FHS rationale: for
>> binaries to be in _predictable_ locations for both the sysad AND the
>> software packager.
>
> The sysad controls the package manager. Regardless of the distribution,
> package placement should _always_ be done via the package manager. This is
> why every package management system that exists is accompanied with a tool
> for importing other package management systems' packages. Including binary
> and source tarballs.
>

Following this, for any distro to correctly FHS, there needs to be a
package manager switch to copy arbitrary packages (and dependent
libraries) from /usr to /. As of yet not implemented.

>> Compare: if all executables were in /usr FHS would have no problem
>> locating the client binaries, because they're all in the same place.
>
> And defeats existing functional systems without valid purpose.

What does it mean to "defeat existing functional systems"? Creating a
single merged FHS system on your network will make your other machines
explode?

There is an upgrade path from plain FHS to merged FHS (copy and
symlink). I'd know, because I've done it on my box.

So compare the following setups with strange and unpredictable new
FUSE filesystems:

plain FHS
- without knowledge of all fuse filesystems installed on the machine,
the sysad cannot determine if a binary should be in / or /usr
- binaries moved to / have the chance of confusing arbitrary sysads /
maintainance scripts that hardcode path /usr/bin
- package managers need a new unimplemented feature that tells
programs to be moved to /bin
- programs moved to / will also need _copies_ of requisite libraries
in /, independent of libraries in /usr

merged FHS
- only one copy of the programs and libraries, all in /usr
- sysads / maintainance scripts will always guess correctly by
guessing /usr/bin/foo.
- no need to change any package managers

=============================================
=== Hypothetical: does server software belong in / or /usr? ===
=============================================
Thesis: Just as you cannot determine which client software is needed
for mounting arbitrary filesystems, neither can you determine which
server software will be the dependency of some filesystem.

>>>> [3] a fuse-based filesystem depends on a local network service being
>>>> started. For example, someone writes a crazy fuse mysql browser that
>>>> also is coincidentally mounted at boot time. Should the mysqld service
>>>> belong in /? ldap? substitute any arbitrary server program.
>>>
>>> And if an administrator decides to do this, it's his responsibility to
>>> make sure mysqld is located in /bin, its libraries are in /lib...and
>>> he's got to find some place other than /var for his database! By this
>>> point, you've gone so far into reducto ad absurdum I honestly can't
>>> imagine anyone apart from someone who has absolutely _no_ idea what
>>> they're doing landing themselves in that situation.
>>>
>>
>> More or less same as above. Since the admin is now manually moving
>> things around, the spec _fails to achieve its goal_.
>
> The presumption is that the admin would use the tools available to him to
> achieve his goals in a consistent fashion. We consider it an error for
> packages to be manually installed into /usr/ as opposed to /usr/local. Why
> would you think I wouldn't consider it an error for a package to be manually
> installed into /?
>
> *Anything* an admin does without the knowledge of his package manager is his
> own fault if it blows up in his face. This is why he is provided with a
> package manager in the first place.
>

As above, the necessary action is unimplemented in any package
managers I know of.

As you've blown yourself into a tangent avoiding the fundamental
question, this one goes more or less the same as the one with client
software. But I find that you're very myopic in what you are going to
allow in your Unix. Why, in a Plan9-like system it wouldn't be
uncommon for daemons to serve up a number of filesystems, and if
someone crazy enough makes a toy environment with FUSE it's not our
place to stop them from following the standard.

=====================================
=== Hypothetical: Should sysad tools be in /? ===
=====================================
Thesis: Because the root user's home directory is intended for the
root user to place arbitrary emergency restore tools and information
in, it is possible for the root user to depend on tools that depend on
/usr being mounted. This creates an FHS violation for all those tools
that does not exist in merged FHS.

>>>> [4] /root (which is why it's separated from /home) contains docs and
>>>> custom utilities used by root user for recovery. Unfortunately,
>>>> there's a lot of perl scripts there specifically for doing filesystem
>>>> checks / reports. Should perl be in in /? substitute any scripting
>>>> language.
>>>
>>> You quoted FHS. I'll quote it back to you:
>>>
>>> "/usr, /opt, and /var are designed such that they may be located on
>>> other partitions or filesystems."
>>>
>>> /root is ridiculously out of the question. /home isn't defended by the
>>> spec, but it's commonly enough separated that it's difficult to
>>> imagine someone making that error twice.
>>
>> /root is the home directory of the root user. If it's not available
>> there, it defaults to the root directory (/). The point being the root
>> user has its own storage that defaults to the root directory,
>> independent of /home or whatnot.
>>
>> http://www.pathname.com/fhs/pub/fhs-2.3.html#FTN.AEN1037
>>
>> One good reason why it's separated from /home is because the root user
>> may download or store his own stuff there relevant to fixing that
>> machine. For example, post-install notes are often placed in /root,
>> which detail which packages were selected upon installation. Or while
>> performing lengthy system maintainance, the sysad may write down their
>> upgrade notes, or snip relevant tcpdumps, or what-have-you and store
>> them in the /root directory while the /home directory is unavailable.
>
> Of course. This is why /root as a separate filesystem is ridiculously out of
> the question.

I _didn't_ say /root was going to be a separate filesystem. I was
_only_ describing /root for the purposes of clarifying why I expect it
to have the following contents:

>> It is very conceivable for the /root directory to contain perl scripts
>> or whatever the sysad has downloaded in his adventures to grep through
>> his logs and find out what the heck is going on in his system and/or
>> repair it.
>>
>> Should perl be in / or /usr?
>
> Now that is a good question, if only because Perl traditionally _loathes_
> being in /bin, for its own philosophical reasons.
>
> Now, as a practical matter? WTF are the scripts written in Perl? Or in
> anything other than sh? If they're intended for emergency use, they've got
> some pretty fat dependencies, and should probably be launched from a full
> rescue environment instead.

The fact that Perl has fat dependencies doesn't mean that it
shouldn't, according to the spec, be placed in /. The fact is, in this
case plain FHS suggests that Perl be in /. It's that ambiguous about
it. This strongly conflicts with the history of Perl and general
practice, in a way that suggests that we'd rather break plain FHS than
follow the general practice (suggesting the plain FHS is in the
wrong).

> If you need a Cadillac for bootstrapping and emergency maintenance, sure.
> But this is one of those scenarios that experience and good sense teaches
> you to avoid.

Doesn't help the fact that it still violates plain FHS.

====================================================
=== Hypothetical: Can we tell portage to install to / instead of /usr? ===
====================================================
>>> With a system such as portage, it should be entirely possible (with
>>> few code changes) to configure installation targets (/ vs /usr) on a
>>> per-package basis, and have that trickle down the dependency chain.
>>
>> Yes, that should be. In fact I think that's the cleanest way to push
>> through so far. Just add a USE=prefix-root to udev.
>
> Make it default. The change of default to /usr is what's ridiculously stupid
> about the current scenario.
>
> And, frankly, if this is as much an issue as it's described to be, then
> something beyond USE flags is necessary. A different per-package tunable is
> needed.

I don't think this is going to be trivial to implement, though. Any
package moved to / is going to also need copies of any library
dependencies to /. I only wonder how linking / revdep-rebuild is going
to work.

--
This email is:    [ ] actionable   [ ] fyi        [x] social
Response needed:  [ ] yes          [x] up to you  [ ] no
Time-sensitive:   [ ] immediate    [ ] soon       [x] none

Re: Should /usr be merged with /? (Was: Re: [gentoo-user] Re: Anyone switched to eudev yet?)

Reply via email to