Hello,

On Wed, 3 Dec 2014 12:39:12 -0600 William Hubbs wrote:
> All,
> 
> we have a pull request on OpenRC for a dependency checker [1].
> 
> The author of this patch believes that we should not only scan for
> circular deps, but break some of them automatically.

Situation is more complicated than that. There are two loop
solvers, see below.

> I, and several other team members I have spoken with on IRC, disagree
> with this and think that we should just warn about the circular deps
> since users can break them by modifying files in /etc/conf.d, and the
> service script writers should be told about these kinds of issues so
> they can determine whether they neeed to adjust the dependencies in
> their scripts.
> 
> I wanted to post a question here to see what people think, so feel free
> to comment.
> 
> My opinion is the less automatic adjustment we do the better.

Short opinion for those not interested in reading a long text below.

1. Warn users about loops.
2. Break loops.
3. Make both options above configurable (on/off).
Gentoo is all about choise, right? 

Now the long story follows.

Why do loops occur? This depends on distribution being used. While
in Gentoo loops in deps are usually errors, in Debian loops are
unavoidable and must be handled, otherwise they will never accept
OpenRC as an alternative to systemd and other init systems.

Why there are loops in Debian? Here are few cases (though full list
of reasons is not limited to these cases):

1. There are multiple services having "after $all" statement (an
analog in Gentoo is "after *", which is currently used only by
local init.d script).

2. LSB dependencies are allowed to be asymmetrical relative to start
and stop, while in OpenRC they are symmetrical. This yields to
loops in OpenRC while in LSB the same services work fine. Example
follows:

cryptdisks <-a umountfs <-u hwclock.sh <-a checkroot <-n cryptdisks
where X <-a Y means Y after X;
X <-u Y is for Y uses X; and
X <-n Y is for Y needs X.

Actually cryptdisks needs checkroot only on start and hwclock.sh
uses umountfs only on stop (shutdown), so there are no issues for
LSB, but OpenRC have a loop here, which can be broken between 
umountfs <-u hwclocs, because "use" is the weakest type of
dependecy.

While it was suggested on #openrc IRC channel that Debian may switch
to runscript-format dependencies, this may be possible only in
distant future (and I doubt even this) and this step is not
acceptable right now. So it is a statement of fact that OpenRC
should be compatible with LSB dependencies. Probably zigo and
heroxbd may give you more insight on this issue.

Warnings for users about loops is a good idea for Gentoo, but will
produce a lot of not always wanted output on Debian, that's why
this option should be configurable.

As for loop breaker, its presence depends on setup and user needs.
It is definitely needed in Debian for the reasons described above.

As for Gentoo it is desirable too, becase it is better to boot
system somehow instead of not booting it at all (or with long
delays due to 60-seconds timeout on service startup). This is
crucial for remote servers, e.g. admin needs to reboot machine due
to critical security kernel update ASAP and having it hang during
boot is really a very bad idea. Another example from my experience
is emergency shutdown due to power failure and low battery signal
from sys-power/nut. I had several nasty cases when system failed to
shutdown properly due to 60-second timeouts for services failed to
shutdown — battery just ran out of charge while OpenRC was try to
do thing "right way".

Of course you mileage may vary, that's why loop breaker option
should be configurable.

But I see of no reason why we have right to force users to do what
we believe is right, instead of letting them choose what they need
based on their profile, preferences, setups, workflow and so on.

For those interested in more details:

There are two loop solvers. The first one, early-loop-solver which
is currently discussed:
https://github.com/openrc/openrc/pull/12
https://github.com/xaionaro/openrc/tree/earlyloopdetector

It solves all loops where there is at least one weak dependency
("after" or "use"). And this is done during dependency cache
generation process, so there is no run-time penalty for system
startup or shutdown.

But if one have a "hard" dependency where all graph edges are
"need", e.g.:
A <-need B <-need C <-need A
then there is no way to break this dependency during cache
generation and it should be broken on run-time, that's why
later-loop-detector exists:
https://github.com/xaionaro/openrc/tree/laterloopdetector
And it does its dirty job :)

If someone is enterested in loop detection and solver algorithms,
there are well described in the following presentation:
https://github.com/xaionaro/documentation/blob/master/openrc/earlyloopdetector/early-loop-detection.pdf

I tested both loop detectors on several Gentoo hosts for about 9
months now and they work fine for me. In Debian people also tested
them for a while and one bug was found and fixed since then.

Best regards,
Andrew Savchenko

Attachment: pgpgMGywdFIum.pgp
Description: PGP signature

Reply via email to