[gentoo-dev] The meaning of || ( a:= b:= ) dependencies

Michał Górny Sun, 03 Aug 2014 15:45:07 -0700

Hello, everyone.

I would like to hear your opinion on what should be the meaning and use
of '|| ( A:= B:= )' dependencies.



By the PMS-y definition, any-of dependency can be satisfied by either
branch of it, and the provider can be safely switched at runtime. That
is:

  || ( A B )

means that either a or b has to be installed. If you built the package
against A, you can install B, uninstall A and everything is supposed to
work without rebuilding. That doesn't really happen when linking is
involved.

With help of subslots and virtuals we were able to partially solve
the issue. For example, look at virtual/libudev. It binds subslot of
the virtual to matching subslots of provider libraries. This way, you
can safely switch providers as long as they have matching ABI; and if
you want to upgrade the provider to another ABI version, you need to
upgrade the virtual as well, and therefore rebuild the revdeps.

Sadly, virtuals like this can only work when you can expect providers
to have matching ABIs. This won't happen e.g. in krb5 providers
(the two have incompatible ABIs) or libav* providers (ABIs of some of
the libraries differ from version to version).


At the moment, some developers already started mixing subslot
and any-of operator syntax:

  || ( A:= B:= )

However, this breeds a really weird behavior in Portage.

With static dependency model, it's partially understandable. ':=' atoms
are expanded into specific subslots when matching package is installed,
otherwise left unspecified. 'Unspecified' here means that any subslot
satisfies the dependency -- like it was plain 'A' or 'B'.

So, if during the build only A was installed, further upgrades to A can
cause subslot rebuilds. If only B was installed, rebuilds are caused by
B likewise.

However, if afterwards the other package is installed, it satisfies
the other branch of the dependency without subslot, so package doesn't
get rebuild on any upgrades of A or B (since the unspecific dep always
matches). This happens until the package is manually rebuild and gets
the other subslot written.

Even more curious behavior is caused if both A and B are installed at
build time. In this case, subslots for both packages are expanded.
And since || means that either of the branches must match, the subslots
of both packages must change for the package to trigger subslot rebuild.

In other words, || ( A:= B:= ) means that subslot rebuilds happen only
if you consistently use a single provider. Provider switching or having
both providers installed break it.


Dynamic deps partially fix it. Since the current := support code is
very dumb, it doesn't notice the '||' and respects all expanded
subslots found in vardb.

The main difference is that installing the other dependency doesn't
prevent subslot rebuilds from the first one from happening. For
example, if you built the package against A and install B afterwards,
upgrade of A will still force rebuild of the package (because
dynamic-deps code accidentally moves the A:0/1= dep out of || ()).

The code also makes the behavior with both providers installed saner.
Since both subslots are expanded, both are copied and rebuild of either
would cause the rebuild of package. However, in practice it usually
causes emerge to fail with slot conflict :).

It should be also noted that the dyndeps behavior makes it impossible
to uninstall either A or B when both were installed at the reverse
dependency build time (since both are added to depgraph).


The question would be -- which behavior is desired? I'm pretty sure
Ciaran will say that the static dep behavior is correct per definitions
but I don't think it's really useful to have slot operator dependencies
which work only randomly. Instead, we may decide to redefine it into
something useful in a future EAPI.

In particular, I was thinking we could reuse this syntax:

  || ( A:= B:= )

to express any-of dependencies that do not support runtime switching of
providers -- since that is pretty much what := does to slots. This
would save us from creating a new syntax like '||= ()' [1].

[1]:https://bugs.gentoo.org/show_bug.cgi?id=489458


If we go this way, we also need to decide whether the order in such
block would matter or not. In other words, whether the application can
be expected to link to the first installed package in the list, or can
link to any of them.

If the order would matter, the package would need to be rebuilt when:

1. first satisfied dependency changes subslot,

2. [optionally] package preceding the first currently satisfied
dependency is installed,

3. first currently satisfied dependency is uninstalled (but another is
installed).

If the order wouldn't matter, the package would need to be rebuilt when:

1. any of satisfied dependencies changes subslot (since we don't know
which one package links to),

2. [optionally] any of the remaining packages is installed,

3. any of satisfied dependencies is uninstalled.

The first option seems more refined, and causes less rebuilds. However,
it diverges further from the basic || () definition. The second tries
to fit || () and := with minimal changes.

Remaining issues:

a. behavior of || ( A:= B:= C ) -- should C cause complete provider
switching rebuilds?

b. do we need ||= ( A B C ) -- i.e. provider switching rebuilds
without subslot rebuilds?


What do you think?

-- 
Best regards,
Michał Górny

signature.asc
Description: PGP signature

[gentoo-dev] The meaning of || ( a:= b:= ) dependencies

Reply via email to