Hi folks,
in the recent months, the problems with Fink's dependency engine (and
dpkg's, and the way they interact) have become more and more
apparent. Various problems are basically impossible to overcome with
the current design, hence it seems we need a new full fledged
dependency engine.
For some time, I hoped we might be able to just reuse the apt engine
(and the hope is still not completely gone), but as far as I can tell
it can't cope with build time only dependencies (but there is the
possibility to work around that, see below). If somebody is
interested to look into this (that means you have to know C++, and
ideally also the apt sources, or are willing to work into it), feel
free to contact me.
So, for now, instead of charging ahead and trying to write a new
dependency engine from scratch or trying to retrofit an existing one,
I went to try to write down what our needs are. Then based on this, I
started to develop ideas on how to realize these needs in actual
code. I try to present all my ideas and findings in this email. That
includes a list of problematic cases the engine needs to handle, as
well as fundamental problems, and problems that are also affecting
our current system. It'll be a long email, and maybe I should put it
on a web page later, too.
Basic needs and issues
======================
First off: the engine needs to deal with 4 basic types of dependency:
* build conflicts
* build dependencies
* install conflicts
* install dependencies
Note that the current "Depends" field in fink in fact represent both
a build time *and* a install time dependency; in a future fink
package manager release, we will hence add new fields InstallDepends
and InstallConflicts (names subject to change); this will increase
the flexibility w/o adding any complexity.
Due to the way fink works, it's hard to use dpkg's engine properly;
e.g. fink "knows" it'll install all requirements for a package, but
dpkg will not know that unless we install them all at the same time.
There are more many scenarios where install/remove fails because fink
"knows more" than dpkg. The only way to overcome this is to use
"--force-depends" at least for some operations. But if we do that, we
*have* to handle all dependency issues manually (be it with a
complete new engine or with an existing engine, it doesn't matter).
Right now fink gets off cheap as it relies on the dpkg engine to
remove conflicting packages automatically, or to refuse to remove a
package which is still needed. If we override this, we have to do it
ourselves, increasing the complexity.
Another trouble area are build time dependencies, and build time in
general. Right now, fink installs build time dependencies if needed,
but doesn't remove them later (which might or might not be the right
approach, you can argue either way). We don't handle build conflicts
at all. Also, there issues when users run multiple fink's at once (I
do that frequently - no need to interrupt KDE building if I just want
to quickly install figlet). Right now, this can easily lead to screw
up. Just imagine openssh is building and the user removes the openssl
package. Ouch, either the build fails, or gets messed because the
openssh build now starts using Apple's openssl - just imagine if the
version of openssl differ, then half of openssh is linked against
openssl 0.9.7, the other against 0.9.6 - ouch).
dpkg doesn't (and can't) handle build time dependencies at all. Fink
should do that, but right now it does a very poor job at it.
apt only handles them in a very limited way (for the apt-get source
command), not sufficient for our needs.
Ideas & Solutions
=================
I developed various ideas on how to tackle the problems above and
other problems I encountered while researching this. Note that I am
still not finished yet with all this (one of the reason for me to
finally write this down was to get my thoughts ordered, it's easy to
get lost :-).
And basically, this assumes we use write own engine...
Note: when I say deps (=dependencies) in the following, I usually
also mean conflicts, which are also a kind of (anti)dependencies. You
can view the net of packages as a graph, the packages are nodes, the
(anti)dependencies are directed edges. It's actually a bit more
complicated, a package can consist of many version (PkgVersion in
fink), and a dependency can specify version ranges. But the idea
should be clear.
First, I tried to split down the problems in small units. This makes
reasoning easier. For example, some of the problems we used to have
in the (limited) fink dep code was caused by the fact that the
"Depends" field actually has a blurred meaning and is *not* the same
as the dpkg "Depends" - because it also covers build time. Realizing
this and getting it straight helped a lot.
Continuing this, the difference between build & install time deps.
Fink can either "install" or "build" or "build & install" a package.
So it's natural to break that down, and if you do that, you end up
with "build" objectives which have their dependencies, and almost
completely independent "install" objectives, which have their own
separate (but usually related) sets of dependencies.
That's a bit like the current approach the fink package manger uses
to decide what to build: it has a queue of packages to build. If one
runs "fink install foo", then Initially it only contains "foo" (i am
lying here, but it boils down to that). Then Fink goes on to iterate
over the dependencies of foo; any of them which are not installed get
prepended to the queue. Then fink iterates over the queue, and
builds/install anything that has its dependencies fulfilled; any
missing dependencies are again prepended to the queue. etc.
While this works well in many cases, it has its limitations. The code
is pretty complicated; a packages is *always* installed immediately
after it was built (with the notable exception of splitoffs, which
was made to fix certain problems, but the change introduced a bunch
of different problems).
So the idea is to change the queue to instead contain "objectives",
three kinds of them in fact:
* build foo-1.0-1
* install bar-2.1-2
* remove qux
The "remove" objective is new, and the operations are split. This is
needed because we may need to install db3 as a build dependency, then
remove it to be able to install db4 which another package may have as
a build dependency (a case fink currently can't handle, and which is
partly responsible for our libpng/libpng3 problem).
Now the idea is to have one component which generates a batch of
these commands.
By doing that in advance, we can detect problems *before* they occur,
and immediately prompt the user on what to do (or just bail out).
E.g. we could ask them "to install foo, the following packages have
to be remove: bar, ... Do you want to continue".
Furthermore, we can then insert an optimizer: it can group together
installs / removes, or it can re-arrange the order of things; for
example, it might first build 3 packages, and then install them all
at once, instead of build-install-build-install-build-install. This
fixes a big class of problems, and also causes some speedup. Of
course the optimizer has to honor dependencies and all for this, but
I think it's pretty much doable.
The last stage is to execute the commands necessary to fulfill the
objectives list. No dep checking is needed at this point, though one
would probably still do it, a) for sanity checking and b) because
some other process/the user may have messed with the installation in
a way that breaks operations (e.g. openssl was removed). In the
latter case, fink could bail out or offer the user to fix the issue
or whatever.
[of course we could also make fink use the same locking mechanism as
dpkg, thus avoiding any concurent apt-get/dpkg/fink runs, but at
least for me that would be an annoying limitation]
IMHO the above approach is much cleaner than the current one in Fink,
and hopefully easier to extend/debug.
So far, I haven't actually mentioned how to deal with the
dependencies. That's because i wanted to first present the framework
for the whole thing, before getting to the dirty and important
details.
Why dependency deciding is difficult
====================================
Life would be easy if a dependency would just say "install foo", and
there was exactly one foo. However, foo may exist in 5 different
versions; there might be packages that "Provides: foo". Furthermore,
dependencies can be versioned ("Depends: foo (>= 1.0-1)") or can have
alternatives ("foo | foo-ssl") or even combinations of this.
Also, existing packages may conflict with foo, so we may have to
remove those, which may not be possible because other installed
packages depend on them, which we then could remove, which may not be
possible, .... etc. you get the idea :-) And that doesn't even take
into account that we should ask the user for permission first.
Then, foo can conflict with install stuff (same problem as in the
previous paragraph)
Next, foo of course also has its rat tail of dependencies (which may
be versioned, too), etc.
This can lead to three cases:
1) we can find one version of "foo" that leads to no issues
2) we can find more than possibility
3) we can find none
Case 1 is nice of course. In case 2, we can just use one at random
like apt-get does. But of course our current approach is nice: ask
the user which to use (system-xfree86 or xfree86-rootless?
system-tetex or tetex-texmf... you get the idea). Of course, that
decision then should immediately be taken account before asking the
user the next question (a well known problem is that you have to tell
fink several times whether to use system-tetex or not, even though it
could answer all of the following questions based on the first one -
in fact it's wrong to offer the choice again).
Case 3 is the nasty one. But it can occur very easily, as the
libpng/libpng3 case shows (or in the past, evolution building was
complicated by this issue due to db31 and db3 conflicting).
Anyway, should this happen, we may still be able to operate. That's
so because some things don't have to be installed permanently: build
dependencies. We can (and should!) remove those again after a package
was built. However, handling this cleverly isn't trivial either, but
at least it's possible.
If even that doesn't help (because the user choose to install
ghostscript while system-ghostscript is installed), fink could
determine a set of packages to remove to make the install possible
(in the example, system-ghostscript). In the past, we left that to
dpkg, but we have to know about this, too, to avoid certain problems
(and because it seems we have to call dpkg with --force-depends,
though I'll try as hard as possible to avoid that in most cases. BTW
note that apt-get does it, too, for similar reasons as I think we
have to do it).
Now, this all still doesn't tell much about how to go about the
actual dependency handling. But since this email is already very
long, I will write my thoughts on that in a second mail. I'll also
explain the chroot/fakeroot approach for package building and how it
would help us in many many ways (at the cost of more time/disk space,
though).
That said, second mail might be tomorrow, as it's later (1 AM) and I
need sleep. Feel free to rip this one apart in the meantime (but keep
in mind, a second part will follow, too).
Cheers,
Max
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
Fink-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/fink-devel
- Re: [Fink-devel] On dependency engines Max Horn
- Re: [Fink-devel] On dependency engines David
- Re: [Fink-devel] On dependency engines Max Horn
- Re: [Fink-devel] On dependency engines David
- Re: [Fink-devel] On dependency engines Alexander Strange
- Re: [Fink-devel] On dependency engines Carsten Klapp