-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sat, 7 Jul 2001, Stefan van der Eijk wrote:
>>>>What I'd love to have is the most minimal
>>>>set of packages installed, the install would then download and compile
>>>>the SRPMS according to your specified arch and optflags.
>>>>
>>>I'm actually working on a mechanism like this.
>>To be really slick: if a package fails, your script could randomly
>>install some extra packages on the machine (perhaps 50% at random of the
>>total packages in Cooker) and try again to build. If that succeeds,
>>remove some packages and try again, etc. By a kind of binary search you
>>would find the minimal set of packages needed,
>Hmmm... Would that actually work?
For sure. Brute force methods always work, it's just that they can be a
bit slow.
Let me illustrate what I mean with an example. Suppose that you install
your minimal+BuildRequires system and a package P fails to rpm
- --rebuild. Suppose for the sake of argument that there are only ten
more packages included in Cooker (!), numbered 0 to 9 inclusive.
This is what happens:
Installed: (none), not installed: 0123456789.
Try to build P - fails.
Pick randomly half of the remaining packages and install.
Installed: 02469, not installed: 13578.
Try to build P - fails.
Pick randomly half (three) of the remaining packages and install.
Installed: 01234679, not installed: 58.
Try to build P - succeeds.
At this stage we know that 5 and 8 are not needed. The 'binary
search' goes in the other direction - pick half of the installed
packages and uninstall them.
Installed: 0179, not installed: 2346, not needed: 58.
Try to build P - succeeds. We know that 2 3 4 and 6 are not needed.
Uninstall half of the installed packages.
Installed: 17, not installed: 09, not needed: 234568.
Try to build P - fails.
Install half of the not installed packages - not counting those we
already know are not needed.
Installed: 179, not installed: 0, not needed: 234568.
Try to build P - succeeds. So 0 is not needed.
Uninstall half (one) of the installed packages.
Installed: 79, not installed: 1, not needed: 0234568.
Try to build P - fails.
Right, we're coming to the end now. Uninstalling a single package -
package number 1 - causes P to break. So we know that 1 is needed,
and we can install it always for the tests further on.
Needed: 1, installed: 79, not installed: (none), not needed: 0234568.
So packages 1, 7 and 9 are installed - 1 is definitely needed and
we're testing whether we need 7 and/or 9.
As an optimization, remember that we previously tried building with 1
7 and 9 installed, and it worked. So uninstall half of the installed
packages at random, as before. (1 is known to be needed, so we won't
uninstall that.)
Needed: 1, installed: 9, not installed: 7, not needed: 0234568.
Try to build P - succeeds. We know that 7 is not needed.
Uninstall half (one) of the installed packages.
Needed: 1, installed: (none), not installed: 9, not needed: 02345678.
Try to build P - fails. We know that 9 is needed.
We've now assigned every package to either 'needed' or not needed, so
the final result is:
Needed: 19, not needed: 02345678.
So packages 1 and 9 should be added as BuildRequires lines.
I hope you can figure out the algorithm from the example above; if not I
could write some code to implement and demo it. But I'm sure you do get
the idea.
I've taken some liberties with the meaning of 'half' - sometimes it
rounds up and sometimes down. I think rounding up is needed because
when you have only one package left in the 'installed' or 'not
installed' pools, then it's no good removing zero packages from that
pool. Although actually having just one package left is a special case
- - it's this that lets you finally decide whether a package is needed.
>I mean, there are _so_ many packages in cooker,
This process could take a long time to figure out the dependencies for a
package, just grinding through the set of other packages. But of course
it is nice and quick in the case of a package which lists its
dependencies properly: it builds first time, all the not-installed
packages are marked as 'not needed', and that's that.
Still, it would probably be quicker than doing the job by hand - if,
like me, you have a fast computer and slow fingers :-).
>and if you take along the dependancies it would be hard to
>pinpoint what is really required.
As you can see the method I propose doesn't bother to look at
dependencies. It just goes by what works.
>For instance, a package fails, the
>script decides to install gnome-core-devel, how would we then know if
>gnome-libs-devel or gtk+-devel would have been enough?
I think it would try gnome-core-devel along with lots of other stuff; if
that happened to work then it would try uninstalling things; eventually
it would be able to move packages into 'not needed' if it found a combo
which worked without them. So in the end, it would figure out the
minimal set of packages.
There are useful optimizations with dependencies though - the script
could know that installing KDE without Qt is 'obviously' not going to
work, and thus if it randomly chose to move KDE from the 'not installed'
set to the 'installed' set, it would move Qt too - and if moving Qt into
'not installed', all the things which depend on it would move with it.
Things would still work without knowing anything about dependencies, but
I think the brute force search could become a bit less brutish if it had
some knowledge of these things.
>>Once all the dependencies are correct, it wouldn't take long to just
>>verify that things work whenever a new package is released, just as a
>>kind of sanity check. It might be good for Mandrakesoft to dedicate a
>>fairly old, slow box
>Not a slow box... I've been doing this for some time now, my experience
>is that a 500Mhz 256Mb box is the minimum.
Depends. How often are new packages released? How long do they take to
compile? A big package might take an hour to compile on an old Pentium;
do big packages get released 24 times a day?
I don't mean recompile all of Cooker all the time, I just mean whenever
a new SRPM is released test that. I don't think Cooker changes
frequently enough to keep a machine busy all day long.
>I think it's important that the distro is capable to rebuild itself, all
>the time.
Agreed.
>>Ideally you'd start
>>from a completely minimal setup - what you'd get if you installed
>>Mandrake and chose the smallest possible selection of packages.
>>
>True. But then the dependancy structure of the distro also needs to
>cater for this --> somehow gcc also needs to be installed by the
>dependancies.
Hmm. Obviously you don't want 'BuildRequires: bash' and the like,
that's why I suggested having a minimal set of packages which is always
installed. GCC doesn't fall into that set. But you never see packages
- - from MandrakeSoft, Red Hat, or whoever - with an explicit dependency
on gcc. Although that might start to happen more if people adopt more
cutting-edge C++ or C features.
It's probably a bit excessive to start adding gcc, make and so on as
BuildRequires lines - even if this is technically correct (and it may be
what Debian does).
How about this as a baseline: the setup you get when you buy a Mandrake
boxed set and choose the 'newbie install' hitting Enter for every
option. This includes make and gcc, but not that many -devel packages.
>What I'd really like to see is a visual (2D, 3D?) map of package
>dependancies.... anybody care to make an app that can do that?
I think any graph-drawing tool could do it. You'd need to write a bit
of Perl to munge the output from RPM into input for some obscure LaTeX
package.
- --
Ed Avis <[EMAIL PROTECTED]>
Finger for PGP key
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7R3FRIMp73jhGogoRAqJ+AJ9POrdKGPJg7dqukbI8aIG97CKooQCeKsY2
BGENxPzAtZP2Rq1aHuFdEos=
=gIlw
-----END PGP SIGNATURE-----