Sorry for not replying to this earlier. I think after our meeting the other day we agreed to keep the existing algorithm but document it better. Here's a stab at describing the spec:

1. compute the set of valid packages, where a package is valid if
  - all its dependencies are valid
  - it is not named in an -ignore-package flag,
  - it is not shadowed by a package with the same PackageId
    in another Package DB, unless it is in the transitive
    closure of a package named in a -package-id flag

2. for each flag:
   -package P:  expose P, and hide all other versions of P
   -package-id P: expose P, and hide all other versions of P
   -hide-package P: hide P

3. if there are multiple exposed packages with the same name, hide
   all but the most recent version.

We need to rethink the shadowing behaviour. It is designed to handle the case where we have the same PackageId (name + version) in two different DBs (e.g. global and local). Shadowing takes the topmost one of these (e.g. local, or rightmost -package-db flag). We can relax this requirement so long as the InstalledPackageIds are different, but what if the InstalledPackageIds are the same? Right now that's OK, because identical InstalledPackageIds implies identical ABIs, but if we change that so that InstalledPackageId is derived from the source and not the ABI, we would not be able to assume that two identical InstalledPackageIds are compatible.

Cheers,
Simon

On 24/07/2014 15:57, Edward Z. Yang wrote:
Right now, GHC has a very complex and hard to explain algorithm for
picking packages from the package database when you give it a pile of
-package/-package-id/-{hide,ignore,trust,distrust}-package flags.
Roughly, it currently does something like this.

1. Concatenate all of the package databases into a giant list, with
system packages first and then user packages later, removing duplicate
entries with the same installed package ID (preferring later packages).
Call this PACKAGES.

2. Take all the -package-id flags and compute their transitive closure
(call this set P).

3. Calculate the set of shadowed packages--the set of installed packages
for which there exists another package with the same package ID
(foo-0.1) which is in P or is later in the package stack.

4. Calculate the set of ignored packages---the set of packages which
match all -ignore-package flags

5. Filter out shadowed and ignored packages from the list of packages,
calling the result ALL_PACKAGES.

6. Calculate the set of broken packages---the set of packages not
contained in the least fixed point of the relation that takes
a set of packages, and adds all packages whose dependencies are
satisfied, from ALL_PACKAGES.

7. Process the package flags in order, operating on PACKAGES (not
ALL_PACKAGES).  For -package/-hide-package, take the package with the
*latest* version that matches the flag and are not broken (since this
includes shadowed packages, the result is unique) and toggle it to be
exposed/unexposed, and hide all the other packages.  For
-trust-package/-distrust-package, toggle the trusted bit for all
instances in the database.

8. If we have exposed multiple versions of the same package name,
hide all the older versions

What a mouthful!  I have no idea, given a set of flags, how this works.
So here is an alternate proposal for an alternate way of handling these
flags *when starting from an empty database* (e.g. -hide-all-packages)

Suppose we maintain a set of selected packages, which starts off empty.
Process each package flag in turn.

For -package and -package-id, get the set of installed packages which
match the flag. (If multiple package names apply, process each in turn
as if it were a separate flag.)  Compute the transitive closure of
dependencies for all of them, and filter out all choices which have
dependencies which are inconsistent with the current set of selected
packages.  Consistency without multi-instances is a mapping of a package
name to an installed package.  If there is still more than one choice,
tiebreak by version, which database and time of install. (The latter
tiebreak should not be necessary until we allow multiple instances of a
package with the same package ID.)

For -hide-package, get the set of packages which match and hide them
all; for -ignore-package, hide the transitive closure of dependencies of it.
For -trust,distrust-package, toggle for all matching packages as before.

Here are some differences in behavior between this and the previous
scheme:

- It's no longer valid to indirectly depend on two different versions
   of the same package.  Most of the time, users didn't want this anyway.
   Note that the current scheme prevents directly depending on two
   different versions by shadowing the old ones.

- We can easily extend it to handle multi-instances by relaxing the
   consistency check.

- It assumes *-hide-all-packages* at the beginning.  This scheme
   probably works less well without that: now we need some consistent
   view of the database to start with.

What do people think?

Cheers,
Edward
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

Reply via email to