[gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-13 Thread Martin Vaeth
Michael Lienhardt  wrote:
> the criteria list you gave (maybe it's in the PMS)

I doubt that it is in PMS, and IMHO it also does not belong there:
As long as the result configuration is valid (no collisions
or unresolvable loops) all should be equally fine from the
viewpoint of PMS: It is in the responsibility of the package
manager and its configuration to determine correctly what the
_user_ actually wants.

>   - in criteria a., there could be an ambiguity between what I call a package
> (e.g., 'app-editors/nano-2.9.3') and a package group (e.g.,
> 'app-editors/nano')

This is what already Michał has oberved: In my list, I had completely
forgotten to refer to versions ('app-editors/nano-2.9.3') and had only
packages ('app-editors/nano') in mind.
(I think "version" vs. "package" is the usual terminology in gentoo;
there are also "slots" which in a sense might be considered as
different packages, although perhaps with a different "penalty")

> is the criteria about package group (i.e., are version changes allowed)?

Version upgrades should even be preferred over remaining with the version.
OTOH, version downgrades are perhaps even worse than using a different package
(opinions might differ for the latter).

>   - similarly in criteria b.: is this criteria valid across versions
> (i.e., when changing version, the USE-flag change should be minimal)

Cf. the next point: The USE-flag change compared to user configuration
(package.use etc.) should be minimal.

>   - just to be sure, the "less keyword and mask change" criteria is at the
> top of the list?

Yes. Only "illegal" configurations (colliding packages or unresolvable
dependency loops of uninstalled packages) should weight stronger:
If possible at all, the user's choice should always be preferred,
and changes to the configuration should only be suggested if no other
solution exists (and probably the number of suggested changes should
be minimal in a sense).




Re: [gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-13 Thread Michael Lienhardt

Thanks a lot for this list!

You are totally right, simply translating the dependencies into SAT constraints 
and feeding them to a solver returns in most cases a very bad, totally useless 
solution.
However, nowadays many solvers support solution optimization, i.e., you can 
specify an ordered list of criteria you want to maximize/minimize, exactly like 
you did.
I already implemented the minimal status difference and the minimal 
installation size criteria, and the solutions I get are already acceptable/good.

I wasn't aware of the criteria list you gave (maybe it's in the PMS), so thank 
you very much, this is a big help :)
I have few questions about these criteria:
 - in criteria a., there could be an ambiguity between what I call a package 
(e.g., 'app-editors/nano-2.9.3') and a package group (e.g., 
'app-editors/nano'): is the criteria about package group (i.e., are version 
changes allowed)?
 - similarly in criteria b.: is this criteria valid across versions (i.e., when 
changing version, the USE-flag change should be minimal), across slots (i.e., 
two installed versions of the same package group should have a minimal USE flag 
difference)?
   If yes, what if package.use specifies very different USE flags for two 
versions of the same package group?
 - does the "prefer new version" criteria go between a. and b.?


b. Similarly, the number of USE-flag changes necessary to achieve this
aim should be minimized.
(You didn't tell whether your solver already supports such changes,
but when it is finished, it definitely should.)


Yes, my solver supports USE-flag, keyword and mask changes (it is currently 
oblivious of licenses, but supporting them should just be a technical/time 
consuming effort).
Due to keywords and mask changes, the "prefer new version" criteria needs to be after 
criteria "less keyword and mask change".
 - just to be sure, the "less keyword and mask change" criteria is at the top 
of the list?

Best regards,
Michael Lienhardt



[gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-13 Thread Martin Vaeth
Michał Górny  wrote:
>> 
>> d. In || ( ... ) clauses the left-most packages should be preserved.

s/preserved/preferred/

> you've missed the most important point: we want to prefer
> the newest version, whenever possible
> ;-).

Yes, you are right: I had thought only about packages, not about versions.
Of course, a version upgrade within the same slot should have a
negative penalty. I am already not sure how upgrades with slot
changes should be treated. And then there are subslots...
The list is probably still not exhaustive and - as already mentioned -
the penalties are certainly a topic of discussion (and even more of
trial and error: which works best in practice).

Anyway, I think that it would be a huge improvement over the current
(portage's) solver if one could formulate such a list explicitly in
some specified format, and then one abstract algorithm takes care to
find the best solution according to the specified penalties:
If I understand it correctly, all these rules are currently hard-coded
into the algorithm by using various hacks in backtracking steps,
finding of a solution is convoluted with determining the order, etc.
One would obtain a solver which not only is "provably" correct,
but also much easier to maintain and to tweak (e.g. if one realizes
that certain penalties were not cleverly chosen).

This would also provide the possibility for much richer user configuration.
An example which immediately come to mind is "weak-masking" of packages
or versions ("use it or upgrade only if no alternative is available").
Theoretically, one could then even endow the entries of package.mask
with a penalty number.




Re: [gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-13 Thread Michał Górny
W dniu wto, 13.02.2018 o godzinie 07∶49 +, użytkownik Martin Vaeth
napisał:
> Michael Lienhardt  wrote:
> > 
> > ad-hoc fixes and tweaks that can hardly be encoded into SAT constraints.
> 
> The main difficulty which I see is that one does not want only _some_
> solution, but among all solutions one which optimizes certain quantities.
> So it seems to me that a discrete optimization under constraints
> is required instead of a pure SAT solver (although formally, of course,
> such an optimization problem can be reduced to SAT, I do not know whether
> you went the road):
> 
> a. The number of packages which change their status (from installed to
> uninstalled or vice versa) should be minimal.
> 
> b. Similarly, the number of USE-flag changes necessary to achieve this
> aim should be minimized.
> (You didn't tell whether your solver already supports such changes,
> but when it is finished, it definitely should.)
> 
> Hopefully in near future, there will be a second class of USE-flags
> whose change is "cheap" which should be excluded from this minimization
> restriction:
> https://bugs.gentoo.org/show_bug.cgi?id=424283
> I think the main reason why nobody dared to implement them yet
> (although almost everybody wants them) are exactly these implications
> on the current solver in portage which nobody dares to change anymore
> seriously.
> 
> c. A solution with dependency loops should be avoided if possible
> (which is why currently the PDEPEND hacks do exist: To tell the solver
> which loops are not a problem.)
> 
> d. In || ( ... ) clauses the left-most packages should be preserved.
> There are similar (and more difficult) rules for REQUIRED_USE.
> 
> e. Last but not least: The preferences are ordered a > b > c > d
> (A dependency loop of uninstalled packages should probably have even
> higher priority than a).
> Additionally the change installed -> uninstalled should be less
> "expensive" than the change uninstalled -> installed.
> The correct weighting of the quantities should probably be a matter
> of discussion (or preferrably even user-customizable), and preferrably
> should not depend only on the number of packages but also on other
> customizable quantities (e.g. the package size).
> 

Thank you for listing this. However, I think you've missed the most
important point: we want to prefer the newest version, whenever possible
;-).

-- 
Best regards,
Michał Górny




[gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-12 Thread Martin Vaeth
Michael Lienhardt  wrote:
>
> ad-hoc fixes and tweaks that can hardly be encoded into SAT constraints.

The main difficulty which I see is that one does not want only _some_
solution, but among all solutions one which optimizes certain quantities.
So it seems to me that a discrete optimization under constraints
is required instead of a pure SAT solver (although formally, of course,
such an optimization problem can be reduced to SAT, I do not know whether
you went the road):

a. The number of packages which change their status (from installed to
uninstalled or vice versa) should be minimal.

b. Similarly, the number of USE-flag changes necessary to achieve this
aim should be minimized.
(You didn't tell whether your solver already supports such changes,
but when it is finished, it definitely should.)

Hopefully in near future, there will be a second class of USE-flags
whose change is "cheap" which should be excluded from this minimization
restriction:
https://bugs.gentoo.org/show_bug.cgi?id=424283
I think the main reason why nobody dared to implement them yet
(although almost everybody wants them) are exactly these implications
on the current solver in portage which nobody dares to change anymore
seriously.

c. A solution with dependency loops should be avoided if possible
(which is why currently the PDEPEND hacks do exist: To tell the solver
which loops are not a problem.)

d. In || ( ... ) clauses the left-most packages should be preserved.
There are similar (and more difficult) rules for REQUIRED_USE.

e. Last but not least: The preferences are ordered a > b > c > d
(A dependency loop of uninstalled packages should probably have even
higher priority than a).
Additionally the change installed -> uninstalled should be less
"expensive" than the change uninstalled -> installed.
The correct weighting of the quantities should probably be a matter
of discussion (or preferrably even user-customizable), and preferrably
should not depend only on the number of packages but also on other
customizable quantities (e.g. the package size).

Perhaps - if one can rely on some solver - in a future EAPI instead
of the size heuristic, one could give a hint for how "expensive" it
is to recompile a certain package, so that dependency results can
be better optimized for that.

IIRC, this is already built in rudimentarily in portage's current
solver by defining the recompilation costs of virtual packages as 0.

> I don't deal with installation order [...] circular dependency problem

++
Once the packages are known, the installation order and breaking of
unavoidable circles is a matter of a single graph traversal in the
resulting subgraph which needs neglectable linear time.
However, if there is a solution without a circle this should have
been found instead in the first place (cf. c).




Re: [gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-07 Thread Michael Lienhardt



Il 07/02/2018 05:11, Duncan ha scritto:

AFAIK, (plain) etc-portage semantics are the same for both files -- that
is, /etc/portage/package.keywords and the newer package.accept_keywords
are identical.

The reason for the new name and deprecation of the old one was that
package.keywords also exists in the /profile/, where the semantics are
different, which created confusion for devs and others attempting to edit
the profile version as well as the more commonly user-edited (plain)
/etc/portage version.

(I add the parenthesized "(plain)" because there's also the deeper
/etc/portage/profile path, which takes profile changes and therefore the
profile format.  Actually, I suspect it was someone editing that using
the wrong format and then filing a bug when things didn't work as
expected that likely prompted the new name and deprecation of the old
one.)


Ok, thank you for the information, it's very interesting.
I may need to update my solver then :) (I thought that *.keywords always 
manipulated the KEYWORDS variable, while *.accept_keywords manipulated the 
ACCEPT_KEYWORDS).


I've a rather unusual portage config here:

* /etc/portage/profile/packages contains a -* entry, negating the entire
normal @system set.  Many normally @system packages I really need are
dependencies of something or other I already have in @world anyway, and
I've added @world entries where necessary to keep the few exceptions
installed.

* My USE starts with a -* entry as well, negating profile and package USE
defaults so I have direct control of all USE flag settings and don't have
to worry about USE flag changes on profile updates or tracking down why a
flag is changing when I didn't change anything, both previous problems I
had to deal with until I set that initial -*, so the only flags set are
the ones I deliberately chose, myself.


Does the -* also remove profile USE defaults for USE flags in 
PROFILE_ONLY_VARIABLES?
My understanding is that only files in a profile (either 
/etc/portage/make.profile or /etc/portage/profile) can configure these USE 
flags.


* My world file (/var/lib/portage/world) is empty.  I've categorized
everything into individual sets found in /etc/portage/sets, with those in
turn listed in the world_sets file (/var/lib/portage/world_sets).


Interesting, I didn't know about the /var/lib/portage/world_sets file (it 
escaped me somehow).
I currently include in all the packages listed in the /etc/portage/sets/* files.
It can be fixed easily.


* Overlays... (Less unusual, but still not mainline...) I run nearly all
the kde I have installed (frameworks/plasma/apps), as well as a few other
packages, from the live-git *- packages found in the gentoo/kde
overlay (and others).  Package keywording/masking is adjusted
accordingly.  (Everything else is mainline ~amd64.)

I expect  one or more of these you won't have anticipated so they'll
present a challenge for your current script, but because it /is/ a rather
unusual setup, I'm not sure it's worth your trouble to bother with.


Out of all your points, only the overlays are not managed by our solver.
As your setup is unusual, apar from the overlay, it could be a good test case 
for our tool to check the configuration loading.
Right now in my tests, I deal with overlay by adding manually all the installed 
packages that is missing from the base portage tree...
If there is not too much packages to add (I guess 50 is still ok, 100 is a bit 
too much).


OTOH, if you want unusual corner-cases to test...


I'm from formal methods, focused on correction and completeness...
So yes, corner cases are important and interesting, and are useful to detect 
early bug or design errors.


So bother sending the results in (you're ready for it already), or you
want them, but wait until you've adjusted the script to deal with it, or
don't bother, you're not going to try supporting anything that unusual
anyway?


Send it in :).
If I have problems with it, I may not fix the bugs right away, but at least I 
would know about it.

Thanks,
Michael Lienhardt




[gentoo-dev] Re: SAT-based dependency solver: request for test cases

2018-02-06 Thread Duncan
Michael Lienhardt posted on Tue, 06 Feb 2018 23:53:05 +0100 as excerpted:

> Il 06/02/2018 15:00, Roy Bamford ha scritto:
>> Posting here to alert other potential helpers.
>> Your script uses
>> 
>> PACKAGE_KEYWORDS="/etc/portage/package.accept_keywords"
>> 
>> but thats a recent name change.
>> 
>> PACKAGE_KEYWORDS="/etc/portage/package.keywords"
>> 
>> is the old name and my older systems still use that.
>> You probably need to harvest both and process both as portage does.
> 
> You are right, I was lazy (and hoped that everyone already made the
> switch because, as I understand it, package.keywords and
> package.accept_keywords do not have the same semantics).
> I added the package.keywords file/folder in the script.

AFAIK, (plain) etc-portage semantics are the same for both files -- that 
is, /etc/portage/package.keywords and the newer package.accept_keywords
are identical.

The reason for the new name and deprecation of the old one was that 
package.keywords also exists in the /profile/, where the semantics are 
different, which created confusion for devs and others attempting to edit 
the profile version as well as the more commonly user-edited (plain)
/etc/portage version.

(I add the parenthesized "(plain)" because there's also the deeper 
/etc/portage/profile path, which takes profile changes and therefore the 
profile format.  Actually, I suspect it was someone editing that using 
the wrong format and then filing a bug when things didn't work as 
expected that likely prompted the new name and deprecation of the old 
one.)


Meanwhile...

I've a rather unusual portage config here:

* /etc/portage/profile/packages contains a -* entry, negating the entire 
normal @system set.  Many normally @system packages I really need are 
dependencies of something or other I already have in @world anyway, and 
I've added @world entries where necessary to keep the few exceptions 
installed.

* My USE starts with a -* entry as well, negating profile and package USE 
defaults so I have direct control of all USE flag settings and don't have 
to worry about USE flag changes on profile updates or tracking down why a 
flag is changing when I didn't change anything, both previous problems I 
had to deal with until I set that initial -*, so the only flags set are 
the ones I deliberately chose, myself.

* My world file (/var/lib/portage/world) is empty.  I've categorized 
everything into individual sets found in /etc/portage/sets, with those in 
turn listed in the world_sets file (/var/lib/portage/world_sets).

* Overlays... (Less unusual, but still not mainline...) I run nearly all 
the kde I have installed (frameworks/plasma/apps), as well as a few other 
packages, from the live-git *- packages found in the gentoo/kde 
overlay (and others).  Package keywording/masking is adjusted 
accordingly.  (Everything else is mainline ~amd64.)

I expect  one or more of these you won't have anticipated so they'll 
present a challenge for your current script, but because it /is/ a rather 
unusual setup, I'm not sure it's worth your trouble to bother with.

OTOH, if you want unusual corner-cases to test...

So bother sending the results in (you're ready for it already), or you 
want them, but wait until you've adjusted the script to deal with it, or 
don't bother, you're not going to try supporting anything that unusual 
anyway?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman