Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)

Alexis Ballier Sun, 11 Jun 2017 09:05:42 -0700

On Fri, 09 Jun 2017 18:21:50 +0200
Michał Górny <[email protected]> wrote:

> (cut off the parts where I agree and there's nothing to add)
> 
> On pią, 2017-06-09 at 16:16 +0200, Alexis Ballier wrote:
> > [...]  
> > > > In your example above, we'd call 'nsolve("|| ( X )")' and
> > > > 'nsolve("|| ( Y )")' (or even simpler, depending on how
> > > > simplify() is defined). If both X and Y are masked on a
> > > > profile, then that'd reduce to 'nsolve("False")' which would
> > > > rant.    
> > > 
> > > So you're talking about reducing prior to transforming? Yes,
> > > that'd work. As I mentioned in one of the first mails wrt my
> > > reference implementation, I've used reordering (stable sort)
> > > instead of reducing since it was simpler.
> > > 
> > > If you reduce (simplify), you need to account for special cases
> > > like getting '|| ()' etc. If you reorder only, things just fail
> > > the normal way.  
> > 
> > While the reordering idea seems nice as it factors both user
> > preference and masks, the problem with reordering is that nothing
> > guarantees that the solver won't try to enable a masked flag. We'd
> > have to deal with that somehow.  
> 
> Well, yes and no.
> 
> The algorithm always needs to account for the possibility of
> constraints altering immutable flags. Of course, there's more than
> one way of doing it.
> 
> AFAIU you are aiming for separate processing of immutable flags
> and explicit failure if the constraints would attempt to force value
> of those flags. That surely makes sense for verification.

The semi-hidden goal here for me is to have purely ast rewriting rules
giving a list of implications. This makes the solver trivial as those
are read as "if condition then constraint" and can be used as input for
the checker. Failing that, this would need to be done on the checker
side anyway and then we might run into problems like the checker not
really checking reality since the solver behaves a little bit
differently.

> My approach was simpler -- marking the flags immutable, and failing if
> something actually tries to alter their value. I think it's a simpler
> solution for the plain solver and it works as well. After all, we do
> not want the solver to attempt to find workarounds for the problem
> but just fail.

This should be equivalent: masked flags will be toggled as last resort
and fail; eliminated flags will not be toggled at all and fail if
having them as immutable causes a contradiction

> The above applies clearly to the plain conflicts such as:
> 
>   foo? ( bar )
> 
> where bar is immutable. The n-ary operators can be flexed a little.
> That's what reordering achieves -- it makes sure they come as the most
> or the least preferred as appropriate. Which means that the same
> algorithm either succeeds (by not having to touch them) or fails at
> attempting to flip them.
> 
> Think of:
> 
>   ?? ( a b c )
> 
> with both b&c in use.force. This gets reordered to:
> 
>   ?? ( b c a )
> 
> The order between b&c doesn't matter. Since b comes first now, it
> forces c being disabled. Since c is immutable, the solver fails with
> ImmutabilityError. We solve the problem with minimal code redundancy.

Considering that code should ideally be checked, that'd be '?? ( a true
true )' reducing to 'false' and a repoman error.

> > I think reordering should be kept for user preferences
> > (soft-enable/soft-disable) while masks for hard-no's or hard-yes'es.
> > 
> > 
> > Be careful with reordering though:
> > '^^ ( a b ) b? ( a )' can be solved in one pass.
> > (it disables b if both are set and enables a if none are set)
> > 
> > while:
> > '^^ ( b a ) b? ( a )' loops
> > (if both are set it disables 'a' for the 1st clause but then
> > enables it for the 2nd)
> > 
> > This is not checked by nsolve().  
> 
> Yes, this is a problem I was considering. I planned to think a bit if
> we could possibly generate some more complex transformations to
> trigger nsolve to notice this kind of issues.

Except feeding the n! possible reorderings to nsolve() and checking them
all I don't see many other possibilities.

We could think about a transformation that would be order agnostic,
like '|| ( a b c )' giving the same output as '|| ( b c a )' but then
this would not express any preference anymore. Remember: The whole
point of the solver is to break the symmetry of SAT formulas so that
there is a natural way to solve them instead of just "figure out some
useflags that make it work". In other words, order does actually
matter a lot, otherwise you fall into the "solve SAT" trap again.

> And now two updates on other matters.
> 
> Firstly, all-of inside ??. Per the construct:
> 
>   ?? ( ( a b ) c )
> 
> the expansion would be:
> 
>   [a b]? ( !c ) c? ( ![a b] )
> 
> which means we should be able to easily affect the effective behavior
> of both implementations by defining how to handle/expand negations of
> all- of groups.
> 
> It's then a matter of replacing it with:
> 
> a. !a, or
> 
> b. !a !b.
> 
> As you pointed out, a. has the advantage that we alter less flags. b.
> has the advantage that we alter more flags -- so it's less likely
> we'll actually leave some unused flag enabled. Whichever we choose, it
> probably doesn't matter as I can't think of a valid use case for this
> constraint that would clearly define the result.

I think this can result in (otherwise solvable) errors too:
"c? ( ![a b] ) b" is ok if expanded to "c? ( !a ) b" but not if
expanded to "c? ( !a !b ) b" with USE="-* c".

This is quite a stupid and extreme example but I can't think of any
case that solve() would work when negating all of them but not when
negating only the first.

> Secondly, nested n-ary operators. I have taken the following snippet
> as a simple example:
> 
>   || ( a || ( b c ) )
> 
> Logically (and per constraint checking algo), this should be
> equivalent to:
> 
>   || ( a b c )
> 
> However, if we expand it to implication form, we get:
> 
>   ![ || ( b c ) ] => a
> 
>   ![ !c => b ] => a
> 
> At this point, we already see some contract problem/ambiguity. Per
> contract, we are supposed not to alter any flags on LHS of
> implication. However, we have another implication there, so it is
> unclear if RHS of that nested implication should be mutable or not.
> 
> Let's consider the nested implication first:
> 
>   !c => b
> 
> Per the constraint checking rules, this constraint is met (evaluates
> to true) either if c is enabled, or both c is disabled and b is
> enabled. In other words, it fails (evaluates to false) only if both b
> and c are disabled. Putting that into a table we get:
> 
>   b c | ?
>   0 0 | 0 (fail -- LHS matches, RHS does not)
>   0 1 | 1 (LHS does not match)
>   1 0 | 1 (LHS & RHS matches)
>   1 1 | 1 (LHS does not match)
> 
> Per the solving rules, in the only failing case we should enforce RHS
> -- i.e. enable b.
> 
> Now, let's consider its negation:
> 
>   ![ !c => b ]
> 
> Per the rules of logic, it is true (= matches the constraint) only if
> both b and c are disabled. While it is unclear if we should be
> enforcing RHS inside negation, logically saying I don't think it can
> actually happen. Because:
> 
> 1. if b=c=0, the whole negated constraint is satisfied, so we
> shouldn't apply any solving to it (attempting to solve it would be
> inconsistent with the case where whole REQUIRED_USE is satisfied
> immediately),
> 
> 2. in any other case, the inner constraint is satisfied, so there's no
> change to be applied.
> 
> That considered, I think we could reasonably replace the negation of
> implication with plain conjunction of LHS and negation of RHS:
> 
>   [ !c !b ]
> 
> This would render the outer expression as:
> 
>   [ !c !b ] => a
> 
> which is equivalent to || ( a b c ).
> 
> The question is whether applying that rule for implications nested on
> LHS of another implication is going to work fine for all our
> expansions.
> 

We have two issues here I think: We want to preserve the semantics of
the whole thing, that is: solve(rewrite(x)) always does the exact same
thing as solve(x) (and its definition) and rewrite(x) should not grow x
exponentially (or at least, not too much). To add to the fun, we want
solve(rewrite(x)) to always provide a valid combination and rewrite(x)
to play nice with nsolve() (or alter a bit nsolve to play nice with it).

|| ( foo? ( bar ) baz ) would read as "if not baz and foo then bar else
baz"; this is equivalent to "foo? ( || ( bar baz ) ) !foo? ( baz )".
However, this roughly doubles the size of the formula for each nested
implication.

If we blindly apply the || expansion, then we get:
 '!baz? ( foo? ( bar ) )' so that the solver does not give a valid
 answer for USE='-*' as it won't change anything and this does not
 match the original constraint.

foo -> bar is logically equivalent to || ( !foo bar ); semantically
this is wrong since we prefer leftmost and this would mean 'if not bar
then disable foo' while we want 'if foo then bar'. So we should write
it '|| ( bar !foo )'. This is why, while truth tables are necessary,
you should not rely on them either since they fail to capture the non
commutativity of the ||, && & co.

After replacing, we get:
|| ( || ( bar !foo ) baz )
and we're back to your original problem.

I'm starting to think the nested implications are not implications at
all.

If we want '|| ( foo? ( bar ) baz )' to be equivalent to 'foo? ( ||
( bar baz ) ) !foo? ( baz )' then 'foo? ( bar )' should evaluate to
false when foo is disabled so that the algorithm chooses 'baz'.
That'd be: 'foo? ( bar )' is replaced by '( foo bar )', giving:

'|| ( ( foo bar ) baz )' being rewritten as '!baz? ( foo bar )'. This
starts to make more sense: Now the solver will always provide a valid
answer. We can also see that the leftmost preference is now very
strong as it will seriously force the leftmost clause to be true.
There is not much better we can do here I think.

Now, consider its AllOf equivalent '( foo? ( bar ) baz )'; replacing as
above gives '( foo bar baz )' enabling them all. While this works, this
is likely not what we want. Better replace it by '( || ( bar !foo ) baz
)' as a real implication, giving two toplevel implications: 'foo? ( bar
)' and 'baz'.

And finally '?? ( foo? ( bar ) baz )': "if baz and foo then not bar".
Here it is equivalent to '?? ( ( foo bar ) baz )'.

While it is a bit annoying to have to look at a clause's parent in the
AST in order to rewrite it, we have now eliminated nested implications
without exponential grow.

Note also that we need to merge nested implications too:
|| ( a? ( b? ( c ) ) d )
should read as:
|| ( [a,b]?[c] d ) and be translated to:
|| ( ( a b c ) d )

This reads as: 'if not d then a, b and c'

If we do not merge them and apply the algorithm recursively in a
breadth-first way we get:
|| ( ( a b? ( c ) ) d )
And then, since we have an AllOf clause:
|| ( ( a || ( c !b ) ) d )

Reading as: 'if not d then ( if b and not c then a )' where the solver
will not change anything with e.g. USE='-* c' leaving the initial
constraint unsatisfied. We can force this behavior with the following
constraint: '|| ( a? ( b? ( c ) e ) d )' (so that a? ( AllOf(['b?c',
'e'] )' does not allow easy merging of the implications).

So we have to forget about merging nested implications and have a
problem here. Even for the solver. I think the safest way is to always
replace nested implications 'foo? ( ... )' by AllOf([foo, ...]).
While it would give some unexpected results, this would make the solver
work in every case.

|| ( a? ( b? ( c ) ) d ) will be '|| ( ( a b c ) d )' in every case.
|| ( a? ( b? ( c ) e ) d ) will be '|| ( ( a b c e ) d )'.

Then comes '??':
?? ( c1 c2 ... ck ) is expanded to:
c1? ( !c2 ... !ck )
c2? ( !c3 ... !ck )
etc.

One problem we've seen is computing things like !c1 when c1 is, for
example, an implication. The above replacement makes that easy since
we're now in a boolean algebra and can bubble down the negations to
useflags.

?? ( a? ( b ) c d )
becomes:
?? ( ( a b ) c d )
then:
(( a b ) ? (!c !d))
c? ( !d )
which is good.

?? ( c a? ( b ) d )
becomes:
?? ( c ( a b ) d )
then:
c? ( !(a b) !d ) -> c? ( || ( !a !b ) !d )
(a b)? ( !d )
then, expanding everything:
c? !d
c? b? !a
a? b? !d

which is good too.

We can define how to bubble down the negations:
 Not(AnyOf(l)) -> AllOf([Not(x) for x in l])
 Not(AllOf(l)) -> AnyOf([Not(x) for x in l])

Another option you suggested is Not(AllOf(l)) -> AllOf([Not(x) for x in
l]). This preserves solving but breaks Not(Not(x)) == x here. In the
above this would change 'c? ( !(a b) !d )' as 'c? ( !a !b !d )' which
would work too but be more constraining.

After eliminating nested implications, ^^, ?? and bubbling down the
negations, the remaining AST for the constraints is much simpler:
rclause =
        AllOf(list of rclauses)
      | AnyOf(list of rclauses)
      | Not(useflag)
      | useflag

And a REQUIRED_USE is a list of Implication([list of flags or !flags],
rclause), where the list is read as 'AllOf' which we want to transform
as a list of 'Implication([list of flags or !flags], [list of flags
or !flags])'.

We can do that recursively:
transform (expr: rclause) -> list(Implication[list of flags or !flags],
[list of flags or !flags])

Here I assume Implication([],[b]) is 'b') and Implication([...],[]) is
nothing.

transform (expr:rclause):
  match expr with:
     useflag | !useflag -> [ Implication([], [expr]) ]
   | AnyOf(l) ->
        f = l.pop(0)
        if len(l)>0: return merge(Not(AnyOf(l)), transform(f))
        else return transform(f)
   | AllOf(l) ->
        r = []
        for i in l: r+=transform(i)
        return r

And the merge function:

merge (condition:rclause, consequences:list of implications):
   match reduce(rclause) with:
  useflag | !useflag as x -> [ Implication([x]+c.conditions,
  c.consequences) for c in consequences ]
 | AnyOf(l) ->
        r = []
        for c in l:
                r+=merge(c,consequences)
        return r
 | AllOf(l) ->
        if len(l) <= 0: return consequences
        return merge(l[0], merge(AllOf(l[1:]),consequences))

Note that this is exponential (we're basically computing the DNF form
of the formula here), but the exponent is roughly the number of
alternating nested || and &&, so that should still be ok.

Let's apply that to your examples:
transform(|| ( a || ( b c ) ))
-> merge( Not(AnyOf('|| ( b c )'), transform('a'))
-> merge( AllOf(AllOf(['!b', '!c']), [Implication([],['a'])]) 
(bubbling down the Not())
-> merge( AllOf(['!b', '!c']), merge(AllOf([]), [Implication([],['a'])])
-> merge( AllOf(['!b', '!c']), [Implication([],['a'])])
-> merge( '!b', merge(AllOf(['!c'], [Implication([],['a'])]))
-> merge( '!b', merge('!c', [Implication([],['a'])]))
-> merge( '!b', [Implication([!c], ['a'])])
-> Implication(['!b','!c'],['a'])

transform(|| ( a b c ))
-> merge( Not(AnyOf(['b', 'c']), [Implication([],['a'])])
-> merge( AllOf(['!b', '!c']),  [Implication([],['a'])])
And we're at the same point as above.

Suppose we write it: '|| ( a !c? ( b ) )'. This gets rewritten by the
nested implication remover as '|| ( a ( !c b ) )'.

transform(|| ( a ( !c b ) ))
-> merge(Not(AnyOf('(!c b)')), [Implication([],[a]))
-> merge(AllOf([AnyOf([c, !b])])), [Implication([],[a]))
-> merge(AnyOf([c,!b]), [Implication([],[a]))
-> merge(c,  [Implication([],[a])) + merge(!b, [Implication([],[a]))
-> [ Implication([c],[a]), Implication([!b],[a]) ]

aka 'c? ( a ) !b? ( a )'
While it works, we can see a serious bias towards the left-most in a ||.

I think this handles all the cases. I'll try to update the repo with
that algo.

Bests,

Alexis.

Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)

Reply via email to