Thanks Michael! Other people might have different opinions, but knowing that a 
"definitive" fix is on the way is plenty for me and I'm not looking to make a 
fuss.

If you still have some time, I have another question or two. You said that only 
self is passed at +0. Does this mean, for instance, that custom operators would 
still get +1 Expr values? It seems to me that if you receive any boxed value at 
+0, then it's always safe to also pass it further at +0 since it is immutable. 
That would have a huge impact on this program. Are there downsides to more 
pervasive +0 parameters?

Félix

> Le 29 déc. 2017 à 15:42, Michael Gottesman <mgottes...@apple.com> a écrit :
> 
> 
> 
>> On Dec 28, 2017, at 4:27 PM, Chris Lattner via swift-dev 
>> <swift-dev@swift.org> wrote:
>> 
>> <moving to swift-dev, where most of the perf optimization people lurk>
>> 
>> This is a great question, I’m not sure what the answer is: maybe it is a 
>> simple case missed by the arc optimizer?
> 
> It is a bit more complicated than that. There are codegen aspects and 
> optimization aspects. In both cases there are "structural issues" that I am 
> currently in the process of fixing as part of the "semantic sil" effort. 
> There are some retain/release pairs that we could eliminate by teaching the 
> optimizer about enums/indirect boxes. I go over all of this below.
> 
> TLDR: There are structural issues here that make this difficult/unsafe in the 
> general case. Semantic SIL eliminates all of the structural issues with ARC 
> and makes this problem (among other problems) "trivial", so we have been 
> focusing our engineering time on implementing that. That being said, we could 
> use the effects analysis to handle this trivial case. But it would have to be 
> very conservative. Chris, feel like doing some ARC Optimization? = p.
> 
> # Codegen Aspects
> 
> ## Background Info: Local Retain/Release Pairs
> 
> If one does not know about local vs non-local retain/release pairs, see the 
> appendix for an explanation.
> 
> ## +0 Self
> 
> In Swift today all parameters except for self are passed at +1. Self is 
> passed at +0. Even though this is true, often times SILGen will be 
> conservative around self and will emit a copy/destroy around the call site. 
> This is actually better since it eliminates a non-local retain/release pair 
> in the callee/caller favor of a local retain/release pair in the caller that 
> can be eliminated by the optimizer  without any form of inter-procedural 
> optimization. That is why there is a retain/release at the count call. I will 
> go into why we are not optimizing it below in the optimization section.
> 
> ## Pattern Matching at +1
> 
> All pattern matching in Swift today is done at +1. That means that in the 
> following Swift:
> 
> ----
> enum Expr: CustomStringConvertible {
>       case Int(n: Int)
>       indirect case Var(x: String)
>       indirect case Add(f: Expr, g: Expr)
>       indirect case Mul(f: Expr, g: Expr)
>       indirect case Pow(f: Expr, g: Expr)
>       indirect case Ln(f: Expr)
> 
>       ...
> 
>       var count: Int {
>               switch self {
>               case .Int, .Var: return 1
>               case let .Ln(f): return f.count
>               case let .Add(f, g), let .Mul(f, g), let .Pow(f, g):
>                       return f.count + g.count
>               }
>       }
> }
> ----
> 
> even though self is passed at +0 to count, SILGen will emit a retain before 
> the switch. This is unfortunate and adds unnecessary ARC traffic in read only 
> methods on enums. There is no reason not to emit switches that way, so as 
> part of the work to make SILGenPattern able to pass the ownership verifier, I 
> plan on changing all pattern emission to be done at +0. As an added benefit 
> once that work is complete, refactorings in SILGenPattern will be exposed 
> that will enable us to significantly simplify SILGenPattern's handling of 
> cleanups. This complex code has lead to obscure bugs in the past and is IMO 
> overly complex.
> 
> # Optimizer Aspects
> 
> Lets start by looking at count.getter. Here is the SIL in CFG form:
> 
> <count.getter.pdf>
> 
> Lets analyze bb4 (I think bb5 suffers from similar issues).
> 
> The retain, release pair on %0 here suffers from the turning off of the 
> guaranteed optimization. The guaranteed optimization says that any 
> retain/release pairs on a guaranteed value can be eliminated if they are 
> "semantic pairings". We had to turn off this optimization since without 
> semantic-sil we were guessing what retain/release pairings were "semantic 
> pairings". In certain cases this would result in mispairings so we had to 
> turn it off in the general case. Specifically, consider the following straw 
> man:
> 
> ----
> strong_retain %x
> ...
> strong_release obfuscated(%x)
> ...
> strong_retain obfuscated'(%x)
> ...
> strong_release %x
> ----
> 
> in this case we would pair/remove the outer retain, release pair yielding:
> 
> ----
> strong_release obfuscated(%x)
> ...
> strong_retain obfuscated'(%x)
> ----
> 
> While the reference count on %x is still balanced, since the retain count 
> balancing is in inverse order this can result in pre-mature releases and 
> use-after-frees. Once semantic-sil is complete, all pairings are explicit in 
> the IR so such a bug can not happen. To work around that problem, today we 
> are conservative and require any retain/release pair around a guaranteed 
> parameter to be +1 before and have a use afterwards.
> 
> The second retain, release pair (i.e. %13) is slightly different since this 
> suffers from the optimizer needing to make a connection in between the box in 
> the enum and the value loaded from the box. Today we have enough knowledge in 
> the optimizer to make the first connection without issue via the RCIdentity 
> analysis. We do not make the connection though in between the box being 
> guaranteed and the value loaded from the box being effectively guaranteed as 
> a result. Even if we made that connection today, I am not sure if in the 
> general case it would be safe. With semantic-sil all of those problems go 
> away.
> 
> For both of these, I think that by using appropriate inter procedural effects 
> analysis (which we have), we could teach the ARC optimizer to be more 
> aggressive around guaranteed. The specific properties would be that:
> 
> 1. No side-effects (ignoring retain/releases).
> 2. No retains/releases that are unbalanced.
> 3. No escapes.
> 
> I would have to think in more detail in order to be sure though.
> 
> # Appendix
> 
> ## Local Retain/Release Pairs
> 
> There are two types of retain/release pairs in Swift, function local and 
> non-local. An example of a local retain/release is when one creates a new 
> binding, i.e.:
> 
> ----
> func f(x: Klass) {
>  let xhat = x
>  ...
> }
> ----
> 
> When the assignment to xhat occurs we retain x and at the end of the function 
> (assuming that xhat is not used), we release xhat. Since the retain/release 
> pair are both in f, we are able to potentially pair and eliminate them 
> without needing to modify the call graph. In contrast, a function non-local 
> pairing is created when a parameter is passed at +1, e.g.:
> 
> ----
> func g(x: Klass) { ... x is used but not consumed ... }
> func f(x: Klass) {
>  g(x)
> }
> ----
> 
> In this case when we call g (ignoring any peepholes) we retain x first in f 
> (the callee) and then release x in g (the caller). This is bad for 
> optimization purposes since it means that we can not eliminate the 
> retain/release pair without modifying the call graph or inlining. For 
> non-resilient functions, we of course can/do inline and modify the callgraph 
> via function signature optimization. In the case of resilient functions this 
> is impossible by the nature of resilience.
> 
>> 
>> -Chris
>> 
>> 
>> 
>>> On Dec 27, 2017, at 9:19 PM, Félix Cloutier via swift-evolution 
>>> <swift-evolut...@swift.org> wrote:
>>> 
>>> Running this on my MBP with 10 as the parameter: 
>>> https://gist.github.com/zneak/ae33bb970a08632cfb2925e2049f9e7a 
>>> 
>>> I get a runtime of about 10 seconds, 45% of which is spent in 
>>> retain/release calls according to Instruments (!!), and at least half of 
>>> that from Expr.count. Looking at the IR, Swift generously sprinkles 
>>> retain/release calls through the outlined copy method:
>>> 
>>>     • `self` is retained at the beginning of `count`
>>>     • The values that you get out of destructuring are retained
>>>             • Of course, when you get `count` on these, they are retained 
>>> again
>>> 
>>> Of course, Expr.count cannot modify itself or its subexpressions because 
>>> the functions are not mutating; in fact, no function in that program can 
>>> mutate an enum case. Why, then, is Swift retaining/releasing `self` and the 
>>> values obtained from destructured patterns? They can't go away. Shouldn't 
>>> we be getting guaranteed references instead of owning references?
>>> 
>>> That seems to hit pattern-matching-heavy programs with indirect cases 
>>> pretty hard, and it's pretty frustrating because that seems to be about the 
>>> nicest way to write this program, and there's no workaround from the 
>>> developer's perspective. I don't think that this is a fatal flaw of 
>>> refcounting, but unless I'm missing something, that's sub-par behavior.
>>> 
>>> Félix
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolut...@swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
> 

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to