Thanks Michael! Other people might have different opinions, but knowing that a "definitive" fix is on the way is plenty for me and I'm not looking to make a fuss.
If you still have some time, I have another question or two. You said that only self is passed at +0. Does this mean, for instance, that custom operators would still get +1 Expr values? It seems to me that if you receive any boxed value at +0, then it's always safe to also pass it further at +0 since it is immutable. That would have a huge impact on this program. Are there downsides to more pervasive +0 parameters? Félix > Le 29 déc. 2017 à 15:42, Michael Gottesman <mgottes...@apple.com> a écrit : > > > >> On Dec 28, 2017, at 4:27 PM, Chris Lattner via swift-dev >> <swift-dev@swift.org> wrote: >> >> <moving to swift-dev, where most of the perf optimization people lurk> >> >> This is a great question, I’m not sure what the answer is: maybe it is a >> simple case missed by the arc optimizer? > > It is a bit more complicated than that. There are codegen aspects and > optimization aspects. In both cases there are "structural issues" that I am > currently in the process of fixing as part of the "semantic sil" effort. > There are some retain/release pairs that we could eliminate by teaching the > optimizer about enums/indirect boxes. I go over all of this below. > > TLDR: There are structural issues here that make this difficult/unsafe in the > general case. Semantic SIL eliminates all of the structural issues with ARC > and makes this problem (among other problems) "trivial", so we have been > focusing our engineering time on implementing that. That being said, we could > use the effects analysis to handle this trivial case. But it would have to be > very conservative. Chris, feel like doing some ARC Optimization? = p. > > # Codegen Aspects > > ## Background Info: Local Retain/Release Pairs > > If one does not know about local vs non-local retain/release pairs, see the > appendix for an explanation. > > ## +0 Self > > In Swift today all parameters except for self are passed at +1. Self is > passed at +0. Even though this is true, often times SILGen will be > conservative around self and will emit a copy/destroy around the call site. > This is actually better since it eliminates a non-local retain/release pair > in the callee/caller favor of a local retain/release pair in the caller that > can be eliminated by the optimizer without any form of inter-procedural > optimization. That is why there is a retain/release at the count call. I will > go into why we are not optimizing it below in the optimization section. > > ## Pattern Matching at +1 > > All pattern matching in Swift today is done at +1. That means that in the > following Swift: > > ---- > enum Expr: CustomStringConvertible { > case Int(n: Int) > indirect case Var(x: String) > indirect case Add(f: Expr, g: Expr) > indirect case Mul(f: Expr, g: Expr) > indirect case Pow(f: Expr, g: Expr) > indirect case Ln(f: Expr) > > ... > > var count: Int { > switch self { > case .Int, .Var: return 1 > case let .Ln(f): return f.count > case let .Add(f, g), let .Mul(f, g), let .Pow(f, g): > return f.count + g.count > } > } > } > ---- > > even though self is passed at +0 to count, SILGen will emit a retain before > the switch. This is unfortunate and adds unnecessary ARC traffic in read only > methods on enums. There is no reason not to emit switches that way, so as > part of the work to make SILGenPattern able to pass the ownership verifier, I > plan on changing all pattern emission to be done at +0. As an added benefit > once that work is complete, refactorings in SILGenPattern will be exposed > that will enable us to significantly simplify SILGenPattern's handling of > cleanups. This complex code has lead to obscure bugs in the past and is IMO > overly complex. > > # Optimizer Aspects > > Lets start by looking at count.getter. Here is the SIL in CFG form: > > <count.getter.pdf> > > Lets analyze bb4 (I think bb5 suffers from similar issues). > > The retain, release pair on %0 here suffers from the turning off of the > guaranteed optimization. The guaranteed optimization says that any > retain/release pairs on a guaranteed value can be eliminated if they are > "semantic pairings". We had to turn off this optimization since without > semantic-sil we were guessing what retain/release pairings were "semantic > pairings". In certain cases this would result in mispairings so we had to > turn it off in the general case. Specifically, consider the following straw > man: > > ---- > strong_retain %x > ... > strong_release obfuscated(%x) > ... > strong_retain obfuscated'(%x) > ... > strong_release %x > ---- > > in this case we would pair/remove the outer retain, release pair yielding: > > ---- > strong_release obfuscated(%x) > ... > strong_retain obfuscated'(%x) > ---- > > While the reference count on %x is still balanced, since the retain count > balancing is in inverse order this can result in pre-mature releases and > use-after-frees. Once semantic-sil is complete, all pairings are explicit in > the IR so such a bug can not happen. To work around that problem, today we > are conservative and require any retain/release pair around a guaranteed > parameter to be +1 before and have a use afterwards. > > The second retain, release pair (i.e. %13) is slightly different since this > suffers from the optimizer needing to make a connection in between the box in > the enum and the value loaded from the box. Today we have enough knowledge in > the optimizer to make the first connection without issue via the RCIdentity > analysis. We do not make the connection though in between the box being > guaranteed and the value loaded from the box being effectively guaranteed as > a result. Even if we made that connection today, I am not sure if in the > general case it would be safe. With semantic-sil all of those problems go > away. > > For both of these, I think that by using appropriate inter procedural effects > analysis (which we have), we could teach the ARC optimizer to be more > aggressive around guaranteed. The specific properties would be that: > > 1. No side-effects (ignoring retain/releases). > 2. No retains/releases that are unbalanced. > 3. No escapes. > > I would have to think in more detail in order to be sure though. > > # Appendix > > ## Local Retain/Release Pairs > > There are two types of retain/release pairs in Swift, function local and > non-local. An example of a local retain/release is when one creates a new > binding, i.e.: > > ---- > func f(x: Klass) { > let xhat = x > ... > } > ---- > > When the assignment to xhat occurs we retain x and at the end of the function > (assuming that xhat is not used), we release xhat. Since the retain/release > pair are both in f, we are able to potentially pair and eliminate them > without needing to modify the call graph. In contrast, a function non-local > pairing is created when a parameter is passed at +1, e.g.: > > ---- > func g(x: Klass) { ... x is used but not consumed ... } > func f(x: Klass) { > g(x) > } > ---- > > In this case when we call g (ignoring any peepholes) we retain x first in f > (the callee) and then release x in g (the caller). This is bad for > optimization purposes since it means that we can not eliminate the > retain/release pair without modifying the call graph or inlining. For > non-resilient functions, we of course can/do inline and modify the callgraph > via function signature optimization. In the case of resilient functions this > is impossible by the nature of resilience. > >> >> -Chris >> >> >> >>> On Dec 27, 2017, at 9:19 PM, Félix Cloutier via swift-evolution >>> <swift-evolut...@swift.org> wrote: >>> >>> Running this on my MBP with 10 as the parameter: >>> https://gist.github.com/zneak/ae33bb970a08632cfb2925e2049f9e7a >>> >>> I get a runtime of about 10 seconds, 45% of which is spent in >>> retain/release calls according to Instruments (!!), and at least half of >>> that from Expr.count. Looking at the IR, Swift generously sprinkles >>> retain/release calls through the outlined copy method: >>> >>> • `self` is retained at the beginning of `count` >>> • The values that you get out of destructuring are retained >>> • Of course, when you get `count` on these, they are retained >>> again >>> >>> Of course, Expr.count cannot modify itself or its subexpressions because >>> the functions are not mutating; in fact, no function in that program can >>> mutate an enum case. Why, then, is Swift retaining/releasing `self` and the >>> values obtained from destructured patterns? They can't go away. Shouldn't >>> we be getting guaranteed references instead of owning references? >>> >>> That seems to hit pattern-matching-heavy programs with indirect cases >>> pretty hard, and it's pretty frustrating because that seems to be about the >>> nicest way to write this program, and there's no workaround from the >>> developer's perspective. I don't think that this is a fatal flaw of >>> refcounting, but unless I'm missing something, that's sub-par behavior. >>> >>> Félix >>> _______________________________________________ >>> swift-evolution mailing list >>> swift-evolut...@swift.org >>> https://lists.swift.org/mailman/listinfo/swift-evolution >> >> _______________________________________________ >> swift-dev mailing list >> swift-dev@swift.org >> https://lists.swift.org/mailman/listinfo/swift-dev > _______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev