Thanks for putting these thoughts together, Jordan! Some additional comments
inline.
> On Aug 2, 2017, at 5:08 PM, Jordan Rose <jordan_r...@apple.com> wrote:
>
> David Hart recently asked on Twitter
> <https://twitter.com/dhartbit/status/891766239340748800> if there was a good
> way to add Decodable support to somebody else's class. The short answer is
> "no, because you don't control all the subclasses", but David already
> understood that and wanted to know if there was anything working to mitigate
> the problem. So I decided to write up a long email about it instead. (Well,
> actually I decided to write a short email and then failed at doing so.)
>
> The Problem
>
> You can add Decodable to someone else's struct today with no problems:
>
> extension Point: Decodable {
> enum CodingKeys: String, CodingKey {
> case x
> case y
> }
> public init(from decoder: Decoder) throws {
> let container = try decoder.container(keyedBy: CodingKeys.self)
> let x = try container.decode(Double.self, forKey: .x)
> let y = try container.decode(Double.self, forKey: .y)
> self.init(x: x, y: y)
> }
> }
>
> But if Point is a (non-final) class, then this gives you a pile of errors:
>
> - init(from:) needs to be 'required' to satisfy a protocol requirement.
> 'required' means the initializer can be invoked dynamically on subclasses.
> Why is this important? Because someone might write code like this:
>
> func decodeMe<Result: Decodable>() -> Result {
> let decoder = getDecoderFromSomewhere()
> return Result(from: decoder)
> }
> let specialPoint: VerySpecialSubclassOfPoint = decodeMe()
>
> …and the compiler can't stop them, because VerySpecialSubclassOfPoint is a
> Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is
> Decodable. A bit more on this later, but for now let's say that's a sensible
> requirement.
>
> - init(from:) also has to be a 'convenience' initializer. That one makes
> sense too—if you're outside the module, you can't necessarily see private
> properties, and so of course you'll have to call another initializer that can.
>
> But once it's marked 'convenience' and 'required' we get "'required'
> initializer must be declared directly in class 'Point' (not in an
> extension)", and that defeats the whole purpose. Why this restriction?
>
>
> The Semantic Reason
>
> The initializer is 'required', right? So all subclasses need to have access
> to it. But the implementation we provided here might not make sense for all
> subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)'
> initializer? Normally, the compiler checks for this situation and makes the
> subclass reimplement the 'required' initializer…but that only works if the
> 'required' initializers are all known up front. So it can't allow this new
> 'required' initializer to go by, because someone might try to call it
> dynamically on a subclass. Here's a dynamic version of the code from above:
>
> func decodeDynamic(_ pointType: Point.Type) -> Point {
> let decoder = getDecoderFromSomewhere()
> return pointType.init(from: decoder)
> }
> let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)
>
>
> The Implementation Reason
>
> 'required' initializers are like methods: they may require dynamic dispatch.
> That means that they get an entry in the class's dynamic dispatch table,
> commonly known as its vtable. Unlike Objective-C method tables, vtables
> aren't set up to have entries arbitrarily added at run time.
>
> (Aside: This is one of the reasons why non-@objc methods in Swift extensions
> can't be overridden; if we ever lift that restriction, it'll be by using a
> separate table and a form of dispatch similar to objc_msgSend. I sent a
> proposal to swift-evolution about this last year but there wasn't much
> interest.)
>
>
> The Workaround
>
> Today's answer isn't wonderful, but it does work: write a wrapper struct that
> conforms to Decodable instead:
>
> struct DecodedPoint: Decodable {
> var value: Point
> enum CodingKeys: String, CodingKey {
> case x
> case y
> }
> public init(from decoder: Decoder) throws {
> let container = try decoder.container(keyedBy: CodingKeys.self)
> let x = try container.decode(Double.self, forKey: .x)
> let y = try container.decode(Double.self, forKey: .y)
> self.value = Point(x: x, y: y)
> }
> }
>
> This doesn't have any of the problems with inheritance, because it only
> handles the base class, Point. But it makes everywhere else a little less
> convenient—instead of directly encoding or decoding Point, you have to use
> the wrapper, and that means no implicitly-generated Codable implementations
> either.
>
> I'm not going to spend more time talking about this, but it is the officially
> recommended answer at the moment. You can also just have all your own types
> that contain points manually decode the 'x' and 'y' values and then construct
> a Point from that.
I would actually take this a step further and recommend that any time you
intend to extend someone else’s type with Encodable or Decodable, you should
almost certainly write a wrapper struct for it instead, unless you have
reasonable guarantees that the type will never attempt to conform to these
protocols on its own.
This might sound extreme (and inconvenient), but Jordan mentions the issue here
below in The Dangers of Retroactive Modeling. Any time you conform a type which
does not belong to you to a protocol, you make a decision about its behavior
where you might not necessarily have the "right" to — if the type later adds
conformance to the protocol itself (e.g. in a library update), your code will
no longer compile, and you’ll have to remove your own conformance. In most
cases, that’s fine, e.g., there’s not much harm done in dropping your custom
Equatable conformance on some type if it starts adopting it on its own. The
real risk with Encodable and Decodable is that unless you don’t care about
backwards/forwards compatibility, the implementations of these conformances are
forever.
Using Point here as an example, it’s not unreasonable for Point to eventually
get updated to conform to Codable. It’s also not unreasonable for the
implementation of Point to adopt the default conformance, i.e., get encoded as
{"x": …, "y": …}. This form might not be the most compact, but it leaves room
for expansion (e.g. if Point adds a z field, which might also be reasonable,
considering the type doesn’t belong to you). If you update your library
dependency with the new Point class and have to drop the conformance you added
to it directly, you’ve introduced a backwards and forwards compatibility
concern: all new versions of your app now encode and decode a new archive
format, which now requires migration. Unless you don’t care about other
versions of your app, you’ll have to deal with this:
Old versions of your app which users may have on their devices cannot read
archives with this new format
New versions of your app cannot read archives with the old format
Unless you don’t care for some reason, you will now have to write the wrapper
struct, to either
Have new versions of your app attempt to read old archive versions and migrate
them forward (leaving old app versions in the dust), or
Write all new archives with the old format so old app versions can still read
archives written with newer app versions, and vice versa
Either way, you’ll need to write some wrapper to handle this; it’s
significantly safer to do that work up front on a type which you do control
(and safely allow Point to change out underneath you transparently), rather
than potentially end up between a rock and a hard place later on because a type
you don’t own changes out from under you.
> Future Direction: 'required' + 'final'
>
> One language feature we could add to make this work is a 'required'
> initializer that is also 'final'. Because it's 'final', it wouldn't have to
> go into the dynamic dispatch table. But because it's 'final', we have to make
> sure its implementation works on all subclasses. For that to work, it would
> only be allowed to call other 'required' initializers…which means you're
> still stuck if the original author didn't mark anything 'required'. Still,
> it's a safe, reasonable, and contained extension to our initializer model.
>
>
> Future Direction: runtime-checked convenience initializers
>
> In most cases you don't care about hypothetical subclasses or invoking
> init(from:) on some dynamic Point type. If there was a way to mark
> init(from:) as something that was always available on subclasses, but
> dynamically checked to see if it was okay, we'd be good. That could take one
> of two forms:
>
> - If 'self' is not Point itself, trap.
> - If 'self' did not inherit or override all of Point's designated
> initializers, trap.
>
> The former is pretty easy to implement but not very extensible. The latter
> seems more expensive: it's information we already check in the compiler, but
> we don't put it into the runtime metadata for a class, and checking it at run
> time requires walking up the class hierarchy until we get to the class we
> want. This is all predicated on the idea that this is rare, though.
>
> This is a much more intrusive change to the initializer model, and it's
> turning a compile-time check into a run-time check, so I think we're less
> likely to want to take this any time soon.
>
>
> Future Direction: Non-inherited conformances
>
> All of this is only a problem because people might try to call init(from:) on
> a subclass of Point. If we said that subclasses of Point weren't
> automatically Decodable themselves, we'd avoid this problem. This sounds like
> a terrible idea but it actually doesn't change very much in practice.
> Unfortunately, it's also a very complicated and intrusive change to the Swift
> protocol system, and so I don't want to spend more time on it here.
>
>
> The Dangers of Retroactive Modeling
>
> Even if we magically make this all work, however, there's still one last
> problem: what if two frameworks do this? Point can't conform to Decodable in
> two different ways, but neither can it just pick one. (Maybe one of the
> encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded
> with polar coordinates.) There aren't great answers to this, and it calls
> into question whether the struct "solution" at the start of this message is
> even sensible.
>
> I'm going to bring this up on swift-evolution soon as part of the Library
> Evolution discussions (there's a very similar problem if the library that
> owns Point decides to make it Decodable too), but it's worth noting that the
> wrapper struct solution doesn't have this problem.
>
>
> Whew! So, that's why you can't do it. It's not a very satisfying answer, but
> it's one that falls out of our compile-time safety rules for initializers.
> For more information on this I suggest checking out my write-up of some of
> our initialization model problems
> <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst>.
> And I plan to write another email like this to discuss some solutions that
> are actually doable.
>
> Jordan
>
> P.S. There's a reason why Decodable uses an initializer instead of a
> factory-like method on the type but I can't remember what it is right now. I
> think it's something to do with having the right result type, which would
> have to be either 'Any' or an associated type if it wasn't just 'Self'. (And
> if it is 'Self' then it has all the same problems as an initializer and would
> require extra syntax.) Itai would know for sure.
To give background on this — the protocols originally had factory initializers
in mind for this (to allow for object replacement and avoid some of these
issues), but without a "real" factory initializer pattern like we’re discussing
here, the problems with this approach were intractable (all due to subclassing
issues).
An initializer pattern like static func decode(from: Decoder) throws -> ??? has
a few problems
The return type is one consideration. If we allow for an associated type
representing to the return type, subclasses cannot override the associated type
to return something different. This makes object replacement impossible in
situations which use subclassing. The only reasonable thing is to return Self
(which would allow for returning instances of self, or of subclasses). (We
could return Any, but that defeats the entire purpose of having a type-safe API
to begin with; we want to avoid the dynamic casting altogether.)
Even if we return Self, this method cannot be overridden by subclasses:
If implemented as static func decode(from: Decoder) throws -> Self, the method
clearly cannot be overridden in a subclass, as it is a static method
The method cannot be implemented as class func decode(from: Decoder) throws ->
Self on a non-final class:
protocol Foo {
static func create() -> Self
}
class Bar : Foo {
class func create() -> Bar { // method 'create()' in non-final class 'Bar'
must return 'Self' to conform to protocol 'Foo'
return Bar()
}
}
protocol Foo {
static func create() -> Self
}
class Bar : Foo {
class func create() -> Self {
return Bar() // cannot convert return expression of type 'Bar' to
return type 'Self'
}
}
protocol Foo {
static func create() -> Self
}
class Bar : Foo {
class func create() -> Self {
return Bar() as! Self // error: 'Self' is only available in a protocol
or as the result of a method in a class; did you mean 'Bar'?; warning: forced
cast of 'Bar' to same type has no effect; error: cannot convert return of
expression type 'Bar' to return type 'Self'
}
}
final class Bar : Foo {
class func create() -> Bar { // no problems
return Bar()
}
}
This means that we either allow adoption of these protocols on final classes
only (which, again, defeats the whole purpose!), or, that every class which
implements these protocols has to have knowledge about all of its potential
subclasses and their implementations of these protocols. This is prohibitive as
well.
Even if it were possible to subclass these types of methods, they don’t follow
the regular initializer pattern. In order to construct an instance of a
subclass, you need to be able to call a superclass initializer. But these
methods are not initializers; even if you call super’s factory initializer,
there’s noting you can do with the returned instance of the superclass; unlike
in ObjC, there’s no super- or self-reassignment (in general), so classes would
have to follow a completely different (and awkward) pattern of creating an
instance of the superclass, initializing from that instance in a separate
initializer (e.g. self.init(superInstance)), and also setting decoded properties
Overall, the lack of a true factory initializer pattern prevented us from doing
something like this, and we took the regular initializer approach.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution