> On Mar 17, 2017, at 1:15 PM, Itai Ferber via swift-evolution > <[email protected]> wrote: > > On 15 Mar 2017, at 22:58, Zach Waldowski wrote: > > > Another issue of scale - I had to switch to a native mail client as replying > inline severely broke my webmail client. ;-) > > Again, lots of love here. Responses inline. > On Mar 15, 2017, at 6:40 PM, Itai Ferber via swift-evolution > <[email protected]> wrote: > Proposed solution > We will be introducing the following new types: > > protocol Codable: Adopted by types to opt into archival. Conformance may be > automatically derived in cases where all properties are also Codable. > > FWIW I think this is acceptable compromise. If the happy path is derived > conformances, only-decodable or only-encodable types feel like a lazy way out > on the part of a user of the API, and builds a barrier to proper testing. > > [snip] > > Structured types (i.e. types which encode as a collection of properties) > encode and decode their properties in a keyed manner. Keys may be > String-convertible or Int-convertible (or both), and user types which have > properties should declare semantic key enums which map keys to their > properties. Keys must conform to the CodingKey protocol: > public protocol CodingKey { <##snip##> } > > A few things here: > > The protocol leaves open the possibility of having both a String or Int > representation, or neither. What should a coder do in either case? Are the > representations intended to be mutually exclusive, or not? The protocol > design doesn’t seem particularly matching with the flavor of Swift; I’d > expect something along the lines of a CodingKey enum and the protocol > CodingKeyRepresentable. It’s also possible that the concerns of the two are > orthogonal enough that they deserve separate container(keyedBy:) requirements. > > The general answer to "what should a coder do" is "what is appropriate for > its format". For a format that uses exclusively string keys (like JSON), the > string representation (if present on a key) will always be used. If the key > has no string representation but does have an integer representation, the > encoder may choose to stringify the integer. If the key has neither, it is > appropriate for the Encoder to fail in some way. > > On the flip side, for totally flat formats, an Encoder may choose to ignore > keys altogether, in which case it doesn’t really matter. The choice is up to > the Encoder and its format. > > The string and integer representations are not meant to be mutually exclusive > at all, and in fact, where relevant, we encourage providing both types of > representations for flexibility. > > As for the possibility of having neither representation, this question comes > up often. I’d like to summarize the thought process here by quoting some > earlier review (apologies for the poor formatting from my mail client): > > > If there are two options, each of which is itself optional, we have 4 > possible combinations. But! At the same time we prohibit one combination by > what? Runtime error? Why not use a 3-case enum for it? Even further down the > rabbit whole there might be a CodingKey<> specialized for a concrete > combination, like CodingKey<StringAndIntKey> or just CodingKey<StringKey>, > but I’m not sure whether our type system will make it useful or possible… > > public enum CodingKeyValue { > case integer(value: Int) > case string(value: String) > case both(intValue: Int, stringValue: String) > } > public protocol CodingKey { > init?(value: CodingKeyValue) > var value: CodingKeyValue { get } > } > > I agree that this certainly feels suboptimal. We’ve certainly explored other > possibilities before sticking to this one, so let me try to summarize here: > > * Having a concrete 3-case CodingKey enum would preclude the possibility of > having neither a stringValue nor an intValue. However, there is a lot of > value in having the key types belong to the type being encoded (more safety, > impossible to accidentally mix key types, private keys, etc.); if the > CodingKey type itself is an enum (which cannot be inherited from), then this > prevents differing key types. > * Your solution as presented is better: CodingKey itself is still a protocol, > and the value itself is the 3-case enum. However, since CodingKeyValue is not > literal-representable, user keys cannot be enums RawRepresentable by > CodingKeyValue. That means that the values must either be dynamically > returned, or (for attaining the benefits that we want to give users — easy > representation, autocompletion, etc.) the type has to be a struct with static > lets on it giving the CodingKeyValues. This certainly works, but is likely > not what a developer would have in mind when working with the API; the power > of enums in Swift makes them very easy to reach for, and I’m thinking most > users would expect their keys to be enums. We’d like to leverage that where > we can, especially since RawRepresentable enums are appropriate in the vast > majority of use cases. > * Three separate CodingKey protocols (one for Strings, one for Ints, and one > for both). You could argue that this is the most correct version, since it > most clearly represents what we’re looking for. However, this means that > every method now accepting a CodingKey must be converted into 3 overloads > each accepting different types. This explodes the API surface, is confusing > for users, and also makes it impossible to use CodingKey as an existential > (unless it’s an empty 4th protocol which makes no static guarantees and the > others inherit from). > * [The current] approach. On the one hand, this allows for the accidental > representation of a key with neither a stringValue nor an intValue. On the > other, we want to make it really easy to use autogenerated keys, or > autogenerated key implementations if you provide the cases and values > yourself. The nil value possibility is only a concern when writing > stringValue and intValue yourself, which the vast majority of users should > not have to do. > * Additionally, a key word in that sentence bolded above is “generally”. As > part of making this API more generalized, we push a lot of decisions to > Encoders and Decoders. For many formats, it’s true that having a key with no > value is an error, but this is not necessarily true for all formats; for a > linear, non-keyed format, it is entirely reasonable to ignore the keys in the > first place, or replaced them with fixed-format values. The decision of how > to handle this case is left up to Encoders and Decoders; for most formats > (and for our implementations), this is certainly an error, and we would > likely document this and either throw or preconditionFailure. But this is not > the case always. > * In terms of syntax, there’s another approach that would be really nice (but > is currently not feasible) — if enums were RawRepresentable in terms of > tuples, it would be possible to give implementations for String, Int, (Int, > String), (String, Int), etc., making this condition harder to represent by > default unless you really mean to. > > Hope that gives some helpful background on this decision. FWIW, the only way > to end up with a key having no intValue or stringValue is manually > implementing the CodingKey protocol (which should be exceedingly rare) and > implementing the methods by not switching on self, or some other method that > would allow you to forget to give a key neither value. > > > Speaking of the mutually exclusive representations - what above > serializations that doesn’t code as one of those two things? YAML can have > anything be a “key”, and despite that being not particularly sane, it is a > use case. > > We’ve explored this, but at the end of the day, it’s not possible to > generalize this to the point where we could represent all possible options on > all possible formats because you cannot make any promises as to what’s > possible and what’s not statically. > > We’d like to strike a balance here between strong static guarantees on one > end (the extreme end of which introduces a new API for every single format, > since you can almost perfectly statically express what’s possible and what > isn’) and generalization on the other (the extreme end of which is an empty > protocol because there really are encoding formats which are mutually > exclusive). So in this case, this API would support producing and consuming > YAML with string or integer keys, but not arbitrary YAML. > > > For most types, String-convertible keys are a reasonable default; for > performance, however, Int-convertible keys are preferred, and Encoders may > choose to make use of Ints over Strings. Framework types should provide keys > which have both for flexibility and performance across different types of > Encoders. It is generally an error to provide a key which has neither a > stringValue nor an intValue. > Could you speak a little more to using Int-convertible keys for performance? > I get the feeling int-based keys parallel the legacy of NSCoder’s older > design, and I don’t really see anyone these days supporting non-keyed > archivers. They strike me as fragile. What other use cases are envisioned for > ordered archiving than that? > > We agree that integer keys are fragile, and from years (decades) of > experience with NSArchiver, we are aware of the limitations that such > encoding offers. For this reason, we will never synthesize integer keys on > your behalf. This is something you must put thought into, if using an integer > key for archival. > > However, there are use-cases (both in archival and in serialization, but > especially so in serialization) where integer keys are useful. Ordered > encoding is one such possibility (when the format supports it, integer keys > are sequential, etc.), and is helpful for, say, marshaling objects in an XPC > context (where both sides are aware of the format, are running the same > version of the same code, on the same device) — keys waste time and bandwidth > unnecessarily in some cases. > > Integer keys don’t necessarily imply ordered encoding, however. There are > binary encoding formats which support integer-keyed dictionaries (read: > serialized hash maps) which are more efficient to encode and decode than > similar string-keyed ones. In that case, as long as integer keys are chosen > with care, the end result is more performant. > > But again, this depends on the application and use case. Defining integer > keys requires manual effort because we want thought put into defining them; > they are indeed fragile when used carelessly. > > > [snip] > > Keyed Encoding Containers > > Keyed encoding containers are the primary interface that most Codable types > interact with for encoding and decoding. Through these, Codable types have > strongly-keyed access to encoded data by using keys that are semantically > correct for the operations they want to express. > > Since semantically incompatible keys will rarely (if ever) share the same key > type, it is impossible to mix up key types within the same container (as is > possible with Stringkeys), and since the type is known statically, keys get > autocompletion by the compiler. > > open class KeyedEncodingContainer<Key : CodingKey> { > > Like others, I’m a little bummed about this part of the design. Your > reasoning up-thread is sound, but I chafe a bit on having to reabstract and a > little more on having to be a reference type. Particularly knowing that it’s > got a bit more overhead involved… I /like/ that NSKeyedArchiver can simply > push some state and pass itself as the next encoding container down the stack. > > There’s not much more to be said about why this is a class that I haven’t > covered; if it were possible to do otherwise at the moment, then we would. > It is possible using a manually written type-erased wrapper along the lines of AnySequence and AnyCollection. I don’t recall seeing a rationale for why you don’t want to go this route. I would still like to hear more on this topic.
> As for why we do this — this is the crux of the whole API. We not only want > to make it easy to use a custom key type that is semantically correct for > your type, we want to make it difficult to do the easy but incorrect thing. > From experience with NSKeyedArchiver, we’d like to move away from unadorned > string (and integer) keys, where typos and accidentally reused keys are > common, and impossible to catch statically. > encode<T : Codable>(_: T?, forKey: String) unfortunately not only encourages > code like encode(foo, forKey: "foi") // whoops, typo, it is more difficult to > use a semantic key type: encode(foo, forKey: CodingKeys.foo.stringValue). The > additional typing and lack of autocompletion makes it an active disincentive. > encode<T : Codable>(_: T?, forKey: Key) reverses both of these — it makes it > impossible to use unadorned strings or accidentally use keys from another > type, and nets shorter code with autocompletion: encode(foo, forKey: .foo) > > The side effect of this being the fact that keyed containers are classes is > suboptimal, I agree, but necessary. > > > > open func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws > > Does this win anything over taking a Codable? > > Taking the concrete type over an existential allows for static dispatch on > the type within the implementation, and is a performance win in some cases. > > > open func encode(_ value: Bool?, forKey key: Key) throws > open func encode(_ value: Int?, forKey key: Key) throws > open func encode(_ value: Int8?, forKey key: Key) throws > open func encode(_ value: Int16?, forKey key: Key) throws > open func encode(_ value: Int32?, forKey key: Key) throws > open func encode(_ value: Int64?, forKey key: Key) throws > open func encode(_ value: UInt?, forKey key: Key) throws > open func encode(_ value: UInt8?, forKey key: Key) throws > open func encode(_ value: UInt16?, forKey key: Key) throws > open func encode(_ value: UInt32?, forKey key: Key) throws > open func encode(_ value: UInt64?, forKey key: Key) throws > open func encode(_ value: Float?, forKey key: Key) throws > open func encode(_ value: Double?, forKey key: Key) throws > open func encode(_ value: String?, forKey key: Key) throws > open func encode(_ value: Data?, forKey key: Key) throws > > What is the motivation behind abandoning the idea of “primitives” from the > Alternatives Considered? Performance? Being unable to close the protocol? > > Being unable to close the protocol is the primary reason. Not being able to > tell at a glance what the concrete types belonging to this set are is > related, and also a top reason. > Looks like we have another strong motivating use case for closed protocols. I hope that will be in scope for Swift 5. It would be great for the auto-generated documentation and “headers" to provide a list of all public or open types inheriting from a closed class or conforming to a closed protocol (when we get them). This would go a long way towards addressing your second reason. > > What ways is encoding a value envisioned to fail? I understand wanting to > allow maximum flexibility, and being symmetric to `decode` throwing, but > there are plenty of “conversion” patterns the are asymmetric in the ways they > can fail (Date formatters, RawRepresentable, LosslessStringConvertible, etc.). > > Different formats support different concrete values, even of primitive types. > For instance, you cannot natively encode Double.nan in JSON, but you can in > plist. Without additional options on JSONEncoder, encode(Double.nan, forKey: > …) will throw. > > > /// For `Encoder`s that implement this functionality, this will only encode > the given object and associate it with the given key if it encoded > unconditionally elsewhere in the archive (either previously or in the future). > open func encodeWeak<Object : AnyObject & Codable>(_ object: Object?, forKey > key: Key) throws > > Is this correct that if I send a Cocoa-style object graph (with weak > backrefs), an encoder could infinitely recurse? Or is a coder supposed to > detect that? > > encodeWeak has a default implementation that calls the regular encode<T : > Codable>(_: T, forKey: Key); only formats which actually support weak > backreferencing should override this implementation, so it should always be > safe to call (it will simply unconditionally encode the object by default). > > > open var codingKeyContext: [CodingKey] > } > [snippity snip] > Alright, those are just my first thoughts. I want to spend a little time > marinating in the code from PR #8124 before I comment further. Cheers! I owe > you, Michael, and Tony a few drinks for sure. > > Hehe, thanks :) > > > Zach Waldowski > [email protected] > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
