> On 5 Apr 2017, at 19:04, Tony Parker <[email protected]> wrote:
>
> Hi David,
>
>> On Apr 4, 2017, at 10:33 PM, David Hart via swift-evolution
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> Very interesting discussion below. Here are a few more points:
>>
>> Sent from my iPhone
>> On 4 Apr 2017, at 23:43, Itai Ferber via swift-evolution
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>>> Hi Brent,
>>>
>>> Thanks for your comments and thorough review! :)
>>> Responses inline.
>>>
>>> On 4 Apr 2017, at 1:57, Brent Royal-Gordon wrote:
>>>
>>>
>>> On Apr 3, 2017, at 1:31 PM, Itai Ferber via swift-evolution
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> Hi everyone,
>>>
>>> With feedback from swift-evolution and additional internal review, we've
>>> pushed updates to this proposal, and to the Swift Archival & Serialization
>>> proposal.
>>> Changes to here mostly mirror the ones made to Swift Archival &
>>> Serialization, but you can see a specific diff of what's changed here. Full
>>> content below.
>>>
>>> We'll be looking to start the official review process very soon, so we're
>>> interested in any additional feedback.
>>>
>>> Thanks!
>>>
>>> — Itai
>>>
>>> This is a good revision to a good proposal.
>>>
>>> I'm glad `CodingKey`s now require `stringValue`s; I think the intended
>>> semantics are now a lot clearer, and key behavior will be much more
>>> reliable.
>>>
>>> Agreed
>>>
>>>
>>> I like the separation between keyed and unkeyed containers (and I think
>>> "unkeyed" is a good name, though not perfect), but I'm not quite happy with
>>> the unkeyed container API. Encoding a value into an unkeyed container
>>> appends it to the container's end; decoding a value from an unkeyed
>>> container removes it from the container's front. These are very important
>>> semantics that the method names in question do not imply at all.
>>>
>>> I think that consistency of phrasing is really important here, and the
>>> action words "encode" and "decode" are even more important to connote than
>>> the semantics of pushing and popping.
>>> (Note that there need not be specific directionality to an unkeyed
>>> container as long as the ordering of encoded items is eventually maintained
>>> on decode.) But on a practical note, names like encodeAtEnd and
>>> decodeFromFront (or similar) don't feel like they communicate anything much
>>> more useful than the current encode/decode.
>>>
>>>
>>> Certain aspects of `UnkeyedDecodingContainer` also feel like they do the
>>> same things as `Sequence` and `IteratorProtocol`, but in different and
>>> incompatible ways. And I certainly think that the `encode(contentsOf:)`
>>> methods on `UnkeyedEncodingContainer` could use equivalents on the
>>> `UnkeyedDecodingContainer`. Still, the design in this area is much improved
>>> compared to the previous iteration.
>>>
>>> Which aspects of Sequence and IteratorProtocol do you feel like you're
>>> missing on UnkeyedDecodingContainer? Keep in mind that methods on
>>> UnkeyedDecodingContainer must be able to throw, and an
>>> UnkeyedDecodingContainercan hold heterogeneous items whose type is not
>>> known, two things that Sequence and IteratorProtocol do not do.
>>>
>>> In terms of an equivalent to encode(contentsOf:), keep in mind that this
>>> would only work if the collection you're decoding is homogeneous, in which
>>> case, you would likely prefer to decode an Array over getting an unkeyed
>>> container, no? (As soon as conditional conformance arrives in Swift, we
>>> will be able to express extension Array : Decodable where Element :
>>> Decodable { ... } making decoding homogeneous arrays trivial.)
>>>
>>>
>>> (Tiny nitpick: I keep finding myself saying "encode into", not "encode to"
>>> as the API name suggests. Would that be a better parameter label?)
>>>
>>> On a personal note here — I agree with you, and had originally used "into".
>>> However, we've reviewed our APIs and more often have balanced from:/to:
>>> rather than from:/into: on read/write/streaming calls. We'd like to rein
>>> these in a bit and keep them consistent within our naming guidelines, as
>>> much as possible.
>>>
>>>
>>> I like the functionality of the `userInfo` dictionary, but I'm still not
>>> totally satisfied casting out of `Any` all the time. I might just have to
>>> get over that, though.
>>>
>>> I think this is the closest we can get to a pragmatic balance between
>>> dynamic needs and static guarantees. :)
>>>
>>>
>>> I wonder if `CodingKey` implementations might ever need access to the
>>> `userInfo`. I suppose you can just switch to a different set of
>>> `CodingKeys` if you do.
>>>
>>> I don't think CodingKey should ever know about userInfo — CodingKeys should
>>> be inert data. If you need to, use the userInfo to switch to a different
>>> set of keys, as you mention.
>>>
>>>
>>> Should there be a way for an `init(from:)` implementation to determine the
>>> type of container in the encoder it's just been handed? Or perhaps the
>>> better question is, do we want to promise users that all decoders can tell
>>> the difference?
>>>
>>> I think it would be very rare to need this type of information. If a type
>>> wants to encode as an array or as a dictionary conditionally, the context
>>> for that would likely be present in userInfo.
>>> If you really must try to decode regardless, you can always try to grab one
>>> container type from the decoder, and if it fails, attempt to grab the other
>>> container type.
>>>
>>>
>>> * * *
>>>
>>> I went ahead and implemented a basic version of `Encoder` and `Encodable`
>>> in a Swift 3 playground, just to get a feel for this system in action and
>>> experiment with a few things. A few observations:
>>>
>>> Lots to unpack here, let's go one by one. :)
>>>
>>>
>>> * I think it may make sense to class-constrain some of these protocols.
>>> `Encodable` and its containers seem to inherently have reference
>>> semantics—otherwise data could never be communicated from all those
>>> `encode` calls out to the ultimate caller of the API. Class-constraining
>>> would clearly communicate this to both the implementer and the compiler.
>>> `Decoder` and its containers don't *inherently* have reference semantics,
>>> but I'm not sure it's a good idea to potentially copy around a lot of state
>>> in a value type.
>>>
>>> I don't think class constraints are necessary. You can take a look at the
>>> current implementation of JSONEncoder and JSONDecoder here
>>> <https://github.com/itaiferber/swift/blob/3c59bfa749adad2575975e47130b28b731f763e0/stdlib/public/SDK/Foundation/JSONEncoder.swift>
>>> (note that this is still a rough implementation and will be updated soon).
>>> The model I've followed there is that the encoder itself (_JSONEncoder) has
>>> reference semantics, but the containers (_JSONKeyedEncodingContainer,
>>> _JSONUnkeyedEncodingContainer) are value-type views into the encoder itself.
>>>
>>> Keep in mind that during the encoding process, the entities created most
>>> often will be containers. Without some additional optimizations in place,
>>> you end up with a lot of small, short-lived class allocations as containers
>>> are brought into and out of scope.
>>> By not requiring the class constraints, it's at least possible to make all
>>> those containers value types with references to the shared encoder.
>>>
>>>
>>> * I really think that including overloads for every primitive type in all
>>> three container types is serious overkill. In my implementation, the
>>> primitive types' `Encodable` conformances simply request a
>>> `SingleValueEncodingContainer` and write themselves into it. I can't
>>> imagine any coder doing anything in their overloads that wouldn't be
>>> compatible with that, especially since they can never be sure when someone
>>> will end up using the `Encodable` conformance directly instead of the
>>> primitive. So what are all these overloads buying us? Are they just
>>> avoiding a generic dispatch and the creation of a new `Encoder` and perhaps
>>> a `SingleValueEncodingContainer`? I don't think that's worth the increased
>>> API surface, the larger overload sets, or the danger that an encoder might
>>> accidentally implement one of the duplicative primitive encoding calls
>>> inconsistently with the others.
>>>
>>> To be clear: In my previous comments, I suggested that we should radically
>>> reduce the number of primitive types. That is not what I'm saying here. I'm
>>> saying that we should always use a single value container to encode and
>>> decode primitives, and the other container types should always use
>>> `Encodable` or `Decodable`. This doesn't reduce the capabilities of the
>>> system at all; it just means you only have to write the code to handle a
>>> given primitive type one time instead of three.
>>>
>>> Having implemented these myself multiple times, I agree — it can be a pain
>>> to repeat these implementations, and if you look at the linked
>>> implementations above, funneling to one method from all of those is exactly
>>> what I do (and in fact, this can be shortened significantly, which I plan
>>> on doing soon).
>>>
>>> There is a tradeoff here between ease of use for the end consumer of the
>>> API, and ease of coding for the writer of a new Encoder/Decoder, and my
>>> argument will always be for the benefit of the end consumer. (There will be
>>> orders of magnitude more end consumers of this API than those writing new
>>> Encoders and Decoders 😉)
>>> Think of the experience for the consumer of this API, especially someone
>>> learning it for the first time. It can already be somewhat of a hurdle to
>>> figure out what kind of container you need, but even once you get a keyed
>>> container (which is what we want to encourage), then what? You start typing
>>> container.enc... and in the autocomplete list in Xcode, the only thing that
>>> shows up is one autocomplete result: encode(value: Encodable, forKey: ...)
>>> Consider the clarity (or lack thereof) of that, as opposed to seeing
>>> encode(value: Int, forKey: ...), encode(value: String, forKey: ...), etc.
>>> Given a list of types that users are already familiar with helps immensely
>>> with pushing them in the right direction and reducing cognitive load. When
>>> you see String in that list, you don't have to question whether it's
>>> possible to encode a string or not, you just pick it. I have an Int8, can I
>>> encode it? Ah, it's in the list, so I can.
>>>
>>> Even for advanced users of the API, though, there's something to be said
>>> for static guarantees in the overloading. As someone familiar with the API
>>> (who might even know all the primitives by heart), I might wonder if the
>>> Encoder I'm using has correctly switched on the generic type. (It would
>>> have to be a dynamic switch.) Did the implementer remember to switch on
>>> Int16 correctly? Or did they forget it and will I be falling into a generic
>>> case which is not appropriate here?
>>>
>>> When it comes to overloads vs. dynamically switching on a generic type, I
>>> think we would generally prefer the static type safety. As a consumer of
>>> the API I want to be sure that the implementer of the Encoder I'm using was
>>> aware of these primitive types in some way, and that the compiler helped
>>> them too to make sure they didn't, say, forget to switch on Data.self. As a
>>> writer of Encoders, yes, this is a pain, but a sacrifice I'm willing to
>>> make for the sake of the consumer.
>>>
>>> Let's take a step back, though. This is mostly annoying to implement
>>> because of the repetition, right? If someone were to put together a
>>> proposal for a proper macro system in Swift, which is really what we want
>>> here, I wouldn't be sad. 😉
>>>
>> There's also an argument of API surface area. As a user or implementer of
>> the API, it's much less intimidating to load the documentation for a
>> protocol and see one central function than many overloads.
>>
>> I've used many serialization third-party frameworks in Swift. None of them
>> defined all those overloads, and more importantly, I never saw any user of
>> those APIs post an issue to GitHub where the cause could be traced back to
>> the lack of those overloads.
>>
>> These overloads look to me like remnants of Codable's NSCoding influences
>> instead of an API reimagined for Swift.
>>
>> For the same reasons, I continue to believe that decode functions should
>> overload on the return type. If we follow the arguments in favor of
>> providing a type argument, then why don't we also have type arguments for
>> encoders: encode(_ value: T?, forKey key: Key, as type: T.self)? I'm not
>> advocating that: I'm just pushing the argument to its logical conclusion to
>> explain why I don't understand it.
>
> I don’t see a way for a call to encode to become ambiguous by omitting the
> type argument, whereas the same is not true for a return value from decode.
> The two seem fundamentally different.
When decoding to a property, there will be no ambiguity. And for other cases,
Swift developers are already quite used to handling that kind of ambiguity,
like for literals:
let x: UInt = 10
let y = 20 as CGFloat
> - Tony
>
>>>
>>> * And then there's the big idea: Changing the type of the parameter to
>>> `encode(to:)` and `init(from:)`.
>>>
>>> ***
>>>
>>> While working with the prototype, I realized that the vast majority of
>>> conformances will immediately make a container and then never use the
>>> `encoder` or `decoder` again. I also noticed that it's illegal to create
>>> more than one container from the same coder, and there are unenforceable
>>> preconditions to that effect. So I'm wondering if it would make sense to
>>> not pass the coder at all, but instead have the conforming type declare
>>> what kind of container it wants:
>>>
>>> extension Pet: Codable {
>>> init(from container: KeyedDecodingContainer<CodingKeys>) throws {
>>> name = try container.decode(String.self, forKey: .name)
>>> age = try container.decode(Int.self, forKey: .age)
>>> }
>>>
>>> func encode(to container: KeyedEncodingContainer<CodingKeys>) throws {
>>> try container.encode(name, forKey: .name)
>>> try container.encode(age, forKey: .age)
>>> }
>>> }
>>>
>>> extension Array: Encodable where Element: Encodable {
>>> init(from container: UnkeyedDecodingContainer) throws {
>>> self.init()
>>> while !container.isAtEnd {
>>> append(try container.decode(Element.self))
>>> }
>>> }
>>>
>>> func encode(to container: UnkeyedEncodingContainer) throws {
>>> container.encode(contentsOf: self)
>>> }
>>> }
>>>
>>> I think this could be implemented by doing the following:
>>>
>>> 1. Adding an associated type to `Encodable` and `Decodable` for the type
>>> passed to `encode(to:)`/`init(from:)`.
>>>
>>> This is already unfortunately a no-go. As mentioned in other emails, you
>>> cannot override an associatetype in a subclass of a class, which means that
>>> you cannot require a different container type than your superclass. This is
>>> especially problematic in the default case where we'd want to encourage
>>> types to use keyed containers — every type should have its own keys, and
>>> you'd need to have a different keyed container than your parent, keyed on
>>> your keys.
>>>
>>> Along with that, since the typealias would have to be at least as visible
>>> as your type (potentially public), it would necessitate that your key type
>>> would be at least as public as your type as well. This would expose your
>>> type's coding keys, which is prohibitive. (Consider what this would mean
>>> for frameworks, for instance.)
>>>
>>> Finally, this also means that you could not request different container
>>> types based on context — a type could not offer both a dictionary
>>> representation and a more efficient array representation, since it can only
>>> statically request one container type.
>>>
>>>
>>> 2. Creating protocols for the types that are permitted there. Call them
>>> `EncodingSink` and `DecodingSource` for now.
>>>
>>> 3. Creating *simple* type-erased wrappers for the `Unkeyed*Container` and
>>> `SingleValue*Container` protocols and conforming them to `EncodingSink` and
>>> `DecodingSource`. These wouldn't need the full generic-subclass dance
>>> usually used for type-erased wrappers; they just exist so you can strap
>>> initializers to them. In a future version of Swift which allowed
>>> initializers on existentials, we could probably get rid of them.
>>>
>>> (Incidentally, if our APIs always return a type-erased wrapper around the
>>> `Keyed*ContainerProtocol` types, there's no actual need for the underlying
>>> protocols to have a `Key` associated type; they can use `CodingKey`
>>> existentials and depend on the wrapper to enforce the strong key typing.
>>> That would allow us to use a simple type-erased wrapper for
>>> `Keyed*Container`, too.)
>>>
>>> 4. For advanced use cases where you really *do* need to access the encoder
>>> in order to decide which container type to use, we would also need to
>>> create a simple type-erased wrapper around `Encoder` and `Decoder`
>>> themselves, conforming them to the `Sink`/`Source` protocols.
>>>
>>> This might address my last point above, but then what useful interface
>>> would EncodingSink and DecodingSource have if a type conforming to
>>> EncodingSink could be any one of the containers or even a whole encoder
>>> itself?
>>>
>>>
>>> 5. The Source/Sink parameter would need to be `inout`, unless we *do* end
>>> up class-constraining things. (My prototype didn't.)
>>>
>>> There are lots of little details that change too, but these are the broad
>>> strokes.
>>>
>>> Although this technically introduces more types, I think it actually
>>> simplifies the design for people who are just using the `Codable` protocol.
>>> All they have to know about is the `Codable` protocol, the magic
>>> `CodingKeys` type, the three container types (realistically, probably just
>>> the `KeyedEncoding/DecodingContainer`), and the top-level encoders they
>>> want to use. Most users should never need to know about the members of the
>>> `Encoder` protocol; few even need to know about the other two container
>>> types. They don't need to do the "create a container" dance. The thing
>>> would just work with a minimum of fuss.
>>>
>>> Meanwhile, folks who write encoders *do* deal with a bit more complexity,
>>> but only because they have to be aware of more type-erased wrappers. In
>>> other respects, it's simpler for them, too. Keyed containers don't need to
>>> be generic, and they have a layer of Foundation-provided wrappers above
>>> them that can help enforce good behavior and (probably) hide the
>>> implementation a little bit more. I think that overall, it's probably
>>> better for them, too.
>>>
>>> Thoughts?
>>>
>>> For what it's worth, the way to introduce these three different types of
>>> encoding without the use of associated types is to split the Codable
>>> protocol up into three protocols, which we've tried in the past
>>> <https://github.com/itaiferber/swift-evolution/blob/swift-archival-serialization/proposals/XXXX-swift-archival-serialization.md#alternatives-considered>
>>> (bullet #4). Unfortunately, the results are not great — an even bigger
>>> explosion of types, overloads, etc.
>>>
>>> While I agree that the current approach of dynamically requesting
>>> containers is, well, dynamic, the benefit of not exposing your keys
>>> publicly and allowing encoding of classes is a big win in comparison, I
>>> think.
>>>
>>> I am curious, though, about your comment above on preconditions being
>>> unenforceable, because this is certainly something we would want to hammer
>>> out before releasing. What cases are you thinking of that are unenforceable?
>>>
>>>
>>> --
>>> Brent Royal-Gordon
>>> Architechies
>>>
>>> Again, thanks for your thorough review! Looking forward to further
>>> comments. :)
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected] <mailto:[email protected]>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> _______________________________________________
>> swift-evolution mailing list
>> [email protected] <mailto:[email protected]>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> <https://lists.swift.org/mailman/listinfo/swift-evolution>
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution