Is there any reason the API couldn’t be expressed as something along these
lines?
func hash() -> [Hashable] {
return [x, y]
}
l8r
Sean
> On Mar 13, 2017, at 2:15 PM, David Hart <[email protected]> wrote:
>
>>
>> On 13 Mar 2017, at 18:54, Sean Heber via swift-evolution
>> <[email protected]> wrote:
>>
>> I’m dumb when it comes to proper hashing, but it’s such a tediously common
>> thing in my experience to need to add Hashable to some kind of a struct so I
>> can stash it in a set or use it as a dictionary key. Is there really no way
>> to make this all more automatic? I have to be honest - this proposal seems
>> *harder* to understand than the way it works now.
>
> It's not really harder: just call hash on each of your type's significant
> values:
>
> x.hash(&hasher)
> y.hash(&hasher)
>
> How would you implement hashValue in a simpler way, remembering that 'x ^ y'
> is an incorrect implementation?
>
>> Of course the easiest would be if the language could just do this “good
>> enough" for me using reflection or whatever and if I really did run into a
>> problem where I wanted to do this myself, I could override something.
>>
>> Perfect is the enemy of good.
>>
>> l8r
>> Sean
>>
>>
>>> On Mar 13, 2017, at 10:38 AM, Vincent Esche via swift-evolution
>>> <[email protected]> wrote:
>>>
>>> Hi there,
>>>
>>> I've written up a proposal for an overhaul/replacement of the Hashable
>>> protocol and would love to hear your feedback on it!
>>>
>>> Rendered | Blog Post
>>>
>>> Cheers,
>>> Vincent
>>>
>>> Ps: I'd like to thank David Hart (@hartbit) for his great editorial
>>> feedback on this proposal. 👍
>>>
>>> HashVisitable
>>>
>>> • Proposal: SE-NNNN
>>> • Authors: Vincent Esche
>>> • Review Manager: TBD
>>> • Status: Awaiting review
>>> Introduction
>>>
>>> Replace the Hashable protocol by two new procotols (Hasher and
>>> HashVisitable) to improve safety, versatility and learnability.
>>>
>>> Motivation
>>>
>>> Implementing Hashable is difficult and the consequences if not done well
>>> have dire performance and safety repercussions.
>>>
>>> The documentation of Hashable lists a sample implementation of var
>>> hashValue:
>>>
>>> /// A point in an x-y coordinate system.
>>> struct GridPoint {
>>> var x: Int
>>> var y: Int
>>> }
>>>
>>> extension GridPoint: Hashable {
>>> var hashValue: Int {
>>> return x.hashValue ^ y.hashValue
>>> }
>>>
>>> static func == (lhs: GridPoint, rhs: GridPoint) -> Bool {
>>> return lhs.x == rhs.x && lhs.y == rhs.y
>>> }
>>> }
>>>
>>> Calculating the hashes of all GridPoints (given the above implementation)
>>> on a 1000 × 1000 grid …
>>>
>>> let (width, height) = (1000, 1000)
>>> let total = width * height
>>> var hashes = Set<Int>()
>>> for x in 0..<width {
>>> for y in 0..<height {
>>> hashes.insert(GridPoint(x: x, y: y).hashValue)
>>> }
>>> }
>>> print("\(hashes.count) unique hashes out of a total of \(total).")
>>>
>>> … results in just 1024 unique hash values for 1_000_000 unique values.
>>>
>>> In other words: The recommended implementation causes 99.9% of values to
>>> trigger a hash collision.
>>>
>>> Out of those 1_000_000 values the median collision count was 976 with min
>>> and max being 976 and 1000respectively.
>>>
>>> The collision rate will have negative impact in algorithms which heavily
>>> use hashValue like the ones in Dictionaryand Set. Furthermore, it increases
>>> the vulnerability to DDOS attacks when exposed to the web.
>>>
>>> If even the official Swift documentation gets the implementation of
>>> hashValue wrong, then who is to expect the average Swift programmer to do
>>> any better?
>>>
>>> In contrast, running the same snippet using HashVisitable and the
>>> semi-secure Fnv1aHash (see below) results in zero collisions!
>>>
>>> Finally, the design of the Hashable protocol forces the use of one
>>> implementation without the possibility of switching between multiple
>>> hashing algorithms.
>>>
>>> Proposed solution
>>>
>>> Instead of coupling the hashing algorithm with each and every Swift type,
>>> we should provide a hashing API based on the visitor-pattern. By freeing
>>> application developers from the burden of having to implement hashing
>>> algorithms, the Standard Library can provide default ones which fulfill
>>> collision, performance and security goals. Furthermore, it would allow
>>> developers to swap to different algorithms based on the use case.
>>>
>>> Detailed design
>>>
>>> The proposal deprecates the Hashable protocol and introduces the following
>>> two:
>>>
>>> protocol Hasher
>>> {
>>>
>>> mutating func finish() -> Int
>>>
>>>
>>> mutating func write(bytes
>>> : UnsafeRawBufferPointer)
>>> }
>>>
>>>
>>> protocol HashVisitable
>>> {
>>>
>>> func hash<H: Hasher>(_ hasher: inout
>>> H)
>>> }
>>>
>>> Hasher is the protocol which represents a hashing algorithm, and
>>> HashVisitable replaces Hashable. For types entirely represented by their
>>> memory layout, the following protocol would provide a default
>>> implementation:
>>>
>>> protocol ContiguouslyHashable: HashVisitable {}
>>>
>>> extension ContiguouslyHashable {
>>> func hash<H: Hasher>(_ hasher: inout H) {
>>> var mutableSelf = self
>>> try! Swift.withUnsafeBytes(of: &mutableSelf) {
>>> hasher.write(bytes: $0)
>>> }
>>> }
>>> }
>>>
>>> extension Bool : ContiguouslyHashable {}
>>> extension UInt8 : ContiguouslyHashable {}
>>> extension UInt16 : ContiguouslyHashable {}
>>> extension UInt32 : ContiguouslyHashable {}
>>> extension UInt64 : ContiguouslyHashable {}
>>> extension UInt : ContiguouslyHashable {}
>>> extension Int8 : ContiguouslyHashable {}
>>> extension Int16 : ContiguouslyHashable {}
>>> extension Int32 : ContiguouslyHashable {}
>>> extension Int64 : ContiguouslyHashable {}
>>> extension Int : ContiguouslyHashable {}
>>>
>>> The Standard-Library would then provide a set of hashing implementations
>>> specific to each purpose. A possible choice for hashing algorithms would be
>>> the reasonably fast SipHash-2-4, and the reasonably secure SipHash-4-8.
>>>
>>> FNV-1A is another popular semi-secure but blazingly fast hash algorithm,
>>> which – for the sake of demonstration – could be implemented as follows:
>>>
>>> struct Fnv1aHash
>>> {
>>>
>>> fileprivate var state: UInt
>>>
>>>
>>> init(seed: UInt
>>> ) {
>>>
>>> self.state = seed &+ 14695981039346656037
>>>
>>> }
>>> }
>>>
>>>
>>> extension Fnv1aHash: Hasher
>>> {
>>>
>>> mutating func write(bytes
>>> : UnsafeRawBufferPointer) {
>>>
>>> for byte in
>>> bytes {
>>>
>>> self.state = (self.state ^ UInt(byte)) &* 1099511628211
>>>
>>> }
>>> }
>>>
>>> mutating func finish() -> Int
>>> {
>>>
>>> return unsafeBitCast(self.state, to: Int.self
>>> )
>>> }
>>> }
>>>
>>> Coming back to the sample code present in the Hashable documentation, the
>>> new implementation would look like:
>>>
>>> extension GridPoint: HashVisitable
>>> {
>>>
>>> func hash<H: Hasher>(_ hasher: inout
>>> H) {
>>>
>>> self.x.hash(&
>>> hasher)
>>>
>>> self.y.hash(&
>>> hasher)
>>> }
>>> }
>>>
>>> Source compatibility
>>>
>>> Making use of "extending protocols to conform to protocols":
>>>
>>> extension Hashable: HashVisitable
>>> {
>>>
>>> func hash<H: Hasher>(_ hasher: inout
>>> H) {
>>>
>>> self.hashValue.hash(&
>>> hasher)
>>> }
>>> }
>>>
>>> Effect on ABI stability
>>>
>>> n/a
>>>
>>> Effect on API resilience
>>>
>>> This feature should be possible to add/remove without breaking ABI.
>>>
>>> Alternatives considered
>>>
>>> n/a
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected]
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>> _______________________________________________
>> swift-evolution mailing list
>> [email protected]
>> https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution