That sure looks like a nice addition. 👍 I tried to keep the proposal as small as possible, as it already looks much more complicated than it is.
On Mon, Mar 13, 2017 at 6:25 PM, Haravikk <[email protected]> wrote: > I really like the proposal! > > Only thing I'm wondering is whether a convenience function on Hasher might > be nice to have, like so: > > extension Hasher { > mutating func hash<H:Hashable>(_ hashable:H) -> Int { > hashable.hash(self) > return self.finish() > } > } > > This just feels like it would be easier for the most common case where you > just need to hash a single value right away, as you'd just do: > > let hashValue = hasher.hash(foo) > > While leaving the inout style available for when you do need to combine > multiple values together or otherwise feed in data gradually. This way > we're not losing as much of the convenience of the .hashValue that's being > deprecated. On that note, should Hashable be updated to do something like > the above as well, so it still immediately returns a value from a default > hasher? > > Anyway, definitely in favour of this change, as you're right; .hashValue > is a hard thing to do right, and is often an after-thought, which can make > it of limited usefulness to types that depend upon it! > > On 13 Mar 2017, at 15:38, Vincent Esche via swift-evolution < > [email protected]> wrote: > > Hi there, > > I've written up a proposal for an overhaul/replacement of the Hashable > protocol and would love to hear your feedback on it! > > Rendered > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25> | Blog > Post <https://blog.definiteloops.com/ha-r-sh-visitors-8c0c3686a46f> > > Cheers, > Vincent > > Ps: I'd like to thank David Hart (@hartbit) for his great editorial > feedback on this proposal. 👍 > > HashVisitable > > - Proposal: SE-NNNN > <https://gist.github.com/regexident/NNNN-filename.md> > - Authors: Vincent Esche <https://github.com/regexident> > - Review Manager: TBD > - Status: Awaiting review > > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#introduction> > Introduction > > Replace the Hashable protocol by two new procotols (Hasher and > HashVisitable) to improve safety, versatility and learnability. > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#motivation> > Motivation > > Implementing Hashable is difficult and the consequences if not done well > have dire performance and safety repercussions. > > The documentation of Hashable lists a sample implementation > <https://developer.apple.com/reference/swift/hashable> of var hashValue: > > /// A point in an x-y coordinate system. > struct GridPoint { > var x: Int > var y: Int > } > > extension GridPoint: Hashable { > var hashValue: Int { > return x.hashValue ^ y.hashValue > } > > static func == (lhs: GridPoint, rhs: GridPoint) -> Bool { > return lhs.x == rhs.x && lhs.y == rhs.y > } > } > > Calculating the hashes of all GridPoints (given the above implementation) > on a 1000 × 1000 grid … > > let (width, height) = (1000, 1000) > let total = width * height > var hashes = Set<Int>() > for x in 0..<width { > for y in 0..<height { > hashes.insert(GridPoint(x: x, y: y).hashValue) > } > } > print("\(hashes.count) unique hashes out of a total of \(total).") > > … results in just 1024 unique hash values for 1_000_000 unique values. > > In other words: The recommended implementation causes 99.9% of values to > trigger a hash collision. > > Out of those 1_000_000 values the median collision count was 976 with min > and max being 976 and 1000respectively. > > The collision rate will have negative impact in algorithms which heavily > use hashValue like the ones in Dictionaryand Set. Furthermore, it > increases the vulnerability to DDOS attacks when exposed to the web > <https://arstechnica.com/business/2011/12/huge-portions-of-web-vulnerable-to-hashing-denial-of-service-attack/> > . > > If even the official Swift documentation gets the implementation of > hashValue wrong, then who is to expect the average Swift programmer to do > any better? > > In contrast, running the same snippet using HashVisitable and the > semi-secure Fnv1aHash (see below) results in zero collisions! > > Finally, the design of the Hashable protocol forces the use of one > implementation without the possibility of switching between multiple > hashing algorithms. > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#proposed-solution>Proposed > solution > > Instead of coupling the hashing algorithm with each and every Swift type, > we should provide a hashing API based on the visitor-pattern. By freeing > application developers from the burden of having to implement hashing > algorithms, the Standard Library can provide default ones which fulfill > collision, performance and security goals. Furthermore, it would allow > developers to swap to different algorithms based on the use case. > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#detailed-design>Detailed > design > > The proposal deprecates the Hashable protocol and introduces the > following two: > > protocol Hasher { > mutating func finish() -> Int > mutating func write(bytes: UnsafeRawBufferPointer) > } > protocol HashVisitable { > func hash<H: Hasher>(_ hasher: inout H) > } > > Hasher is the protocol which represents a hashing algorithm, and > HashVisitable replaces Hashable. For types entirely represented by their > memory layout, the following protocol would provide a default > implementation: > > protocol ContiguouslyHashable: HashVisitable {} > > extension ContiguouslyHashable { > func hash<H: Hasher>(_ hasher: inout H) { > var mutableSelf = self > try! Swift.withUnsafeBytes(of: &mutableSelf) { > hasher.write(bytes: $0) > } > } > } > > extension Bool : ContiguouslyHashable {} > extension UInt8 : ContiguouslyHashable {} > extension UInt16 : ContiguouslyHashable {} > extension UInt32 : ContiguouslyHashable {} > extension UInt64 : ContiguouslyHashable {} > extension UInt : ContiguouslyHashable {} > extension Int8 : ContiguouslyHashable {} > extension Int16 : ContiguouslyHashable {} > extension Int32 : ContiguouslyHashable {} > extension Int64 : ContiguouslyHashable {} > extension Int : ContiguouslyHashable {} > > The Standard-Library would then provide a set of hashing implementations > specific to each purpose. A possible choice for hashing algorithms would be > the reasonably fast SipHash-2-4 <https://en.wikipedia.org/wiki/SipHash>, > and the reasonably secure SipHash-4-8 > <https://en.wikipedia.org/wiki/SipHash>. > > FNV-1A is another popular semi-secure but blazingly fast hash algorithm, > which – for the sake of demonstration – could be implemented as follows: > > struct Fnv1aHash { > fileprivate var state: UInt > init(seed: UInt) { > self.state = seed &+ 14695981039346656037 > } > } > extension Fnv1aHash: Hasher { > mutating func write(bytes: UnsafeRawBufferPointer) { > for byte in bytes { > self.state = (self.state ^ UInt(byte)) &* 1099511628211 > } > } > mutating func finish() -> Int { > return unsafeBitCast(self.state, to: Int.self) > } > } > > Coming back to the sample code present in the Hashable documentation, the > new implementation would look like: > > extension GridPoint: HashVisitable { > func hash<H: Hasher>(_ hasher: inout H) { > self.x.hash(&hasher) > self.y.hash(&hasher) > } > } > > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#source-compatibility>Source > compatibility > > Making use of "extending protocols to conform to protocols > <https://github.com/apple/swift-evolution/blob/d33c129f0920af0404f42219db56981411b20e76/proposals/0143-conditional-conformances.md#extending-protocols-to-conform-to-protocols> > ": > > extension Hashable: HashVisitable { > func hash<H: Hasher>(_ hasher: inout H) { > self.hashValue.hash(&hasher) > } > } > > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#effect-on-abi-stability>Effect > on ABI stability > > n/a > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#effect-on-api-resilience>Effect > on API resilience > > This feature should be possible to add/remove without breaking ABI. > > <https://gist.github.com/regexident/1b8e84974da2243e5199e760508d2d25#alternatives-considered>Alternatives > considered > n/a > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution > > >
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
