Fair point! I’m all in favor of a simple(r) API. Noted, and thanks for the comment!
> On 05 Jan 2016, at 03:07, Donnacha Oisín Kidney <[email protected]> > wrote: > > Unless I’m confused, all of the “ranged” overloads can already be achieved by > composing ranged subscripts with the indexOf, etc., methods: > > let ar = [0, 1, 2, 3, 4, 5] > let ind = ar[3...5].indexOf { n in n % 2 == 0 } > ar[ind!] // 4 > >> On 5 Jan 2016, at 01:00, Vincent Esche via swift-evolution >> <[email protected] <mailto:[email protected]>> wrote: >> >> There are a couple of things I’m not 100% happy with/sure about: >> >> 1. >> I don’t like how it’s >> “<index/range/count>Of(element:range:)” but >> “<index/range/count>Of(range:predicate:)”. >> The reason I went for “<index/range/count>Of(range:predicate:)” was to go >> with the standard pattern of putting closures last and thus allowing for >> trailing closure syntax. >> >> The current order allows for this: >> “<index/range/count>Of(0..10) { $0.name == “Foobar" }” >> which I like syntax-wise. >> It however makes it look like one was looking for a range ("indexOf(0..10, >> …)”), not the predicate. >> >> While the alternative requires this: >> “<index/range/count>Of(predicate: { $0.name == “Foobar" }, range: 0..10)” >> >> I’m actually leaning towards the latter now. Dang! >> >> 2. >> I’m unsure about the name of “lensView”. I first went with “transform”, then >> with “mappingFunction”, but found that neither of those made it clear that >> the closure should provide a specific view into the element. Both >> “transform" and “mappingFunction” imply the transformation from one data >> type to another. It’s not about transforming. It’s about accessing. >> Thus looked for names of similar patterns and found “keypaths” (kind of) in >> ObjC and lenses in Haskell. To people familiar with functional programming >> the name should be clear. The closure passed to “lensView” is basically the >> getter part of a functional lens. >> Anyway, I’m not too attached to the name and more than open to suggestions. >> >> 3. >> “BinarySearchView.init(base:validateAgainst:)” currently asserts. This >> should probably be split into two init functions. One assuming the base to >> be sorted (“BinarySearchView.init(base:)”, O(1)). And one allowing for a >> check, returning nil on failure >> (“BinarySearchView.init?(base:validateAgainst:)”, O(self.count)). Or should >> at least throw an error, instead of panicking. >> >> Note: I made some minor documentation changes/fixes. >> See https://gist.github.com/regexident/2b7531bd748f57679671 >> <https://gist.github.com/regexident/2b7531bd748f57679671> for up-to-date >> RFC/source code (vs. Markdown-dump at bottom of OP). >> >> - Vincent >> >>> On 04 Jan 2016, at 16:13, Vincent Esche <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> After having had this code laying around on my Mac since 6/26/15 (waiting >>> for Swift to go open source) >>> I figured it’s about damn time I submit the actual RFC for it. So without >>> further ado… >>> >>> One of the areas where Swift’s stdlib is still quite lacking is searching. >>> Which is a shame as searching (along with indexing) might be among >>> the most important tasks of computer science/software development. >>> One might not need fast collection searches for writing a banal fart or >>> flashlight app, >>> but almost every other app is likely to benefit from having a proper >>> searching API. >>> >>> So I’d like to fix that. >>> >>> Attached you find a full RFC along with the full and functional source code >>> to make it happen. >>> >>> I’d love to hear your opinion on this and will be more than happy to answer >>> questions. >>> >>> Rendered Version + Full Implementation: >>> https://gist.github.com/regexident/2b7531bd748f57679671 >>> <https://gist.github.com/regexident/2b7531bd748f57679671> >>> (The code is tested using Quick/Nimble. Tests would thus have to be ported >>> to Swift’s XCTest.) >>> >>> - Vincent >>> >>>> Markdown-dump for >>>> >>>> # Improved Collection Search >>>> >>>> * Proposal: >>>> [SE-NNNN](https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md >>>> >>>> <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md>) >>>> * Author(s): [Vincent Esche](https://github.com/regexident >>>> <https://github.com/regexident>) >>>> * Status: **Review** >>>> * Review manager: TBD >>>> >>>> ## Introduction >>>> >>>> This RFC proposes an extension to the currently rather limited and linear >>>> searching API of `CollectionType`, that is `indexOf(element:)` and >>>> `indexOf(predicate:)`. >>>> It proposes the backwards-compatible refactoring of those existing methods >>>> as well as the introduction of a view-based ensemble of methods for >>>> efficient binary-search. >>>> >>>> Swift-evolution thread: [link to the discussion thread for that >>>> proposal](https://lists.swift.org/pipermail/swift-evolution >>>> <https://lists.swift.org/pipermail/swift-evolution>) >>>> >>>> ## Motivation >>>> >>>> Indexing of and searching in data is a big deal in software development. >>>> As such it is crucial to have capable means of doing so in a language that >>>> aspires to be a highly performant programming language. >>>> >>>> Swift's current API for searching is basically limited to these two >>>> methods on `CollectionType`: >>>> >>>> > ```swift >>>> > func indexOf(element: Self.Generator.Element) -> Self.Index? >>>> > ``` >>>> > Returns the first index where value appears in self or nil if value is >>>> > not found. >>>> > >>>> > **Complexity**: `O(self.count)`. >>>> >>>> and: >>>> >>>> > ```swift >>>> > func indexOf(@noescape predicate: (Self.Generator.Element) throws -> >>>> > Bool) rethrows -> Self.Index? >>>> > ``` >>>> > Returns the first index where predicate returns true for the >>>> > corresponding value, or nil if such value is not found. >>>> > >>>> > **Complexity**: `O(self.count)`. >>>> >>>> The author sees a couple of issue with these two methods: >>>> >>>> 1. They do not provide an option for restricting the "haystack" to a >>>> certain `range` of `self`. >>>> 2. They do not provide a fast (`O(log2(self.count))`) path for sorted >>>> collections. >>>> 3. They should not be limited to `CollectionType` but instead be moved to >>>> `Indexable`. >>>> >>>> In many situations it is desirable to perform fast searches on sorted >>>> collections instead of having to fall back to naïve linear searches. >>>> >>>> Further more while a single `index` might be the most common subject of >>>> interest, there are many scenarios where a `range` or `count` of matches >>>> are of more use. >>>> >>>> And anyone who is writing a custom ordered collection type will eventually >>>> want to find the insertion index for a given element and will thus end up >>>> writing some variant of `lowerBoundOf(…)` or `upperBoundOf(…)` (upon which >>>> all the other methods can be implemented, actually). >>>> >>>> ## Proposed solution >>>> >>>> The author proposes: >>>> >>>> 1. A backwards-compatible refactoring of `CollectionType.indices`, moving >>>> it to `Indexable`. >>>> >>>> 2. A backwards-compatible refactoring of `indexOf(…)` (adding optional >>>> `range:` and moving it to `Indexable`). >>>> >>>> 3. The addition of `rangeOf(…)`, `countOf(…)` and `isSorted(…)` to >>>> `Indexable` with a TIME complexity of `O(self.count)`. >>>> >>>> 4. The introduction of a `BinarySearchView` on `Indexable`, allowing for >>>> fast (`O(log2(self.count))`) searches on `Indexable` via `indexOf(…)`, >>>> `rangeOf(…)`, `countOf(…)`, `lowerBoundOf(…)`, `upperBoundOf(…)` without >>>> cluttering `Indexable`'s interface. >>>> >>>> ## Detailed design >>>> >>>> ### `CollectionType.indices`: >>>> >>>> The author proposes the relocation of `.indices` from `CollectionType` to >>>> `Indexable`: >>>> >>>> ```swift >>>> extension Indexable { >>>> /// Return the range of valid index values. >>>> /// >>>> /// The result's `endIndex` is the same as that of `self`. Because >>>> /// `Range` is half-open, iterating the values of the result produces >>>> /// all valid subscript arguments for `self`, omitting its `endIndex`. >>>> public var indices: Range<Self.Index> { get } >>>> } >>>> ``` >>>> >>>> After all all it does is provide a convenience method for generating a >>>> Range from its `startIndex` and `endIndex`. There is no need for `self` to >>>> be a `CollectionType` for this. >>>> >>>> This change should not break any existing code as it generalizes the >>>> property. >>>> >>>> ### `Indexable` of `Comparable` elements: >>>> >>>> The author proposes the addition of the following method on `Indexable >>>> where Element : Comparable`: >>>> >>>> ```swift >>>> extension Indexable where Index: RandomAccessIndexType, _Element: >>>> Comparable { >>>> /// Return true iff all elements in `self` within `range` are sorted >>>> according to `isOrderedBefore`. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func isSorted(range: Range<Index>? = nil) -> Bool >>>> } >>>> ``` >>>> >>>> This method would perform a linear scan of the `Indexable` with a TIME >>>> complexity of `O(self.count)`. >>>> >>>> An actual implementation of these methods can be found in >>>> `Indexable.swift`. >>>> >>>> ### `Indexable` of `Equatable ` elements: >>>> >>>> The author proposes the addition of the following method on `Indexable >>>> where Element : Equatable`: >>>> >>>> ```swift >>>> extension Indexable where Index : RandomAccessIndexType, _Element : >>>> Equatable { >>>> >>>> /// Returns the first index where `element` appears in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func indexOf(element: _Element, range: Range<Index>? = default) >>>> -> Self.Index? >>>> >>>> /// Returns the first range where `element` appears in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func rangeOf(element: _Element, range: Range<Index>? = default) >>>> -> Range<Self.Index>? >>>> >>>> /// Returns the number of `element`s that appear in `self` >>>> /// within range `range` or `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func countOf(element: _Element, range: Range<Index>? = default) >>>> -> Int >>>> } >>>> ``` >>>> >>>> These methods would perform a linear scan of the `Indexable` with a TIME >>>> complexity of `O(self.count)`. >>>> >>>> An actual implementation of these methods can be found in >>>> `Indexable.swift`. >>>> >>>> ### `Indexable` of any elements: >>>> >>>> The author proposes the addition of the following methods on `Indexable`: >>>> >>>> ```swift >>>> extension Indexable where Index : RandomAccessIndexType { >>>> >>>> /// Returns the first index where an element appears in `self` >>>> /// within range `range` for which `lensView(element) == value` or >>>> `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func indexOf(range: Range<Index>? = default, @noescape >>>> predicate: (_Element) throws -> Bool) rethrows -> Self.Index? >>>> >>>> /// Returns the first range where an element appears in `self` >>>> /// within range `range` for which `lensView(element) == value` or >>>> `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func rangeOf(range: Range<Index>? = default, @noescape >>>> predicate: (_Element) throws -> Bool) rethrows -> Range<Self.Index>? >>>> >>>> /// Returns the number of `element`s that appear in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func countOf(range: Range<Index>? = default, @noescape >>>> predicate: (_Element) throws -> Bool) rethrows -> Int >>>> >>>> /// Return true iff all elements in `self` within `range` are sorted >>>> according to `isOrderedBefore`. >>>> /// >>>> /// - Complexity: O(`self.count`) >>>> public func isSorted(range: Range<Index>? = default, @noescape >>>> isOrderedBefore: (_Element, _Element) throws -> Bool) rethrows -> Bool >>>> } >>>> ``` >>>> >>>> These methods would perform a linear scan of the `Indexable` with a TIME >>>> complexity of `O(self.count)`. >>>> >>>> An actual implementation of these methods can be found in >>>> `Indexable.swift`. >>>> >>>> **Note**: For explanation of and reasoning behind the `lensView:` argument >>>> as well as the distinction between `T` and `Base._Element`, see below. >>>> >>>> ### `Indexable.binarySearchView(validateAgainst:?)`: >>>> >>>> The author proposes the addition of a >>>> `binarySearchView(validateAgainst:?)` method on `Indexable` for obtaining >>>> a specialized search view for performing searches with >>>> O(`log2(self.count)`) complexity: >>>> >>>> ```swift >>>> extension Indexable where Index: RandomAccessIndexType { >>>> /// Create a view into a sorted indexable that allows access within >>>> `bounds`. >>>> /// >>>> /// - Complexity: O(`self.count`) if a validation closure is provided, >>>> otherwise O(`1`). >>>> public func binarySearchView(validateAgainst isOrderedBefore: >>>> ((_Element, _Element) -> Bool)? = nil) -> BinarySearchView<Self> >>>> } >>>> ``` >>>> >>>> ### `BinarySearchView<T: Indexable>`: >>>> >>>> The author proposes the introduction of a `BinarySearchView` struct with >>>> the following interface: >>>> >>>> ```swift >>>> public struct BinarySearchView<Base : Indexable where Base.Index : >>>> RandomAccessIndexType> : Indexable { >>>> public typealias Index = Base.Index >>>> >>>> public var startIndex: Base.Index { get } >>>> >>>> public var endIndex: Base.Index { get } >>>> >>>> public subscript (position: Index) -> Base._Element { get } >>>> >>>> /// Create a view into a sorted collection `base` that allows access >>>> within `bounds`. >>>> /// >>>> /// - Complexity: O(1). >>>> public init(base: Base, validateAgainst isOrderedBefore: >>>> ((Base._Element, Base._Element) -> Bool)? = default) >>>> >>>> /// Returns first index of the first `element` in `self` within range >>>> `range` for which >>>> /// `lensView(element) < value` evaluates to false or `nil` if >>>> `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func lowerBoundOf<T : Comparable>(value: T, range: >>>> Range<Index>? = default, @noescape isOrderedBefore: (T, T) -> Bool = >>>> default, @noescape lensView: Base._Element -> T) -> Base.Index >>>> >>>> /// Returns first index of the first `element` in `self` within range >>>> `range` for which >>>> /// `value < lensView(element)` evaluates to true or `nil` if >>>> `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func upperBoundOf<T : Comparable>(value: T, range: >>>> Range<Index>? = default, @noescape isOrderedBefore: ((T, T) -> Bool) = >>>> default, @noescape lensView: Base._Element -> T) -> Base.Index >>>> >>>> /// Returns the first index where an element appears in `self` >>>> /// within range `range` for which `lensView(element) == value` or >>>> `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func indexOf<T : Comparable>(value: T, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape >>>> lensView: Base._Element -> T) -> Base.Index? >>>> >>>> /// Returns the range where an element appears in `self` >>>> /// within range `range` for which `lensView(element) == value` or >>>> `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func rangeOf<T : Comparable>(value: T, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape >>>> lensView: Base._Element -> T) -> Range<Base.Index>? >>>> >>>> /// Returns the number of `element`s that appear in `self` >>>> /// within range `range` for which `lensView(element) == value` or >>>> `nil` if such an element is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func countOf<T : Comparable>(value: T, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape >>>> lensView: Base._Element -> T) -> Int >>>> >>>> internal let _base: Base >>>> } >>>> ``` >>>> >>>> As well as the following extension for `Indexable where Element: >>>> Comparable`: >>>> >>>> ```swift >>>> extension BinarySearchView where Base._Element : Comparable { >>>> /// Returns first index of the first `element` in `self` within range >>>> `range` for which >>>> /// `… < element` evaluates to false or `nil` if `element` is not >>>> found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func lowerBoundOf(element: Base._Element, range: Range<Index>? >>>> = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> >>>> Bool = default) -> Base.Index >>>> >>>> /// Returns first index of the first `element` in `self` within range >>>> `range` for which >>>> /// `element < …` evaluates to true or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func upperBoundOf(element: Base._Element, range: Range<Index>? >>>> = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> >>>> Bool = default) -> Base.Index >>>> >>>> /// Returns the first index where `element` appears in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func indexOf(element: Base._Element, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool >>>> = default) -> Base.Index? >>>> >>>> /// Returns the range where `element` appears in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func rangeOf(element: Base._Element, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool >>>> = default) -> Range<Base.Index>? >>>> >>>> /// Returns the number of `element`s that appear in `self` >>>> /// within range `range` or `nil` if `element` is not found. >>>> /// >>>> /// - Complexity: O(`log2(self.count)`). >>>> public func countOf(element: Base._Element, range: Range<Index>? = >>>> default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool >>>> = default) -> Int >>>> } >>>> ``` >>>> >>>> These methods would perform a linear scan of the `Indexable` with a TIME >>>> complexity of `O(log2(self.count))`. >>>> >>>> An actual implementation of these methods can be found in >>>> `BinarySearchView.swift`. >>>> >>>> ### Why `lensView`? >>>> >>>> Let's assume one had a collection of Persons like this: >>>> >>>> ```swift >>>> struct Person { >>>> let name: String >>>> let age: Int >>>> } >>>> >>>> let persons: [Person] = [ … ] >>>> ``` >>>> >>>> Now let's assume one wanted to find the first person that's called `John >>>> Doe`. One could either change `Person` to conform to `Equatable`, allowing >>>> `indexOf(element:)` to be called on `persons`. Now that was easy, wasn't >>>> it? >>>> >>>> Fine, but what if one wanted to search `Person`s by both their `name` or >>>> their `age` depending on the given use case? You're out of luck now. >>>> >>>> Luckily there is a variant of `indexOf(…)` called `indexOf(predicate:)` >>>> that is available for any type of `Element` regardless of its conformance >>>> to `Equatable`: >>>> >>>> ```swift >>>> let searchView = persons.binarySearchView() >>>> let index = persons.indexOf { $0.name == "John Doe" } >>>> ``` >>>> >>>> Unfortunately however such a `predicate` cannot be used when searching for >>>> an element's property in a sorted(!) collection or a collection of >>>> non-`Comparable` elements within a TIME complexity of >>>> `O(log2(self.count))` as binary search uses `<` instead of `==` internally >>>> and as the "needle" gets passed as both the first (in `indexOf(…)` itself) >>>> and second argument (in `lowerBoundOf(…)` which gets called by >>>> `indexOf(…)`) to the operator, one cannot simply change `predicate: (E) -> >>>> Bool` to `isOrderedBefore: (V, E) -> Bool` as that would result in a type >>>> mismatch (`(V, E)` vs. `(E, V)`). >>>> >>>> The use of `lensView` however allows one to do just that without making a >>>> mess. >>>> >>>> ```swift >>>> let searchView = persons.binarySearchView() // sorted! >>>> let index = searchView.indexOf("John Doe") { $0.name } >>>> ``` >>>> >>>> The `lensView` works similarly to how keypaths work in Objective-C, just >>>> type-safe. >>>> >>>> ## Impact on existing code >>>> >>>> The author proposes: >>>> >>>> 1. Moving `CollectionType.indices` to `Indexable` should not cause any >>>> issues with existing code. >>>> >>>> 2. Changing `indexOf(element:)` to `indexOf(element:range:?)` should not >>>> cause any issues with existing code as the `range` argument is optional >>>> and the default behaves like it used to. >>>> >>>> 2. Changing `indexOf(predicate:)` to `indexOf(range:?predicate:)` should >>>> not cause any issues with existing code either, as the `range` argument is >>>> optional and the default behaves like it used to. Swift's trailing-closure >>>> magic allows `indexOf(predicate: …)` to still properly map to >>>> `indexOf(range: nil, predicate: …)`. >>>> >>>> 3. The addition of `rangeOf(…)`, `countOf(…)` and `isSorted(…)` to >>>> `Indexable` do not conflict with any existing API. >>>> >>>> 4. The introduction of a `BinarySearchView` on `Indexable` does not >>>> conflict with any existing API. >>>> >>>> ## Alternatives considered >>>> >>>> The author's initial approach was to implement all methods on Indexable. >>>> This however required all methods to either be provided in two variants >>>> (one for unsorted, one for sorted `Indexable`s): >>>> >>>> ```swift >>>> final public func indexOf(element: _Element, range: Range<Index>? = nil) >>>> -> Index? { >>>> return … >>>> } >>>> >>>> final public func sortedIndexOf(element: _Element, range: Range<Index>? = >>>> nil) -> Index? { >>>> return … >>>> } >>>> ``` >>>> >>>> Or require the introduction of an additional argument: >>>> >>>> ```swift >>>> final public func indexOf(element: _Element, range: Range<Index>? = nil, >>>> isSorted: Bool = true) -> Index? { >>>> if isSorted { >>>> return … >>>> } else { >>>> return … >>>> } >>>> } >>>> ``` >>>> >>>> It also would have cluttered `Indexable` with methods such as >>>> `lowerBoundOf` and `upperBoundOf`, which are only ever applicable when >>>> `self` is sorted accordingly. The introduction of a dedicated >>>> BinarySearchView fixes these issues. >>>> >>>> One also would have had to switch from `predicate: (T) -> Bool` to >>>> `isOrderedBefore: (T, T) -> Bool` for all methods for the second unified >>>> API alternative. >>> >> >> _______________________________________________ >> swift-evolution mailing list >> [email protected] <mailto:[email protected]> >> https://lists.swift.org/mailman/listinfo/swift-evolution >> <https://lists.swift.org/mailman/listinfo/swift-evolution>
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
