Hi Dave Thanks for your time to go in to this an explain. This optimising goes much further then I thought.
> That's fine. Please don't be offended that I don't wish to argue it further. > It's been an interesting exercise while I'm on vacation and I hoped it would > lay out some general principles that would be useful to others in future even > if you are not convinced, but when I get back to work next week I'll have to > focus on other things. yes, I understand, it would become too iterative and time consuming I guess. ( how can you become work-detached if you keep doing things like this during vacation? ) Enjoy your vacation! TedvG > On 24 Feb 2017, at 22:40, Dave Abrahams <[email protected]> wrote: > > > > Sent from my moss-covered three-handled family gradunza > > On Feb 23, 2017, at 2:04 PM, Ted F.A. van Gaalen <[email protected] > <mailto:[email protected]>> wrote: > >> >>> On 23 Feb 2017, at 02:24, Dave Abrahams <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Equally a non-starter. All known threadsafe schemes that require caches to >>> be updated upon non-mutating operations have horrible performance issues, >>> and further this would penalize all string code by reserving space for the >>> cache and filling it even for the vast majority of operations that don't >>> require random access. >> Well, maybe “caching” is not the right description for what I've suggested. >> It is more like: >> let all strings be stored as they are now, but as soon as you want to work >> with >> random accessing parts of a string just “lift the string out of normal >> optimised string storage” >> and then add (temporarily) a Character array so one can work with this >> array directly ” > > That's a cache. > >> which implies that all other strings remain as they are. ergo: efficiency >> is only reduced for the “elevated” strings, > > You have to add that temporary array somewhere. The performance of every > string is penalized for that storage, and also for the cost of throwing it > out upon mutation. Every branch counts. > >> Using e.g. str.freeSpace(), if necessary, would then place the String back >> in its normal storage domain, thereby disposing the Character array >> associated with it. > > Avoiding hidden dynamic storage overhead that needs to be freed is an > explicit goal of the design (see the section on String and Substring). > >>> Trust me, we've gotten lots of such suggestions and thought through the >>> implications of each one very carefully. >> That’s good, because it means, that a lot of people are interested in this >> subject and wish to help. >> Of course you’ll get many of suggestions that might not be very useful, >> perhaps like this one... but sometimes suddenly someone >> comes along with things that might never have occurred to you. >> That is the beautiful nature of ideas… > > But at some point, I hope you'll understand, I also have to say that I think > all the simple schemes have been adequately explored and the complex ones all > seem to have this basic property of relying on caches, which has unacceptable > performance, complexity, and, yes, usability costs. Analyzing and refuting > each one in detail begins to be a waste of time after that. I'm not really > willing to go further down this road unless someone has an implementation and > experimental evidence that demonstrates it as non-problematic. > > >>> I'm afraid you will have to accept being disappointed about this. >> Well, like most developers, I am a stubborn kind of guy.. >> Luckily Swift is very flexible like Lego, so I rolled my own convenience >> struct. >> If I need direct access on a string I simply copy the string to it. >> it permits things like this: (and growing) >> >> let strabc = "abcdefghjiklmnopqrstuvwxyz" >> let strABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" >> var abc = TGString(strabc) >> var ABC = TGString(strABC) >> >> func test() >> { >> // as in Basic: left$, mid$, right$ >> print(abc.left(5)) >> print(abc.mid(5,10)) >> print(ABC.mid(5)) >> print(ABC.right(5)) >> >> // ranges and concatenation: >> print(abc[12..<23]) >> print(abc.left(5) + ABC.mid(6,6) + abc[10...25]) >> >> // eat anything: >> let d:Double = -3.14159 >> print(TGString(d)) >> >> let n:Int = 1234 >> print(TGString(n)) >> >> print(TGString(1234.56789)) >> >> let str = abc[15..<17].asString // Copy to to normal Swift String >> print(str) >> >> let s = "\(abc[12..<20])" // interpolate to normal Swift String. >> print(s) >> >> abc[3..<5] = TGString("34") // if lengths don't match: >> abc[8...9] = ABC[24...25] // length of dest. string is altered. >> abc[12] = TGString("$$$$") // if src l > 1 will insert remainder after >> dest.12 here >> abc[14] = TGString("") // empty removes character at pos. >> print(abc) >> abc.insert(at: 3, string: ABC[0..<3]) >> print(abc) >> } >> >> test() >> . >> outputs: >> abcde >> fghjiklmno >> FGHIJKLMNOPQRSTUVWXYZ >> VWXYZ >> mnopqrstuvw >> abcdeGHIJKLklmnopqrstuvwxyz >> -3.14159 >> 1234 >> 1234.56789 >> abcdefghjiklmnopqrstuvwxyz >> mnopqrst >> abc34fghYZkl$$$$nopqrstuvwxyz >> abcABC34fghYZkl$$$$nopqrstuvwxyz >> >> kinda hoped that this could be builtin in Swift strings >> Anyway, I’ve made myself what I wanted, which happily co-exists >> alongside normal Swift strings. Performance and storage >> aspects of my struct TGString are not very important, because >> I am not using this on thousands of strings. >> Simply want to use a string as a plain array, that’s all, >> which is implemented in almost every PL on this planet. >> >> >>> More generally, there's a reason that the collection model has >>> bidirectional and random access distinctions: important data structures are >>> inherently not random access. >> I don’t understand the above line: definition of “important data structures” >> <> “inherently” > > Important data structures are those from the classical CS literature upon > which every practical programming language (and even modern CPU hardware) is > based, e.g. hash tables. Based on the properties of modern string processing, > strings fall into the same category. "Inherent" means that performance > characteristics are tied to the structure of the data or problem being > solved. You can't sort in better than O(N log N) worst case (mythical quantum > computers don't count here), and that's been proven mathematically. Similarly > it's easy to prove that the constraints of our problem mean that counting the > characters in a string will always be O(N) worst case where N is the length > of the representation. That means strings are inherently not random access. > >>> Heroic attempts to present the illusion that they are randomly-accessible >>> are not going to fly. >> ?? Accessing discrete elements directly > > All collections have direct access via indices. You mean randomly, via > arbitrary integers. > >> in an array is not an illusion to me. >> (e.g. I took the 4th and 7th eggs from the container) > > It's not an illusion when they're stored in an array. > > If you have to walk down an aisle of differently sized cereal boxes to pick > the 5th box of SuperBoomCrisp Flakes off the shelf in the grocery store, > that's not random access (even if you're willing to drop the boxes into an > array for later lookups as you go, as you're proposing). That's what Strings > are like. > >>> These abstractions always break down, leaking the true non-random-access >>> nature in often unpredictable ways, penalizing lots of code for the sake of >>> a very few use-cases, and introducing complexity that is hard for the >>> optimizer to digest and makes it painful (sometimes impossible) to grow and >>> evolve the library. >>> >> Is an Array an abstraction? of what? > > A randomly-accessible homogeneous tail growable collection. But the > abstraction in question here isn't Array; it's RandomAccessCollection. > >> I don’t get this either. most components in the real world can be accessed >> randomly. > > Actually that's far from being true in the real world. See the grocery store > above. Computer memory is very unlike most things in the real world. Ask any > roboticist. > >> >>> This should be seen as a general design philosophy: Swift presents >>> abstractions that harmonize with, rather than hide, the true nature of >>> things. >> The true nature of things is a very vague and subjective criterium, > > Not at all; see my explanation above. > >> how can you harmonise with that, let alone with abstractions? >> e.g. for me: “the true nature of things” for an array is that it has direct >> accessible discrete elements… > > Arrays definitely support random access. Strings are not arrays. > >> Sorry, with respect, we have a difference of opinion here. > > That's fine. Please don't be offended that I don't wish to argue it further. > It's been an interesting exercise while I'm on vacation and I hoped it would > lay out some general principles that would be useful to others in future even > if you are not convinced, but when I get back to work next week I'll have to > focus on other things. > >> Thanks btw for the link to this article about tagged pointers, very >> interesting. >> it inspired me to (have) read other things in this domain as well. >> >> TedvG
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
