I've heard several times from members of the community (on Matrix and possibly on answers) that a simple iteration like

const mixed = "\b5Ὂg̀9! ℃ᾭG"
for _, c := range mixed {
        ... do something with c (but not write to it)

will actually silently allocate a slice of runes and decode the string into it, before iteration. I've heard it is done to prevent problems that occur when a programmer might overwrite data being iterated, which should be a no-brainer for programmers in general, but sure, whatever. So is it true in the case for constants? Is it true always, or only when writes occur to the source string or `c` in that case?

And if it always occurs, wouldn't it be great optimization to only decode runes when we get to them?

Since hearing of this, I started doing all of my utf-8 iteration of string runes as such:

import "unicode/utf8"
str := "Some UTF-8 string."
var i int
for i < len(str) {
        r, size := utf8.DecodeRune(str[i:])
        // do something with r...
        i += size
}

Which admittedly is a bit of a burden, and possibly premature optimization if I'm wrong. But I just think it shouldn't silently allocate a ginormous slice in the background to iterate the runes of a string I might not read all of, and especially considering `for i, v :=
range` should be idiomatic.

I tried to make a test on Compiler Explorer <https://godbolt.org/#z:OYLghAFBqd5TKALEBjA9gEwKYFFMCWALugE4A0BIEAZugHZEDKqAhgDbYgCMALOXUYBVAM7YACgA8QAcgAMM8gCse5dq3qhg6cmM6oiBBtWz1MAYXTsArgFt6IAEwA2cqcwAZAvWwA5OwBG2KQgABzkAA7oIsRG9JY29k6uUTGGDF4%2B/rZBIeF62AZxTESspEQJdg4uutj66fQlZUSZfoHBYbql5ZVJNSLdLd5tOR2hAJS66NakqFwyEayoANaswNgA1Las3gCkcgCC%2BwcEtlHlG7uOjjS2RFeOx8c01vSoWzv0EOOXAOwAQscNsCtgRJNhMBsQLsAMwAEUu1wAEnV2OgwGBLrhHqFeLtQjDdrhuLt/rhdgc4QBxB5AkF0UgbAjkDbvaHwjakDTrUHgyG7AF0kH0u4AOnEpG8RHYX2ZrPGQr%2BcKev2Vhxkk3YsgArIoHPJFOhZFT0BsRNNZpsrjDuIoiLIFONJssQNq5GpZLxFLYQLxtaLcS5uDDHHxtW7QtryPqFOQjTJFCIQO77QbJnBYCgMGcCJwKFQINmIrmOsB2IYIgJc0RgkmIAEHYoAt4ygBPWS28jZ2ymIgAeXo7HbBvIOG2mk4jdHBFIhUMADdsEmR9hwahrDWO4opXUp%2BwCAEuaRW5YcFOiJKfTJbZNBMw2JO%2BAIGERRBJpCOVNw1NyQNo1AeSaQJM6ARA0y6JnUc5xCYZi9NUrjuK02S5CAvyRNEsQMPByQYWkcTIe0IToQURQME0PRWFUuGkQ0FFDFkRFoV0zQ4f0gyEaMxGTOaMxzDwmo6nqU7xqwtiYM4vAbMA7wQBWPwQPgxBkIiNospYOZ5qp3A/CadqNk65BINgrA4CE3zkC6boejIXrRiJsiJsm5Cpo6gkyI43pOKEoqOLwziOBGACcfDcI4vxyPwMaGo5LkGeQGYIPAEBZlgeCECQ%2BbUHeLAcFwT53m%2BUhTiojg/pof46LRMEQO4bHoUhwwoR0QV4Vh8RUX06GpO1nGoa11XkYM9W1PUxQcU1TEDcNnXVCRE2MVxIBBTxFr8QsSyrDy2x7IcxynOcRCIjcdy0nthwvG8HzeN8fyAocwq2GCEJQrCCIPCi7BohiWI4niBJEiSZIUtSZ0HMKDJMiybJvZy3KbE9fJ3YqwoAPSoxsACSGxKNYAwbAA7hoR1EMZrLoJpwQbCQGzLPQ6AE9TSCsEd2NiYTkqGJo1OmojEKiijIJ8/yjiAhyAySpohYKg9woikQ4qS9KsrQzL4MggKapHKqxwajZur2SO8Ymmaa1Wo4Ib6WmkzGaZHQWVZ7parZXnarworcNqvwwqEzi/JGvC8CFUbRXGsVJim8WJSlaAU8WeaUNQRYliEZYVlW5a1tQDYjs29BtluXYUz2jADkOU5jtyk4joQs5FIuy6xquhQbvMnY7s7sb7oebanvMsYXqcW63i%2BuWPvwhViMVn5OOVWg6N3QEWaB4Hh1BZEOLVcGzU4UaNYtqHcNwKSYQ0bFRj1DR9R0R%2BuINjQzYk1RRvf9HXyEt8sZRT%2B71/DEjIfY%2Bq0%2BJcEcO5A2odRLiUktJWS8kNiKQyipa0ZUNgaXjlTFBul0BWzckZEyZlqDOldE7T0wkjbh2cq5DU7lPLkB9DCd23AgpBRhN7ZwIZA6hADobWM8ZcF62jigQgNAaCJ1oKPB8%2BUJ4viKh%2BLuSAkzfnYIokRNAiCtgiFwd0pBFGz10SINRGitHJnAeQvhsg4QEFERsMSEkpIyQQfA%2BcIhbHQIcXAggERxgCMMs7OyDDfiinDLwNhbtfjOGDLifyvCYoJl0FQgyxDuByFITIGE5i4m%2BMmIuUgMRjC8CAA%3D> modifying the string being iterated, and not. I learned that I can't read the assembly gc produces. But what I gathered is that it takes a pointer to the original string and actually calls runtime.decoderune on each iteration. Modifications to mixed change variable mixed (to point to the newly allocated string(s)), while again, the original string pointer is kept in stack while iterating. I'd like to have somebody who can read the assembly confirm.

If that's the case, then it completely voids my argument and concerns. And will make many people, including myself, very happy to learn that it is in fact optimal.

In addition, I wonder if it's the same for other types being iterated? What if a byte slice for example has some bytes modified during iteration? It must create a copy, but it shouldn't copy for loops that do not write to the iterated variable.

Sorry for the rant, I'm passionate about it.

Luke

--
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/4390361c-288c-4c55-91c9-e9f53ffd74a2%40gmail.com.

Reply via email to