On Thursday, August 18, 2011 13:21:29 unDEFER wrote: > Hello! > > D language specification says that it supports UTF-8 strings, but I can't > find how to slice UTF-8 string by character index, not by bytes numbers. > Why there is no simple slice function in std.utf like attached code? > > Thank you in advance.
Hmmm. Such a function isn't entirely a bad idea, but it also makes me a bit nervous. Slicing is efficient. The slice function that you suggest is not. I mean, it's efficient enough for what it's doing, but it's not O(1) like slicing is, so having a slice function could be a bit misleading. Once drop has been merged in, you'll be able do to this auto s = takeExactly(drop(str, firstIndex), lastIndex - firstIndex)); to get the same effect. It may be worth adding such a function though. Certainly auto s = slice(firstIndex, lastIndex); is cleaner. If we add it though, then we should probably give it a different name. Maybe sliceByElementType? That does seem a bit long though, if accurate. We'd probably put it in std.range though rather than std.utf, since it could be useful for any range which isn't actually sliceable. And then there's the question of whether it would be better to make it lazy. It would make it so that it wasn't actually a string anymore, but it would make it more efficient for all of the cases where you don't actually end up using the whole slice. You can make a pull request for it if you want to, and the best way to handle it - as well as whether we actually want such a function - can be discussed in the pull request. I do think that some thought is going to have to go into what behavior we really want such a function to have though (as well as the best name for it). - Jonathan M Davis _______________________________________________ phobos mailing list [email protected] http://lists.puremagic.com/mailman/listinfo/phobos
