On Saturday, October 26, 2013 15:17:33 Ali Çehreli wrote: > On 10/26/2013 02:25 PM, Namespace wrote: > > On Saturday, 26 October 2013 at 21:23:13 UTC, Gautam Goel wrote: > >> Dumb Newbie Question: I've searched through the library reference, but > >> I haven't figured out how to extract a substring from a string. I'd > >> like something like string.substring("Hello", 0, 2) to return "Hel", > >> for example. What method am I looking for? Thanks! > > > > Use slices: > > > > string msg = "Hello"; > > string sub = msg[0 .. 2]; > > Yes but that works only if the string is known to contain only ASCII > codes. (Otherwise, a string is a collection of UTF-8 code units.) > > I could not find a subString() function either but it turns out to be > trivial to implement with Phobos: > > import std.range; > import std.algorithm; > > auto subRange(R)(R s, size_t beg, size_t end) > { > return s.dropExactly(beg).take(end - beg); > } > > unittest > { > assert("abcçdef".subRange(2, 4).equal("cç")); > } > > void main() > {} > > That function produces a lazy range. To convert it eagerly to a string: > > import std.conv; > > string subString(string s, size_t beg, size_t end) > { > return s.subRange(beg, end).text; > } > > unittest > { > assert("Hello".subString(0, 2) == "He"); > }
There's also std.utf.toUTFindex, which allows you to do auto str = "Hello"; assert(str[0 .. str.toUTFindex(2)] == "He"); but you have to be careful with it when using anything other than 0 for the first index, because you don't want it to have to traverse the range multiple times. With your unicode example you're forced to do something like auto str = "abcçdef"; immutable first = str.toUTFindex(2); immutable second = str[first .. $].toUTFindex(2) + first; assert(str[first .. second] == "cç"); It also has the advantage of the final result being a string without having to do any conversions. So, subString should probably be defined as inout(C)[] subString(C)(inout(C)[] str, size_t i, size_t j) if(isSomeChar!C) { import std.utf; immutable first = str.toUTFindex(i); immutable second = str[first .. $].toUTFindex(i) + first; return str[first .. second]; } Using drop/dropExactly with take/takeExactly makes more sense when you want to iterate over the characters but don't need a string (especially if you're not necessarily going to iterate over them all), but if you really want a string, then finding the right index for the slice and then slicing is arguably better. - Jonathan M Davis