On 2011-06-25 16:07, kenji hara wrote: > 2011/6/26 Jonathan M Davis <[email protected]>: > > On 2011-06-25 15:15, kenji hara wrote: > >> > 1. Keep toStringz as it is (as well as toUTF16z) and either consider > >> > stringz to be some sort of word unique to the D community or just > >> > admit that we're not going to camelcase it because it would break too > >> > much code to do so. > >> > >> ++vote, but not all. > >> > >> Currently, the return type of toStringz is "zero-termniated UTF-8", > >> not "C-string". > >> > >> The 'C-string' word has multiple meanings=encodings. ASCII, Latin-1, > >> EUC, Shift-JIS (in Japan), UTF-8 (Linux?), UTF-16 (in Windows) ... > >> It depends on context. > >> > >> But, maybe, many of ’C-string' equals to "zero-terminated UTF-8' or > >> "zero-terminated UTF-16". > >> Other encodings should be supported by another module (std.encoding? > >> Is it living?). > >> > >> My proposal: > >> 1. Add three aliased types. > >> alias immutable(char)* stringz; // useful in Linux > >> alias immutable(wchar)* wstringz; // useful in Windows > >> alias immutable(dchar)* dstringz; // > >> 2. Rename current toStringz to toUTF8z, and add deprecated aliasing > >> 'toStringz' to keep compatibility. > >> (Adding toUTF32z in std.string module will increase consistency. > >> Templated toUTFXXz family is more better.) > >> 3. std.conv.to support conversion from 'any string type' to > >> (|wd)stringz type (by using toUTFXXz family). > >> > >> The main point is we should make the aliased type names as 'De facto' > >> type names, like string, wstring, dstring. (Remember the three string > >> types are aliased type in fact.) > >> > >> We can treat the type name uint as 'unsigned int'. Because it is just > >> built-in type name! > >> > >> User defined type names shoude be camel cased usually in D. > >> Then, let's make them built-in! Therefore we can remove camel cased > >> names from our choices. > >> > >> I think this proposal is usefulness, keeping compatibility, and > >> consistent. > > > > From this and related discussions, it seems that the current plan is to > > create a toUTFz function which is templated on the pointer type that you > > want returned (char*, const(char)*, immutable(char)*, wchar*, etc.) and > > which takes any string type. Then you can get a zero-terminated string > > with whatever level of constness you want from any string. std.conv.to > > would then be updated such that converting from any string to any > > character pointer would call toUTFz. We may or may not have toStringz, > > toWstringz, and toDstringz which use toUTFz. > > > > Regardless, I don't see much point in creating the types stringz, > > wstringz, and dstringz. There's nothing which guarantees that they're > > going to be zero- terminated, so they could be complete misnomers, > > depending on how they're used, and they're specifically immutable > > whereas you often need mutable zero- terminated strings. So, ultimately, > > I don't think that they'd add much. We _do_ need better conversion > > functions though. > > > > - Jonathan M Davis > > > > > > There's nothing which guarantees that they're going to be zero- > > terminated, so they could be complete misnomers, depending on how they're > > used, > > Ah, you are right. I didn't think about it. I agree to you. > > > to create > > a toUTFz function which is templated on the pointer type that you want > > returned (char*, const(char)*, immutable(char)*, wchar*, etc.) > > I tihnk the templated function toUTFz needs default type inference > feature like follows: > ---- > string s = "..."; > auto sz = toUTFz(s); > static assert(is(typeof(sz) == immutable(char)*)); > ---- > > Thanks for your explain.
Oh, it may end up with a default template parameter based on what it's given. It hasn't been written yet. But the idea is to allow for creating any type zero-terminated strings (well, character pointers) from any type of string - including allowing for defining constness. Defining a default for the template parameter is definitely a good idea though. - Jonathan M Davis
