Am Mon, 10 Mar 2014 14:05:03 +0000 schrieb "Andrea Fontana" <[email protected]>:
> In italian we need unicode too. We have several accented letters > and often programming languages don't handle utf-8 and other > encoding so well... > > In D I never had any problem with this, and I work a lot on text > processing. > > So my question: is there any problem I'm missing in D with > unicode support or is just a performance problem on algorithms? The only real problem apart from potential performance issues I've seen mentioned in this thread is that indexing/slicing is done with code units. I think this: auto index = countUntil(...); auto slice = str[0 .. index]; is really the only problem with the current implementation. If we could start from scratch I'd say we keep operating on code points by default but don't make strings arrays of char/wchar/dchar. Instead they should be special types which do all operations (especially indexing, slicing) on code points. This would be as safe as the current implementation, always consistent but probably even slower in some cases. Then offer some nice way to get the raw data for algorithms which can deal with it. However, I think it's too late to make these changes.
