On Sunday 12 September 2010 19:22:02 bearophile wrote: > Andrei Alexandrescu: > > No, you end up having string-processing code dealing with ranges of > > dchar. > > Well, in several situations it's better to produce a real string/dstring. > Even in Haskell, that is designed to manage lazy computation well, you > sometimes create eager lists/arrays to simplify the types or the code or > to make the code more deterministic.
Personally, I've had to use strict functions rather than lazy ones in haskell primarily to save memory by forcing the program to actually do the computations rather than putting it off and piling up the whole list of operations to possibly do later in memory. When working on my thesis, I had a program which made me run out of memory - all 4 GB of memory and 6GB of swap - because it wasn't processing _any_ of the files that I gave it until it had gotten the last one. I had to make it process each file and save the result before processing the next file rather than processing them all and then saving the result. > > If you want to keep the > > comparison with Python complete, Python's support for Unicode also needs > > to be part of the discussion. > > Right. My code was written in Python 2.x. In Python 3.x the situation is > different, all strings are Unicode on default (they are all UTF 16 or UTF > 32 according to the way you have compiled CPython) (and there is a > built-in bytearray, that is an array of bytes that in some situations is > seen as an ASCII string). So in Python it's like using dstrings everywere > (in Python there's no char type, it's a string of length 1) or using lazy > generators of them. Well, then in comparing python 3 with D, it would then seem like you wouldn't really lose anything to be using dstrings everywhere. Sure, it's nice to be able to save space by using string, but if it's a comparison between python and D and you end up using UTF-32 in both, then it doesn't seem to me that it's all that big a deal when porting code. Now, in comparing python 2 and D, that may be a different issue, but it sounds like the python 2 strings aren't unicode, which could be problematic. The issues with UTF-8 vs UTF-32 and random access are just a natural side-effect of having all strings be unicode. And honestly, I _really_ don't want having non-unicode strings to be at all normal in D. The fact that D forces unicode is a _good_ thing. - Jonathan M Davis
