"Nick Sabalausky" <a...@a.a> wrote in message news:humfrk$2g...@digitalmars.com... > "Rainer Deyke" <rain...@eldwood.com> wrote in message > news:humes8$s...@digitalmars.com... >> On 6/8/2010 13:57, bearophile wrote: >>> I hope we'll soon have computers with 200+ GB of RAM where using >>> strings that use less than 32-bit chars is in most cases a premature >>> optimization (like today is often a silly optimization to use arrays >>> of 16-bit ints instead of 32-bit or 64-bit ints. Only special >>> situations found with the profiler can justify the use of arrays of >>> shorts in a low level language). >> >> Off-topic, but I don't need a profiler to tell me that my 1024x1024x1024 >> arrays should use shorts instead of ints. And even when 200GB becomes >> common, I'd still rather not waste that memory by using twice as much >> space as I have to just because I can. >> >> > > I think he was just musing that it would be nice to be able to ignore > multiple encodings and multiple-code-units, and get back to something much > closer to the blissful simplicity of ASCII. On that particular point, I > concur ;) >
Keep in mind too, that for an English-language app (and there are plenty), even using ASCII still wastes space, since you usually only need the 26 letters, 10 digits, a few whitespace characters, and a handful of punctuation. You could probably fit that in 6 bits per character, less if you're ballsy enough to use huffman encoding internally. Yea, there's twice as many letters if you count uppercase/lowercase, but random-casing is rare so there's tricks you can use to just stick with 26 plus maybe a few special control characters. But, of course, nobody actually does any of that because with the amount of memory we have, and the amount of memory already used by other parts of a program, the savings wouldn't be worth the bother. But I agree with your point too. Just saying.