Its seems like the grand plan for Unicode is to go to UTF-32 (4 bytes), and UTF-16 (2 bytes) is already the current standard for most OS's:
http://en.wikipedia.org/wiki/ UTF-16#Use_in_major_operating_systems_and_environments .hc On Nov 14, 2007, at 3:18 AM, Bryan Jurish wrote: > morning all, > > One (potential) problem with strings-as-arrays (as has been pointed > out > before) is a gross waste of space: for IEEE-754 floats, sizeof > (float)=4, > which is 3 bytes wasted for a single 8-bit character. We could do > some > devious byte-packing and pointer-casting in one or more externals to > attenuate the problems, but that gets very cryptic very quickly (I > actually tried this with [pdstring] (which wastes even *more* > memory by > using one atom per character), but decided against it in the interests > of clarity and flexibility)... I think it's definitely worth a go > however... > > marmosets, > Bryan > > On 2007-11-14 08:13:03, Hans-Christoph Steiner <[EMAIL PROTECTED]> > appears to > have written: >> Using arrays as strings is an interesting idea. I don't think non- >> ascii charsets should be too big a deal, they are decently supported >> right now, without even trying :). The Pd floats should store UTF-16 >> fine, which really covers basically everything. By the time UTF-32 >> is used much, Pd will be using 64-bit floats. >> >> .hc >> >> On Nov 14, 2007, at 12:43 AM, Miller Puckette wrote: >> >>> HI all, >>> >>> I don't have answers to all these, but I'm sure that adding a >>> string ytpe >>> to Pd isn't the roght way to handle these problems. But >>> specifically: >>> >>> 1. spaces in symbols are a parsing/formatting problem, not a data >>> type >>> problem; 2. use arrays as strings as I proposed; 3. I have to >>> think about >>> that one some more (!) and 4. one thing is dealing with non-ascii >>> character >>> sets, although there are likely to be many more problems to adress. >>> >>> >>> On Tue, Nov 13, 2007 at 11:57:06PM -0500, Chris McCormick wrote: >>>> Hi, >>>> >>>> I have deleted Miller's reply where he said that he's not that >>>> interested >>>> in adding a string type to Pd, but I'd like to ask him a couple of >>>> questions regarding that response, if that's ok. >>>> >>>> 1. How do you propose to solve the 'spaces in file path' issue >>>> without a >>>> string type? Or are you content with that restriction? >>>> >>>> 2. How do you suggest that people deal with the symbol table >>>> pollution >>>> issue mentioned before on this list, when they are doing operations >>>> processing lots and lots of symbol-strings in Pd? Let me know if >>>> you >>>> want more information about this issue. >>>> >>>> 3. Will a [symbol2list] ever make it into Pd canonical so that >>>> people >>>> can split long symbols on a character, like the zexy external that >>>> does >>>> this? It seems strange that you can concatenate symbols, but not >>>> split >>>> them apart again. >>>> >>>> 4. Can anyone else help me with a concise summary of other >>>> string/Pd >>>> issues I haven't thought of? >>>> >>>> Thanks for taking the time to read and reply. >>>> >>>> Best, >>>> >>>> Chris. > > -- > Bryan Jurish "There is *always* one more > bug." > [EMAIL PROTECTED] -Lubarsky's Law of Cybernetic > Entomology ------------------------------------------------------------------------ ---- Mistrust authority - promote decentralization. - the hacker ethic _______________________________________________ PD-dev mailing list [email protected] http://lists.puredata.info/listinfo/pd-dev
