Re: [PD-dev] strings

Bryan Jurish Sat, 16 Dec 2006 06:09:30 -0800

morning,

On 2006-12-16 01:40:03, Mathieu Bouchard <[EMAIL PROTECTED]> appears tohave written:

On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
An advantage using the list-of-bytes approach is that because eachcharacter can be represented by a rather large integer, it can beextended to work on lists-of-characters meaning quickly, if there is a[utf8decode] and [utf8encode] to turn bytes into characters and back;also it's a method that is available now and reuses the existing listobjects; and it's a method that supports \0 (NUL) characters.
Disadvantages are that it takes more time to convert to C strings andback, it takes more space in .pd files, it isn't readable as text in .pdfiles, it takes up to 4 times more space to represent in .pd files, andexactly 4 times more space in RAM (in the case that just iso-latin-1 isused), and also that you can't make lists of strings like that.

i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes perstring object, not counting the selector. as i think we've discussedbefore, using ieee floats, which should be able to losslessly encode a24 bit integer, that can be tweaked down to(sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on mysystem (32 bit floats), that still amounts to one wasted byte percharacter for the representation, and it's hellishly cryptic to boot.

(By the time we can have real strings, we can have nested-lists, and theother way around, because they'd use the same mechanisms. whether it'sbetter to make them two types or one type, is a good question.)

... but then again, what else are ascii 0x1c-0x1f (28-31 ={fs,gs,rs,us}) for? it's another ugly hack, would reserve some of theascii range, and would require additional parsing objects (potentiallyconstructable with [list]), but it's a possibility, should anyoneactually need nested lists as strings...

please don't get me wrong: i'm all in favor of "real" strings, nestedlists, and associative arrays - i wrote [pdstring] because i needed tosend some generated text over OSC to someone who could only interpretascii values: i'm glad if it's helpful to anyone besides myself, and idon't see much difficulty in adding support for low-level c-type stringoperations ([toupper], [tolower], at some later point maybe evenregexes), but i can't bring myself to believe that the list-of-bytesapproach is really the "right" way to do it, although i don't have abetter idea at the moment...


marmosets,
        Bryan

--
Bryan Jurish                           "There is *always* one more bug."
[EMAIL PROTECTED]      -Lubarsky's Law of Cybernetic Entomology

_______________________________________________
PD-dev mailing list
[email protected]
http://lists.puredata.info/listinfo/pd-dev

Re: [PD-dev] strings

Reply via email to