On Dec 16, 2006, at 4:55 AM, Bryan Jurish wrote:

morning,

On 2006-12-16 01:40:03, Mathieu Bouchard <[EMAIL PROTECTED]> appears to have written:
On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
An advantage using the list-of-bytes approach is that because each character can be represented by a rather large integer, it can be extended to work on lists-of-characters meaning quickly, if there is a [utf8decode] and [utf8encode] to turn bytes into characters and back; also it's a method that is available now and reuses the existing list objects; and it's a method that supports \0 (NUL) characters. Disadvantages are that it takes more time to convert to C strings and back, it takes more space in .pd files, it isn't readable as text in .pd files, it takes up to 4 times more space to represent in .pd files, and exactly 4 times more space in RAM (in the case that just iso-latin-1 is used), and also that you can't make lists of strings like that.

i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes per string object, not counting the selector. as i think we've discussed before, using ieee floats, which should be able to losslessly encode a 24 bit integer, that can be tweaked down to (sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on my system (32 bit floats), that still amounts to one wasted byte per character for the representation, and it's hellishly cryptic to boot.

(By the time we can have real strings, we can have nested-lists, and the other way around, because they'd use the same mechanisms. whether it's better to make them two types or one type, is a good question.)

... but then again, what else are ascii 0x1c-0x1f (28-31 = {fs,gs,rs,us}) for? it's another ugly hack, would reserve some of the ascii range, and would require additional parsing objects (potentially constructable with [list]), but it's a possibility, should anyone actually need nested lists as strings...

please don't get me wrong: i'm all in favor of "real" strings, nested lists, and associative arrays - i wrote [pdstring] because i needed to send some generated text over OSC to someone who could only interpret ascii values: i'm glad if it's helpful to anyone besides myself, and i don't see much difficulty in adding support for low-level c-type string operations ([toupper], [tolower], at some later point maybe even regexes), but i can't bring myself to believe that the list-of-bytes approach is really the "right" way to do it, although i don't have a better idea at the moment...

One advantage of this approach is that many C string functions like toupper, tolower, strcat, strcmp, etc. would be pretty easy to implement in Pd, rather than C. A regexp object in C would be pretty straightforward.

How about using a selector "string" for these lists? I suppose that could cause mayhem since it would make the list into a selector series and run into all the vagaries of handling them.

.hc
------------------------------------------------------------------------

Man has survived hitherto because he was too ignorant to know how to realize his wishes. Now that he can realize them, he must either change them, or perish. -William Carlos Williams



_______________________________________________
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev

Reply via email to