morning,
On 2006-12-16 01:40:03, Mathieu Bouchard <[EMAIL PROTECTED]> appears to
have written:
On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
An advantage using the list-of-bytes approach is that because each
character can be represented by a rather large integer, it can be
extended to work on lists-of-characters meaning quickly, if there is a
[utf8decode] and [utf8encode] to turn bytes into characters and back;
also it's a method that is available now and reuses the existing list
objects; and it's a method that supports \0 (NUL) characters.
Disadvantages are that it takes more time to convert to C strings and
back, it takes more space in .pd files, it isn't readable as text in .pd
files, it takes up to 4 times more space to represent in .pd files, and
exactly 4 times more space in RAM (in the case that just iso-latin-1 is
used), and also that you can't make lists of strings like that.
i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes per
string object, not counting the selector. as i think we've discussed
before, using ieee floats, which should be able to losslessly encode a
24 bit integer, that can be tweaked down to
(sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on my
system (32 bit floats), that still amounts to one wasted byte per
character for the representation, and it's hellishly cryptic to boot.
(By the time we can have real strings, we can have nested-lists, and the
other way around, because they'd use the same mechanisms. whether it's
better to make them two types or one type, is a good question.)
... but then again, what else are ascii 0x1c-0x1f (28-31 =
{fs,gs,rs,us}) for? it's another ugly hack, would reserve some of the
ascii range, and would require additional parsing objects (potentially
constructable with [list]), but it's a possibility, should anyone
actually need nested lists as strings...
please don't get me wrong: i'm all in favor of "real" strings, nested
lists, and associative arrays - i wrote [pdstring] because i needed to
send some generated text over OSC to someone who could only interpret
ascii values: i'm glad if it's helpful to anyone besides myself, and i
don't see much difficulty in adding support for low-level c-type string
operations ([toupper], [tolower], at some later point maybe even
regexes), but i can't bring myself to believe that the list-of-bytes
approach is really the "right" way to do it, although i don't have a
better idea at the moment...
marmosets,
Bryan
--
Bryan Jurish "There is *always* one more bug."
[EMAIL PROTECTED] -Lubarsky's Law of Cybernetic Entomology
_______________________________________________
PD-dev mailing list
[email protected]
http://lists.puredata.info/listinfo/pd-dev