Re: [PD-dev] strings

2006-12-19 Thread Tim Blechmann
On Mon, 2006-12-18 at 12:46 -0500, Mathieu Bouchard wrote:
  of course the only real way to vote for this would be write the
 code -
  i think i'll wait for PNPD instead.. :)
 
  pnpd is currently supporting both hashed symbols and full-featured 
  string ;) however, there are no objects for handling strings, yet
 
 Are there any implicit casts between strings and symbols? 

i haven't decided, yet, but i guess no ... 

--
[EMAIL PROTECTED]ICQ: 96771783
http://www.mokabar.tk

The only people for me are the mad ones, the ones who are mad to live,
mad to talk, mad to be saved, desirous of everything at the same time,
the ones who never yawn or say a commonplace thing, but burn, burn,
burn, like fabulous yellow roman candles exploding like spiders across
the stars and in the middle you see the blue centerlight pop and
everybody goes Awww!
  Jack Kerouac


signature.asc
Description: This is a digitally signed message part
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Tim Blechmann
 of course the only real way to vote for this would be write the code -
 i think i'll wait for PNPD instead.. :) 

pnpd is currently supporting both hashed symbols and full-featured
string ;)
however, there are no objects for handling strings, yet 

t

--
[EMAIL PROTECTED]ICQ: 96771783
http://www.mokabar.tk

All we composers really have to work with is time and sound - and
sometimes I'm not even sure about sound.
  Morton Feldman


signature.asc
Description: This is a digitally signed message part
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Mathieu Bouchard

On Sun, 17 Dec 2006, Martin Peach wrote:

but in the long term it would be best to just use long lengths for when 
we all have teraflop laptops:


People have been using strings bigger than 64k for many years in almost 
any other language. It doesn't have anything to do with the teraflops, 
really.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Mathieu Bouchard

On Mon, 18 Dec 2006, Hans-Christoph Steiner wrote:

On Dec 17, 2006, at 1:36 AM, Mathieu Bouchard wrote:
That's aiming low. Why shouldn't there be any automatic casts between the 
two?


Automatic type conversion sounds like a really bad idea if the language only 
partially supports it.


If that's the case then pd is a really bad idea.

It's not possible to typecast any value of a type to any value of another 
type, all of the time. That's even true in the most typecast-frenzy 
languages ever.


There's no such requirement that implicit casts have to be impossibly well 
supported in order to be a good idea. You're just dismissing all forms of 
implicit casts as being bad ideas.



Pd is strongly typed, so what Martin says is definitely appropriate.


Non-sequitur, there are languages that have quite strict and elaborate 
type checking and yet which support implicit casts. For example, C++.


And then, in many cases, languages can become less strictly checked 
without any problems, as long as nothing actually relies on type 
violations. Because pd doesn't have any error handling (in the sense of pd 
patches being able to figure out their own problems), if any patch doesn't 
spit out any error messages, it would run the same if there were more 
implicit casts made, because the patches would never trigger those 
implicit casts anyway.


Perl is the opposite, everything can be automatically cast, so there it 
makes sense.


No, in Perl there's no way that you can cast something to a pointer. You 
can use the backslash operator to make a pointer to anything, but 
that's not a cast-to-pointer, because if you use it on a pointer, you 
don't get the same pointer, you get a new pointer.


So... anyone want to code up some of these ideas?  We could try them out 
in the next Pd-extended.


Martin's solution, my solution, or your solution?

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Hans-Christoph Steiner


On Dec 18, 2006, at 1:23 AM, carmen wrote:

Automatic type conversion sounds like a really bad idea if the  
language only partially supports it.  Pd is strongly typed


is it? it mainly has numbers that occasionally look like symbols,  
and symbols that more than occasionally look like lists and/or  
strings..


There are set rules which defined what is a float, symbol, or  
pointer.  You cannot change that type, often even with a special  
method.  Ever tried to turn a float into a symbol?  Doesn't really  
work, only partially.


.hc


, so what Martin says is definitely appropriate.
 Perl is the opposite, everything can be automatically cast, so  
there it makes sense.


it is definitely a design decision which way to go. could PD  
flexibly support both at once? or does there need to be an OCaml  
edition, and a Perl edition?


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev






Mistrust authority - promote decentralization.  - the hacker ethic



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: Re: [PD-dev] strings

2006-12-18 Thread Mathieu Bouchard

On Mon, 18 Dec 2006, [EMAIL PROTECTED] wrote:

Mathieu Bouchard [EMAIL PROTECTED] wrote:

I have no clue what you're talking about: how mangled would they be? i
don't plan any mangling to happen, except for the presence of \0
characters.

Maybe you don't understand what is being proposed. How would you make a symbol 
containing ASCII NUL and CR LF characters for instance?


Do you realise that the quoting problem can be solved independently of the 
allocation problem? In that case, you would be able to save any symbol and 
read it back. This would solve the problem about CR LF and spaces; only 
the problem with \0 (NUL) would remain.


Do you also realise that symbols can be made to support NUL while being 
backwards-compatible? Then what happens when a non-NUL-supporting external 
tries to read a symbol that contains a NUL, it will appear truncated at 
that point and that's all.


So basically there are three problems that can be dealt with 
independently. I'd rather not suppose that all three have to come 
together, monolithically.


For this purpose symbols are not usable because they can't contain every 
possible character and lists have too much overhead since each element 
of a list is an atom.


Symbols could be usable, if the problems that can be fixed in symbol 
without changing the nature of symbols, are fixed. You don't need strings 
for that.



I'm suggesting that a [string] be like any other object and be
deallocated when the patcher is closed.

Ok, that's certainly the string feature that I want. It's too much trouble
for the benefit.

Whatever.


Wouldn't you want objects to be able to emit strings in a way as carefree 
as they are with symbols? I'm talking about not putting the burden of 
memory management on the emitter of strings.



Man, that's not n atom type.

No it's not n atoms,


It's a typo. It was supposed to be that's not an atom type, but that 
isn't so more true. I would've like to say something more like: it would 
be easier, if strings are more similar to symbols and floats, than to 
(g)pointers.



Symbols are difficult to work with because their content gets interpreted,

You say that in answer to my questions on allocation? (That's not an
allocation issue and not even any kind of memory layout issue.)
I don't know, did I? It looks to me like an answer to a question about 
why symbols can't be used to encode arbitrary strings. Maybe I was 
tired.


It was just below two paragraphs that I had written about allocation.


for example if I write a comment MP 20061214 it gets converted into MP
2.00612e+007


the contents of a comment box is not a symbol. It's a list of atoms.
However, Pd has the same problem you describe when trying to save some


I wonder how the list of atoms in a comment box gets by without some of 
those atoms being symbols...


That some of the components are symbols, has nothing to do with the reason 
20061214 gets converted to float32. There is never a big symbol containing 
all of the contents of a comment: it's broken down into atoms (into a 
t_binbuf) as soon as you click outside of the box, it's just the t_rtext 
that keeps holding the original string; but the t_savefn only saves the 
t_binbuf, it doesn't look at the t_rtext, which is why the floats get 
mangled at save time only.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: Re: [PD-dev] strings

2006-12-18 Thread Mathieu Bouchard

On Mon, 18 Dec 2006, [EMAIL PROTECTED] wrote:
If ascii values from 0 - 31 can be part of symbols that would be nice. 
How do you specify a symbol containing ascii values 1 2 and 3? Do they 
have names?


Do it the way most languages have borrowed from C : use backslash followed 
by an octal or hex code, like \033 or \0x1b. It's easy to make it 
compatible with C/C++/Java/Perl/Python/Tcl/Ruby/PHP, all at the same time.



Symbols could be usable, if the problems that can be fixed in symbol
without changing the nature of symbols, are fixed. You don't need strings
for that.

You still have the problem of the symbol table that grows by one each time the 
symbol changes.


Well, you can solve that problem separately. There's no point to clump all 
issues into one ball.


Wouldn't you want objects to be able to emit strings in a way as 
carefree as they are with symbols? I'm talking about not putting the 
burden of memory management on the emitter of strings.
A string library could have functions similar to getbytes(), 
resizebytes() and freebytes()


Yes, that's putting the burden of memory management on the emitter. I 
mean, what I call the burden of memory management isn't about writing 
your own malloc() from scratch, no, I'm not talking about that, I'm 
talking about having to decide when to copy and when to deallocate.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: Re: [PD-dev] strings

2006-12-18 Thread martin.peach
 Mathieu Bouchard [EMAIL PROTECTED] wrote: 
 
 Do you realise that the quoting problem can be solved independently of the 
 allocation problem? In that case, you would be able to save any symbol and 
 read it back. This would solve the problem about CR LF and spaces; only 
 the problem with \0 (NUL) would remain.

If ascii values from 0 - 31 can be part of symbols that would be nice. How do 
you specify a symbol containing ascii values 1 2 and 3? Do they have names?

 Symbols could be usable, if the problems that can be fixed in symbol 
 without changing the nature of symbols, are fixed. You don't need strings 
 for that.

You still have the problem of the symbol table that grows by one each time the 
symbol changes. If I want to parse a book one word at a time, for example, it 
would only take one string for the input buffer, but it would take as many 
symbols as there are different words in the book.

 Wouldn't you want objects to be able to emit strings in a way as carefree 
 as they are with symbols? I'm talking about not putting the burden of 
 memory management on the emitter of strings.


A string library could have functions similar to getbytes(), resizebytes() and 
freebytes() for changing the length of strings that could be called by any 
other external in the library. Or pd could have the same functions that could 
be called by any external. Either way...
 
Martin



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: Re: [PD-dev] strings

2006-12-18 Thread martin.peach
 
 De: Hans-Christoph Steiner [EMAIL PROTECTED]
 Date: 2006/12/18 lun. AM 09:45:26 GMT-05:00
 À: carmen [EMAIL PROTECTED]
 Cc: pd-dev@iem.at
 Objet: Re: [PD-dev] strings
 
 
 On Dec 18, 2006, at 1:23 AM, carmen wrote:
 
  Automatic type conversion sounds like a really bad idea if the  
  language only partially supports it.  Pd is strongly typed
 
  is it? it mainly has numbers that occasionally look like symbols,  
  and symbols that more than occasionally look like lists and/or  
  strings..
 
 There are set rules which defined what is a float, symbol, or  
 pointer.  You cannot change that type, often even with a special  
 method.  Ever tried to turn a float into a symbol?  Doesn't really  
 work, only partially.

Along the lines of pd_defaultlist() in m_class.c, which handles list messages 
for objects that don't have list methods, one could add a pd_defaultstring(), 
which attempts to convert strings into symbols/floats/lists, instead of calling 
pd_defaultanything(), which would print no method for string. But it needs to 
be understood that it might not do it correctly, which is Not A Good Thing, but 
no worse than comments getting mangled. Maybe a [string unpack] object would be 
better: it could attempt to unpack a string into specified types, so the user 
could decide if a string like 123 is meant to represent a float or a symbol.

Martin

 
 .hc
 
  , so what Martin says is definitely appropriate.
   Perl is the opposite, everything can be automatically cast, so  
  there it makes sense.
 
  it is definitely a design decision which way to go. could PD  
  flexibly support both at once? or does there need to be an OCaml  
  edition, and a Perl edition?
 
  ___
  PD-dev mailing list
  PD-dev@iem.at
  http://lists.puredata.info/listinfo/pd-dev
 
 
 
 
 
 Mistrust authority - promote decentralization.  - the hacker ethic
 
 
 
 ___
 PD-dev mailing list
 PD-dev@iem.at
 http://lists.puredata.info/listinfo/pd-dev
 


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: Re: [PD-dev] strings

2006-12-18 Thread martin.peach

 
 De: Mathieu Bouchard [EMAIL PROTECTED]
 Date: 2006/12/18 lun. PM 12:11:18 GMT-05:00
 À: Martin Peach [EMAIL PROTECTED]
 Cc: pd-dev@iem.at
 Objet: Re: [PD-dev] strings
 
 On Sun, 17 Dec 2006, Martin Peach wrote:
 
  You make them work as strings when they can, and
  You make them work as symbols when they must.
  There would be two objects, [stringtosymbol] and [symboltostring] that you 
  could put between string and symbol objects. Of course some strings would 
  get 
  impossibly mangled this way but that's because of the way symbols work.
 
 I have no clue what you're talking about: how mangled would they be? i 
 don't plan any mangling to happen, except for the presence of \0 
 characters.

Maybe you don't understand what is being proposed. How would you make a symbol 
containing ASCII NUL and CR LF characters for instance?

 
  Yes, there's no reason not to have 0-length strings. And no reason to trash 
  them when they are unused either, since they don't take up more space than 
  any other object.
 
 They take the space it takes to tell their size and the pointer to the 
 buffer. That's significant, and nearly as much as in the case of a 
 t_symbol, supposing that those t_strings can live independently of the 
 objects that produce them.

Like any other object strings have that overhead, but unlike lists they only 
have one atom per string. They would be created by string objects and last as 
long as the string objects. One string per string object. String messages are 
passed between string manipulator objects. For this purpose symbols are not 
usable because they can't contain every possible character and lists have too 
much overhead since each element of a list is an atom.

 
  I'm suggesting that a [string] be like any other object and be 
  deallocated when the patcher is closed.
 
 Ok, that's certainly the string feature that I want. It's too much trouble 
 for the benefit.

Whatever.

 
 Man, that's not n atom type.
 

No it's not n atoms, it's a single atom that contains a pointer to a list of 
bytes. That's the main advantage of string over list.

  Symbols are difficult to work with because their content gets interpreted,
 
 You say that in answer to my questions on allocation? (That's not an 
 allocation issue and not even any kind of memory layout issue.)

I don't know, did I? It looks to me like an answer to a question about why 
symbols can't be used to encode arbitrary strings. Maybe I was tired.

 
  for example if I write a comment MP 20061214 it gets converted into MP 
  2.00612e+007
 
 the contents of a comment box is not a symbol. It's a list of atoms. 
 However, Pd has the same problem you describe when trying to save some 
 symbols. e.g. say you have a symbol with a space in it and you pass it to 
 a messagebox set $1 which passes it to an empty messagebox, and then you 
 save the patch: then you have that problem with symbols. But the contents 
 of the comment box has that problem while never storing its contents as a 
 symbol.

I wonder how the list of atoms in a comment box gets by without some of those 
atoms being symbols...

Martin



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Hans-Christoph Steiner


On Dec 18, 2006, at 12:42 PM, Mathieu Bouchard wrote:


On Mon, 18 Dec 2006, Hans-Christoph Steiner wrote:

On Dec 17, 2006, at 1:36 AM, Mathieu Bouchard wrote:
That's aiming low. Why shouldn't there be any automatic casts  
between the two?


Automatic type conversion sounds like a really bad idea if the  
language only partially supports it.


If that's the case then pd is a really bad idea.

It's not possible to typecast any value of a type to any value of  
another type, all of the time. That's even true in the most  
typecast-frenzy languages ever.


There's no such requirement that implicit casts have to be  
impossibly well supported in order to be a good idea. You're just  
dismissing all forms of implicit casts as being bad ideas.



Pd is strongly typed, so what Martin says is definitely appropriate.


Non-sequitur, there are languages that have quite strict and  
elaborate type checking and yet which support implicit casts. For  
example, C++.


And then, in many cases, languages can become less strictly checked  
without any problems, as long as nothing actually relies on type  
violations. Because pd doesn't have any error handling (in the  
sense of pd patches being able to figure out their own problems),  
if any patch doesn't spit out any error messages, it would run the  
same if there were more implicit casts made, because the patches  
would never trigger those implicit casts anyway.


C/C++ is not very strict.  It allows you to just change what you call  
a chunk of memory without complaint.  Try Pascal, that is strict.  Or  
Pd's floats and symbols.  There is no way to make a float into symbol  
or vice versa without trickery.  For example, these doesn't work:


[123([symbol 123(  [symbol 123(
| | |
[symbol] [ \  -- (symbol box) [float]

.hc





Perl is the opposite, everything can be automatically cast, so  
there it makes sense.


No, in Perl there's no way that you can cast something to a  
pointer. You can use the backslash operator to make a pointer to  
anything, but that's not a cast-to-pointer, because if you use it  
on a pointer, you don't get the same pointer, you get a new pointer.


So... anyone want to code up some of these ideas?  We could try  
them out in the next Pd-extended.


Martin's solution, my solution, or your solution?

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada





Terrorism is not an enemy.  It cannot be defeated.  It's a tactic.   
It's about as sensible to say we declare war on night attacks and  
expect we're going to win that war.  We're not going to win the war  
on terrorism.- retired U.S. Army general, William Odom




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Hans-Christoph Steiner


On Dec 18, 2006, at 12:42 PM, Mathieu Bouchard wrote:


On Mon, 18 Dec 2006, Hans-Christoph Steiner wrote:

On Dec 17, 2006, at 1:36 AM, Mathieu Bouchard wrote:
That's aiming low. Why shouldn't there be any automatic casts  
between the two?


Automatic type conversion sounds like a really bad idea if the  
language only partially supports it.


If that's the case then pd is a really bad idea.

It's not possible to typecast any value of a type to any value of  
another type, all of the time. That's even true in the most  
typecast-frenzy languages ever.


There's no such requirement that implicit casts have to be  
impossibly well supported in order to be a good idea. You're just  
dismissing all forms of implicit casts as being bad ideas.



Pd is strongly typed, so what Martin says is definitely appropriate.


Non-sequitur, there are languages that have quite strict and  
elaborate type checking and yet which support implicit casts. For  
example, C++.


And then, in many cases, languages can become less strictly checked  
without any problems, as long as nothing actually relies on type  
violations. Because pd doesn't have any error handling (in the  
sense of pd patches being able to figure out their own problems),  
if any patch doesn't spit out any error messages, it would run the  
same if there were more implicit casts made, because the patches  
would never trigger those implicit casts anyway.


C/C++ is not very strict.  It allows you to just change what you call  
a chunk of memory without complaint.  Try Pascal, that is strict.  Or  
Pd's floats and symbols.  There is no way to make a float into symbol  
or vice versa without trickery.  For example, these doesn't work:


[123([symbol 123(  [symbol 123(
| | |
[symbol] [ \  -- (symbol box) [float]

.hc





Perl is the opposite, everything can be automatically cast, so  
there it makes sense.


No, in Perl there's no way that you can cast something to a  
pointer. You can use the backslash operator to make a pointer to  
anything, but that's not a cast-to-pointer, because if you use it  
on a pointer, you don't get the same pointer, you get a new pointer.


So... anyone want to code up some of these ideas?  We could try  
them out in the next Pd-extended.


Martin's solution, my solution, or your solution?

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada





Terrorism is not an enemy.  It cannot be defeated.  It's a tactic.   
It's about as sensible to say we declare war on night attacks and  
expect we're going to win that war.  We're not going to win the war  
on terrorism.- retired U.S. Army general, William Odom




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-18 Thread Mathieu Bouchard

On Mon, 18 Dec 2006, Hans-Christoph Steiner wrote:

On Dec 18, 2006, at 12:42 PM, Mathieu Bouchard wrote:
Non-sequitur, there are languages that have quite strict and elaborate type 
checking and yet which support implicit casts. For example, C++.


C/C++ is not very strict.  It allows you to just change what you call a chunk 
of memory without complaint.


Ok, I spoke too quick. I didn't want to say strict. I shouldn't have said 
strict. Instead of strict I wanted to say that the type checking happens 
all of the time.


I thought up some kind of classification of type systems, avoiding to call 
them strong/weak or static/dynamic because those words are confusing.


1. Typed expressions: each piece of code that can give a value, has a 
type that can be figured out at compile-time.


2. Typed variables/parameters: declarations allow runtime checks but not 
compile-time checks.


3. Typed values: variables don't have types, they can contain any value, 
but every value has a type.


4. Typed uses: values don't have types, a type is a way of using a value.

Strictness, in the sense of forbidding things to the user, is not on that 
scale, it's another aspect. A well-balanced strictness allows one to 
bypass the system whenever needed, but without being too error-prone. 
However it's difficult to say what it means to bypass the system for all 
four typing categories at once, or even within one category.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Bryan Jurish
On 2006-12-17 03:09:19, Martin Peach [EMAIL PROTECTED] appears 
to have written:
A string could be considered unused when its length is set to 0. Memory 
would need to be dynamically allocated in small blocks. The API should 
return no method for string if the external doesn't implement strings.


... which wouldn't get us true strings in the mathematical sense of a 
free monoid Alphabet,concat(), since the empty string is the identity 
element for concat()...


marmosets,
Bryan

--
Bryan Jurish   There is *always* one more bug.
[EMAIL PROTECTED]  -Lubarsky's Law of Cybernetic Entomology

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Mathieu Bouchard

On Sun, 17 Dec 2006, Bryan Jurish wrote:

... which wouldn't get us true strings in the mathematical sense of a 
free monoid Alphabet,concat(), since the empty string is the identity 
element for concat()...


Right, and it may seem like not much, but if one is going to make a lot of 
abstractions for basic string processing, i'd rather have them use monoid 
algorithms rather than semigroup algorithms. The monoid algorithms are 
often nicer... semigroup algorithms can't start with an empty string, so 
they start with the first character of a string, and then do a 
foreach-loop that starts on the second character so that the first 
character isn't counted twice, so you have to decide a way to skip that 
character... ugly.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Martin Peach

Mathieu Bouchard wrote:

On Sat, 16 Dec 2006, Martin Peach wrote:

What if strings could be automatically cast to symbols for externals 
that would rather have symbols, and vice-versa?
I have written an external asc2sym that takes lists of bytes and 
splits them into symbols based on the argument(s) which are characters.
But it seems important to avoid symbols as much as possible to avoid 
filling up the symbol table with symbols that are referenced only once..


Yes, but my reason for wanting this, is that all externals currently 
available understand symbols but not strings. So, what if you want to 
make strings as widely used as possible, as easily as possible, and 
working with all externals currently available in Pd?


You make them work as strings when they can, and
You make them work as symbols when they must.
There would be two objects, [stringtosymbol] and [symboltostring] that 
you could put between string and symbol objects. Of course some strings 
would get impossibly mangled this way but that's because of the way 
symbols work.



A string could be considered unused when its length is set to 0.


If you want to use a string as a mutable buffer, then you want to be 
able to have 0-length strings, as a boundary condition: you start with 
nothing and then add to it. You don't want to have to start with 
something just because setting the length to 0 would delete it.


Yes, there's no reason not to have 0-length strings. And no reason to 
trash them when they are unused either, since they don't take up more 
space than any other object.
It seems that you are suggesting that the deallocation would be 
user-controlled? Then how do you prevent the user from crashing pd?
I'm suggesting that a [string] be like any other object and be 
deallocated when the patcher is closed. It's basically a variable-length 
list of bytes. It would contain methods to allocated and deallocate 
memory via malloc() or pd's getbytes(), which uses calloc().
If you use a weak-pointer as an intermediate (like t_gpointer or 
t_gfxstub), then you still have to manage reference counts. Whatever 
you do for the user, you have to know more about externals' behaviour 
than what they tell you now, because right now they don't deallocate 
atoms explicitly.


But if strings are going to be deallocated explicitly and there is not 
going to be any checks, why not instead make something that will allow 
users to deallocate symbols. It's about as safe as that and you don't 
need to introduce a string type.
Symbols are difficult to work with because their content gets 
interpreted, for example if I write a comment MP 20061214 it gets 
converted into MP 2.00612e+007, or if I want a symbol to have spaces 
or carriage returns in it, it won't get created, which is very annoying 
when a lot of serial hardware wants to see a CR before it processes a 
message.
Also every time I change a symbol, it gets added to the global symbol 
table. So adding one character at a time to a string would result in 
that many symbols being created.
A string as I see it is closer to a list, and could be operated on with 
objects like the list objects -- append, split, etc.





Memory would need to be dynamically allocated in small blocks.


What do you mean in small blocks ?
Whatever is most efficient. If malloc is better at allocating blocks of 
256 bytes than blocks of 1 then it's better to work with multiples of 
256. It seems inefficient to allocate 65536 bytes for every string at 
creation time.


The API should return no method for string if the external doesn't 
implement strings.


That's aiming low. Why shouldn't there be any automatic casts between 
the two?
Because it would require rewriting more of the pd core, and because a 
lot of strings can't be made into symbols (strings can contain any 
integer on [0...255] but symbols cannot). Having the two converter 
objects [stringtosymbol] and [symboltostring] is easier. The no method 
for string message would come from pd, not the external, so the 
external doesn't need to implement any string methods.


Martin


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Martin Peach

Mathieu Bouchard wrote:

On Sat, 16 Dec 2006, Martin Peach wrote:

Yes, and it's also easier to limit strings to word (16-bit) lengths, 
while 8-bit is too short. So a t_string would look like:

typedef struct _string /* pointer to a string */
{
 unsigned short s_length; /* length of string in bytes */
 unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;


If you're not compiling in 16-bit mode, then there will be 2 or 6 
bytes between the first and second field, so that the second field can 
be aligned to a word boundary, supposing that the struct as a whole is 
itself aligned to a word boundary. (By word, I strictly mean something 
that is the same size as a pointer.)


What I mean is that it's useless to not use the whole a length field 
that is not the same size as the pointer field, if you have only those 
two fields. If you have more than two fields, then you can put several 
short fields in the space of a word (2 or 4).

I suppose we could do like Apple or Microsoft and have something like:

typedef struct _string /* pointer to a string */
{
   unsigned short s_length; /* length of string in bytes */
   unsigned short s_reserved; /* filler */
   unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;

but in the long term it would be best to just use long lengths for when 
we all have teraflop laptops:


typedef struct _string /* pointer to a string */
{
   unsigned long s_length; /* length of string in bytes */
   unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;

...but restrict the maximum string length using a #define 
MAX_STRING_LENGTH so that pd doesn't bite off more than it can chew...


Martin

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Bryan Jurish

moin Martin, moin list,

On 2006-12-17 21:46:50, Martin Peach [EMAIL PROTECTED] appears 
to have written:

Bryan Jurish wrote:
On 2006-12-17 03:09:19, Martin Peach [EMAIL PROTECTED] 
appears to have written:
A string could be considered unused when its length is set to 0. 
Memory would need to be dynamically allocated in small blocks. The 
API should return no method for string if the external doesn't 
implement strings.


... which wouldn't get us true strings in the mathematical sense of a 
free monoid Alphabet,concat(), since the empty string is the 
identity element for concat()...


Yes, I agree there should be no restriction on empty strings. I also 
think there is no need to destroy strings except when the patcher is 
closed, so it's not really an issue.


if by destroy you mean de-allocation of the string struct itself (i 
assume you do; your suggestion looks a lot like a glib GString btw, 
which is im(ns)ho a good general purpose c string struct), and if a 
string therefore winds up being just something like a symbol with a 
volatile value (i.e. doesn't get written to the symbol table), then i agree.


what i think we need to avoid with strings (i don't think anyone has 
suggested otherwise, i'm just stating the obvious) is symbol-style 
permanent allocation for every string *value*.  string variables 
could/should be handled like any other pd atom: the external which 
creates them is responsible for (de-)allocation, which would wind up 
doing what you suggest and freeing any allocated memory when the 
responsible object is destroyed (provided the object doesn't leak 
memory, but i think we can assume c programmers are used to keeping 
track of such things -- ymmv).


in fact, this is how [any2string] handles things, in its ugly 
list-of-floats way...


marmosets,
Bryan

--
Bryan Jurish   There is *always* one more bug.
[EMAIL PROTECTED]  -Lubarsky's Law of Cybernetic Entomology

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread Hans-Christoph Steiner


On Dec 17, 2006, at 1:36 AM, Mathieu Bouchard wrote:


On Sat, 16 Dec 2006, Martin Peach wrote:

What if strings could be automatically cast to symbols for  
externals that would rather have symbols, and vice-versa?
I have written an external asc2sym that takes lists of bytes and  
splits them into symbols based on the argument(s) which are  
characters.
But it seems important to avoid symbols as much as possible to  
avoid filling up the symbol table with symbols that are referenced  
only once..


Yes, but my reason for wanting this, is that all externals  
currently available understand symbols but not strings. So, what if  
you want to make strings as widely used as possible, as easily as  
possible, and working with all externals currently available in Pd?


You make them work as strings when they can, and
You make them work as symbols when they must.


A string could be considered unused when its length is set to 0.


If you want to use a string as a mutable buffer, then you want to  
be able to have 0-length strings, as a boundary condition: you  
start with nothing and then add to it. You don't want to have to  
start with something just because setting the length to 0 would  
delete it.


It seems that you are suggesting that the deallocation would be  
user-controlled? Then how do you prevent the user from crashing pd?
If you use a weak-pointer as an intermediate (like t_gpointer or  
t_gfxstub), then you still have to manage reference counts.  
Whatever you do for the user, you have to know more about  
externals' behaviour than what they tell you now, because right now  
they don't deallocate atoms explicitly.


But if strings are going to be deallocated explicitly and there is  
not going to be any checks, why not instead make something that  
will allow users to deallocate symbols. It's about as safe as that  
and you don't need to introduce a string type.



Memory would need to be dynamically allocated in small blocks.


What do you mean in small blocks ?

The API should return no method for string if the external  
doesn't implement strings.


That's aiming low. Why shouldn't there be any automatic casts  
between the two?


Automatic type conversion sounds like a really bad idea if the  
language only partially supports it.  Pd is strongly typed, so what  
Martin says is definitely appropriate.  Perl is the opposite,  
everything can be automatically cast, so there it makes sense.


So... anyone want to code up some of these ideas?  We could try them  
out in the next Pd-extended.


.hc




There is no way to peace, peace is the way.   -A.J. Muste



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread carmen
 Automatic type conversion sounds like a really bad idea if the language only 
 partially supports it.  Pd is strongly typed

is it? it mainly has numbers that occasionally look like symbols, and symbols 
that more than occasionally look like lists and/or strings..

 , so what Martin says is definitely appropriate. 
  Perl is the opposite, everything can be automatically cast, so there it 
 makes sense.

it is definitely a design decision which way to go. could PD flexibly support 
both at once? or does there need to be an OCaml edition, and a Perl edition?

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-17 Thread carmen
 Automatic type conversion sounds like a really bad idea if the language only 
 partially supports it.  Pd is strongly typed

do you think the target user base wants to think in terms of casting types? i 
don't. i have a feeling that was why there are so few types. i think most users 
wan't to be able to plug anything into anythign and at least get some sort of 
result, more than expected bang, got '' scrolling 1000 times a second in 
stderr... 

my vote would be a nice selection of types, and autocasting (maybe warnings at 
most for int vs float, string to symbol, etc)

of course the only real way to vote for this would be write the code - i think 
i'll wait for PNPD instead.. :)

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Mathieu Bouchard

On Sat, 16 Dec 2006, Bryan Jurish wrote:
On 2006-12-16 01:40:03, Mathieu Bouchard [EMAIL PROTECTED] appears to have 
written:


i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes per string 
object, not counting the selector.


Oh yeah, sorry, the occupied space is up to 4 times as text but it's 8 
times in 32-bit mode and 16 times in 64-bit mode.


as i think we've discussed before, using ieee floats, which should be 
able to losslessly encode a 24 bit integer,


if you want something saveable to a file, pd can only losslessly convert 
19.93 bits to decimal.


... but then again, what else are ascii 0x1c-0x1f (28-31 = {fs,gs,rs,us}) 
for?


When I was a small kid, my parents bought a CGP-115 plotter, and the code 
for changing the colour of the stylus was 29. It was in 1983.



 it's another ugly hack, would reserve some of the ascii range,


0 is enough to do lists-of-strings because in many ASCII-based systems 
it's only ever used to mean end-of-string. It's faster than my nested-list 
hack. However, my hack looks more like what the syntax for nested lists 
could become if it were not a hack. Essentially my hack is a post-parser 
that reinterprets symbol-atoms depending on their parens-content, and 
makes it feel like pd has a LISP syntax... sometimes. (It's a 
GridFlow-only feature though).


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Hans-Christoph Steiner


On Dec 16, 2006, at 4:55 AM, Bryan Jurish wrote:


morning,

On 2006-12-16 01:40:03, Mathieu Bouchard [EMAIL PROTECTED]  
appears to have written:

On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
An advantage using the list-of-bytes approach is that because each  
character can be represented by a rather large integer, it can be  
extended to work on lists-of-characters meaning quickly, if there  
is a [utf8decode] and [utf8encode] to turn bytes into characters  
and back; also it's a method that is available now and reuses the  
existing list objects; and it's a method that supports \0 (NUL)  
characters.
Disadvantages are that it takes more time to convert to C strings  
and back, it takes more space in .pd files, it isn't readable as  
text in .pd files, it takes up to 4 times more space to represent  
in .pd files, and exactly 4 times more space in RAM (in the case  
that just iso-latin-1 is used), and also that you can't make lists  
of strings like that.


i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes  
per string object, not counting the selector.  as i think we've  
discussed before, using ieee floats, which should be able to  
losslessly encode a 24 bit integer, that can be tweaked down to  
(sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on  
my system (32 bit floats), that still amounts to one wasted byte  
per character for the representation, and it's hellishly cryptic to  
boot.


(By the time we can have real strings, we can have nested-lists,  
and the other way around, because they'd use the same mechanisms.  
whether it's better to make them two types or one type, is a good  
question.)


... but then again, what else are ascii 0x1c-0x1f (28-31 =  
{fs,gs,rs,us}) for?  it's another ugly hack, would reserve some of  
the ascii range, and would require additional parsing objects  
(potentially constructable with [list]), but it's a possibility,  
should anyone actually need nested lists as strings...


please don't get me wrong: i'm all in favor of real strings,  
nested lists, and associative arrays - i wrote [pdstring] because i  
needed to send some generated text over OSC to someone who could  
only interpret ascii values: i'm glad if it's helpful to anyone  
besides myself, and i don't see much difficulty in adding support  
for low-level c-type string operations ([toupper], [tolower], at  
some later point maybe even regexes), but i can't bring myself to  
believe that the list-of-bytes approach is really the right way  
to do it, although i don't have a better idea at the moment...


One advantage of this approach is that many C string functions like  
toupper, tolower, strcat, strcmp, etc. would be pretty easy to  
implement in Pd, rather than C. A regexp object in C would be pretty  
straightforward.


How about using a selector string for these lists?  I suppose that  
could cause mayhem since it would make the list into a selector  
series and run into all the vagaries of handling them.


.hc


Man has survived hitherto because he was too ignorant to know how to  
realize his wishes.  Now that he can realize them, he must either  
change them, or perish.-William Carlos Williams




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Martin Peach
I think it would be most efficient to have a string type be a length 
followed by that many unsigned chars, similar to a Pascal string but 
with the length being something like a 32-bit integer. It would not be 
added to pd's symbol list. The atom whose type was string would have to 
contain a pointer to the first byte of the string, and a length. 
Multibyte characters would just be counted as multiple characters when 
calculating the length, so the length would be the number of bytes in 
the string, not the number of characters.

It looks too easy to me...In m_pd.h, add:
A_STRING
to t_atomtype.
Add
t_string * w_string;
to  t_word.
Add the typedef:
typedef struct _string /* pointer to a string */
{
   unsigned long s_length; /* length of string in bytes */
   unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;

...so a string atom would have a_type = A_STRING and a_w = a_w.w_string, 
which points to a t_string containing the length and a pointer to the 
string.


If pd is otherwise able to handle atom types it doesn't know about (?), 
all the string manipulation objects could be built as externals.


Martin



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Mathieu Bouchard

On Sat, 16 Dec 2006, Martin Peach wrote:

...so a string atom would have a_type = A_STRING and a_w = a_w.w_string, 
which points to a t_string containing the length and a pointer to the 
string.



If pd is otherwise able to handle atom types it doesn't know about (?),


It's not. There are no provisions for adding any extra atom types. There's 
no table for registering atom types. Out of 12 assigned numbers for atom 
types, 5 aren't actually atom types, 4 are radioactive types 
(SEMI,COMMA,DOLLAR,DOLLSYM), the remaining three have reserved selectors 
and hardcoded entries in t_class. What's the right way to add a fourth one 
like that?



all the string manipulation objects could be built as externals.


What if strings could be automatically cast to symbols for externals that 
would rather have symbols, and vice-versa?



It looks too easy to me...


It's because you've only thought about the easy part of the problem. How 
do you know when a string becomes unused? When do you deallocate the 
memory? What does this mean for the API used by externals? (including the 
things that are assumed but not written in m_pd.h)


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Martin Peach

Hans-Christoph Steiner wrote:


The one problem I can think of here is that you can only have 19 bits 
of precision in Pd's 32-bit t_float.  So having a length of 32 bits 
would cause problems if trying to deal with string length using 
t_floats. I could see this happening in a loop in Pd space, for example.
Yes, and it's also easier to limit strings to word (16-bit) lengths, 
while 8-bit is too short.

So a t_string would look like:
typedef struct _string /* pointer to a string */
{
  unsigned short s_length; /* length of string in bytes */
  unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;

Martin




.hc

On Dec 16, 2006, at 5:12 PM, Martin Peach wrote:

I think it would be most efficient to have a string type be a length 
followed by that many unsigned chars, similar to a Pascal string but 
with the length being something like a 32-bit integer. It would not 
be added to pd's symbol list. The atom whose type was string would 
have to contain a pointer to the first byte of the string, and a 
length. Multibyte characters would just be counted as multiple 
characters when calculating the length, so the length would be the 
number of bytes in the string, not the number of characters.

It looks too easy to me...In m_pd.h, add:
A_STRING
to t_atomtype.
Add
t_string * w_string;
to  t_word.
Add the typedef:
typedef struct _string /* pointer to a string */
{
   unsigned long s_length; /* length of string in bytes */
   unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;

...so a string atom would have a_type = A_STRING and a_w = 
a_w.w_string, which points to a t_string containing the length and a 
pointer to the string.


If pd is otherwise able to handle atom types it doesn't know about 
(?), all the string manipulation objects could be built as externals.


Martin



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev





News is what people want to keep hidden and everything else is 
publicity.  - Bill Moyers





___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Mathieu Bouchard

On Sat, 16 Dec 2006, Martin Peach wrote:

Yes, and it's also easier to limit strings to word (16-bit) lengths, 
while 8-bit is too short. So a t_string would look like:

typedef struct _string /* pointer to a string */
{
 unsigned short s_length; /* length of string in bytes */
 unsigned char *s_data; /* pointer to 1st byte of string */
} t_string;


If you're not compiling in 16-bit mode, then there will be 2 or 6 bytes 
between the first and second field, so that the second field can be 
aligned to a word boundary, supposing that the struct as a whole is itself 
aligned to a word boundary. (By word, I strictly mean something that is 
the same size as a pointer.)


What I mean is that it's useless to not use the whole a length field that 
is not the same size as the pointer field, if you have only those two 
fields. If you have more than two fields, then you can put several short 
fields in the space of a word (2 or 4).


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-16 Thread Mathieu Bouchard

On Sat, 16 Dec 2006, Martin Peach wrote:

What if strings could be automatically cast to symbols for externals that 
would rather have symbols, and vice-versa?
I have written an external asc2sym that takes lists of bytes and splits them 
into symbols based on the argument(s) which are characters.
But it seems important to avoid symbols as much as possible to avoid filling 
up the symbol table with symbols that are referenced only once..


Yes, but my reason for wanting this, is that all externals currently 
available understand symbols but not strings. So, what if you want to make 
strings as widely used as possible, as easily as possible, and working 
with all externals currently available in Pd?


You make them work as strings when they can, and
You make them work as symbols when they must.


A string could be considered unused when its length is set to 0.


If you want to use a string as a mutable buffer, then you want to be 
able to have 0-length strings, as a boundary condition: you start with 
nothing and then add to it. You don't want to have to start with 
something just because setting the length to 0 would delete it.


It seems that you are suggesting that the deallocation would be 
user-controlled? Then how do you prevent the user from crashing pd?
If you use a weak-pointer as an intermediate (like t_gpointer or 
t_gfxstub), then you still have to manage reference counts. Whatever you 
do for the user, you have to know more about externals' behaviour than 
what they tell you now, because right now they don't deallocate atoms 
explicitly.


But if strings are going to be deallocated explicitly and there is not 
going to be any checks, why not instead make something that will allow 
users to deallocate symbols. It's about as safe as that and you don't need 
to introduce a string type.



Memory would need to be dynamically allocated in small blocks.


What do you mean in small blocks ?

The API should return no method for string if the external doesn't 
implement strings.


That's aiming low. Why shouldn't there be any automatic casts between the 
two?


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-15 Thread padawan12

Thanks Hans and IOhan, I think Bryans offering covers most 
of what is needed, adequate to muddle by until such time when we
have real strings.

Andy


On Fri, 15 Dec 2006 17:41:03 -0500
Hans-Christoph Steiner [EMAIL PROTECTED] wrote:

 
 You can do a fair amount of string handling with [list2symbol] and  
 things like that.  But yes, it leaves a lot to be desired.  Bryan  
 Jurish has taken a different approach, which is to use lists of bytes  
 to represent strings.  Might be worth checking out.
 
 .hc
 
 On Dec 15, 2006, at 2:06 AM, padawan12 wrote:
 
  A new and keen developer on the forums has asked - What about text  
  processing
  in Pd? to which I replied Pd doesn't do strings.
  I tie myself in knots trying string-like operations sometimes :),  
  so I know
  its a can of worms, but what are the fundamental limitations  
  surrounding symbol.
  How do we deal with EOL or NULL and so on, and what about encoding?  
  Did I hear
  a rumour that better string handling is chalked in for Pd soon? An  
  alphanumeric
  sort, maybe even a [grep] or [sed]? What would be the best way to  
  introduce the
  concept of strings to Pd in a consistent and robust way. I see them  
  as lists of
  symbols without any need for a new type but right now there are  
  pieces of the
  jigsaw missing. Sorry so many questions, but it's bugging me today.
  a.
 
 
  ___
  PD-dev mailing list
  PD-dev@iem.at
  http://lists.puredata.info/listinfo/pd-dev
 
 
 
 
 
 All information should be free.  - the hacker ethic
 
 
 
 

___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-15 Thread Hans-Christoph Steiner


Plus you can use that string format directly with Martin Peach's  
network objects, AFAIK.


.hc

On Dec 16, 2006, at 7:16 AM, padawan12 wrote:



Thanks Hans and IOhan, I think Bryans offering covers most
of what is needed, adequate to muddle by until such time when we
have real strings.

Andy


On Fri, 15 Dec 2006 17:41:03 -0500
Hans-Christoph Steiner [EMAIL PROTECTED] wrote:



You can do a fair amount of string handling with [list2symbol] and
things like that.  But yes, it leaves a lot to be desired.  Bryan
Jurish has taken a different approach, which is to use lists of bytes
to represent strings.  Might be worth checking out.

.hc

On Dec 15, 2006, at 2:06 AM, padawan12 wrote:


A new and keen developer on the forums has asked - What about text
processing
in Pd? to which I replied Pd doesn't do strings.
I tie myself in knots trying string-like operations sometimes :),
so I know
its a can of worms, but what are the fundamental limitations
surrounding symbol.
How do we deal with EOL or NULL and so on, and what about encoding?
Did I hear
a rumour that better string handling is chalked in for Pd soon? An
alphanumeric
sort, maybe even a [grep] or [sed]? What would be the best way to
introduce the
concept of strings to Pd in a consistent and robust way. I see them
as lists of
symbols without any need for a new type but right now there are
pieces of the
jigsaw missing. Sorry so many questions, but it's bugging me today.
a.


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev




- 
---


All information should be free.  - the hacker ethic










All information should be free.  - the hacker ethic





___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] strings

2006-12-15 Thread Mathieu Bouchard

On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:

But yes, it leaves a lot to be desired.  Bryan Jurish has taken a 
different approach, which is to use lists of bytes to represent strings. 
Might be worth checking out.


An advantage using the list-of-bytes approach is that because each 
character can be represented by a rather large integer, it can be extended 
to work on lists-of-characters meaning quickly, if there is a [utf8decode] 
and [utf8encode] to turn bytes into characters and back; also it's a 
method that is available now and reuses the existing list objects; and 
it's a method that supports \0 (NUL) characters.


Disadvantages are that it takes more time to convert to C strings and 
back, it takes more space in .pd files, it isn't readable as text in .pd 
files, it takes up to 4 times more space to represent in .pd files, and 
exactly 4 times more space in RAM (in the case that just iso-latin-1 is 
used), and also that you can't make lists of strings like that.


(By the time we can have real strings, we can have nested-lists, and the 
other way around, because they'd use the same mechanisms. whether it's 
better to make them two types or one type, is a good question.)


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev