Re: [PD] pd, openCV, pointers and indirection.

2009-10-05 Thread Loic Kessous
thanks Mathieu, it is still not clear for me what make things faster  
in one case or another but it helps.

loic
PS: what do you call Martin's strings ?

On 3 oct. 09, at 23:22, Mathieu Bouchard wrote:


On Sat, 3 Oct 2009, Loic Kessous wrote:

I understand your point of view, but I am more interested buy the  
approach than the implementation itself. I mean passing a pointer  
and not the image itself.


Passing the image itself is largely a myth anyway.

At a first level, Pd doesn't always pass $1, $2, $3, etc., as  
separate arguments in C: it often passes the pointer to the list  
(under the name argv). This is what always happens for running  
list-methods and anything-methods, as well as when sending list- 
messages and anything-messages (pd automatically converts argv to  
non-argv and non-argv to argv whenever needed).


At a second level, not much large data is passed as pd arglists:  
some notable exceptions can happen in [pix_data], [pix_set],  
[#to_list], [#import], [pix_convolve]'s config, Martin's strings,  
etc.; plugins such as Gem and GridFlow use a second level of  
pointers to avoid Pd's argv. This is mostly for this reason: because  
Pd's argv is limited to being a t_atom array, which is usually too  
big and inefficient for tightly-formatted data, spending 8 or 16  
bytes on storing a 4-byte float when you just want to store a single- 
byte int, for example.


But then, with either level, the way of specifying the pointer to  
the list allows basically anything to happen, as the pointer doesn't  
have to be stack allocated. With argv, methods aren't allowed to  
rely on a past argv after the return is done, but still, the sender  
of the message can decide the argv to be anything, not necessarily  
on the stack; this can happen to be fairly permanent data.


Beyond that, there is a distinction between systems that let the  
user deal with the pointerness aspect, and those that try to hide  
it (to make it more automatic and easier to think about, they  
pretend to pass the image but doesn't really). Outside of Pd, both  
strategies are widely used. Perl and Tcl are very good examples of  
strings that never look like they use pointers but always do. In  
Pd, ... only GridFlow uses something that looks like pass the  
image semantics but has a few gotchas, and it's also the only one  
that can pass an image without allocating a buffer of the same size  
as the image. In the end, all the video frameworks make the user  
mess with pointers in some way:


 * Gem's [pix_separator]
 * PDP's [pdp_trigger]
 * GridFlow's [#t]
 * MaPoD even required the user to free() image buffers using a  
special

   object-class.
 * FrameStein: i don't know (sorry).


That's why it's compiled as a dll library I suppose


I don't see any link between any of the above notions, and the kind  
of linkage (dll, etc) it uses.


and I wonder how using another solution as shared memory for  
example could be done in the same goal... loic


No idea what you are referring to. I know what shared memory is, I  
know what indirection is, but I don't know what is the problem that  
the solution solves, you didn't say that. (And if anything, shared  
memory introduces new portability concerns.)


_ _ __ ___ _  _ _ ...
| Mathieu Bouchard, Montréal, Québec. téléphone: +1.514.383.3801



___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] pd, openCV, pointers and indirection.

2009-10-05 Thread Mathieu Bouchard

On Mon, 5 Oct 2009, Loic Kessous wrote:

thanks Mathieu, it is still not clear for me what make things faster in 
one case or another but it helps.


1. data spacing: the more your data is spaced in memory, the more the 
cache has to load lots of data, because it assumes that the data is not 
very fragmented. Pd in 64-bit mode spends half of the argv space on 
padding. Here by spacing I mean the difference of starting position of two 
elements next to each other (e.g. where are $1 and $2 in RAM).


2. data element size: when the data doesn't have padding, this is the same 
as data spacing. Pd in in any mode spends half of the nonpadding argv 
space on type information.


3. type checking: if you have to check that every element of an argv is 
indeed a float, you need to use twice more data, and it's twice more 
spacing, but on top of that you need one conditional per element, just in 
case it isn't a float, and conditionals are getting comparatively slow on 
modern CPUs because they're harder to accelerate than the rest.


4. time fragmentation: a low block size may mean the CPU has to reload 
things in the cache more often, if the CPU's other tasks need the same 
cache for other purposes between the processing of two blocks. Bigger 
blocks mean that the CPU can concentrate. Having to repeatedly call, 
init, deinit, return, is also something that can take time.


5. cache fitting: repeatedly making long sweeps on very long arrays can 
make the cache completely useless. it's better to do as many things as 
possible on a small area of RAM at a time.


Based on those five criteria, we could compare various storage and 
computation strategies of various internals and externals of pd, provided 
that we get a bit more precise on some things. There may also be 
additional criteria.



loic PS: what do you call Martin's strings ?


I thought I knew, but I borked that. Martin's strings are [mrpeach/str], 
but they don't use pd lists of floats, they use a custom atom type called 
BLOB, which is essentially a form of double-indirection. (POINTER is also 
a double-indirection, but it was meant for DS, though it's often hijacked 
to be used in other ways.)


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard, Montréal, Québec. téléphone: +1.514.383.3801___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] pd, openCV, pointers and indirection.

2009-10-05 Thread Mathieu Bouchard

On Mon, 5 Oct 2009, Mathieu Bouchard wrote:

On Mon, 5 Oct 2009, Loic Kessous wrote:

loic PS: what do you call Martin's strings ?
I thought I knew, but I borked that. Martin's strings are [mrpeach/str], but 
they don't use pd lists of floats, they use a custom atom type called BLOB,


And the weird thing is that actually I knew that very well, but I still 
wrote «Martin's strings» in the list without thinking, I don't know why. 
Sleep deprivation, drugs, distractions, name it, blame it.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard, Montréal, Québec. téléphone: +1.514.383.3801___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


[PD] pd, openCV, pointers and indirection.

2009-10-03 Thread Mathieu Bouchard

On Sat, 3 Oct 2009, Loic Kessous wrote:

I understand your point of view, but I am more interested buy the 
approach than the implementation itself. I mean passing a pointer and 
not the image itself.


Passing the image itself is largely a myth anyway.

At a first level, Pd doesn't always pass $1, $2, $3, etc., as separate 
arguments in C: it often passes the pointer to the list (under the name 
argv). This is what always happens for running list-methods and 
anything-methods, as well as when sending list-messages and 
anything-messages (pd automatically converts argv to non-argv and non-argv 
to argv whenever needed).


At a second level, not much large data is passed as pd arglists: some 
notable exceptions can happen in [pix_data], [pix_set], [#to_list], 
[#import], [pix_convolve]'s config, Martin's strings, etc.; plugins such 
as Gem and GridFlow use a second level of pointers to avoid Pd's argv. 
This is mostly for this reason: because Pd's argv is limited to being a 
t_atom array, which is usually too big and inefficient for 
tightly-formatted data, spending 8 or 16 bytes on storing a 4-byte float 
when you just want to store a single-byte int, for example.


But then, with either level, the way of specifying the pointer to the list 
allows basically anything to happen, as the pointer doesn't have to be 
stack allocated. With argv, methods aren't allowed to rely on a past 
argv after the return is done, but still, the sender of the message can 
decide the argv to be anything, not necessarily on the stack; this can 
happen to be fairly permanent data.


Beyond that, there is a distinction between systems that let the user deal 
with the pointerness aspect, and those that try to hide it (to make it 
more automatic and easier to think about, they pretend to pass the image 
but doesn't really). Outside of Pd, both strategies are widely used. Perl 
and Tcl are very good examples of strings that never look like they use 
pointers but always do. In Pd, ... only GridFlow uses something that looks 
like pass the image semantics but has a few gotchas, and it's also the 
only one that can pass an image without allocating a buffer of the same 
size as the image. In the end, all the video frameworks make the user mess 
with pointers in some way:


  * Gem's [pix_separator]
  * PDP's [pdp_trigger]
  * GridFlow's [#t]
  * MaPoD even required the user to free() image buffers using a special
object-class.
  * FrameStein: i don't know (sorry).


That's why it's compiled as a dll library I suppose


I don't see any link between any of the above notions, and the kind of 
linkage (dll, etc) it uses.


and I wonder how using another solution as shared memory for example 
could be done in the same goal... loic


No idea what you are referring to. I know what shared memory is, I know 
what indirection is, but I don't know what is the problem that the 
solution solves, you didn't say that. (And if anything, shared memory 
introduces new portability concerns.)


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard, Montréal, Québec. téléphone: +1.514.383.3801___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list