Re: [PD] how to iterate over left and right channel separately in one Pd class?

2013-01-12 Thread Hans-Christoph Steiner

Yeah, that makes sense.  With all the auto-vectorization and SIMD support is
recent versions of gcc, it seems a better approach is to tailor the C code to
work well with SIMD-aware compilers.

.hc

On 01/12/2013 04:45 PM, katja wrote:
> It's interesting, but rather compiler-and-processor-specific. Such
> code is maintanance-intensive. At the moment, ARM processors are
> screaming loudest for optimization. Best thing for a community project
> is probably plain C code which reckons with parallel processing,
> because that won't go away for the next few decades. Functions like
> copy_perform8(), times_perform8() etc. can profit from SIMD
> instructions without a need for compiler intrinsics and asm code.
> Well-structured data storage and access can make a 50 % or more
> performance gain, in my experience.
> 
> Another important thing: avoid float precision conversions. Throughout
> Pd there are many untyped float defines and literal constants which
> default to double, and I have introduced more when making libs
> double-ready. Not good. I'll come back to this in another thread.
> 
> Katja
> 
> 
> On Sat, Jan 12, 2013 at 8:14 PM, Hans-Christoph Steiner  wrote:
>>
>> If you are interested, there is still the hand-coded SIMD stuff from 
>> pd-devel:
>> https://pure-data.svn.sourceforge.net/svnroot/pure-data/branches/pd-devel/v0-39
>>
>> .hc
>>
>> On 01/12/2013 09:34 AM, katja wrote:
>>> Function copy_perform8() is also eligible for SIMD processing. I used
>>> memcpy() because it is straightforward to use, while Pd's functions
>>> pointed to the wrong locations for this case. On the reverb's total
>>> load there is no significant performance difference.
>>>
>>> Katja
>>>
>>>
>>> On Sat, Jan 12, 2013 at 1:00 AM, Hans-Christoph Steiner  
>>> wrote:

 I recently learned that libc's memcpy actually uses things like SSE2 or 
 SSSE2
 so it can be quite fast on CPUs from the past 10 years, especially of the 
 last
 5 years.

 It would be worth profiling to see if that's noticeable.

 .hc

 On 01/11/2013 05:12 PM, katja wrote:
> Ok so I did the ugly thing with the right channel input and output 
> pointers:
>
> memcpy(outR, inR, vectorsize * sizeof(t_float));
> inR = outR;
>
> Works like a charm, thanks again.
>
> Katja
>
>
>
> On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette  wrote:
>> copy_perform assumes the data is 4-byte aligned so might save a test
>> or two compared to memcopy() - but I really don't know.  I never
>> benchmarked the two against each other :)
>>
>> M
>>
>> On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote:
>>> Hi Miller,
>>>
>>> Thanks for the solution. The routines are in place so copying the
>>> right channel input to output should do it. Is there any reason to
>>> prefer copy_perform() over memcpy()? I'm trying to make the most
>>> efficient reverb for RPi & Co.
>>>
>>> Katja
>>>
>>>
>>>
>>> On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette  wrote:
 Hi Katja -

 There's one example of this in sigfft_dspx() - a complex FFT that 
 'natively'
 works on 2 signals in-place but has to deal with various cases in which
 buffers get re-used.  It's ugly but the basic idea is first to get the
 inputs copied to the outputs (unless they're already there in the 
 correct
 order in which case nothing needs to be done) and then run the in-place
 algorithm.

 If the algo only works out-of-place (i.e. you need 4 distinct buffers, 
 2
 in and 2 out) the only way out is (at least conditionally) allocate 
 temporary
 copies of the inputs before writing to any outputs.

 I may be able to add an optional way tilde objects can request that 
 output
 buffers be distinct from input ones sometime in the future - but this 
 is a
 couple of steps away for me right now :)

 M

 On Fri, Jan 11, 2013 at 03:32:09PM +0100, katja wrote:
> Hello,
>
> I'm working on a Pd class with stereo channels (reverb), and the
> routine happens to be most efficient when iterating over the samples
> per channel, instead of left and right together in the perform loop.
> However, when doing two while loops in one object, one for left and
> one for right, the right channel samples get overwritten because of
> sample-wise in-place computation. Is this an inescapable truth? I
> mean, I could write a left channel class and a right channel class
> (actually did that to verify that it works), but it's inconvenient to
> use. What could be an efficient way to get them in one object?
>
> Thanks,
> Katja
>
> ___

Re: [PD] how to iterate over left and right channel separately in one Pd class?

2013-01-12 Thread katja
It's interesting, but rather compiler-and-processor-specific. Such
code is maintanance-intensive. At the moment, ARM processors are
screaming loudest for optimization. Best thing for a community project
is probably plain C code which reckons with parallel processing,
because that won't go away for the next few decades. Functions like
copy_perform8(), times_perform8() etc. can profit from SIMD
instructions without a need for compiler intrinsics and asm code.
Well-structured data storage and access can make a 50 % or more
performance gain, in my experience.

Another important thing: avoid float precision conversions. Throughout
Pd there are many untyped float defines and literal constants which
default to double, and I have introduced more when making libs
double-ready. Not good. I'll come back to this in another thread.

Katja


On Sat, Jan 12, 2013 at 8:14 PM, Hans-Christoph Steiner  wrote:
>
> If you are interested, there is still the hand-coded SIMD stuff from pd-devel:
> https://pure-data.svn.sourceforge.net/svnroot/pure-data/branches/pd-devel/v0-39
>
> .hc
>
> On 01/12/2013 09:34 AM, katja wrote:
>> Function copy_perform8() is also eligible for SIMD processing. I used
>> memcpy() because it is straightforward to use, while Pd's functions
>> pointed to the wrong locations for this case. On the reverb's total
>> load there is no significant performance difference.
>>
>> Katja
>>
>>
>> On Sat, Jan 12, 2013 at 1:00 AM, Hans-Christoph Steiner  
>> wrote:
>>>
>>> I recently learned that libc's memcpy actually uses things like SSE2 or 
>>> SSSE2
>>> so it can be quite fast on CPUs from the past 10 years, especially of the 
>>> last
>>> 5 years.
>>>
>>> It would be worth profiling to see if that's noticeable.
>>>
>>> .hc
>>>
>>> On 01/11/2013 05:12 PM, katja wrote:
 Ok so I did the ugly thing with the right channel input and output 
 pointers:

 memcpy(outR, inR, vectorsize * sizeof(t_float));
 inR = outR;

 Works like a charm, thanks again.

 Katja



 On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette  wrote:
> copy_perform assumes the data is 4-byte aligned so might save a test
> or two compared to memcopy() - but I really don't know.  I never
> benchmarked the two against each other :)
>
> M
>
> On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote:
>> Hi Miller,
>>
>> Thanks for the solution. The routines are in place so copying the
>> right channel input to output should do it. Is there any reason to
>> prefer copy_perform() over memcpy()? I'm trying to make the most
>> efficient reverb for RPi & Co.
>>
>> Katja
>>
>>
>>
>> On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette  wrote:
>>> Hi Katja -
>>>
>>> There's one example of this in sigfft_dspx() - a complex FFT that 
>>> 'natively'
>>> works on 2 signals in-place but has to deal with various cases in which
>>> buffers get re-used.  It's ugly but the basic idea is first to get the
>>> inputs copied to the outputs (unless they're already there in the 
>>> correct
>>> order in which case nothing needs to be done) and then run the in-place
>>> algorithm.
>>>
>>> If the algo only works out-of-place (i.e. you need 4 distinct buffers, 2
>>> in and 2 out) the only way out is (at least conditionally) allocate 
>>> temporary
>>> copies of the inputs before writing to any outputs.
>>>
>>> I may be able to add an optional way tilde objects can request that 
>>> output
>>> buffers be distinct from input ones sometime in the future - but this 
>>> is a
>>> couple of steps away for me right now :)
>>>
>>> M
>>>
>>> On Fri, Jan 11, 2013 at 03:32:09PM +0100, katja wrote:
 Hello,

 I'm working on a Pd class with stereo channels (reverb), and the
 routine happens to be most efficient when iterating over the samples
 per channel, instead of left and right together in the perform loop.
 However, when doing two while loops in one object, one for left and
 one for right, the right channel samples get overwritten because of
 sample-wise in-place computation. Is this an inescapable truth? I
 mean, I could write a left channel class and a right channel class
 (actually did that to verify that it works), but it's inconvenient to
 use. What could be an efficient way to get them in one object?

 Thanks,
 Katja

 ___
 Pd-list@iem.at mailing list
 UNSUBSCRIBE and account-management -> 
 http://lists.puredata.info/listinfo/pd-list
>>
>> ___
>> Pd-list@iem.at mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list

 ___

Re: [PD] Echo detection and autocepstrum

2013-01-12 Thread Hans-Christoph Steiner

Sounds like you're looking for William Brent's owkr:

http://williambrent.conflations.com/pages/research.html

.hc


On 01/12/2013 06:58 AM, oguz gurler wrote:
> Hi,
> 
> I'm working on echo detection and looking for autocepstrum with pd. Is
> there any detail information about autocepstrum-cepstrum and it's output
> data for commenting about finding echo.
> 
> 
> 
> ___
> Pd-list@iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list
> 

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] how to iterate over left and right channel separately in one Pd class?

2013-01-12 Thread Hans-Christoph Steiner

If you are interested, there is still the hand-coded SIMD stuff from pd-devel:
https://pure-data.svn.sourceforge.net/svnroot/pure-data/branches/pd-devel/v0-39

.hc

On 01/12/2013 09:34 AM, katja wrote:
> Function copy_perform8() is also eligible for SIMD processing. I used
> memcpy() because it is straightforward to use, while Pd's functions
> pointed to the wrong locations for this case. On the reverb's total
> load there is no significant performance difference.
> 
> Katja
> 
> 
> On Sat, Jan 12, 2013 at 1:00 AM, Hans-Christoph Steiner  wrote:
>>
>> I recently learned that libc's memcpy actually uses things like SSE2 or SSSE2
>> so it can be quite fast on CPUs from the past 10 years, especially of the 
>> last
>> 5 years.
>>
>> It would be worth profiling to see if that's noticeable.
>>
>> .hc
>>
>> On 01/11/2013 05:12 PM, katja wrote:
>>> Ok so I did the ugly thing with the right channel input and output pointers:
>>>
>>> memcpy(outR, inR, vectorsize * sizeof(t_float));
>>> inR = outR;
>>>
>>> Works like a charm, thanks again.
>>>
>>> Katja
>>>
>>>
>>>
>>> On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette  wrote:
 copy_perform assumes the data is 4-byte aligned so might save a test
 or two compared to memcopy() - but I really don't know.  I never
 benchmarked the two against each other :)

 M

 On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote:
> Hi Miller,
>
> Thanks for the solution. The routines are in place so copying the
> right channel input to output should do it. Is there any reason to
> prefer copy_perform() over memcpy()? I'm trying to make the most
> efficient reverb for RPi & Co.
>
> Katja
>
>
>
> On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette  wrote:
>> Hi Katja -
>>
>> There's one example of this in sigfft_dspx() - a complex FFT that 
>> 'natively'
>> works on 2 signals in-place but has to deal with various cases in which
>> buffers get re-used.  It's ugly but the basic idea is first to get the
>> inputs copied to the outputs (unless they're already there in the correct
>> order in which case nothing needs to be done) and then run the in-place
>> algorithm.
>>
>> If the algo only works out-of-place (i.e. you need 4 distinct buffers, 2
>> in and 2 out) the only way out is (at least conditionally) allocate 
>> temporary
>> copies of the inputs before writing to any outputs.
>>
>> I may be able to add an optional way tilde objects can request that 
>> output
>> buffers be distinct from input ones sometime in the future - but this is 
>> a
>> couple of steps away for me right now :)
>>
>> M
>>
>> On Fri, Jan 11, 2013 at 03:32:09PM +0100, katja wrote:
>>> Hello,
>>>
>>> I'm working on a Pd class with stereo channels (reverb), and the
>>> routine happens to be most efficient when iterating over the samples
>>> per channel, instead of left and right together in the perform loop.
>>> However, when doing two while loops in one object, one for left and
>>> one for right, the right channel samples get overwritten because of
>>> sample-wise in-place computation. Is this an inescapable truth? I
>>> mean, I could write a left channel class and a right channel class
>>> (actually did that to verify that it works), but it's inconvenient to
>>> use. What could be an efficient way to get them in one object?
>>>
>>> Thanks,
>>> Katja
>>>
>>> ___
>>> Pd-list@iem.at mailing list
>>> UNSUBSCRIBE and account-management -> 
>>> http://lists.puredata.info/listinfo/pd-list
>
> ___
> Pd-list@iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list
>>>
>>> ___
>>> Pd-list@iem.at mailing list
>>> UNSUBSCRIBE and account-management -> 
>>> http://lists.puredata.info/listinfo/pd-list
>>>
>>
>> ___
>> Pd-list@iem.at mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] Automating startup

2013-01-12 Thread Pierre Massat
Hi Rick,

You can start by reading this page in the FLOSS manual, especially the last
paragraphs.
Once you get how it works, it's no big deal start Pd from a command line
with all the settings you need.

Cheers,

Pierre.

2013/1/12 Rick Bragg 

> Hi,
>
> I would like to set up my system to set up my patch automatically with
> jack and all
> the right connections when the system boots. I am using Ubuntu studio at
> the
> moment.
>
> I currently have it set so that qjackctl starts when I log in, and I set
> qjackctl
> to automatically start up jack server when it opens, and then to open my
> pd file
> after it starts.
>
> I have a few problems to overcome.
>
> First, every time pd starts, I need to go into the "Media" menu and change
> from
> "Default MIDI" to "Alsa MIDI"  Why do I have to change this every time?
>  Shouldn't
> this setting be saved?
>
> Second problem:
> After pd opens, I always need to go back to qjackctl, open up the the
> "connections"
> and change the MIDI connections.  Can't I save this as a patchbay setting
> somehow?
> I tried that, but it doesn't work because the patchbay settings need to
> load AFTER
> pd starts  otherwise the pd connections are not available.
>
> Are there any good documentation somewhere that discusses automating all
> this kind
> of stuff?
>
> Thanks!
> Rick
>
>
>
>
> ___
> Pd-list@iem.at mailing list
> UNSUBSCRIBE and account-management ->
> http://lists.puredata.info/listinfo/pd-list
>
___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


[PD] Automating startup

2013-01-12 Thread Rick Bragg
Hi,

I would like to set up my system to set up my patch automatically with jack and 
all
the right connections when the system boots. I am using Ubuntu studio at the
moment.

I currently have it set so that qjackctl starts when I log in, and I set 
qjackctl
to automatically start up jack server when it opens, and then to open my pd file
after it starts.

I have a few problems to overcome.

First, every time pd starts, I need to go into the "Media" menu and change from
"Default MIDI" to "Alsa MIDI"  Why do I have to change this every time?  
Shouldn't
this setting be saved?

Second problem:
After pd opens, I always need to go back to qjackctl, open up the the 
"connections"
and change the MIDI connections.  Can't I save this as a patchbay setting 
somehow? 
I tried that, but it doesn't work because the patchbay settings need to load 
AFTER
pd starts  otherwise the pd connections are not available.

Are there any good documentation somewhere that discusses automating all this 
kind
of stuff?

Thanks!
Rick




___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] how to iterate over left and right channel separately in one Pd class?

2013-01-12 Thread katja
Function copy_perform8() is also eligible for SIMD processing. I used
memcpy() because it is straightforward to use, while Pd's functions
pointed to the wrong locations for this case. On the reverb's total
load there is no significant performance difference.

Katja


On Sat, Jan 12, 2013 at 1:00 AM, Hans-Christoph Steiner  wrote:
>
> I recently learned that libc's memcpy actually uses things like SSE2 or SSSE2
> so it can be quite fast on CPUs from the past 10 years, especially of the last
> 5 years.
>
> It would be worth profiling to see if that's noticeable.
>
> .hc
>
> On 01/11/2013 05:12 PM, katja wrote:
>> Ok so I did the ugly thing with the right channel input and output pointers:
>>
>> memcpy(outR, inR, vectorsize * sizeof(t_float));
>> inR = outR;
>>
>> Works like a charm, thanks again.
>>
>> Katja
>>
>>
>>
>> On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette  wrote:
>>> copy_perform assumes the data is 4-byte aligned so might save a test
>>> or two compared to memcopy() - but I really don't know.  I never
>>> benchmarked the two against each other :)
>>>
>>> M
>>>
>>> On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote:
 Hi Miller,

 Thanks for the solution. The routines are in place so copying the
 right channel input to output should do it. Is there any reason to
 prefer copy_perform() over memcpy()? I'm trying to make the most
 efficient reverb for RPi & Co.

 Katja



 On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette  wrote:
> Hi Katja -
>
> There's one example of this in sigfft_dspx() - a complex FFT that 
> 'natively'
> works on 2 signals in-place but has to deal with various cases in which
> buffers get re-used.  It's ugly but the basic idea is first to get the
> inputs copied to the outputs (unless they're already there in the correct
> order in which case nothing needs to be done) and then run the in-place
> algorithm.
>
> If the algo only works out-of-place (i.e. you need 4 distinct buffers, 2
> in and 2 out) the only way out is (at least conditionally) allocate 
> temporary
> copies of the inputs before writing to any outputs.
>
> I may be able to add an optional way tilde objects can request that output
> buffers be distinct from input ones sometime in the future - but this is a
> couple of steps away for me right now :)
>
> M
>
> On Fri, Jan 11, 2013 at 03:32:09PM +0100, katja wrote:
>> Hello,
>>
>> I'm working on a Pd class with stereo channels (reverb), and the
>> routine happens to be most efficient when iterating over the samples
>> per channel, instead of left and right together in the perform loop.
>> However, when doing two while loops in one object, one for left and
>> one for right, the right channel samples get overwritten because of
>> sample-wise in-place computation. Is this an inescapable truth? I
>> mean, I could write a left channel class and a right channel class
>> (actually did that to verify that it works), but it's inconvenient to
>> use. What could be an efficient way to get them in one object?
>>
>> Thanks,
>> Katja
>>
>> ___
>> Pd-list@iem.at mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list

 ___
 Pd-list@iem.at mailing list
 UNSUBSCRIBE and account-management -> 
 http://lists.puredata.info/listinfo/pd-list
>>
>> ___
>> Pd-list@iem.at mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list
>>
>
> ___
> Pd-list@iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list

___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list


[PD] Echo detection and autocepstrum

2013-01-12 Thread oguz gurler
Hi,

I'm working on echo detection and looking for autocepstrum with pd. Is
there any detail information about autocepstrum-cepstrum and it's output
data for commenting about finding echo.
___
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list