Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Thomas Grill


Am 19.11.2006 um 05:00 schrieb Hans-Christoph Steiner:



On Nov 18, 2006, at 8:07 PM, Thomas Grill wrote:



Am 18.11.2006 um 22:16 schrieb Mathieu Bouchard:


On Sat, 18 Nov 2006, Hans-Christoph Steiner wrote:

I really doubt that the gcc devs put a lot of effort into  
something that has no effect. Perhaps not for Pd, that may be  
true.  But they are talking about vectorizing loops, it may not  
be the best thing to vectorize, but there are definitely  
vectorizable loops in Pd.


perhaps it would be a good start to reimplement newbytes(n) using  
memalign(16,n) instead of malloc(n).


A few years ago i introduced aligned memory allocation in the pd- 
devel branch.


Have you tried submitting a patch?  It would be at least useful in  
Pd-extended.  How big a difference did it make?


I have a better idea. People interested in improvements can easily  
make a diff from the devel branch.
The aligned memory allocation is part of the SIMD codelets which have  
been part of pd-devel for a long time.


best greetings,
Thomas


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Hans-Christoph Steiner


On Nov 19, 2006, at 5:13 AM, Thomas Grill wrote:



Am 19.11.2006 um 05:00 schrieb Hans-Christoph Steiner:



On Nov 18, 2006, at 8:07 PM, Thomas Grill wrote:



Am 18.11.2006 um 22:16 schrieb Mathieu Bouchard:


On Sat, 18 Nov 2006, Hans-Christoph Steiner wrote:

I really doubt that the gcc devs put a lot of effort into  
something that has no effect. Perhaps not for Pd, that may be  
true.  But they are talking about vectorizing loops, it may not  
be the best thing to vectorize, but there are definitely  
vectorizable loops in Pd.


perhaps it would be a good start to reimplement newbytes(n)  
using memalign(16,n) instead of malloc(n).


A few years ago i introduced aligned memory allocation in the pd- 
devel branch.


Have you tried submitting a patch?  It would be at least useful in  
Pd-extended.  How big a difference did it make?


I have a better idea. People interested in improvements can easily  
make a diff from the devel branch.
The aligned memory allocation is part of the SIMD codelets which  
have been part of pd-devel for a long time.


It generally accepted procedure in the projects that I've seen that  
people guide their own code thru the procedures of submitting patches  
and getting them accepted.  I think that makes sense here too.


Its coming quite clear that devel/dd is fork since the devel/dd devs  
are resistant or unwilling to try to get code into pd-MAIN.  That's  
too bad, I think we will all be the worse for it, but its your choice  
to do so.  I think it would be helpful to make it clear that its a  
fork instead of continuing to skirt the issue.


.hc




If you are not part of the solution, you are part of the problem.



___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Mathieu Bouchard

On Sun, 19 Nov 2006, Hans-Christoph Steiner wrote:

Its coming quite clear that devel/dd is fork since the devel/dd devs are 
resistant or unwilling to try to get code into pd-MAIN.


Please. See it more clearly. Welcome to 2006.

Or is it 2005.

I think it would be helpful to make it clear that its a fork instead of 
continuing to skirt the issue.


Yes, please don't skirt the issue.

 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Mathieu Bouchard

On Sun, 19 Nov 2006, Tim Blechmann wrote:

what makes you think, that just aligning memory regions introduces a 
performance boost? how can a compiler generate code for aligned memory, 
if the memory is aligned, but the compiler isn't aware of that?


The machine code can be code that works regardless of memory alignment, 
but runs faster when the memory is aligned.


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Thomas Grill

Hi H.C.,

It generally accepted procedure in the projects that I've seen that  
people guide their own code thru the procedures of submitting  
patches and getting them accepted.  I think that makes sense here too.


Its coming quite clear that devel/dd is fork since the devel/dd  
devs are resistant or unwilling to try to get code into pd-MAIN.


well, you perfectly know that this isn't true. I have been submitting  
many patches and bug reports over the last years, but i don't see why  
i should invest my really scarce time in something that's senseless.
If i would have submitted the patch for SIMD two or three years ago  
(by the time i made the relevant changes), it would have been  
automatically discarded in the meantime, simply because Miller has  
shown zero interest in using it. The SIMD patch would have been a lot  
of work - wasted time, that i could have used for other developments  
or even for composing music.


That's too bad, I think we will all be the worse for it, but its  
your choice to do so.  I think it would be helpful to make it clear  
that its a fork instead of continuing to skirt the issue.


I'll be continuing submitting patches that aren't a lot of work, like  
bug fixes. For other stuff (like the existing audio and midi fixes,  
idle callbacks, SIMD, or other features i have in mind) interested  
people are more than welcome to keep track of the changes and submit  
them. I don't see why necessarily i should do it.
If it helps i'll also announce to the pd-devel list when new features  
are introduced.


best greetings,
Thomas


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-19 Thread Thomas Grill


Am 19.11.2006 um 22:57 schrieb Mathieu Bouchard:


On Sun, 19 Nov 2006, Thomas Grill wrote:

Am 18.11.2006 um 22:16 schrieb Mathieu Bouchard:
perhaps it would be a good start to reimplement newbytes(n) using  
memalign(16,n) instead of malloc(n).
A few years ago i introduced aligned memory allocation in the pd- 
devel branch.


I see how you did it. Is it because posix_memalign() isn't as  
portable as we'd like it to be? (I wrote memalign by mistake,  
which is the name of a deprecated function that does a similar job)


It seems like a lot of memory is allocated unaligned. Is that  
normal? If the memory allocations you've align cover the most speed- 
critical memory, then why did Tim say that about memory alignment?


The point is that i only introduced and used the aligned memory  
functions for the SIMD codelets, which are used for DSP and array  
processing. I'm sure that there are aligned memory allocation  
functions for either platform (maybe not necessarily  
posix_memalign...), but i wanted to stay as close as possible to the  
original PD memory functions.
I don't think it makes much sense to use aligned memory for anything  
else than DSP and tables. If one wanted to use it with auto- 
vectorization the header code would be much the same as the one in  
the DSP perform functions, with some casting to aligned pointers, so  
that the compiler knows about it. Aliasing is another thing, though.


greetings,
Thomas


___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-18 Thread Tim Blechmann
 I really doubt that the gcc devs put a lot of effort into something  
 that has no effect. Perhaps not for Pd, that may be true.  But they  
 are talking about vectorizing loops, it may not be the best thing to  
 vectorize, but there are definitely vectorizable loops in Pd.

the problem is not vectorizing, but auto-vectorizing. the best thing,
that gcc (or icc) can do, is to generate vectorized code for non-aligned
(read non-optimal) for setting audio blocks ...
loops that access two or more blocks will face the aliasing problem

 I'd say its worth trying.  

just try it, i'm curious about your oprofile dumps 

 Compilation optimization is not an  
 exercise in pure deduction. 

no, but you can figure out a lot by examining the machine code (if you
can read machine code) and read the debugging output of the vectorizer.

t

--
[EMAIL PROTECTED]ICQ: 96771783
http://www.mokabar.tk

You can play a shoestring if you're sincere
  John Coltrane


signature.asc
Description: This is a digitally signed message part
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-18 Thread Mathieu Bouchard

On Sat, 18 Nov 2006, Hans-Christoph Steiner wrote:

I really doubt that the gcc devs put a lot of effort into something that 
has no effect. Perhaps not for Pd, that may be true.  But they are 
talking about vectorizing loops, it may not be the best thing to 
vectorize, but there are definitely vectorizable loops in Pd.


perhaps it would be a good start to reimplement newbytes(n) using 
memalign(16,n) instead of malloc(n).


 _ _ __ ___ _  _ _ ...
| Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
| Freelance Digital Arts Engineer, Montréal QC Canada___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-18 Thread chris clepper

On 11/18/06, Mathieu Bouchard [EMAIL PROTECTED] wrote:


On Sat, 18 Nov 2006, Hans-Christoph Steiner wrote:

 I really doubt that the gcc devs put a lot of effort into something that
 has no effect. Perhaps not for Pd, that may be true.  But they are
 talking about vectorizing loops, it may not be the best thing to
 vectorize, but there are definitely vectorizable loops in Pd.

perhaps it would be a good start to reimplement newbytes(n) using
memalign(16,n) instead of malloc(n).



Fix the loop sizes to a literal so the compiler has some clue as to how the
loop is structured.  The compiler will not figure out passing a runtime
parameter in for the loop size.

It should be noted that most 'benchmarks' for the auto-vector features are
heavily rigged and not like anything used in the average application.  See
further http://www.spec.org/
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-18 Thread Hans-Christoph Steiner


On Nov 18, 2006, at 8:07 PM, Thomas Grill wrote:



Am 18.11.2006 um 22:16 schrieb Mathieu Bouchard:


On Sat, 18 Nov 2006, Hans-Christoph Steiner wrote:

I really doubt that the gcc devs put a lot of effort into  
something that has no effect. Perhaps not for Pd, that may be  
true.  But they are talking about vectorizing loops, it may not  
be the best thing to vectorize, but there are definitely  
vectorizable loops in Pd.


perhaps it would be a good start to reimplement newbytes(n) using  
memalign(16,n) instead of malloc(n).


A few years ago i introduced aligned memory allocation in the pd- 
devel branch.


Have you tried submitting a patch?  It would be at least useful in Pd- 
extended.  How big a difference did it make?


.hc



Using ReBirth is like trying to play an 808 with a long stick.- 
David Zicarelli




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-17 Thread Tim Blechmann
On Thu, 2006-11-16 at 16:28 -0500, Hans-Christoph Steiner wrote:
 Debian/testing now uses gcc 4.1 as its default compiler.  I just  
 noticed when doing the apt-get upgrades.  Has anyone tried the auto- 
 vectorization stuff?  Is it worthwhile with Pd?

you might want to check the archives:
http://lists.puredata.info/pipermail/pd-dev/2006-08/007324.html

to explain the terms 'alignment' and 'aliasing':

alignment: 
audio blocks are not known to be aligned to 16byte boundaries

aliasing: 
for functions in the form foo(t_sample * a, t_sample * b, int n), the
compiler is unable to know if the memory regions of a and b are
overlapping (b may be a+1)

cheers ... tim

--
[EMAIL PROTECTED]ICQ: 96771783
http://www.mokabar.tk

The price an artist pays for doing what he wants is that he has to do
it.
  William S. Burroughs


signature.asc
Description: This is a digitally signed message part
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-17 Thread Hans-Christoph Steiner


On Nov 17, 2006, at 7:01 AM, Tim Blechmann wrote:


On Thu, 2006-11-16 at 16:28 -0500, Hans-Christoph Steiner wrote:

Debian/testing now uses gcc 4.1 as its default compiler.  I just
noticed when doing the apt-get upgrades.  Has anyone tried the auto-
vectorization stuff?  Is it worthwhile with Pd?


you might want to check the archives:
http://lists.puredata.info/pipermail/pd-dev/2006-08/007324.html

to explain the terms 'alignment' and 'aliasing':

alignment:
audio blocks are not known to be aligned to 16byte boundaries

aliasing:
for functions in the form foo(t_sample * a, t_sample * b, int n), the
compiler is unable to know if the memory regions of a and b are
overlapping (b may be a+1)


Right, I remember that, I was meaning more has anyone tried any  
benchmarks.


.hc



Using ReBirth is like trying to play an 808 with a long stick.- 
David Zicarelli




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-17 Thread Tim Blechmann
On Fri, 2006-11-17 at 09:10 -0500, Hans-Christoph Steiner wrote: 
 On Nov 17, 2006, at 7:01 AM, Tim Blechmann wrote:
 
  On Thu, 2006-11-16 at 16:28 -0500, Hans-Christoph Steiner wrote:
  Debian/testing now uses gcc 4.1 as its default compiler.  I just
  noticed when doing the apt-get upgrades.  Has anyone tried the auto-
  vectorization stuff?  Is it worthwhile with Pd?
 
  you might want to check the archives:
  http://lists.puredata.info/pipermail/pd-dev/2006-08/007324.html
 
  to explain the terms 'alignment' and 'aliasing':
 
  alignment:
  audio blocks are not known to be aligned to 16byte boundaries
 
  aliasing:
  for functions in the form foo(t_sample * a, t_sample * b, int n), the
  compiler is unable to know if the memory regions of a and b are
  overlapping (b may be a+1)
 
 Right, I remember that, I was meaning more has anyone tried any  
 benchmarks.

i must admit, but i'm a bit confused ...
how can an auto-vectorizer, that's known not to have any effect for a
piece of code improve it's performance?

tim

--
[EMAIL PROTECTED]ICQ: 96771783
http://www.mokabar.tk

Beware of bugs in the above code; I have only proved it correct, not tried it.
  Donald Knuth


signature.asc
Description: This is a digitally signed message part
___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev


Re: [PD-dev] gcc 4.1 and auto-vectorization

2006-11-17 Thread Hans-Christoph Steiner


On Nov 17, 2006, at 5:25 PM, Tim Blechmann wrote:


On Fri, 2006-11-17 at 09:10 -0500, Hans-Christoph Steiner wrote:

On Nov 17, 2006, at 7:01 AM, Tim Blechmann wrote:


On Thu, 2006-11-16 at 16:28 -0500, Hans-Christoph Steiner wrote:

Debian/testing now uses gcc 4.1 as its default compiler.  I just
noticed when doing the apt-get upgrades.  Has anyone tried the  
auto-

vectorization stuff?  Is it worthwhile with Pd?


you might want to check the archives:
http://lists.puredata.info/pipermail/pd-dev/2006-08/007324.html

to explain the terms 'alignment' and 'aliasing':

alignment:
audio blocks are not known to be aligned to 16byte boundaries

aliasing:
for functions in the form foo(t_sample * a, t_sample * b, int n),  
the

compiler is unable to know if the memory regions of a and b are
overlapping (b may be a+1)


Right, I remember that, I was meaning more has anyone tried any
benchmarks.


i must admit, but i'm a bit confused ...
how can an auto-vectorizer, that's known not to have any effect for a
piece of code improve it's performance?


I really doubt that the gcc devs put a lot of effort into something  
that has no effect. Perhaps not for Pd, that may be true.  But they  
are talking about vectorizing loops, it may not be the best thing to  
vectorize, but there are definitely vectorizable loops in Pd.


I'd say its worth trying.  Compilation optimization is not an  
exercise in pure deduction.  There are too many variables when  
looking at real world performance for humans to know how to program  
for the best performance without testing and profiling.


.hc



Using ReBirth is like trying to play an 808 with a long stick.- 
David Zicarelli




___
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev