Re: [PyCUDA] Histograms with PyCUDA

2012-04-06 Thread Francisco Villaescusa Navarro
Thanks a lot! You are completely right. With these changes the code is ~20% faster. Thanks, Fran. El 05/04/2012, a las 23:19, pierre castellani escribió: Hi Francisco, Just my 2cents on your kernel, I ve learned that pow should be avoid (in my old days ;-) ). Just try:

Re: [PyCUDA] PyCuda 3x slower than nvcc

2012-04-06 Thread Michiel Bruinink
That is for a Windows system. I have Linux. From the pyCuda documentation section Just-in-time Compilation: If keep is True, the compiler output directory is kept, and a line indicating its location in the file system is printed for debugging purposes. There is nothing printed when I set

Re: [PyCUDA] PyCuda 3x slower than nvcc

2012-04-06 Thread Tomi Pieviläinen
That is for a Windows system. I have Linux. This doesn't make a difference, just the paths probably point to /tmp/ like on my system. From the pyCuda documentation section Just-in-time Compilation: If keep is True, the compiler output directory is kept, and a line indicating its location in

Re: [PyCUDA] PyCuda 3x slower than nvcc

2012-04-06 Thread Michiel Bruinink
Ok, the location is only printed when you compile code that has not been compiled before. I have the file now. Michiel. Tomi Pieviläinentomi.pievilai...@iki.fi 4/6/2012 11:51 AM That is for a Windows system. I have Linux. This doesn't make a difference, just the paths probably point to

Re: [PyCUDA] Histograms with PyCUDA

2012-04-06 Thread Thomas Wiecki
Do you mind posting the final code here for future reference (as a gist perhaps)? Also, another optimization might be to remove the (slow) sqrt() in each distance calculation and then do sqrt() of the bin labels in the reduction step. On Fri, Apr 6, 2012 at 3:56 AM, Francisco Villaescusa Navarro

Re: [PyCUDA] Histograms with PyCUDA

2012-04-06 Thread pierre castellani
Hi Francisco, Good to see that it is useful, I was thinking about other way to speed it. Do you really need L2 norm? You could use some other distance calculation that could be faster. Did you look at cuda spécific fonction (for example sqrtf)? Thanks, Pierre. Le vendredi 06 avril 2012 à

Re: [PyCUDA] Histograms with PyCUDA

2012-04-06 Thread Francisco Villaescusa Navarro
Thanks for all the suggestions! Regarding removing sqrt: it seems that the code only gains about ~1%, and you lose the capacity to easily define linear intervals... I have tried with sqrt and sqrtf, but there is not difference in the total time (or it is very small). The code to find the