Again, I am sorry for introducing the mistakes. I've attached a patch that fixes the spacing issues.
It also adds an implementation of __idiv__ and makes __abs__ return a real vector when acting on a complex vector. So for example, if "a" is of type complex64 and "b = abs(a)", "b" is of type float32. I've run the current unit tests and they don't report errors on my machines (gtx580's). python test_gpuarray.py ================================================================ test session starts ================================================================= platform linux2 -- Python 2.6.6 -- pytest-2.1.3 collected 45 items test_gpuarray.py ............................................. ============================================================= 45 passed in 34.02 seconds ============================================================= Thanks, Alex On Tue, Aug 21, 2012 at 2:54 PM, Alex Nitz <[email protected]> wrote: > Thanks for applying the patches. I apologize for the mistakes. I hope it > didn't cause you > too much inconvenience. I'll endeavor not to make them in the future. > > > On Tue, Aug 21, 2012 at 2:42 PM, Andreas Kloeckner < > [email protected]> wrote: > >> Alex Nitz <[email protected]> writes: >> > I have attached a patch that addresses using pow with a complex vector. >> The >> > issue I found was that it was using the wrong >> > function name in the kernel. There is a if statement that sets the >> function >> > name to "pow" for float64, and "powf" for everything else. >> > This problems is that complex types also use "pow" for the function >> name. >> > >> > I've also attached several patches that address a few issues related to >> > using a real GPUArray with a complex scalar. The main issue is that >> > the get_axbz_kernel set the output (z) vector to the same dtype as the >> > input one (x), and assumes the constant factors are the same dtype as >> well. >> > >> > So for real types the following operation makes sense. >> > z[i] = a * x[i] + b >> > >> > If "a" or "b" is complex, however, the code will complain that it has >> been >> > given the wrong type. My patch changes the behavior so that "a","b", and >> > "z" have >> > the same dtype, but can be set separately from "x". For the various >> > operations in GPUArray that call this function, I use the >> _get_common_dtype >> > function >> > even when the "other" is a scalar. This applies to subtraction, >> addition, >> > and multiplication. Division calls a different kernel so I made a >> similar >> > modification there >> > as well. >> > >> > Finally, I modified the "dot" function to work when one argument is >> complex >> > and the other is real. Using get_common_dtype worked to fix this issue >> as >> > well. >> >> Thanks for your patches, I've applied them. >> >> In the future, please stick to PEP 8. (wrt commas and spaces, especially) >> >> I.e. >> >> BAD: f(x,y) >> GOOD: f(x, y) >> >> Also, please next time run the tests to see if they pass: >> >> File "/mnt/nfs-main/home/andreas/src/pycuda/pycuda/gpuarray.py", line >> 450, in __rmul__ >> result = self._new_like_me(_get_common_dtype(self, other)) >> NameError: global name 'other' is not defined >> >> (FTFY) >> >> > Also, I wonder why in compyte/array.py the get_common_dtype function >> does >> > not simply call numpy.find_common_dtype(vectors,scalars)? >> >> I didn't know about numpy.find_common_dtype. Thanks for pointing it >> out. But in any case, obj2 is allowed to be a plain Python scalar, for >> which I'd rather let numpy do the special case handling... >> >> Andreas >> > >
0001-fixed-mistakes-in-argument-spacing-that-I-introduced.patch
Description: Binary data
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
