Re: [PyCUDA] various GPUArray fixes

Alex Nitz Tue, 21 Aug 2012 19:01:10 -0700

Again, I am sorry for introducing the mistakes. I've attached a patch that
fixes the spacing issues.


It also adds an implementation of __idiv__ and makes __abs__ return a real
vector when acting on a complex vector.
So for example, if "a" is of type complex64 and "b = abs(a)", "b" is of
type float32.

I've run the current unit tests and they don't report errors on my machines
(gtx580's).

python test_gpuarray.py
================================================================ test
session starts
=================================================================
platform linux2 -- Python 2.6.6 -- pytest-2.1.3
collected 45 items

test_gpuarray.py .............................................

============================================================= 45 passed in
34.02 seconds =============================================================


Thanks,

Alex

On Tue, Aug 21, 2012 at 2:54 PM, Alex Nitz <[email protected]> wrote:

> Thanks for applying the patches. I apologize for the mistakes. I hope it
> didn't cause you
> too much inconvenience.  I'll endeavor not to make them in the future.
>
>
> On Tue, Aug 21, 2012 at 2:42 PM, Andreas Kloeckner <
> [email protected]> wrote:
>
>> Alex Nitz <[email protected]> writes:
>> > I have attached a patch that addresses using pow with a complex vector.
>> The
>> > issue I found was that it was using the wrong
>> > function name in the kernel. There is a if statement that sets the
>> function
>> > name to "pow" for float64, and "powf" for everything else.
>> > This problems is that complex types also use "pow" for the function
>> name.
>> >
>> > I've also attached several patches that address a few issues related to
>> > using a real GPUArray with a complex scalar. The main issue is that
>> > the get_axbz_kernel set the output (z) vector to the same dtype as the
>> > input one (x), and assumes the constant factors are the same dtype as
>> well.
>> >
>> > So for real types the following operation makes sense.
>> > z[i] = a * x[i] + b
>> >
>> > If "a" or "b" is complex, however, the code will complain that it has
>> been
>> > given the wrong type. My patch changes the behavior so that "a","b", and
>> > "z" have
>> > the same dtype, but can be set separately from "x".  For the various
>> > operations in GPUArray that call this function, I use the
>> _get_common_dtype
>> > function
>> > even when the "other" is a scalar. This applies to subtraction,
>> addition,
>> > and multiplication. Division calls a different kernel so I made a
>> similar
>> > modification there
>> > as well.
>> >
>> > Finally, I modified the "dot" function to work when one argument is
>> complex
>> > and the other is real. Using get_common_dtype worked to fix this issue
>> as
>> > well.
>>
>> Thanks for your patches, I've applied them.
>>
>> In the future, please stick to PEP 8. (wrt commas and spaces, especially)
>>
>> I.e.
>>
>> BAD: f(x,y)
>> GOOD: f(x, y)
>>
>> Also, please next time run the tests to see if they pass:
>>
>> File "/mnt/nfs-main/home/andreas/src/pycuda/pycuda/gpuarray.py", line
>> 450, in __rmul__
>>   result = self._new_like_me(_get_common_dtype(self, other))
>> NameError: global name 'other' is not defined
>>
>> (FTFY)
>>
>> > Also, I wonder why in compyte/array.py the get_common_dtype function
>> does
>> > not simply call numpy.find_common_dtype(vectors,scalars)?
>>
>> I didn't know about numpy.find_common_dtype. Thanks for pointing it
>> out. But in any case, obj2 is allowed to be a plain Python scalar, for
>> which I'd rather let numpy do the special case handling...
>>
>> Andreas
>>
>
>

0001-fixed-mistakes-in-argument-spacing-that-I-introduced.patch
Description: Binary data

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] various GPUArray fixes

Reply via email to