Re: [PyCuda] How to use float4 textures?

Ahmed Fasih Sun, 15 Mar 2009 18:43:14 -0700

Andreas, thanks for the email, I've been playing with this for a
couple of days, and I am getting some strange results that may be be
bug-indicative in make_multichannel_2d_array().


First, I run test/test_driver.py and make sure everything works.

Second, I run the following code snippet (inspired from test_driver.py):

### Start code snippet
import pycuda.driver as drv
import pycuda.autoinit
import numpy

mod = drv.SourceModule("""
    texture<float4, 2, cudaReadModeElementType> mtx_tex;

    __global__ void interp_texture(float * destx, float * desty, float *
    destz, float * destw) {
        int tx = threadIdx.x;
        int ty = threadIdx.y;
        int w  = blockDim.x;
        float4 texval = tex2D(mtx_tex, tx, ty);
        destx[ty*w + tx] = texval.x;
        desty[ty*w + tx] = texval.y;
        destz[ty*w + tx] = texval.z;
        destw[ty*w + tx] = texval.w;
    }
""")

interp_texture = mod.get_function("interp_texture")
mtx_tex = mod.get_texref("mtx_tex")

shape = (2,3)
destx = numpy.zeros(shape, dtype=numpy.float32)
desty = numpy.zeros_like(destx)
destz = numpy.zeros_like(destx)
destw = numpy.zeros_like(destx)

v = numpy.arange(0, numpy.prod(shape), dtype=numpy.float32).reshape(shape)
my_a = numpy.asarray(numpy.dstack((v, 10*v, 100*v,
    1000*v)), dtype=numpy.float32)

# Use C or Fortran-style?
use_c_style = True
my_a = numpy.asarray(numpy.dstack((v, 10*v, 100*v,
    1000*v)), dtype=numpy.float32)
if use_c_style:
    my_order = "C"
else:
    my_a = numpy.asarray(my_a.transpose((2,1,0)), dtype=numpy.float32)
    my_order = "F"

drv.bind_array_to_texref(
    drv.make_multichannel_2d_array(my_a, order=my_order), mtx_tex)

interp_texture(drv.Out(destx), drv.Out(desty), drv.Out(destz), drv.Out(destw),
        block= (shape[1],shape[0], 1), texrefs=[mtx_tex])

print "x:"
print destx
print "y:"
print desty
print "z:"
print destz
print "r0:"
print destw
### End code snippet

So I run this using C-style ordering in the 4-channel array. The
outputs are *garbage*.

Third, I change the relevant line to "use_c_style = False", so it uses
Fortran-style ordering and rerun the script, and surprisingly, *all is
well*: exactly what I expect is printed out.

Fourth: just changing it *back* to C-style ordering (and not
changing/doing anything else) and re-running the script, it works
fine!?!

So it appears that whenever I run the C-style ordering *after* the
Fortran-style ordering, the former works fine. If I run the C-ordered
snippet first, garbage is printed out. And all this is repeatable on
my Quadro FX 5600 by running test/test_driver.py in between these C
vs. Fortran-style orderings.

Coincidentally, if I take care to use Fortran-style orderings (or
C-style orderings after running with Fortran-style orderings),
make_multichannel_2d_array() handles 1D arrays just fine :)

Any insights are as always greatly appreciated!

Thanks,

Ahmed

(PS. I'll get you a matrix_to_texref() patch to handle complex64
arrays shortly.)

On Sat, Mar 14, 2009 at 12:18 AM, Andreas Klöckner
<[email protected]> wrote:
> On Freitag 13 März 2009, Ahmed Fasih wrote:
>> So I recently discovered the very cool
>> pycuda.driver.make_multichannel_2d_array() and am having some trouble
>> with it. As always, I appreciate any inputs :) I ordinarily would poke
>> around on my own but I'm hoping to do a final class project as much in
>> PyCUDA as possible, so what the heck :D
>
> multichannel_2d_array builds 2D arrays. Binding 2D arrays to 1D texrefs is
> bound to end in tears.
>
> Andreas
>

_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] How to use float4 textures?

Reply via email to