Re: [PyCUDA] Binding of GPUArray to Texture

Andrew Byrd Sat, 07 Nov 2009 11:12:29 -0800

>> Is the binding of a GPUArray to a Texture not supported, Or did Imiss something ??

I was having the same problem, trying to pass GPUArrays' gpudata intomy kernels to do other processing. I think I've figured out what'sgoing on.


A quick search through the PyCUDA source code finds bind_to_texref in:
test/test_driver.py
test/test_texture_nan.py
pycuda/gpuarray.py

In test/test_driver.py, function test_fp_textures, texture fetching isdone using the tex1Dfetch function in kernel code, and operates on a1D texture. The script test/test_texture_nan.py does the same.

GPUArray uses textures internally, when callingelementwise.get_take_kernel() in GPUArray.take() etc.elementwise.get_take_kernel() declares its texture references as 1D,and again fetches them with tex1Dfetch() in device code.

If I do likewise and set up my GPUArrays/textures in 1D, I can indeedbind the GPUArray to a texture and fetch its contents inside my devicekernels - but only using tex1Dfetch(). tex1D() seems to always returnzeros, or rather sometimes I can fetch only element 0 of my array.

What is the difference between tex1Dfetch() and tex1D() ? From theCUDA Programming guide:

tex1dfetch(): fetch the region of linear memory bound to texturereference texRef using integer texture coordinate x. No texturefiltering and addressing modes are supported.

tex1D/2D/3D(): fetches the CUDA array bound to texture reference usingfloating-point texture coordinates.

I'm piecing this together from various sources, but it looks likethere's a difference between 'linear memory allocation' and 'pitchlinear memory allocation' on the device. Pitch linear memory is nicelyarranged for texture processing, multi-D addressing, etc. Linearmemory is simply global device memory, accessed byte after byte.

It's hard to keep things straight since we have numpy ndarrays,GPUArrays, and CUDA arrays at work here, and they are all oftenreferred to as just 'an array'.

CUDA arrays (see pycuda.driver.Array) are '2D or 3D memory block thatcan only be accessed via texture references'. That is, they areopaque, we can't directly read/write their bytes. They live in pitchlinear memory.

PyCUDA GPUArrays, on the other hand, use device 'linear memory' tostore their data - it allocated using a normal cudaMalloc, not acudaMallocPitch. This can be seen by looking at their 'allocator',which is set to drv.mem_alloc in the constructor - it would bemem_alloc_pitched if they were using pitched memory. This allowsaccessing them directly in device kernels.

numpy ndarrays, of course, just keep their data in host memory.GPUArrays are designed to work like them syntactically for ease of use.


With these pieces of information, I tried the following:
Make a multidimensional GPUArray in Python on the host.
Bind it to a texture reference from a module that is declared as 1D.

Access it in a device kernel using tex1Dfetch(), not tex1D/2D/3D(), bybuilding up a 1D 'flat' offset for the element I want.


It works!

The other approach is what you see in matrix_to_texref(matrix, texref,order), which takes a numpy array on the host and attaches it to atexref, using matrix_to_array() and bind_array_to_texref(). It does soby copying the data into device linear pitched memory using CUDA'smemcpy2D function, and so is limited to 2D arrays for the moment. Itlooks like you have to do a memcpy from (host / device linear) ->(device linear pitched memory), in order to arrange the data nicelyfor multi-D texture fetching to work.


Can anyone confirm that this is correct?

-Andrew



_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Binding of GPUArray to Texture

Reply via email to