Thanks Ian and Andreas,

About the algorithm, memory isn't a huge concern so if I'm doing this operation for an array of length N I don't mind having a permanently allocated extra array of length N that I'll probably only ever use the first 20 elements of for storing the indices. At the moment, this is what I'm doing in my C code to save going through the array twice.

About the GPU code, I think what you're saying is that I should have an array x say, a global memory array J and a global index say j into J, and then do something like:

__global__ threshold(double *x, double x0)
{
 int i = blockIdx.x * blockDim.x + threadIdx.x;
 if(x[i]>x0){
  atomicInc(&j, N);
  J[j] = i;
 }
}

(Bear with me if I'm way off, as I said I only just started programming with CUDA.)

Isn't there a danger that at the end of the atomicInc instruction, before the J[j]=i instruction, another thread could do a second atomicInc and so one of the elements of J would be skipped out? It's true that this would be a rare event, but almost certain to happen eventually. Ah, although maybe the idea is to have global_j be the global index, and then do:

int j = atomicInc(&global_j, N);
J[j] = i;

I guess this would work even in that case?

One last technical question, I think I see how the pycuda.driver.mem_alloc function works, but how do I refer to this memory in the CUDA code? (I don't think there's an example that demonstrates this in the pycuda release.) The Nvidia CUDA documentation talks about having to manage the global memory by offsets, so I would guess you do something like this (based on the nvidia docs):

extern __device__ int J0[];
__global__ threshold(double *x, double x0)
{
 int i = blockIdx.x * blockDim.x + threadIdx.x;
 int *J = (int*)J0;
 int *global_j = (int*)&J[N];
 if(x[i]>x0){
  int j = atomicInc(global_j, N);
  J[j] = i;
 }
}

Is that right? I'll go and have a play around with this now, but I figure it probably won't work so I'm getting my question in early. ;-)

Dan

p.s. apologies if this posts twice, I sent it from the wrong email address before but maybe it will go through anyway.

_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to