Hello,

I have a question and hope that you can help me.

A block is working on a common problem, the threads are iterating
through a part of the problem each.
Now if some condition is met, a thread should write its threadId
to a 1D output which is smaller than the total number of threads.

I would rather not store all of the results as integers.
since the condition is only met in very rare cases.

The two options I found would be

1.) to store all results in a bitfield with is as long as there are threads and use bitwise atomicAnd.

2.) share a common index within a block which is and use the
return value of atomicAdd to store the threadId there.

Is one of this ideas to be preferred? Or do you have
better suggestions to do this?

Kind regards,
Joe


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to