The findings below are considering I already have a 20 million * 57 bits int
array in the GPu.
> On Jun 6, 2018, at 3:05 AM, aseem hegshetye wrote:
>
> Hi,
> I did some testing with number of threads. I changed number of threads and
> recorded the time in seconds it took for the pyopencl
Hi,
I did some testing with number of threads. I changed number of threads and
recorded the time in seconds it took for the pyopencl kernel to execute.
Following are the results:
- No_of_threads --- Time in seconds
- 10,000 -- 202
- 20,000 -- 170
- 24,000 -- 209
- 30,000 -- 224
Hi Aseem,
This maybe caused by memory access collisions and/or lack of coalesced
memory access. This technical report gives some pointers:
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-143.pdf
Do you use atomic operations? Or maybe you have too many thread fences?
I have no
Hi,
Does GPU speed exponentially drop as number of threads increase beyond a
certain number?. I used to allocate number of threads= number of
transactions in data under consideration.
For Tesla K80 I see exponential drop in speed above 30290 Threads.
If true, is it a best practice to keep number