Hi all,
I have bought kepler GPU in order to do some numerical calculation on it.
I would like to use pyCuda (looks to me the best solution).
Unfortunatly when I am running a test like
MeasureGpuarraySpeedRandom
<http://wiki.tiker.net/PyCuda/Examples/MeasureGpuarraySpeedRandom?action=fullsearch&value=linkto%3A%22PyCuda%2FExamples%2FMeasureGpuarraySpeedRandom%22&context=180>
I get the following results:
Size |Time GPU |Size/Time GPU|Time CPU |Size/Time
CPU|GPU vs CPU speedup
---------+---------------+-------------+-----------------+-------------+------------------
1024
|0.0719905126953|14224.0965047|3.09289598465e-05|33108129.2446|0.000429625497701
2048
|0.0727789160156|28140.0179079|5.74035215378e-05|35677253.6795|0.000788738341822
4096 |0.07278515625 |56275.2106478|0.00010898976326
|37581511.1208|0.00149741745261
8192
|0.0722379931641|113402.928863|0.000164551048279|49783942.9508|0.00227790171171
16384 |0.0720771630859|227311.94318
|0.000254381122589|64407294.9802|0.00352928877467
32768 |0.0722085107422|453796.923149|0.00044281665802
|73999022.8609|0.0061324718301
65536 |0.0720480078125|909615.713047|0.000749320983887|87460516.133
|0.0104003012247
131072 |0.0723209472656|1812365.64171|0.00153271682739
|85516122.5202|0.0211932626071
262144 |0.0727287304688|3604407.75345|0.00305026916504
|85941268.0706|0.041940360369
524288 |0.0723101269531|7250547.35888|0.00601688781738
|87136076.9741|0.0832094766101
1048576 |0.0627352734375|16714297.1178|0.0123564978027
|84860291.0582|0.196962524042
2097152 |0.0743136047363|28220297.0431|0.026837512207
|78142563.4322|0.361138613882
4194304 |0.074144744873 |56569133.8905|0.0583531860352
|71877891.9367|0.787017153206
8388608 |0.0736544189453|113891442.226|0.121150952148
|69240958.0877|1.64485653248
16777216 |0.0743454406738|225665701.191|0.242345166016
|69228597.6891|3.2597179305
33554432 |0.0765948486328|438076875.912|0.484589794922
|69242960.4412|6.32666300112
67108864 |0.0805058410645|833589999.343|0.970654882812 |69137718.45
|12.0569497813
134217728|0.0846059753418|1586385919.64|1.94103554688
|69147485.8439|22.9420621774
268435456|0.094531427002 |2839642482.01|3.88270039062
|69136278.6189|41.0731173089
536870912|0.111502416992 |4814881385.37|7.7108625
|69625273.6967|69.1542184286
I was not expecting fantastic result but not that bad.
Until around 4M numbers the CPU is faster.
Another strange results the GPU timing is constant until 16M numbers. I
asume that is, in fact, only the transaction cost which looks quit
important 70ms.
I am under Xubuntu 12.04, Cuda5, [email protected], [email protected], python2.7,
running Nsight.
I will be happy to get any comments on this.
Many thanks in advance,
Pierre.
<http://wiki.tiker.net/PyCuda/Examples/MeasureGpuarraySpeedRandom?action=fullsearch&value=linkto%3A%22PyCuda%2FExamples%2FMeasureGpuarraySpeedRandom%22&context=180>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda