Hi all,

I have bought kepler GPU in order to do some numerical calculation on it.

I would like to use pyCuda (looks to me the best solution).

Unfortunatly when I am running a test like
MeasureGpuarraySpeedRandom <http://wiki.tiker.net/PyCuda/Examples/MeasureGpuarraySpeedRandom?action=fullsearch&value=linkto%3A%22PyCuda%2FExamples%2FMeasureGpuarraySpeedRandom%22&context=180>

I get the following results:
Size |Time GPU |Size/Time GPU|Time CPU |Size/Time CPU|GPU vs CPU speedup
---------+---------------+-------------+-----------------+-------------+------------------
1024 |0.0719905126953|14224.0965047|3.09289598465e-05|33108129.2446|0.000429625497701 2048 |0.0727789160156|28140.0179079|5.74035215378e-05|35677253.6795|0.000788738341822 4096 |0.07278515625 |56275.2106478|0.00010898976326 |37581511.1208|0.00149741745261 8192 |0.0722379931641|113402.928863|0.000164551048279|49783942.9508|0.00227790171171 16384 |0.0720771630859|227311.94318 |0.000254381122589|64407294.9802|0.00352928877467 32768 |0.0722085107422|453796.923149|0.00044281665802 |73999022.8609|0.0061324718301 65536 |0.0720480078125|909615.713047|0.000749320983887|87460516.133 |0.0104003012247 131072 |0.0723209472656|1812365.64171|0.00153271682739 |85516122.5202|0.0211932626071 262144 |0.0727287304688|3604407.75345|0.00305026916504 |85941268.0706|0.041940360369 524288 |0.0723101269531|7250547.35888|0.00601688781738 |87136076.9741|0.0832094766101 1048576 |0.0627352734375|16714297.1178|0.0123564978027 |84860291.0582|0.196962524042 2097152 |0.0743136047363|28220297.0431|0.026837512207 |78142563.4322|0.361138613882 4194304 |0.074144744873 |56569133.8905|0.0583531860352 |71877891.9367|0.787017153206 8388608 |0.0736544189453|113891442.226|0.121150952148 |69240958.0877|1.64485653248 16777216 |0.0743454406738|225665701.191|0.242345166016 |69228597.6891|3.2597179305 33554432 |0.0765948486328|438076875.912|0.484589794922 |69242960.4412|6.32666300112 67108864 |0.0805058410645|833589999.343|0.970654882812 |69137718.45 |12.0569497813 134217728|0.0846059753418|1586385919.64|1.94103554688 |69147485.8439|22.9420621774 268435456|0.094531427002 |2839642482.01|3.88270039062 |69136278.6189|41.0731173089 536870912|0.111502416992 |4814881385.37|7.7108625 |69625273.6967|69.1542184286


I was not expecting fantastic result but not that bad.
Until around 4M numbers the CPU is faster.
Another strange results the GPU timing is constant until 16M numbers. I asume that is, in fact, only the transaction cost which looks quit important 70ms.

I am under Xubuntu 12.04, Cuda5, [email protected], [email protected], python2.7, running Nsight.

I will be happy to get any comments on this.

Many thanks in advance,
Pierre.





<http://wiki.tiker.net/PyCuda/Examples/MeasureGpuarraySpeedRandom?action=fullsearch&value=linkto%3A%22PyCuda%2FExamples%2FMeasureGpuarraySpeedRandom%22&context=180>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to