I wounld not lose time tring to get speed up from a so small GPU as the 610
vs your CPU that is a costly one. You spend much more money on the CPU then
on the GPU. So don't expect easy speed up from the GPU.

Fred

On Fri, Oct 28, 2016 at 2:48 PM, Shantanu K. Karve <[email protected]
> wrote:

> Looking at this further it seems I need to first experiment / work at a
> much lower level not at the theano level to understand what's going on here
> well.  So I ran a matrix multiplication of 320x320  by 640x 320 via some
> sample code that comes with the CUDA toolkit. Using the GPU  that took
> .0088 seconds. Whereas in R, my usual language for this type of work it
> took .064 seconds. So the GPU is faster alright from this perspective.
>
>
> On Thursday, October 27, 2016 at 3:55:44 PM UTC-7, Shantanu K. Karve wrote:
>>
>> I'm learning NN, Theanos, GPU v CPU. I happened to have a 48
>> cores GeForce GT 610 around so I messed around and added it to my AMD
>> 8-core 32 Gb machine and installed the programs and ran the Logistic
>> regressions example tweaked with additional prints of progress with N the
>> number of features at 4000, not 400. the CPU version took 32 seconds, the
>> GPU took 139 seconds ! so now I'm trying to understand the profile
>> information. I attach the two profiles . Int the Classes section
>> the GPU spends
>> <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply>
>> <Class name>
>>   62.7%    62.7%      84.714s       2.12e-03s     C    40000       4
>> theano.sandbox.cuda.basic_ops.GpuFromHost
>>   34.9%    97.6%      47.197s       2.36e-03s     C    20000       2
>> theano.sandbox.cuda.blas.GpuGemv
>>    1.4%    99.0%       1.918s       2.74e-05s     C    70000       7
>> theano.sandbox.cuda.basic_ops.GpuElemwise
>>
>> the CPU spends
>> <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply>
>> <Class name>
>>   75.5%    75.5%      24.083s       1.20e-03s     C    20000       2
>> theano.tensor.blas_c.CGemv
>>   23.9%    99.4%       7.612s       9.52e-05s     C    80000       8
>> theano.tensor.elemwise.Elemwise
>>    0.3%    99.7%       0.103s       1.03e-05s     C    10000       1
>> theano.tensor.elemwise.Sum
>>
>> How does one interpret this ? ANy other aspects one should focus on to
>> learn from ?
>>
>> From the program
>>
>> gpu
>> 0 1.57595e+06
>> 1000 1893.22
>> 2000 1888.27
>> 3000 1888.27
>> 4000 1888.27
>> 5000 1888.27
>> 6000 1888.27
>> 7000 1888.27
>> 8000 1888.27
>> 9000 1888.27
>> Looping 10000 times took 139.552690 seconds
>>
>> cpu
>> 0 1.5654e+06
>> 1000 1881.41
>> 2000 1878.16
>> 3000 1878.15
>> 4000 1878.15
>> 5000 1878.15
>> 6000 1878.16
>> 7000 1878.16
>> 8000 1878.15
>> 9000 1878.15
>> Looping 10000 times took 33.479546 seconds
>>
>> thanks
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to