Looking at this further it seems I need to first experiment / work at a 
much lower level not at the theano level to understand what's going on here 
well.  So I ran a matrix multiplication of 320x320  by 640x 320 via some 
sample code that comes with the CUDA toolkit. Using the GPU  that took 
.0088 seconds. Whereas in R, my usual language for this type of work it 
took .064 seconds. So the GPU is faster alright from this perspective.

On Thursday, October 27, 2016 at 3:55:44 PM UTC-7, Shantanu K. Karve wrote:
>
> I'm learning NN, Theanos, GPU v CPU. I happened to have a 48 cores GeForce 
> GT 610 around so I messed around and added it to my AMD 8-core 32 Gb 
> machine and installed the programs and ran the Logistic regressions example 
> tweaked with additional prints of progress with N the number of features at 
> 4000, not 400. the CPU version took 32 seconds, the GPU took 139 seconds ! 
> so now I'm trying to understand the profile information. I attach the two 
> profiles . Int the Classes section
> the GPU spends 
> <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> 
> <Class name>
>   62.7%    62.7%      84.714s       2.12e-03s     C    40000       4   
> theano.sandbox.cuda.basic_ops.GpuFromHost
>   34.9%    97.6%      47.197s       2.36e-03s     C    20000       2   
> theano.sandbox.cuda.blas.GpuGemv
>    1.4%    99.0%       1.918s       2.74e-05s     C    70000       7   
> theano.sandbox.cuda.basic_ops.GpuElemwise
>
> the CPU spends
> <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> 
> <Class name>
>   75.5%    75.5%      24.083s       1.20e-03s     C    20000       2   
> theano.tensor.blas_c.CGemv
>   23.9%    99.4%       7.612s       9.52e-05s     C    80000       8   
> theano.tensor.elemwise.Elemwise
>    0.3%    99.7%       0.103s       1.03e-05s     C    10000       1   
> theano.tensor.elemwise.Sum
>
> How does one interpret this ? ANy other aspects one should focus on to 
> learn from ?
>
> From the program 
>
> gpu
> 0 1.57595e+06
> 1000 1893.22
> 2000 1888.27
> 3000 1888.27
> 4000 1888.27
> 5000 1888.27
> 6000 1888.27
> 7000 1888.27
> 8000 1888.27
> 9000 1888.27
> Looping 10000 times took 139.552690 seconds
>
> cpu
> 0 1.5654e+06
> 1000 1881.41
> 2000 1878.16
> 3000 1878.15
> 4000 1878.15
> 5000 1878.15
> 6000 1878.16
> 7000 1878.16
> 8000 1878.15
> 9000 1878.15
> Looping 10000 times took 33.479546 seconds
>
> thanks
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to