There are fixed startup costs that do not amortize well over only 400 elements.

What happens when you vary the size of the array over several orders
of magnitude?

Eli

On Mon, Apr 9, 2012 at 2:05 PM, Serra, Mr. Efren, Contractor, Code
7542 <efren.serra....@nrlmry.navy.mil> wrote:
> import numpy
> """
> """
> import pycuda.driver as cuda
> import pycuda.tools
> import pycuda.gpuarray as gpuarray
> import pycuda.autoinit, pycuda.compiler
>
> a=numpy.arange(400)
> a_gpu=gpuarray.arange(400,dtype=numpy.float32)
>
> start=cuda.Event()
> end=cuda.Event()
> start.record()
> gpuarray.sum(a_gpu).get()/a.size
> end.record()
> end.synchronize()
> print "GPU array time: %fs" %(start.time_till(end)*1e-3)
>
> start.record()
> numpy.sum(a)/a.size
> end.record()
> end.synchronize()
> print "numpy array time: %fs" %(start.time_till(end)*1e-3)
>
> GPU array time: 0.000377s
> numpy array time: 0.000001s
>
> Efren A. Serra (Contractor)
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Office: 831-656-4650
>
>
> _______________________________________________
> PyCUDA mailing list
> PyCUDA@tiker.net
> http://lists.tiker.net/listinfo/pycuda

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to