i'm afraid i'm of no use there. been windows-free for going on 10 years now. :)

ananth ranga wrote:
Oh thats great, thatnks  alot. really appreciate it. I am trying to
install pycuda on windows and kind of struggling with it. could ou
please run me through it? I have VS 05 and 08 but not 03 , is that
fine?

On Tue, Jun 30, 2009 at 11:49 AM, Derek Anderson<[email protected]> wrote:
well, both matrices have to be squarish.  but even for say 100x120*120x100,
i would think not.  here were my performance numbers when i wrote it:
(includes memory transfer times)

(4160×4160)*(4160×4160) = 43.0X faster than numpy
(4096×4096)*(4096×4096) = 34.0X
(3900×3900)*(3900×3900) = 47.3X
(2048×2048)*(2048×2048) = 28.2X
(1024×1024)*(1024×1024) = 58.8X
(512×512)*(512×512) = 24.1X
(256×256)*(256×256) = 6.3X
(128×128)*(128×128) = 1.1X

CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz stepping 06
GPU: nVidia Corporation GeForce 8800 GT (rev a2)

but, you *might* get a modest increase (<5x) if you're keeping the matrices
on the card and performing the multiplications many times before you pull it
back to main memory.  (likely, if you're doing svd :)

derek


ananth ranga wrote:
Hey mine is also an pretty evenly sized matrix. its (120*100). So you
suggesting that for this evenly sized small matrix i can expect speed
up in SVD calculation? or you mean it should be a larger sized and
even sized matrix to get good speed up?

On Tue, Jun 30, 2009 at 11:31 AM, Derek Anderson<[email protected]> wrote:
np.  yes, for more evenly sized matrices it's much faster.  (for >500^2
too)
 btw if just matrix multiplication is what you're looking for, i wrote a
numpy wrapper for it a while back:

http://kered.org/blog/2009-04-13/easy-python-numpy-cuda-cublas/

derek


ananth ranga wrote:
Thanks derek. I read some paper which suggest a speed up of upto 60
when the matrix size is big and almost even for size less than (500 *
500).

On Tue, Jun 30, 2009 at 9:53 AM, Derek Anderson<[email protected]> wrote:
my experience with trying to cuda-ize svd/nmf calculations is that
they're
not really a good fit for cuda.  specifically, most of your expensive
operations are matrix multiplications over very long and narrow
matrices.
 (mxk or kxn), where m~=n (within an order of mag) but k<<(m|n).  even
when
m~=2^16 (the max for cublas matrices) and k<2^8, i was barely breaking
even
with normal cpu-based blas libs.

derek


ananth ranga wrote:
Hello people,

      I am Ranga a new member to the group.  I have a problem of
finding svd of a matrix of size 120*100. On a CPU with the VTK
implemented  version its taking about 5 ms for evaluation. So I was
wondering if a pycuda version of it could give me abetter reult
regarding the speed.

If any one has a pycuda version of SVD calculation could you please
help
me out.

Thanks,
ranga

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net



_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to