i'm afraid i'm of no use there. been windows-free for going on 10 years
now. :)
ananth ranga wrote:
Oh thats great, thatnks alot. really appreciate it. I am trying to
install pycuda on windows and kind of struggling with it. could ou
please run me through it? I have VS 05 and 08 but not 03 , is that
fine?
On Tue, Jun 30, 2009 at 11:49 AM, Derek Anderson<[email protected]> wrote:
well, both matrices have to be squarish. but even for say 100x120*120x100,
i would think not. here were my performance numbers when i wrote it:
(includes memory transfer times)
(4160×4160)*(4160×4160) = 43.0X faster than numpy
(4096×4096)*(4096×4096) = 34.0X
(3900×3900)*(3900×3900) = 47.3X
(2048×2048)*(2048×2048) = 28.2X
(1024×1024)*(1024×1024) = 58.8X
(512×512)*(512×512) = 24.1X
(256×256)*(256×256) = 6.3X
(128×128)*(128×128) = 1.1X
CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz stepping 06
GPU: nVidia Corporation GeForce 8800 GT (rev a2)
but, you *might* get a modest increase (<5x) if you're keeping the matrices
on the card and performing the multiplications many times before you pull it
back to main memory. (likely, if you're doing svd :)
derek
ananth ranga wrote:
Hey mine is also an pretty evenly sized matrix. its (120*100). So you
suggesting that for this evenly sized small matrix i can expect speed
up in SVD calculation? or you mean it should be a larger sized and
even sized matrix to get good speed up?
On Tue, Jun 30, 2009 at 11:31 AM, Derek Anderson<[email protected]> wrote:
np. yes, for more evenly sized matrices it's much faster. (for >500^2
too)
btw if just matrix multiplication is what you're looking for, i wrote a
numpy wrapper for it a while back:
http://kered.org/blog/2009-04-13/easy-python-numpy-cuda-cublas/
derek
ananth ranga wrote:
Thanks derek. I read some paper which suggest a speed up of upto 60
when the matrix size is big and almost even for size less than (500 *
500).
On Tue, Jun 30, 2009 at 9:53 AM, Derek Anderson<[email protected]> wrote:
my experience with trying to cuda-ize svd/nmf calculations is that
they're
not really a good fit for cuda. specifically, most of your expensive
operations are matrix multiplications over very long and narrow
matrices.
(mxk or kxn), where m~=n (within an order of mag) but k<<(m|n). even
when
m~=2^16 (the max for cublas matrices) and k<2^8, i was barely breaking
even
with normal cpu-based blas libs.
derek
ananth ranga wrote:
Hello people,
I am Ranga a new member to the group. I have a problem of
finding svd of a matrix of size 120*100. On a CPU with the VTK
implemented version its taking about 5 ms for evaluation. So I was
wondering if a pycuda version of it could give me abetter reult
regarding the speed.
If any one has a pycuda version of SVD calculation could you please
help
me out.
Thanks,
ranga
_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net
_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net