On Nov 23, 2:46 pm, "Jason Martin" <[EMAIL PROTECTED]>
wrote:
> So, as I look over the CUDA specification I don't see support for some
> important integer operations like: shift, rot, mul, and div.  I
> suppose that left shift could be implemented by repeated adds, but I
> can't see an easy way to implement right shift (if I'm missing
> something, or if rot is a simple thing to implement with just add,
> sub, and, xor, or, then please correct me).  Likewise, there's no
> access to carry bits.  Of course, we can deal without carry bits for
> this much parallelism...

left shift:
 answer = val << count;
is equivalent to:
 answer = val * 2^count; // use a table of 2^n

right shift:
 answer = val >> count;
 answer = val / 2^count; // use a table of 2^n

I think  instead of shift, you may actually be talking about rol and
ror which need carry bits to work properly.

If you want to work with objects larger in precision than register
float width, then use a C++ class that uses Karatsuba or Toom
multiplication if the values are not too huge, and FFTs if they are
titanic in size.

> But, it looks like a GeForce GTX 260 is only $220.  So, I think I'll
> go ahead and order one and start playing around with it.

If you look at the Microway's Tesla GPU + InfiniBand Cluster with 36
Quad-Core CPUs, 36 Tesla 1TFLOP GPUs and 40 Gbit/sec QDR ConnectX
InfiniBand on this page:
http://www.microway.com/tesla/

It would be somewhere between 70 and 80 on the world's most powerful
supercomputer list:
http://www.top500.org/list/2008/11/100

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to