On Nov 23, 2:46 pm, "Jason Martin" <[EMAIL PROTECTED]> wrote: > So, as I look over the CUDA specification I don't see support for some > important integer operations like: shift, rot, mul, and div. I > suppose that left shift could be implemented by repeated adds, but I > can't see an easy way to implement right shift (if I'm missing > something, or if rot is a simple thing to implement with just add, > sub, and, xor, or, then please correct me). Likewise, there's no > access to carry bits. Of course, we can deal without carry bits for > this much parallelism...
left shift: answer = val << count; is equivalent to: answer = val * 2^count; // use a table of 2^n right shift: answer = val >> count; answer = val / 2^count; // use a table of 2^n I think instead of shift, you may actually be talking about rol and ror which need carry bits to work properly. If you want to work with objects larger in precision than register float width, then use a C++ class that uses Karatsuba or Toom multiplication if the values are not too huge, and FFTs if they are titanic in size. > But, it looks like a GeForce GTX 260 is only $220. So, I think I'll > go ahead and order one and start playing around with it. If you look at the Microway's Tesla GPU + InfiniBand Cluster with 36 Quad-Core CPUs, 36 Tesla 1TFLOP GPUs and 40 Gbit/sec QDR ConnectX InfiniBand on this page: http://www.microway.com/tesla/ It would be somewhere between 70 and 80 on the world's most powerful supercomputer list: http://www.top500.org/list/2008/11/100 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---
