I think it is impossible to implement shift right with those primitives. Each of those primitives propagates information only to the left or not at all.
However, given a 32 bit quantity in the lower 32 bits of a 64 bit word, shift right by n can be simulated by shift left by 32 - n. Bill. 2008/11/23 Jason Martin <[EMAIL PROTECTED]>: > > So, as I look over the CUDA specification I don't see support for some > important integer operations like: shift, rot, mul, and div. I > suppose that left shift could be implemented by repeated adds, but I > can't see an easy way to implement right shift (if I'm missing > something, or if rot is a simple thing to implement with just add, > sub, and, xor, or, then please correct me). Likewise, there's no > access to carry bits. Of course, we can deal without carry bits for > this much parallelism... > > But, it looks like a GeForce GTX 260 is only $220. So, I think I'll > go ahead and order one and start playing around with it. > > Jason Worth Martin > Asst. Professor of Mathematics > http://www.math.jmu.edu/~martin > > > > On Sun, Nov 23, 2008 at 4:22 PM, Bill Hart <[EMAIL PROTECTED]> wrote: >> >> Sorry I mean M4RI, not GF2X. >> >> 2008/11/23 Bill Hart <[EMAIL PROTECTED]>: >>> I looked up the NVIDIA Cuda docs here: >>> http://developer.download.nvidia.com/compute/cuda/2_0/docs/NVIDIA_CUDA_Programming_Guide_2.0.pdf >>> >>> It looks like section C2.3 describes an atomic Xor function. That >>> should be just what is needed for GF2X. >>> >>> I can see some definite potential is doing exact arithmetic too. One >>> would implement a floating point FFT. It doesn't matter that if one >>> wanted a proved result one would have to work with a hopelessly slow >>> bound. With that many cores it would be irrelevant. You'd still be a >>> factor of 30-100 times faster than a single core machine! >>> >>> Bill. >>> >>> 2008/11/23 mabshoff <[EMAIL PROTECTED]>: >>>> >>>> >>>> >>>> On Nov 23, 12:38 pm, "Bill Hart" <[EMAIL PROTECTED]> wrote: >>>>> Perhaps if you, me, John C, mabshoff and the people he is working with >>>>> all signed off on it. >>>> >>>> The people I am working with here is basically Clement Pernet. There >>>> are also other people form the LinBox universe working on GPU code, >>>> i.e. Pascal Giorgi. >>>> >>>> Another interesting angle here could be m4ri since the XORing engine >>>> on the GPU should be insanely fast, but last time I talked to malb he >>>> wasn't very enthusiastic about it. >>>> >>>>> I could also mention the "seed funding" EPSRC have given me through my >>>>> grant, for hardware and my salary, specifically for developing "fast >>>>> core arithmetic for parallel processors and platforms". >>>> >>>> Cool. >>>> >>>>> We could actually make the application look quite impressive I think. >>>> >>>> One would hope so. >>>> >>>>> Bill. >>>> >>>> Cheers, >>>> >>>> Michael >>>> >> >>>> >>> >> >> > >> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---
