I think it is impossible to implement shift right with those
primitives. Each of those primitives propagates information only to
the left or not at all.

However, given a 32 bit quantity in the lower 32 bits of a 64 bit
word, shift right by n can be simulated by shift left by 32 - n.

Bill.

2008/11/23 Jason Martin <[EMAIL PROTECTED]>:
>
> So, as I look over the CUDA specification I don't see support for some
> important integer operations like: shift, rot, mul, and div.  I
> suppose that left shift could be implemented by repeated adds, but I
> can't see an easy way to implement right shift (if I'm missing
> something, or if rot is a simple thing to implement with just add,
> sub, and, xor, or, then please correct me).  Likewise, there's no
> access to carry bits.  Of course, we can deal without carry bits for
> this much parallelism...
>
> But, it looks like a GeForce GTX 260 is only $220.  So, I think I'll
> go ahead and order one and start playing around with it.
>
> Jason Worth Martin
> Asst. Professor of Mathematics
> http://www.math.jmu.edu/~martin
>
>
>
> On Sun, Nov 23, 2008 at 4:22 PM, Bill Hart <[EMAIL PROTECTED]> wrote:
>>
>> Sorry I mean M4RI, not GF2X.
>>
>> 2008/11/23 Bill Hart <[EMAIL PROTECTED]>:
>>> I looked up the NVIDIA Cuda docs here:
>>> http://developer.download.nvidia.com/compute/cuda/2_0/docs/NVIDIA_CUDA_Programming_Guide_2.0.pdf
>>>
>>> It looks like section C2.3 describes an atomic Xor function. That
>>> should be just what is needed for GF2X.
>>>
>>> I can see some definite potential is doing exact arithmetic too. One
>>> would implement a floating point FFT. It doesn't matter that if one
>>> wanted a proved result one would have to work with a hopelessly slow
>>> bound. With that many cores it would be irrelevant. You'd still be a
>>> factor of 30-100 times faster than a single core machine!
>>>
>>> Bill.
>>>
>>> 2008/11/23 mabshoff <[EMAIL PROTECTED]>:
>>>>
>>>>
>>>>
>>>> On Nov 23, 12:38 pm, "Bill Hart" <[EMAIL PROTECTED]> wrote:
>>>>> Perhaps if you, me, John C, mabshoff and the people he is working with
>>>>> all signed off on it.
>>>>
>>>> The people I am working with here is basically Clement Pernet. There
>>>> are also other people form the LinBox universe working on GPU code,
>>>> i.e. Pascal Giorgi.
>>>>
>>>> Another interesting angle here could be m4ri since the XORing engine
>>>> on the GPU should be insanely fast, but last time I talked to malb he
>>>> wasn't very enthusiastic about it.
>>>>
>>>>> I could also mention the "seed funding" EPSRC have given me through my
>>>>> grant, for hardware and my salary, specifically for developing "fast
>>>>> core arithmetic for parallel processors and platforms".
>>>>
>>>> Cool.
>>>>
>>>>> We could actually make the application look quite impressive I think.
>>>>
>>>> One would hope so.
>>>>
>>>>> Bill.
>>>>
>>>> Cheers,
>>>>
>>>> Michael
>>>> >>
>>>>
>>>
>>
>> >
>>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to