(Changing the thread title to be a little more relevant than
"Fast
computation of binomial coefficients".
I now have a Tesla C1060 plugged into a Dell T7400 box running
RHEL5 and am learning how to use CUDA for non-trivial
computations. I'm starting to have a little free time again, as
well. Can't promise much time just yet, but hope to have more
later in the year.
Anyway, Marc Glisse indicated one very useful aspect (constant
time
add/sub) of nails when running on GPUs. Another is that, for
the
version 1 hardware at least (which includes all shipped
products)
integer multiplication on NVidia cards is markedly faster for
24*24 bit products than for full-word (i.e. 32-bit) products.
An implementation with 8-bit nails allows for a large number of
carry propagation steps to be postponed. I've not yet taken
measurements but expect the carry saving to more than compensate
for the smaller effective word size. Note that I'm not saying
that the high-level MPIR be 24+8 bits, only that the GPU work
with that representation internally.
My interests, primarily integer factorization, also lead me into
considering CRT representations where multiplication is a linear
time operations. However, that is almost certainly a step too
far for an MPIR implementation!
Paul
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"mpir-devel" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---