Hi Paul,

Great to hear you have some serious hardware hooked up!

Some of us in Seattle for a Sage Days conference (I am already here
visiting Seattle now) in two weeks are planning a GPU party to get a
first step towards doing some GPU computations for MPIR. We'll just be
writing some CUDA at first I would imagine. But a first step is
required before a second step, etc. :-)

[I'm also going to be giving some talks on using MPIR for fast
computations (many people don't know how to use the full power of a
large integer package).]

Integer factorisation is obviously of great interest to me personally,
as that is something I have worked on a fair bit. But it isn't
something we want to put into MPIR as it would essentially sidetrack
us from our primary mission. But don't let that put you off developing
fast subordinate routines that are useful for that and contributing
them to MPIR!

The best way to go about that is to first discuss the planned
interface of the functions you want, on the list.

When you say, "CRT representations where multiplication is a linear
time operation", are you talking about representing large integers as
a vector of integers modulo a whole lot of small (e.g. 24 bit) primes?
That could be an interesting thing to try, given the capabilities of
GPU's.

Incidentally, where does this sort of thing come in handy for
factorisation? Are you talking about the Number Field Sieve here?

Bill.

2009/5/4 Paul Leyland <[email protected]>:
>
>
>        (Changing the thread title to be a little more relevant than
>        "Fast
>        computation of binomial coefficients".
>
>        I now have a Tesla C1060 plugged into a Dell T7400 box running
>        RHEL5 and am learning how to use CUDA for non-trivial
>        computations.  I'm starting to have a little free time again, as
>        well.  Can't promise much time just yet, but hope to have more
>        later in the year.
>
>        Anyway, Marc Glisse indicated one very useful aspect (constant
>        time
>        add/sub) of nails when running on GPUs.   Another is that, for
>        the
>        version 1 hardware at least (which includes all shipped
>        products)
>        integer multiplication on NVidia cards is markedly faster for
>        24*24 bit products than for full-word (i.e. 32-bit) products.
>        An implementation with 8-bit nails allows for a large number of
>        carry propagation steps to be postponed.  I've not yet taken
>        measurements but expect the carry saving to more than compensate
>        for the smaller effective word size.  Note that I'm not saying
>        that the high-level MPIR be 24+8 bits, only that the GPU work
>        with that representation internally.
>
>        My interests, primarily integer factorization, also lead me into
>        considering CRT representations where multiplication is a linear
>        time operations.  However, that is almost certainly a step too
>        far for an MPIR implementation!
>
>        Paul
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to