It turns out that for my problem it suffices to compute the
polynomials
1, f, f^2, ..., f^K
modulo N for some large but known number N, so I suspect that it may
be faster to work modulo p for various primes p and then use the
Chinese Remainder Theorem.
How large, roughly speaking, can I take
At the moment, to get to K = 100, you are only using about 388 bits.
So the size here is not all that large, only a few limbs.
In general arithmetic modulo primes that will fit in a limbs is
generally faster than multiple precision arithmetic. But this is
mainly due to there being less overhead.
I did lots of experimenting. If I really go crazy with optimisation I
can get it down to about 1.93s. About another 0.05s is just taken up
figuring out which case we are in (e.g. everything fits in one limb,
or two limbs, or whatever). I could duplicate the code multiple times
for the different
Thank you (everyone!) for the many extremely helpful comments and
links.
Recall that I need to compute
1, f, f^2, ..., f^K
for f in ZZ[x,y,z] and K known but large. (In fact I only need
certain coefficients of the f^i, but this does not seem to help very
much.)
I have implemented the most
On Sat, May 15, 2010 at 12:55 PM, Tom Coates t.coa...@imperial.ac.uk wrote:
Thank you (everyone!) for the many extremely helpful comments and
links.
Recall that I need to compute
1, f, f^2, ..., f^K
for f in ZZ[x,y,z] and K known but large. (In fact I only need
certain coefficients of
On May 15, 9:03 pm, William Stein wst...@gmail.com wrote:
1. On what hardware?
This was on 64 bit GNU/Linux (Fedora release 12) running on a dual
processor machine with two Intel Core 2 CPUs (each 2.4GHz, 4Gb
cache). I have included the contents of /proc/cpuinfo at the bottom
of this reply.
The times I get with the new code are 28s to K = 70 and 135s to K =
100. This is on an Opteron K102 though, which probably does the
coefficient arithmetic a little faster than the core2. In fact much of
the time is probably coefficient arithmetic in this problem I would
guess. The coefficients
Maple 14 on iMac Core i5 2.66 GHz 8GB (64-bit):
f := x*y^3*z^2 + x^2*y^2*z + x*y^3*z + x*y^2*z^2 + y^3*z^2 + y^3*z +
2*y^2*z^2 + 2*x*y*z + y^2*z + y*z^2 + y^2 + 2*y*z + z;
curr := 1:
TIMER := time[real]():
for i from 1 to 100 do
curr := expand(curr*f):
lprint(i=time[real]()-TIMER):
end do:
Hmm, actually, on my machine Magma is much slower, and that is the
latest Magma. Though perhaps we don't have the right Magma for our
machine or something.
Bill.
On 16 May, 01:22, Bill Hart goodwillh...@googlemail.com wrote:
The times I get with the new code are 28s to K = 70 and 135s to K =
As a check for my implementation, how many bits does the largest
coefficient have?
Bill.
On 16 May, 01:28, Roman Pearce rpear...@gmail.com wrote:
Maple 14 on iMac Core i5 2.66 GHz 8GB (64-bit):
f := x*y^3*z^2 + x^2*y^2*z + x*y^3*z + x*y^2*z^2 + y^3*z^2 + y^3*z +
2*y^2*z^2 + 2*x*y*z + y^2*z +
I get that f^100 is a polynomial with 3721951 terms. The largest
coefficient belongs to x^44*y^181*z^131 and is
540685566063956356849231312581525435336487979299724512007837438591842230283354998840425635151449237483722428755963200
--
To post to this group, send an email to
I have the right number of terms, but not quite the right coefficient,
as of yet. This is a good test to help me dig out the bug. :-)
Thanks.
By the way, is your computation running on more than one core?
Bill.
On 16 May, 02:12, Roman Pearce rpear...@gmail.com wrote:
I get that f^100 is a
On May 15, 6:21 pm, Bill Hart goodwillh...@googlemail.com wrote:
I have the right number of terms, but not quite the right coefficient,
as of yet. This is a good test to help me dig out the bug. :-)
Do you have a division routine? I divided f^100 by f to check the
result. This is one way I
OK, it's working now. I was adding a coefficient where I should have
been setting it.
Times didn't really change though.
Bill.
On 16 May, 02:21, Bill Hart goodwillh...@googlemail.com wrote:
I have the right number of terms, but not quite the right coefficient,
as of yet. This is a good test
On 16 May, 02:41, Roman Pearce rpear...@gmail.com wrote:
On May 15, 6:21 pm, Bill Hart goodwillh...@googlemail.com wrote:
I have the right number of terms, but not quite the right coefficient,
as of yet. This is a good test to help me dig out the bug. :-)
Do you have a division routine?
On 13 mai, 19:03, Roman Pearce rpear...@gmail.com wrote:
On May 13, 2:45 am, parisse bernard.pari...@ujf-grenoble.fr wrote:
In my own experience, coding with an univariate polynomial is not
efficient especially if the polynomial is sparse.
There must be some kind of inefficiency. If you
Thanks all for the very interesting comments and links to publications
and CAS's.
I've implemented the algorithm using flint2's fmpz (multiprecision)
integer type for coefficients and at this stage for 62 bit integers
for exponents, only. (However it should be trivial to lift this
restriction.)
With a bit of fiddling I can get the Fateman benchmark down to 53.5s
on sage.math (2.66 GHz Core2/penryn) making no assumptions about the
size of the output coefficients. I've checked that at least the output
poly has the right length and coeffs of the right size.
Adjusting for the clock speed,
On the other hand, I am unable to replicate the very sparse benchmark
unless I assume the result will fit in 2 limbs and allocate all the
output mpz's in advance, etc. Then I can basically replicate it. If I
use my generic no assumptions code it takes about 3s. I don't think I
can improve that
On May 14, 9:54 am, Bill Hart goodwillh...@googlemail.com wrote:
On the other hand, I am unable to replicate the very sparse benchmark
unless I assume the result will fit in 2 limbs and allocate all the
output mpz's in advance, etc. Then I can basically replicate it. If I
use my generic no
Hi Bill,
On Fri, May 14, 2010 at 3:28 PM, Bill Hart goodwillh...@googlemail.com wrote:
If I make a couple of simplifications, namely assume that the output
fits into two limbs, and that none of the polynomials has length
2^32 - 1, etc, I get pretty good times, certainly better than reported
Actually I wasn't allocating them in slabs. I had my threadsafe
version of the flint integer format turned on. The other version
allocates mpz's in slabs, but was broken. So.
having now fixed that. I do get the time down to about 2.1s on
sage.math. However, that's not noticeably faster
Oh, sorry. I did get confused. I didn't see you had SDMP-Core2
written in your benchmark table. I hadn't realised you were quoting
sdmp times.
Bill.
On 14 May, 21:19, Francesco Biscani bluesca...@gmail.com wrote:
Hi Bill,
On Fri, May 14, 2010 at 3:28 PM, Bill Hart goodwillh...@googlemail.com
In my own experience, coding with an univariate polynomial is not
efficient especially if the polynomial is sparse. I'm using heapmult
or Lagrange interpolation in giac for polynomials represented as
sparse distributed polynomials, with monomials coded as integers if
they fit. I also experienced
Hi Bill,
in my own experience Kronecker substitution can be effective in a
number of situations. It would also automatically handle the case you
mention about working only on a subset of variables (i.e., the ones
involved in the multiplication).
I have the description of my implementation and
On May 13, 2:45 am, parisse bernard.pari...@ujf-grenoble.fr wrote:
In my own experience, coding with an univariate polynomial is not
efficient especially if the polynomial is sparse.
There must be some kind of inefficiency. If you use word
operations
for all monomial operations then it should
Since this is turning into an all purpose post, I'm going to crosspost
to sci.math.symbolic. I want to start by saying that the heap
method
should be called Johnson's algorithm. See
http://portal.acm.org/citation.cfm?id=1086847
We've made contributions to improve it, but our actual work has
1. almost regardless of f, f^K for sufficiently large K is likely to
be fairly dense.
2. You could encode f more tightly as a univariate polynomial by using
different N, depending on the degree
of x,y,z, etc.
3. (most plausible) you could use a recursive representation for f,
which could, I
28 matches
Mail list logo