#7013: [with patch, needs work] prime_pi and nth_prime
----------------------------------------+-----------------------------------
   Reporter:  kevin.stueve              |       Owner:  kevin.stueve            
       Type:  enhancement               |      Status:  needs_work              
   Priority:  major                     |   Milestone:  sage-4.3.2              
  Component:  number theory             |    Keywords:  primes, sieve, table,LMO
     Author:  Kevin Stueve              |    Upstream:  N/A                     
   Reviewer:  was,robertwb,GeorgSWeber  |      Merged:                          
Work_issues:                            |  
----------------------------------------+-----------------------------------

Comment(by kevin.stueve):

 {{{
 First some timings:
 from math import log

 timeit( 'Li(10**15)')
 125 loops, best of 3: 7.34 ms per loop

 #R from http://wstein.org/rh/rh/code/code.sage
 timeit('R(10**15)') #with prec 100
 625 loops, best of 3: 218 µs per loop

 timeit('li_approx(10**15)')
 625 loops, best of 3: 200 µs per loop

 timeit( 'log(10**15)')
 625 loops, best of 3: 10.7 µs per loop

 I have some email correspondence to relay below.  Thank you so much
 to everyone who contributed below.  Every suggestion and idea was
 incredibly valuable!

 Sincerely,
 Kevin Stueve

 "The equation 13
 [http://mathworld.wolfram.com/RiemannPrimeCountingFunction.html]
 you mentioned is PROBABLY wrong: check Andrey V. Kulsha'
 post of 11/18/2008 entitled "On the explicit formula for the Prime-
 counting
 function pi(x)" on the [email protected] list. To me, it seems
 far better to compress the pi(x) data using simply pi(x)=li(x)-e(x).
 Instead
 of storing pi(x) you would store the (positive) value of e(x) rounded to
 the
 nearest integer. Note that li(x) can be computed easily and that e(x)
 should
 be of the order of sqrt(x). Replacing li(x) by R(x) would not help much,
 because the error term could be either positive or negative (one more
 bit).
 Using a few zeros of the zeta function could reduce the error term, but my
 experience is that it would take much more time to compute the
 approximation
 (it would be necessary to evaluate $li(x^rho)$ accurately, and also
 pi(sqrt(x)), pi(cbrt(x)), etc.)." TOS (Tomás Oliveira e Silva,
 http://www.ieeta.pt/~tos/index.html)
 (My experiments lead me to believe that storing R (sometimes called Ri)
 offsets will save at least 6% (with .tar.bz2) in table size over li
 offsets (with .tar.bz2).)

 "lzma gives sometimes significantly better compression than bzip" TOS
 Unfortunately, lzma is not part of Sage.

 "I see a potential problem with the approach described above [storing
 li(x) offsets]. What if li(x)
 evaluates to INTEGER.4999999999 or to INTEGER.500000001? Can it be assured
 that the same evaluation on a different architecture/compiler/operating
 system
 gives exactly the same result? This problem would be amplified in R(x) and
 some zeros of the zeta function are used. Assuming that the error of
 evaluating
 li(x) is smaller than, say, 0.1, the above problems can be solved by
 storing
 the rounded value of e(x) AND storing also the last two bits of pi(x),
 i.e.,
 storing  4 * round(e(x)) + (pi(x) mod 4)."  TOS
 (One of the li (approximation) implementations we are considering is a
 fast asymptotic series, which
 is truncated when the terms stop decreasing.  I am now fearful that a
 slight machine
 difference could cause an extra term or one less term to be included
 causing the value to unacceptably change.)

 "Knowing the last bit of pi(x) is enough if the floating point error
   is smaller than 0.5 (round to the nearest even or to the nearest odd
   integer, respectively if pi(x) mod 2 = 0 or pi(x) mod 2 = 1)" TOS

 "If many pi(x) values are present in the database, why not split it in
   reasonably sized chunks and then encode the data in a differential form?
   I.e., store x1 and pi(x1), then store x2-x1 and pi(x2)-pi(x1), etc.
   Note that pi(x2)-pi(x1) would be close to (x2-x1)/log(x2), so the
 difference
   between the two should be stored instead (together with the last bit of
   pi(x2)). This may perhaps give smaller correction terms than using
   R(x) or more sofisticated approximations. The price to pay would be that
   the database entries would depend on all previous one of the same chunk.
   The advantages: with suitable modifications it could also be used to
   compress pi2(x) data, and the required computations are very simple."
 TOS
 (This [differential coding in reasonably sized chunks] is a much better
 idea
 than the tree structure I was suggesting.  I believe AB's index idea is
 equivalent
 to having chunks.  However, we don't need to store
 the values of x if the spacing is preset.  I tried storing the offsets
 between
 pi(x2)-pi(x1) and (x2-x1)/log(x2).  With tar.bz2 compression, the table
 space
 was less than 0.55% more than storing offsets between pi(x2)-pi(x1) and
 R(x2)-R(x1)!  Within small intervals, the PNT (prime number theorem)
 estimate
 is nearly identical to R (what great insight from TOS!).  The PNT estimate
 can be
 calculated 18 times faster than li_approx or R, and 685 times faster than
 Sage's
 Li.  If any sort of analytic approximation is going to be used for Sage's
 prime_pi tables, it is the method described above!!!!!, along with the
 parity bit
 of pi(x2)-pi(x1) to guard against potential machine dependent rounding
 issues.
 I tried taking diffs of the above described offsets (i.e. one more order
 of differences than TOS suggested), and under tar.bz2, the table
 space increased.  That the method TOS described could easily be extended
 to
 twin primes is an extra bonus.)

 "One of these days I will dump my entire pi(x) and pi2(x) database in a
   ASCII format and then I will let you know where you can fetch it. It has
   many more entries than those available on my web site." TOS
 I can't wait!

 "As it happens, I have been thinking for a while about updating the nth
 prime page to reflect the capabilities of modern hardware, but have
 been too lazy to do the computations myself.  I'll describe what I
 have in mind; let me know if you or others in the group might be
 interested in pursuing it. " AB (Andrew Booker,
 http://www.maths.bris.ac.uk/people/faculty/maarb/)

 "In an ideal world, the spacing of entries around a number x in the
 precomputed table should be proportional to sqrt(x), since that is the
 minimum running time of the sieve of Eratosthenes.  Thus, for a table
 that works up to 10^18, one would want table entries spaced every 10^9
 at least.  In fact the obvious choice would be to store a table of
 pi(t_n), where t_n is the nth triangular number n(n+1)/2.  That would
 give an algorithm that runs in best possible time when using the sieve
 of Eratosthenes.  (To be clear on this point, there are faster
 algorithms than the standard sieve for primality testing a short
 interval; e.g., Galway's algorithm runs in time O(x^{1/3}), and for
 very short intervals one would prefer to use a standalone primality
 test on each number in the interval.  However, I think that the above
 strategy coupled with a heavily optimized standard sieve would be the
 ideal tradeoff of running time and table storage in practice.)  This
 would mean storing on the order of 1.4*10^9 numbers, which is nothing
 for a modern hard drive (though obviously too much to distribute with
 sage; I'm thinking more of the nth prime page where it would be stored
 centrally on a server)." AB
 (I like the idea of storing triangular, square, and oblong numbers so that
 the computation time increases smoothly and so variants of Landau's third
 problem, Legendre's conjecture can be easily studied.  However I think
 that
 storing multiples of powers of ten is the most practical option, because
 lots
 of analysis will only need equally spaced values of the prime counting
 function.
 I would like to see both types of tables.  Larger tables can be used at
 the nth
 prime page and in the online version of Sage, as well as possibly being
 shipped
 on optical media.)

 "This has other advantages as well.  For instance, it obviates the need
 to compute Li(x) for compression; with a regularly spaced table like
 pi(t_n), one gets the same compression simply by storing differences
 between consecutive values, plus an index at the beginning for faster
 searching.  (Even better is to store second differences.  In fact
 that's what I did for the original nth prime page; there the bulk of
 the table consists of 2-byte integers, even though the numbers go up
 to 10^12.)" AB
 (Yes, when the difference between the number of primes in one block and
 the next is stored instead of the number of primes in a block, the space
 used is less than 4% more than TOS's PNT offset idea (taking a diff of a
 higher order hurts compression).  However, I did not
 take into account the cost of storing the parity bits in TOS's PNT offset
 idea
 in my analysis.  An extra bit for a million table entries should add
 125,000
 bytes of virtually incompressible data, making AB's second order
 difference
 idea preferable, at least as far as my vote goes.  It is so amazing that
 out of all the possible algorithms to calculate pi(x), the fastest is to
 use an exponentially
 large table of values calculated with brute force sieving.  Even more
 amazing
 is that out of all the analytic approximations and formulas, the best way
 to
 compress that table of values of pi(x) seems to be simply storing
 differences.
 Yes, there are probably ways to only store parity bits when an offset is
 very
 near a rounding point, but that would involve significant coding effort
 and
 the risk of a coding error or misunderstanding of how floating point
 arithmetic
 works causing incorrect errors would be absolutely unacceptable in my
 opinion.
 Absolutely, the first priority of this project is to not return false
 values.  All else,
 running times, table space, input ranges are negotiable.)

 "Anyway, I propose a project to compute pi(t_n) for all t_n up to
 10^18.  Unfortunately that means sieving up to 10^18, which is not
 easy.  But it's certainly possible, since it has been done already by
 TOS et al.  In fact it could probably be done in less than a year on a
 cluster of reasonable size.  You could then take whatever fraction of
 the values you wanted for distribution with sage (and again I think
 that storing second differences would give compression nearly as good
 as the Li(x) trick, but with only integer arithmetic).  What do you
 think?" AB
 I can't wait for tables to be available for that interval and much larger
 ones.
 A 100 million core exaflop computer may be available by 2018!

 Following is an email exchange between Victor Miller and Hugh Montgomery
 regarding the distribution of the remainder pi(x) - Li(x) that was
 forwarded to me.

 victor miller wrote:

     Hugh, Do you know of results about the distribution of the remainder
 pi(x) - Li(x) (assuming RH)?  In particular if one wanted to store
 remainders like this for various random values of x if one knew the
 distribution (which I would expect is certainly not uniform) then one
 could use various data compression techniques that depend on knowing the
 distribution to save at least a constant factor.

     Victor

     PS. One could of course try to throw in a few terms corresponding to
 the first few zeros of zeta(s) but I would think that, in practical terms,
 you might not gain very much.

 Dear Victor:

    Unconditionally one cannot say much about this error term, because we
 don't even
 know its order of magnitude.  Assume RH.  Then pi - li does not have a
 limiting distribution
 because its general size is growing.  So consider

                                          pi(x) - Li(x)
                                       ------------------- .
                                       x^{1/2}/log x

 This will generally lie between -1.25 and -0.75, but even this does not
 have a limiting
 distribution.  The point is that its wobbles become slower and slower, so
 that
 for any given order of magnitude, the quantity is almost constant.  What
 is needed
 is an exponential change of variable.  Put

                                       f(y) = (psi(e^y) - e^y)/e^{y/2}.

 Aurel Wintner showed that (on RH) this has a limiting distribution.  If
 one wanted
 to use data to get an idea of this distribution, it would be difficult to
 get very many
 independent data points, due to the exponential increase of the argument.
 If one assumes
 that the ordinates gamma > 0 of the zeros of zeta(s) are linearly
 independent over the
 rationals, then the terms e^{i*gamma*y} act like independent random
 variables, and
 one can develop approximations to the distribution function of f.  The
 distribution function
 is even, its density is everywhere positive, and tends to zero rapidly
 (almost doubly-
 exponentially) at infinity.  For each gamma > 0, let X_gamma be a random
 variable
 uniformly distributed on [0,1], and suppose that the X_gamma are
 independent.  Then
 the limiting distribution of f (assuming RH & the linear independence of
 the gamma)
 is the same as for the random variable

                                  2 sum_{gamma > 0} (sin
 2*Pi*X_gamma)/|rho| .

 Here rho = 1/2+i*gamma.  One could sample this variable at random points
 to
 develop an approximation to the distribution function of f.

                                                       Cheers,
                                                                    Hugh
 }}}

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/7013#comment:40>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to