Re: [Chicken-users] big prime number

2016-01-27 Thread Peter Bex
On Mon, Jan 25, 2016 at 03:00:46PM +0100, Peter Bex wrote:
> Unfortunately, I just found out that something goes wrong when compiling
> a program containing bignum literals: The number data can get truncated,
> causing it to go from 22M to 2.1M which of course can be printed much
> quicker.  So we cheered too soon.
> 
> In this program, the compiler actually does most of the work because the
> value can be statically computed.  Running it in the interpreter gives
> a much longer run time, like with the numbers egg under CHICKEN 4 :(
> 
> I'll have to figure out why it's so slow, but also why CHICKEN 5 is
> messing up.  It's a great test case though! :)

OK, here's an update with my findings, for those still interested.

The reason CHICKEN 5 stops with a small output is that the compiler
actually constant-folds the result of the (- (expt ...) 1) expression,
resulting in a HUGE bignum literal which will be compiled into the C
source.  This literal is encoded as a hex string, preceded by its size,
encoded in 24 bits.  Unfortunately, the hex string's size alone takes
up 27 bits!

The compiler should bail out when encountering a literal that's too large
to encode.  Currently it just masks out the higher bits, effectively
silently truncating the value.  This is easy to fix.  I think it's
probably a good idea to also drop folded constants when they are too
large to encode, opting to evaluate the value at runtime, instead.
We could also choose a more efficient "packed" binary representation of
bignum literals, but that'll need even more special code that's only
needed for these extreme edge cases, and results in harder to debug code
in the normal case.  Hex is a nice hackable internal representation for
literals in C code.

BTW, the reason we're not constant folding the result string (which we
could easily do) is that number->string needs to yield a fresh string
every time according to the spec.  This _could_ perhaps be special-cased
by tagging freshly allocated folded constant, so we know at runtime that
we must copy the constant every time it's used.  If we did that, compiling
this source file would take about 5 minutes, and running it would take
less than a second or so! :)


Having said all that, I'll explain the real reason the numbers
egg/CHICKEN 5 is so slow.  When converting a number to a decimal
string, we use the "remainder tree" algorithm.  This simply calculates
10^N, with N about half the number's size in digits.  Then the number
is divided by 10^N, resulting in two halves (quotient and remainder).
These two halves are then individually converted to string recursively,
divide-and-conquer style.

There is only one know faster algorithm which is "scaled remainder tree",
but I don't grok this.  If anyone does and wants to contribute some code
to make numbers use this algorithm, please be my guest!  Anyway, even
with the current algorithm it should be much faster than it is.
The reason it's relatively slow is because the division (and the
calculation of 10^N itself) uses multiplication a lot.  And we currently
only have the Karatsuba multiplication algorithm, which is suboptimal on
extremely large numbers.  In such cases, an FFT-based multiplication
would perform better.  GMP (thus, Racket, Guile and Haskell(?)) uses FFT
at these sizes, and so does Gambit.

Unfortunately, I don't grok the FFT itself, let alone multiplication
based on it (using floating-point numbers, which sounds like madness!).
The code we'd have to add for this is probably pretty large as well at
relatively low benefit for most practical use cases.  All of this means
we probably won't get support for FFT multiplication any time soon
unless a kind soul is going to build it in a small bit of code.

Having said all that, I'm not sure how useful it is to even convert
such a huge number to decimal; it's a good test of bignum implementation
quality, I'll grant you that, but it doesn't really warrant more work on
the numbers egg for now, IMO.  By the way, to make things look a little
less bleak try the following things, just for fun:

1) Converting to hex instead of decimal, and comparing those timings with
   the other Schemes.
2) Comparing CHICKEN's performance on your program with Gauche,
   MIT Scheme, Scheme48, Chibi Scheme, Ruby and Python.  We're faster
   than most things not based on GMP (except, of course(!) Gambit).

So, wrapping up: CHICKEN is doing pretty well at medium-to-large bignums,
but if you want to deal with truly astronomical numbers efficiently,
you'll probably have to go to a different implementation, or you can
wait/help out until we can compete at that level.

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] big prime number

2016-01-25 Thread Peter Bex
On Mon, Jan 25, 2016 at 09:11:40AM +0100, Kristian Lein-Mathisen wrote:
> Yes, indeed! CHICKEN 5 is exciting :) Thanks again Peter, for your ongoing
> efforts in pushing this forward!

Unfortunately, I just found out that something goes wrong when compiling
a program containing bignum literals: The number data can get truncated,
causing it to go from 22M to 2.1M which of course can be printed much
quicker.  So we cheered too soon.

In this program, the compiler actually does most of the work because the
value can be statically computed.  Running it in the interpreter gives
a much longer run time, like with the numbers egg under CHICKEN 4 :(

I'll have to figure out why it's so slow, but also why CHICKEN 5 is
messing up.  It's a great test case though! :)

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] big prime number

2016-01-25 Thread Kristian Lein-Mathisen
Yes, indeed! CHICKEN 5 is exciting :) Thanks again Peter, for your ongoing
efforts in pushing this forward!

K.

On Mon, Jan 25, 2016 at 1:49 AM, Dan Leslie  wrote:

>
> Peter Bex  writes:
>
> > Now, the good news is that I also ran the program under CHICKEN 5 and
> > it took just under 17 seconds to complete.  Most likely this is because
> > the whole thing can be done completely inline, without any CPS calls,
> > which means a lot less allocation, which in turn means a lot less
> > garbage collections need to be performed.  So again many thanks to Felix
> > for pushing me to make all operators have inlineable C functions!
>
> I am very much looking forward to Chicken 5.
>
> :D
>
> -Dan
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] big prime number

2016-01-24 Thread Dan Leslie

Peter Bex  writes:

> Now, the good news is that I also ran the program under CHICKEN 5 and
> it took just under 17 seconds to complete.  Most likely this is because
> the whole thing can be done completely inline, without any CPS calls,
> which means a lot less allocation, which in turn means a lot less
> garbage collections need to be performed.  So again many thanks to Felix
> for pushing me to make all operators have inlineable C functions!

I am very much looking forward to Chicken 5.

:D

-Dan


signature.asc
Description: PGP signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] big prime number

2016-01-24 Thread Peter Bex
On Sun, Jan 24, 2016 at 03:25:04PM -0500, Claude Marinier wrote:
> Bonjour,

Hello Claude,

> Here is the program (minus timing code).
> 
> (use numbers)
> (define big-prime (- (expt 2 74207281) 1))
> (define big-print-str (number->string big-prime))
> (define outport (open-output-file "big-prime.txt"))
> (display big-prime outport) (newline outport)
> (close-output-port outport)
> 
> A colleague claims a Haskell program calculating the prime and writing it
> to a file ran in 17 seconds. That's 89 times faster.
> 
> Am I missing something? Is 4.10.1 faster?

Ouch, that's pretty bad indeed.  I ran the program on my Lenovo X230,
and it took 619 seconds, which is consistent with your findings.  I took
a look at the profile (yay for the new profiler!) and it looks like it's
spending a WHOLE lot of time in Karatsuba multiplication, which is
probably due to the "integer-power" call in integer->string.  I'll have
to look into this deeper, to see whether this can be improved at all.

Now, the good news is that I also ran the program under CHICKEN 5 and
it took just under 17 seconds to complete.  Most likely this is because
the whole thing can be done completely inline, without any CPS calls,
which means a lot less allocation, which in turn means a lot less
garbage collections need to be performed.  So again many thanks to Felix
for pushing me to make all operators have inlineable C functions!

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] big prime number

2016-01-24 Thread Claude Marinier
Bonjour,

Spurred by the recent excitement about a new largest prime number (1), I
wanted to see how long it would take to just write the number. So I wrote a
simple program and added timing to produce the following.

claude@hibou:~/Programming/scheme$ ./big-prime
time to compute big prime : 1.074 seconds
time to convert it to a string : 778.617 seconds
time to write it to a file : 710.805 seconds

I compiled the program on Debian 8 64-bit with Chicken 8.10.0 and Numbers
4.3 and "-O3". The CPU is an old dual-core Pentium E2200 @ 2.20GHz.

Version 4.10.0 (rev b259631)
linux-unix-clang-x86-64 [ 64bit manyargs dload ptables ]
compiled 2015-08-04 on yves.more-magic.net (Linux)

Here is the program (minus timing code).

(use numbers)
(define big-prime (- (expt 2 74207281) 1))
(define big-print-str (number->string big-prime))
(define outport (open-output-file "big-prime.txt"))
(display big-prime outport) (newline outport)
(close-output-port outport)

A colleague claims a Haskell program calculating the prime and writing it
to a file ran in 17 seconds. That's 89 times faster.

Am I missing something? Is 4.10.1 faster?

Back in June, there was talk of Numbers performance. Where do we stand now?
Should I try the code in GIT?

Thank you.

P.S. He may have written the number in binary. That would explain a lot.
Still, am I missing something? Hum ... I should probably use cpu-time
instead of current-milliseconds but (oddly) I do not know have to deal with
the two returned values. :-(

(1) http://www.mersenne.org/primes/?press=M74207281

-- 
Claude Marinier
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users