Re: [Chicken-users] big prime number
On Mon, Jan 25, 2016 at 03:00:46PM +0100, Peter Bex wrote: > Unfortunately, I just found out that something goes wrong when compiling > a program containing bignum literals: The number data can get truncated, > causing it to go from 22M to 2.1M which of course can be printed much > quicker. So we cheered too soon. > > In this program, the compiler actually does most of the work because the > value can be statically computed. Running it in the interpreter gives > a much longer run time, like with the numbers egg under CHICKEN 4 :( > > I'll have to figure out why it's so slow, but also why CHICKEN 5 is > messing up. It's a great test case though! :) OK, here's an update with my findings, for those still interested. The reason CHICKEN 5 stops with a small output is that the compiler actually constant-folds the result of the (- (expt ...) 1) expression, resulting in a HUGE bignum literal which will be compiled into the C source. This literal is encoded as a hex string, preceded by its size, encoded in 24 bits. Unfortunately, the hex string's size alone takes up 27 bits! The compiler should bail out when encountering a literal that's too large to encode. Currently it just masks out the higher bits, effectively silently truncating the value. This is easy to fix. I think it's probably a good idea to also drop folded constants when they are too large to encode, opting to evaluate the value at runtime, instead. We could also choose a more efficient "packed" binary representation of bignum literals, but that'll need even more special code that's only needed for these extreme edge cases, and results in harder to debug code in the normal case. Hex is a nice hackable internal representation for literals in C code. BTW, the reason we're not constant folding the result string (which we could easily do) is that number->string needs to yield a fresh string every time according to the spec. This _could_ perhaps be special-cased by tagging freshly allocated folded constant, so we know at runtime that we must copy the constant every time it's used. If we did that, compiling this source file would take about 5 minutes, and running it would take less than a second or so! :) Having said all that, I'll explain the real reason the numbers egg/CHICKEN 5 is so slow. When converting a number to a decimal string, we use the "remainder tree" algorithm. This simply calculates 10^N, with N about half the number's size in digits. Then the number is divided by 10^N, resulting in two halves (quotient and remainder). These two halves are then individually converted to string recursively, divide-and-conquer style. There is only one know faster algorithm which is "scaled remainder tree", but I don't grok this. If anyone does and wants to contribute some code to make numbers use this algorithm, please be my guest! Anyway, even with the current algorithm it should be much faster than it is. The reason it's relatively slow is because the division (and the calculation of 10^N itself) uses multiplication a lot. And we currently only have the Karatsuba multiplication algorithm, which is suboptimal on extremely large numbers. In such cases, an FFT-based multiplication would perform better. GMP (thus, Racket, Guile and Haskell(?)) uses FFT at these sizes, and so does Gambit. Unfortunately, I don't grok the FFT itself, let alone multiplication based on it (using floating-point numbers, which sounds like madness!). The code we'd have to add for this is probably pretty large as well at relatively low benefit for most practical use cases. All of this means we probably won't get support for FFT multiplication any time soon unless a kind soul is going to build it in a small bit of code. Having said all that, I'm not sure how useful it is to even convert such a huge number to decimal; it's a good test of bignum implementation quality, I'll grant you that, but it doesn't really warrant more work on the numbers egg for now, IMO. By the way, to make things look a little less bleak try the following things, just for fun: 1) Converting to hex instead of decimal, and comparing those timings with the other Schemes. 2) Comparing CHICKEN's performance on your program with Gauche, MIT Scheme, Scheme48, Chibi Scheme, Ruby and Python. We're faster than most things not based on GMP (except, of course(!) Gambit). So, wrapping up: CHICKEN is doing pretty well at medium-to-large bignums, but if you want to deal with truly astronomical numbers efficiently, you'll probably have to go to a different implementation, or you can wait/help out until we can compete at that level. Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] big prime number
On Mon, Jan 25, 2016 at 09:11:40AM +0100, Kristian Lein-Mathisen wrote: > Yes, indeed! CHICKEN 5 is exciting :) Thanks again Peter, for your ongoing > efforts in pushing this forward! Unfortunately, I just found out that something goes wrong when compiling a program containing bignum literals: The number data can get truncated, causing it to go from 22M to 2.1M which of course can be printed much quicker. So we cheered too soon. In this program, the compiler actually does most of the work because the value can be statically computed. Running it in the interpreter gives a much longer run time, like with the numbers egg under CHICKEN 4 :( I'll have to figure out why it's so slow, but also why CHICKEN 5 is messing up. It's a great test case though! :) Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] big prime number
Yes, indeed! CHICKEN 5 is exciting :) Thanks again Peter, for your ongoing efforts in pushing this forward! K. On Mon, Jan 25, 2016 at 1:49 AM, Dan Lesliewrote: > > Peter Bex writes: > > > Now, the good news is that I also ran the program under CHICKEN 5 and > > it took just under 17 seconds to complete. Most likely this is because > > the whole thing can be done completely inline, without any CPS calls, > > which means a lot less allocation, which in turn means a lot less > > garbage collections need to be performed. So again many thanks to Felix > > for pushing me to make all operators have inlineable C functions! > > I am very much looking forward to Chicken 5. > > :D > > -Dan > > ___ > Chicken-users mailing list > Chicken-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/chicken-users > > ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] big prime number
Peter Bexwrites: > Now, the good news is that I also ran the program under CHICKEN 5 and > it took just under 17 seconds to complete. Most likely this is because > the whole thing can be done completely inline, without any CPS calls, > which means a lot less allocation, which in turn means a lot less > garbage collections need to be performed. So again many thanks to Felix > for pushing me to make all operators have inlineable C functions! I am very much looking forward to Chicken 5. :D -Dan signature.asc Description: PGP signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] big prime number
On Sun, Jan 24, 2016 at 03:25:04PM -0500, Claude Marinier wrote: > Bonjour, Hello Claude, > Here is the program (minus timing code). > > (use numbers) > (define big-prime (- (expt 2 74207281) 1)) > (define big-print-str (number->string big-prime)) > (define outport (open-output-file "big-prime.txt")) > (display big-prime outport) (newline outport) > (close-output-port outport) > > A colleague claims a Haskell program calculating the prime and writing it > to a file ran in 17 seconds. That's 89 times faster. > > Am I missing something? Is 4.10.1 faster? Ouch, that's pretty bad indeed. I ran the program on my Lenovo X230, and it took 619 seconds, which is consistent with your findings. I took a look at the profile (yay for the new profiler!) and it looks like it's spending a WHOLE lot of time in Karatsuba multiplication, which is probably due to the "integer-power" call in integer->string. I'll have to look into this deeper, to see whether this can be improved at all. Now, the good news is that I also ran the program under CHICKEN 5 and it took just under 17 seconds to complete. Most likely this is because the whole thing can be done completely inline, without any CPS calls, which means a lot less allocation, which in turn means a lot less garbage collections need to be performed. So again many thanks to Felix for pushing me to make all operators have inlineable C functions! Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] big prime number
Bonjour, Spurred by the recent excitement about a new largest prime number (1), I wanted to see how long it would take to just write the number. So I wrote a simple program and added timing to produce the following. claude@hibou:~/Programming/scheme$ ./big-prime time to compute big prime : 1.074 seconds time to convert it to a string : 778.617 seconds time to write it to a file : 710.805 seconds I compiled the program on Debian 8 64-bit with Chicken 8.10.0 and Numbers 4.3 and "-O3". The CPU is an old dual-core Pentium E2200 @ 2.20GHz. Version 4.10.0 (rev b259631) linux-unix-clang-x86-64 [ 64bit manyargs dload ptables ] compiled 2015-08-04 on yves.more-magic.net (Linux) Here is the program (minus timing code). (use numbers) (define big-prime (- (expt 2 74207281) 1)) (define big-print-str (number->string big-prime)) (define outport (open-output-file "big-prime.txt")) (display big-prime outport) (newline outport) (close-output-port outport) A colleague claims a Haskell program calculating the prime and writing it to a file ran in 17 seconds. That's 89 times faster. Am I missing something? Is 4.10.1 faster? Back in June, there was talk of Numbers performance. Where do we stand now? Should I try the code in GIT? Thank you. P.S. He may have written the number in binary. That would explain a lot. Still, am I missing something? Hum ... I should probably use cpu-time instead of current-milliseconds but (oddly) I do not know have to deal with the two returned values. :-( (1) http://www.mersenne.org/primes/?press=M74207281 -- Claude Marinier ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users