Hi, There are actually three MapReduce example programs for computing pi.
pi - uses a qMC method (a powerful method which could evaluate arbitrary integrals but not particularly good in computing pi), bbp - uses a BBP formula, each task computes a few digits of pi in a specific position (e.g. task 1 computes 1st - 4th digits, task 2 computes 5th - 8th digits, etc.) distbbp - uses also a BBP formula but evaluates the formula in a distributed manner. pi is only able to compute ~10 digits even with a large number of samples. I got the following result in HADOOP-4437. 1000 maps and 10000000 samples per map. Job Finished in 67.337 seconds Estimated value of PI is 3.14159264520000000000 bbp is able to compute millions of digits (I forgot if it could scale to billions but it definitely won't work well on trillions.) See HADOOP-5052. distbbp is able to compute digits of pi up to quadrillionth (10^15 th) positions using a large cluster. Note that it skips to a particular position and computes the digits starting at that position. See MAPREDUCE-637 and MAPREDUCE-1923. See also the articles at the end. Note that bbp and distbbp available in 2.0.0 and above (also 0.21 and above) but neither 1.x.x nor 0.20.x. Thanks for being interested in it! Tsz-Wo ------------------ - The Two Quadrillionth Bit of Pi is 0! Distributed Computation of Pi with Apache Hadoop http://arxiv.org/abs/1008.3171 - BBC News: Pi record smashed as team finds two-quadrillionth digit http://www.bbc.co.uk/news/technology-11313194 - New Scientist: New pi record exploits Yahoo's computers http://www.newscientist.com/article/dn19465-new-pi-record-exploits-yahoos-computers.html - CNN Money Tech: Yahoo exec finds two-quadrillionth digit of pi http://cnnmoneytech.tumblr.com/post/1137357695/yahoo-exec-finds-two-quadrillionth-digit-of-pi - David Bailey (mathematician) Yahoo! researcher computes binary digits of pi beginning at two quadrillionth digit http://experimentalmath.info/blog/2010/09/yahoo-researcher-computes-binary-digits-of-pi-beginning-at-two-quadrillionth-digit/ - Communications of the ACM: New Pi Record Exploits Yahoo's Computers http://cacm.acm.org/news/99207-new-pi-record-exploits-yahoos-computers - Communications of the ACM: Math at Web Speed http://mags.acm.org/communications/201011?pg=20#pg20 - computing now (IEEE): Yahoo Sets Record for Pi Bit Calculation http://www.computer.org/portal/web/news/home/-/blogs/3147549 - The Register: Yahoo! boffin scores pi's two quadrillionth bit http://www.theregister.co.uk/2010/09/16/pi_record_at_yahoo/ - ReadWriteCloud A Cloud Computing Milestone: Yahoo! Reaches the 2 Quadrillionth Bit of Pi http://www.readwriteweb.com/cloud/2010/09/a-cloud-computing-milestone-ya.php - ZDNet: Hadoop used to calculate Pi's two quadrillionth bit http://www.zdnet.co.uk/blogs/mapping-babel-10017967/hadoop-used-to-calculate-pis-two-quadrillionth-bit-10018670/ ________________________________ From: Alex Paransky <ap...@standardset.com> To: common-user@hadoop.apache.org Sent: Tuesday, May 8, 2012 5:35 PM Subject: Hadoop calculates PI So, I installed Hadoop on my imac via port install hadoop and after working through a few configuration issues tried to test the setup with calculation of PI. Unfortunately, I got this answer: Estimated value of Pi is *3.14800000000000000000* Which is not what I expected. Is there something that I missed? Thanks for any help you can offer. Here is the job output: hadoop-1.0.2 $ hadoop-bin hadoop jar $HADOOP_HOME/hadoop-examples-*.jar pi 10 100 Warning: $HADOOP_HOME is deprecated. Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 12/05/08 16:15:12 INFO mapred.FileInputFormat: Total input paths to process : 10 12/05/08 16:15:13 INFO mapred.JobClient: Running job: job_201205081614_0001 12/05/08 16:15:14 INFO mapred.JobClient: map 0% reduce 0% 12/05/08 16:15:28 INFO mapred.JobClient: map 20% reduce 0% 12/05/08 16:15:34 INFO mapred.JobClient: map 40% reduce 0% 12/05/08 16:15:37 INFO mapred.JobClient: map 40% reduce 6% 12/05/08 16:15:40 INFO mapred.JobClient: map 60% reduce 6% 12/05/08 16:15:46 INFO mapred.JobClient: map 80% reduce 13% 12/05/08 16:15:52 INFO mapred.JobClient: map 100% reduce 26% 12/05/08 16:16:01 INFO mapred.JobClient: map 100% reduce 100% 12/05/08 16:16:06 INFO mapred.JobClient: Job complete: job_201205081614_0001 12/05/08 16:16:06 INFO mapred.JobClient: Counters: 27 12/05/08 16:16:06 INFO mapred.JobClient: Job Counters 12/05/08 16:16:06 INFO mapred.JobClient: Launched reduce tasks=1 12/05/08 16:16:06 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=49813 12/05/08 16:16:06 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/05/08 16:16:06 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/05/08 16:16:06 INFO mapred.JobClient: Launched map tasks=10 12/05/08 16:16:06 INFO mapred.JobClient: Data-local map tasks=10 12/05/08 16:16:06 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=31329 12/05/08 16:16:06 INFO mapred.JobClient: File Input Format Counters 12/05/08 16:16:06 INFO mapred.JobClient: Bytes Read=1180 12/05/08 16:16:06 INFO mapred.JobClient: File Output Format Counters 12/05/08 16:16:06 INFO mapred.JobClient: Bytes Written=97 12/05/08 16:16:06 INFO mapred.JobClient: FileSystemCounters 12/05/08 16:16:06 INFO mapred.JobClient: FILE_BYTES_READ=226 12/05/08 16:16:06 INFO mapred.JobClient: HDFS_BYTES_READ=2410 12/05/08 16:16:06 INFO mapred.JobClient: FILE_BYTES_WRITTEN=239538 12/05/08 16:16:06 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215 12/05/08 16:16:06 INFO mapred.JobClient: Map-Reduce Framework 12/05/08 16:16:06 INFO mapred.JobClient: Map output materialized bytes=280 12/05/08 16:16:06 INFO mapred.JobClient: Map input records=10 12/05/08 16:16:06 INFO mapred.JobClient: Reduce shuffle bytes=252 12/05/08 16:16:06 INFO mapred.JobClient: Spilled Records=40 12/05/08 16:16:06 INFO mapred.JobClient: Map output bytes=180 12/05/08 16:16:06 INFO mapred.JobClient: Total committed heap usage (bytes)=1931190272 12/05/08 16:16:06 INFO mapred.JobClient: Map input bytes=240 12/05/08 16:16:06 INFO mapred.JobClient: Combine input records=0 12/05/08 16:16:06 INFO mapred.JobClient: SPLIT_RAW_BYTES=1230 12/05/08 16:16:06 INFO mapred.JobClient: Reduce input records=20 12/05/08 16:16:06 INFO mapred.JobClient: Reduce input groups=20 12/05/08 16:16:06 INFO mapred.JobClient: Combine output records=0 12/05/08 16:16:06 INFO mapred.JobClient: Reduce output records=0 12/05/08 16:16:06 INFO mapred.JobClient: Map output records=20 Job Finished in 53.422 seconds Estimated value of Pi is 3.14800000000000000000 hadoop-1.0.2 $