Hadoop uses something like the monte carlo method to compute pi. It will get more accurate as the simulation is given more cpu time. That's why it's kind of a cool test of distributed system. Try giving it more maps with more samples; somewhere around 2-4 orders of magnitude more.
Read about the monte carlo method here: http://math.fullerton.edu/mathews/n2003/montecarlopimod.html -- Elliott Clark On Tue, May 8, 2012 at 5:45 PM, John Hancock <jhancock1...@gmail.com> wrote: > Alex, > > Give it parameters 1 1 and it will tell you pi is about 4. > > I think what really helps get better precision is making the second > parameter larger since that is the number of samples. > > -John > > On Tue, May 8, 2012 at 8:35 PM, Alex Paransky <ap...@standardset.com> > wrote: > > > So, I installed Hadoop on my imac via port install hadoop and after > working > > through a few configuration issues tried to test the setup with > calculation > > of PI. Unfortunately, I got this answer: > > > > Estimated value of Pi is *3.14800000000000000000* > > > > Which is not what I expected. Is there something that I missed? > > > > Thanks for any help you can offer. > > > > Here is the job output: > > hadoop-1.0.2 $ hadoop-bin hadoop jar $HADOOP_HOME/hadoop-examples-*.jar > pi > > 10 100 > > Warning: $HADOOP_HOME is deprecated. > > > > Number of Maps = 10 > > Samples per Map = 100 > > Wrote input for Map #0 > > Wrote input for Map #1 > > Wrote input for Map #2 > > Wrote input for Map #3 > > Wrote input for Map #4 > > Wrote input for Map #5 > > Wrote input for Map #6 > > Wrote input for Map #7 > > Wrote input for Map #8 > > Wrote input for Map #9 > > Starting Job > > 12/05/08 16:15:12 INFO mapred.FileInputFormat: Total input paths to > process > > : 10 > > 12/05/08 16:15:13 INFO mapred.JobClient: Running job: > job_201205081614_0001 > > 12/05/08 16:15:14 INFO mapred.JobClient: map 0% reduce 0% > > 12/05/08 16:15:28 INFO mapred.JobClient: map 20% reduce 0% > > 12/05/08 16:15:34 INFO mapred.JobClient: map 40% reduce 0% > > 12/05/08 16:15:37 INFO mapred.JobClient: map 40% reduce 6% > > 12/05/08 16:15:40 INFO mapred.JobClient: map 60% reduce 6% > > 12/05/08 16:15:46 INFO mapred.JobClient: map 80% reduce 13% > > 12/05/08 16:15:52 INFO mapred.JobClient: map 100% reduce 26% > > 12/05/08 16:16:01 INFO mapred.JobClient: map 100% reduce 100% > > 12/05/08 16:16:06 INFO mapred.JobClient: Job complete: > > job_201205081614_0001 > > 12/05/08 16:16:06 INFO mapred.JobClient: Counters: 27 > > 12/05/08 16:16:06 INFO mapred.JobClient: Job Counters > > 12/05/08 16:16:06 INFO mapred.JobClient: Launched reduce tasks=1 > > 12/05/08 16:16:06 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=49813 > > 12/05/08 16:16:06 INFO mapred.JobClient: Total time spent by all > > reduces waiting after reserving slots (ms)=0 > > 12/05/08 16:16:06 INFO mapred.JobClient: Total time spent by all maps > > waiting after reserving slots (ms)=0 > > 12/05/08 16:16:06 INFO mapred.JobClient: Launched map tasks=10 > > 12/05/08 16:16:06 INFO mapred.JobClient: Data-local map tasks=10 > > 12/05/08 16:16:06 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=31329 > > 12/05/08 16:16:06 INFO mapred.JobClient: File Input Format Counters > > 12/05/08 16:16:06 INFO mapred.JobClient: Bytes Read=1180 > > 12/05/08 16:16:06 INFO mapred.JobClient: File Output Format Counters > > 12/05/08 16:16:06 INFO mapred.JobClient: Bytes Written=97 > > 12/05/08 16:16:06 INFO mapred.JobClient: FileSystemCounters > > 12/05/08 16:16:06 INFO mapred.JobClient: FILE_BYTES_READ=226 > > 12/05/08 16:16:06 INFO mapred.JobClient: HDFS_BYTES_READ=2410 > > 12/05/08 16:16:06 INFO mapred.JobClient: FILE_BYTES_WRITTEN=239538 > > 12/05/08 16:16:06 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215 > > 12/05/08 16:16:06 INFO mapred.JobClient: Map-Reduce Framework > > 12/05/08 16:16:06 INFO mapred.JobClient: Map output materialized > > bytes=280 > > 12/05/08 16:16:06 INFO mapred.JobClient: Map input records=10 > > 12/05/08 16:16:06 INFO mapred.JobClient: Reduce shuffle bytes=252 > > 12/05/08 16:16:06 INFO mapred.JobClient: Spilled Records=40 > > 12/05/08 16:16:06 INFO mapred.JobClient: Map output bytes=180 > > 12/05/08 16:16:06 INFO mapred.JobClient: Total committed heap usage > > (bytes)=1931190272 > > 12/05/08 16:16:06 INFO mapred.JobClient: Map input bytes=240 > > 12/05/08 16:16:06 INFO mapred.JobClient: Combine input records=0 > > 12/05/08 16:16:06 INFO mapred.JobClient: SPLIT_RAW_BYTES=1230 > > 12/05/08 16:16:06 INFO mapred.JobClient: Reduce input records=20 > > 12/05/08 16:16:06 INFO mapred.JobClient: Reduce input groups=20 > > 12/05/08 16:16:06 INFO mapred.JobClient: Combine output records=0 > > 12/05/08 16:16:06 INFO mapred.JobClient: Reduce output records=0 > > 12/05/08 16:16:06 INFO mapred.JobClient: Map output records=20 > > Job Finished in 53.422 seconds > > Estimated value of Pi is 3.14800000000000000000 > > hadoop-1.0.2 $ > > >