Re: Hadoop calculates PI

Elliott Clark Tue, 08 May 2012 17:53:40 -0700

Hadoop uses something like the monte carlo method to compute pi.  It will
get more accurate as the simulation is given more cpu time. That's why it's
kind of a cool test of distributed system.  Try giving it more maps with
more samples; somewhere around 2-4 orders of magnitude more.


Read about the monte carlo method here:
http://math.fullerton.edu/mathews/n2003/montecarlopimod.html
--
Elliott Clark


On Tue, May 8, 2012 at 5:45 PM, John Hancock <jhancock1...@gmail.com> wrote:

> Alex,
>
> Give it parameters 1 1 and it will tell you pi is about 4.
>
> I think what really helps get better precision is making the second
> parameter larger since that is the number of samples.
>
> -John
>
> On Tue, May 8, 2012 at 8:35 PM, Alex Paransky <ap...@standardset.com>
> wrote:
>
> > So, I installed Hadoop on my imac via port install hadoop and after
> working
> > through a few configuration issues tried to test the setup with
> calculation
> > of PI.  Unfortunately, I got this answer:
> >
> > Estimated value of Pi is *3.14800000000000000000*
> >
> > Which is not what I expected.  Is there something that I missed?
> >
> > Thanks for any help you can offer.
> >
> > Here is the job output:
> > hadoop-1.0.2 $ hadoop-bin hadoop jar $HADOOP_HOME/hadoop-examples-*.jar
> pi
> > 10 100
> > Warning: $HADOOP_HOME is deprecated.
> >
> > Number of Maps  = 10
> > Samples per Map = 100
> > Wrote input for Map #0
> > Wrote input for Map #1
> > Wrote input for Map #2
> > Wrote input for Map #3
> > Wrote input for Map #4
> > Wrote input for Map #5
> > Wrote input for Map #6
> > Wrote input for Map #7
> > Wrote input for Map #8
> > Wrote input for Map #9
> > Starting Job
> > 12/05/08 16:15:12 INFO mapred.FileInputFormat: Total input paths to
> process
> > : 10
> > 12/05/08 16:15:13 INFO mapred.JobClient: Running job:
> job_201205081614_0001
> > 12/05/08 16:15:14 INFO mapred.JobClient:  map 0% reduce 0%
> > 12/05/08 16:15:28 INFO mapred.JobClient:  map 20% reduce 0%
> > 12/05/08 16:15:34 INFO mapred.JobClient:  map 40% reduce 0%
> > 12/05/08 16:15:37 INFO mapred.JobClient:  map 40% reduce 6%
> > 12/05/08 16:15:40 INFO mapred.JobClient:  map 60% reduce 6%
> > 12/05/08 16:15:46 INFO mapred.JobClient:  map 80% reduce 13%
> > 12/05/08 16:15:52 INFO mapred.JobClient:  map 100% reduce 26%
> > 12/05/08 16:16:01 INFO mapred.JobClient:  map 100% reduce 100%
> > 12/05/08 16:16:06 INFO mapred.JobClient: Job complete:
> > job_201205081614_0001
> > 12/05/08 16:16:06 INFO mapred.JobClient: Counters: 27
> > 12/05/08 16:16:06 INFO mapred.JobClient:   Job Counters
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Launched reduce tasks=1
> > 12/05/08 16:16:06 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=49813
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Launched map tasks=10
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Data-local map tasks=10
> > 12/05/08 16:16:06 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=31329
> > 12/05/08 16:16:06 INFO mapred.JobClient:   File Input Format Counters
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Bytes Read=1180
> > 12/05/08 16:16:06 INFO mapred.JobClient:   File Output Format Counters
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Bytes Written=97
> > 12/05/08 16:16:06 INFO mapred.JobClient:   FileSystemCounters
> > 12/05/08 16:16:06 INFO mapred.JobClient:     FILE_BYTES_READ=226
> > 12/05/08 16:16:06 INFO mapred.JobClient:     HDFS_BYTES_READ=2410
> > 12/05/08 16:16:06 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=239538
> > 12/05/08 16:16:06 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=215
> > 12/05/08 16:16:06 INFO mapred.JobClient:   Map-Reduce Framework
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Map output materialized
> > bytes=280
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Map input records=10
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Reduce shuffle bytes=252
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Spilled Records=40
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Map output bytes=180
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Total committed heap usage
> > (bytes)=1931190272
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Map input bytes=240
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Combine input records=0
> > 12/05/08 16:16:06 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1230
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Reduce input records=20
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Reduce input groups=20
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Combine output records=0
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Reduce output records=0
> > 12/05/08 16:16:06 INFO mapred.JobClient:     Map output records=20
> > Job Finished in 53.422 seconds
> > Estimated value of Pi is 3.14800000000000000000
> > hadoop-1.0.2 $
> >
>

Re: Hadoop calculates PI

Reply via email to