Hi, all

I'm using gridmix2 to test my cluster, while in its README file, there are
statements like the following:

+1) Three stage map/reduce job
+          Input:      500GB compressed (2TB uncompressed) SequenceFile
+                 (k,v) = (5 words, 100 words)
+                 hadoop-env: FIXCOMPSEQ
+     *Compute1:   keep 10% map, 40% reduce
+          Compute2:   keep 100% map, 77% reduce
+                 Input from Compute1
+     Compute3:   keep 116% map, 91% reduce
+                 Input from Compute2
+     *Motivation: Many user workloads are implemented as pipelined map/reduce
+                 jobs, including Pig workloads


Can anyone tell me what does "keep 10% map, 40% reduce" mean here?

Best,

-- 
Nan Zhu
School of Electronic, Information and Electrical Engineering,229
Shanghai Jiao Tong University
800,Dongchuan Road,Shanghai,China
E-Mail: zhunans...@gmail.com

Reply via email to