Hi, all I'm using gridmix2 to test my cluster, while in its README file, there are statements like the following:
+1) Three stage map/reduce job + Input: 500GB compressed (2TB uncompressed) SequenceFile + (k,v) = (5 words, 100 words) + hadoop-env: FIXCOMPSEQ + *Compute1: keep 10% map, 40% reduce + Compute2: keep 100% map, 77% reduce + Input from Compute1 + Compute3: keep 116% map, 91% reduce + Input from Compute2 + *Motivation: Many user workloads are implemented as pipelined map/reduce + jobs, including Pig workloads Can anyone tell me what does "keep 10% map, 40% reduce" mean here? Best, -- Nan Zhu School of Electronic, Information and Electrical Engineering,229 Shanghai Jiao Tong University 800,Dongchuan Road,Shanghai,China E-Mail: zhunans...@gmail.com