Hi all,
  I want to ask a question about following script:
testlimit.pig

a = load './testlimit.txt' as (x:int, y:chararray);


b = order a by x;


c = limit b 1;


store c into './testlimit.out';




In MR:it will generate 4 MapReduce node(scope-11, scope-14, scope-29,scope-40)

scope-11: load the input data and store it to a tmp file
scope-14: sampleload the tmp file and generate the quantile file: 
hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp300898425. I think the 
quantile file contains
the instance of WeightedRangePartitioner which shows how keys distribute.
scope-29: use the quantile file to sort. My question here: 
WeightedRangePartitioner only shows how key distribute and makes every reduce 
receive equal data from map. But this can gurantee sort?


#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-11
Map Plan
Store(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp694083214:org.apache.pig.impl.io.InterStorage)
 - scope-12
|
|---a: New For Each(false,false)[bag] - scope-7
     |   |
     |   Cast[int] - scope-2
     |   |
     |   |---Project[bytearray][0] - scope-1
     |   |
     |   Cast[chararray] - scope-5
     |   |
     |   |---Project[bytearray][1] - scope-4
     |
     |---a: 
Load(hdfs://zly1.sh.intel.com:8020/user/root/testlimit.txt:org.apache.pig.builtin.PigStorage)
 - scope-0--------
Global sort: false
----------------

MapReduce node scope-14
Map Plan
b: Local Rearrange[tuple]{tuple}(false) - scope-18
|   |
|   Constant(all) - scope-17
|
|---New For Each(false)[tuple] - scope-16
     |   |
     |   Project[int][0] - scope-15
     |
     
|---Load(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp694083214:org.apache.pig.impl.builtin.RandomSampleLoader('org.apache.pig.impl.io.InterStorage','100'))
 - scope-13--------
Reduce Plan
Store(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp300898425:org.apache.pig.impl.io.InterStorage)
 - scope-27
|
|---New For Each(false)[tuple] - scope-26
     |   |
     |   POUserFunc(org.apache.pig.impl.builtin.FindQuantiles)[tuple] - scope-25
     |   |
     |   |---Project[tuple][*] - scope-24
     |
     |---New For Each(false,false)[tuple] - scope-23
         |   |
         |   Constant(2) - scope-22
         |   |
         |   Project[bag][1] - scope-20
         |
         |---Package(Packager)[tuple]{chararray} - scope-19--------
Global sort: false
Secondary sort: true
----------------

MapReduce node scope-29
Map Plan
b: Local Rearrange[tuple]{int}(false) - scope-30
|   |
|   Project[int][0] - scope-8
|
|---Load(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp694083214:org.apache.pig.impl.io.InterStorage)
 - scope-28--------
Combine Plan
Local Rearrange[tuple]{int}(false) - scope-35
|   |
|   Project[int][0] - scope-8
|
|---Limit - scope-34
     |
     |---New For Each(true)[tuple] - scope-33
         |   |
         |   Project[bag][1] - scope-32
         |
         |---Package(LitePackager)[tuple]{int} - scope-31--------
Reduce Plan
c: 
Store(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp538566422:org.apache.pig.impl.io.InterStorage)
 - scope-10
|
|---Limit - scope-39
     |
     |---New For Each(true)[tuple] - scope-38
         |   |
         |   Project[bag][1] - scope-37
         |
         |---Package(LitePackager)[tuple]{int} - scope-36--------
Global sort: true
Quantile file: hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp300898425
----------------

MapReduce node scope-40
Map Plan
b: Local Rearrange[tuple]{int}(false) - scope-42
|   |
|   Project[int][0] - scope-43
|
|---Load(hdfs://zly1.sh.intel.com:8020/tmp/temp2146669591/tmp538566422:org.apache.pig.impl.io.InterStorage)
 - scope-41--------
Reduce Plan
c: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-49
|
|---Limit - scope-48
     |
     |---New For Each(true)[bag] - scope-47
         |   |
         |   Project[tuple][1] - scope-46
         |
         |---Package(LitePackager)[tuple]{int} - scope-45--------
Global sort: false
----------------




Kelly Zhang/Zhang,Liyun
Best Regards

Reply via email to