Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by RichardDing:

     * The parallelism of the merged splitter job is the maximum of the 
parallelisms of all splittee jobs.
     * The keys from inner plans are partitioned into all the buckets via the 
default hash partitioner.
+ This scheme has advantages: 
+    * Simplicity. No new partition class needed.
+    * Performance. The parallelism of a job specified by users most likely is 
determined by the number of available reducers (machines), so the merged 
parallelism confirms to the user expectation.   
  To avoid the key collision of different inner plans with this scheme, the 
PigNullableWritable class is modified to take into account of the indexes when 
two keys are compared (hashed). 

Reply via email to