[Pig Wiki] Update of "PigMultiQueryPerformanceSpecification" by GuntherHagleitner

Apache Wiki Sat, 07 Feb 2009 12:55:56 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.


The following page has been changed by GuntherHagleitner:
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

------------------------------------------------------------------------------
  ===== hadoop 0.19 supports MultipleOutput =====
  Link: 
http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html#addNamedOutput(org.apache.hadoop.mapred.JobConf,%20java.lang.String,%20java.lang.Class,%20java.lang.Class,%20java.lang.Class)
  
- All the output will still be in the same directory, but the developer can 
give name for different sets of output data. So, in our case we might name the 
output "split1" and "split2" and the output would come out to be:
+ All the output will still be in the same directory, but the developer can 
give names for different sets of output data. So, in our case we might name the 
output "split1" and "split2" and the output would come out to be:
  
  {{{
  /outdir/split1-0000
@@ -287, +287 @@

  ===== MRCompiler (Phase 2 and 3) =====
  The MR Compiler right now looks for splits, terminates the MR job at that 
point and connects the remaining operators via load and store.
  
- We'll add a new optimizer pass to look for these split scenarios. This gives 
us the ability to use the combiner plan information to make the determination 
of multipexing or not (Phase 3) and also allows us more easily to switch back 
to the old style handling, if multiple outputs are not available.
+ We'll add a new optimizer pass to look for these split scenarios. This gives 
us the ability to use the combiner plan information to make the determination 
of multiplexing or not (Phase 3) and also allows us more easily to switch back 
to the old style handling, if multiple outputs are not available.
  
  [[Anchor(Parallelism_(Phase_3))]]
  ===== Parallelism (Phase 3) =====

[Pig Wiki] Update of "PigMultiQueryPerformanceSpecification" by GuntherHagleitner

Reply via email to