Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "NativeMapReduce" page has been changed by Aniket Mokashi.
http://wiki.apache.org/pig/NativeMapReduce?action=diff&rev1=4&rev2=5

--------------------------------------------------

  With native job support, pig can support native map reduce jobs written in 
java language that can convert a data set into a different data set after 
applying a custom map reduce functions of any complexity.
  
  == Native Mapreduce job specification ==
- Native Mapreduce job needs to conform to some specification defined by Pig. 
This is required as Pig specifies the input and output directory in the script 
for this job and is responsible for managing the coordination of the native job 
with the remaining pig mapreduce jobs. Pig also might need to provide some 
extra configuration like job name, input/output formats, parallelism to the 
native job. For communicating such parameters to the native job, it should 
provide some way of communication.
+ Native Mapreduce job needs to conform to some specification defined by Pig. 
This is required because Pig specifies the input and output directory in the 
script for this job and is responsible for managing the coordination of the 
native job with the remaining pig mapreduce jobs. Pig also might need to 
provide some extra configuration like job name, input/output formats, 
parallelism to the native job. For communicating such parameters to the native 
job, it should be according to specification provided by Pig.
  
  Following are some of the approaches of achieving this-
   1. Ordered inputLoc/outputLoc parameters- This is simplistic approach 
wherein native programs follow up a convention so that their first and second 
parameters are treated as input and output respectively. Pig ''native'' command 
takes the parameters required by the native mapreduce job and passes it to 
native job as command line arguments. It is upto the native program to use 
these parameters for operations it performs.
@@ -51, +51 @@

  FileInputFormat.setInputPaths(conf, new Path(args[0]));  
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));
  }}}
+  2. getJobConf Function- Native jobs implement '''getJobConf''' method which 
returns org.apache.hadoop.mapred.JobConf object so that pig can schedule the 
job. This also provides a way to add more pig specific parame
- 
-  2. getJobConf Function-
  
  
  

Reply via email to