Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The "NativeMapReduce" page has been changed by Aniket Mokashi.


New page:
#format wiki
#language en


This document captures the specification for native map reduce jobs and 
proposal for executing native mapreduce jobs inside pig script. This is tracked 
at *

== Introduction ==
Pig needs to provide a way to natively run map reduce jobs written in java 
There are some advantages of this-
 1. The advantages of the ''native'' keyword are that the user need not be 
worried about coordination between the jobs, pig will take care of it.
 2. User can make use of existing java applications without being a java 

== Syntax ==
To support native mapreduce job pig will support following syntax-

X = ... ;
Y = NATIVE ('mymr.jar' [, 'other.jar' ...]) STORE X INTO 'storeLocation' USING 
storeFunc LOAD 'loadLocation' USING loadFunc [params, ... ];

This stores '''X''' into the '''storeLocation''' which is used by native 
mapreduce to read its data. After we run mymr.jar's mapreduce we load back the 
data from '''loadLocation''' into alias '''Y'''.

== Comparison with similar features ==
=== Pig Streaming ===

=== Hive Transform ===

== Native Mapreduce job specification ==
Native Mapreduce job needs to conform to some specification defined by Pig. Pig 
specifies the input and output directory for this job and is responsible for 

== Implementation Details ==

== References ==

 1. <<Anchor(ref1)>> PIG-506, "Does pig need a NATIVE keyword?",
 2. <<Anchor(ref2)>> Pig Wiki, "Pig Streaming Functional Specification",
 3. <<Anchor(ref3)>> Hive Wiki, "Transform/Map-Reduce Syntax",

Reply via email to