Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by ChrisOlston:

New page:
---+++ What is Pig:

 * Pig has two parts:
   * A language for processing data, called <i>Pig Latin</i>.
   * A set of <i>evaluation mechanisms</i> for evaluating a Pig Latin program. 
Current evaluation mechanisms include (a) local evaluation in a single JVM, (2) 
evaluation by translation into one or more Map-Reduce jobs, executed using 

---+++ Pig Latin programs:

 * Pig Latin has built-in relational-style operations such as filter, project, 
group, join. Pig Latin also has a map operation that applies a custom user 
function to every member of a set. In Pig Latin, the map operation is called 

 * Additionally, users can incorporate their own custom code into essentially 
any Pig Latin operation. For example, if a user has a function that determines 
whether a given image contains a human face, the user can ask Pig to filter 
images according to this function. Pig will then evaluate this function on the 
user's behalf, over the images. If the evaluation mechanism incorporates 
parallelism, as is the case with the Hadoop evaluation mechanism, then the 
user's function will be executed in a parallel fashion.

---+++ Data:

 * Pig can process data of any format. Some standard formats, e.g. tab 
delimited text files, are supported via built-in capabilities. A user can add 
support for a file format by writing a function that parses the bytes of a file 
into objects in Pig's data model, and vice versa.
 * Pig's data model is similar to the relational data model, except that tuples 
can be nested. For example, you can have a table of tuples, where the third 
field of each tuple contains a table. In Pig, tables are called bags.


Reply via email to