I saw a change in Pig Wiki frontpage :- - [http://incubator.apache.org/pig/ Pig] is a platform for analyzing large data sets. Pig's language, Pig Latin, is a simple query algebra that lets you express data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Users can create their own functions to do special-purpose processing.
+ [http://incubator.apache.org/pig/ Pig] is a dataflow programming environment for processing very large files. Pig's language is called Pig Latin. A Pig Latin program consists of a directed acyclic graph where each node represents an operation that transforms data. Operations are of two flavors: (1) relational-algebra style operations such as join, filter, project; (2) functional-programming style operators such as map, reduce. Is there any change in philosophy? What is the difference between "a platform for analyzing large data sets" and "dataflow programming environment" ? Does the term "data flow programming environment" imply that Pig can run across multiple file systems at the same time? Cheers, Pi
