Re: Change in Pig Wiki

Chris Olston Mon, 10 Mar 2008 15:50:40 -0700

No change in philosophy. I just think "platform for analyzing data"is too generic. I've talked to a lot of people at a lot ofinstitutions, and people "get" what a dataflow program is.


-Chris



On Mar 10, 2008, at 3:27 PM, pi song wrote:

I saw a change in Pig Wiki frontpage :-
- [http://incubator.apache.org/pig/ Pig] is a platform foranalyzing largedata sets. Pig's language, Pig Latin, is a simple query algebrathat letsyou express data transformations such as merging data sets,filtering them,and applying functions to records or groups of records. Users cancreate
their own functions to do special-purpose processing.

+ [http://incubator.apache.org/pig/ Pig] is a dataflow programming
environment for processing very large files. Pig's language iscalled PigLatin. A Pig Latin program consists of a directed acyclic graphwhere eachnode represents an operation that transforms data. Operations areof two
flavors: (1) relational-algebra style operations such as join, filter,
project; (2) functional-programming style operators such as map,reduce.
Is there any change in philosophy? What is the difference between "a
platform for analyzing large data sets" and "dataflow programming
environment" ? Does the term "data flow programming environment"imply that
Pig can run across multiple file systems at the same time?

Cheers,
Pi


--
Christopher Olston, Ph.D.
Sr. Research Scientist
Yahoo! Research

Re: Change in Pig Wiki

Reply via email to