Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/PigLatin

New page:
= Introduction to Pig Latin =

So you want to learn Pig Latin. Welcome! Lets begin with the data types.

== Data Types ==

Every piece of data in Pig has one of these four types:

 * A '''Data Atom''' is a simple atomic data value. It is stored as a string 
but can be used as either a string or a number (see [#FilterS Filters]). 
Examples of data atoms are 'apache.org' and '1.0'.
 * A '''Tuple''' is a data record consisting of a sequence of "fields". Each 
field is a piece of data of any type (data atom, tuple or data bag). We denote 
tuples with < > bracketing. An example of a tuple is <apache.org, 1.0>.
 * A '''Data Bag''' is a set of tuples (duplicate tuples are allowed). You may 
think of it as a "table", except that Pig does not require that the tuple field 
 types match, or even that the tuples have the same number of fields! (It is up 
to you whether you want these properties.) We denote bags by { } bracketing. 
Thus, a data bag could be {<apache.org,1.0>, <flickr.com,0.8>}
 * A '''Data Map''' is a map from keys that are string literals to values that 
can be any data type. Think of it as a !HashMap<String,X> where X can be any of 
the 4 pig data types. A Data Map supports the expected get and put interface. 
We denote maps by [ ] bracketing, with ":" separating the key and the value, 
and ";" separating successive key value pairs. Thus. a data map could be [ 
'apache' : <'pig', 'hadoop'> ; 'cnn' : 'news' ]. Here, the key 'apache' is 
mapped to the tuple with 2 atomic fields 'pig' and 'hadoop', while the key 
'cnn' is mapped to the data atom 'news'.

Reply via email to