Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by OlgaN:

   * A '''Data Bag''' is a set of tuples (duplicate tuples are allowed). You 
may think of it as a "table", except that Pig does not require that the tuple 
field  types match, or even that the tuples have the same number of fields! (It 
is up to you whether you want these properties.) We denote bags by { } 
bracketing. Thus, a data bag could be {<,1.0>, <,0.8>}
   * A '''Data Map''' is a map from keys that are string literals to values 
that can be any data type. Think of it as a !HashMap<String,X> where X can be 
any of the 4 pig data types. A Data Map supports the expected get and put 
interface. We denote maps by [ ] bracketing, with ":" separating the key and 
the value, and ";" separating successive key value pairs. Thus. a data map 
could be [ 'apache' : <'pig', 'hadoop'> ; 'cnn' : 'news' ]. Here, the key 
'apache' is mapped to the tuple with 2 atomic fields 'pig' and 'hadoop', while 
the key 'cnn' is mapped to the data atom 'news'.
+ #DataItems
+ == Data Items ==
+ Data can be referred to in various powerful and convenient ways in Pig. Any 
data referred to is called a Data Item. We will illustrate all these ways by 
using the following example tuple.
+ `t = < 1, {<2,3>,<4,6>,<5,7>}, ['yahoo':'search']>`
+ Thus, ''t'' has 3 fields. Let these fields have names f1, f2, f3. Field f1 is 
an atom with value 1. Field f2 is a bag having 3 tuples. Field f3 is a data map 
having 1 key.
+ The following table lists the various methods of referring to data.
+ || Method of Referring to Data || Example || Value for example tuple ''t'' || 
Notes ||
+ || '''Constant''' || ''' '1.0' ''', or ''' '' ''', or ''' 'blah' ''' 
|| Value constant irrespective of ''t'' || ||

Reply via email to