Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by AlanGates:
http://wiki.apache.org/pig/PigTypesFunctionalSpec

------------------------------------------------------------------------------
  Currently (pig release 1.2) pig has four data types:
   * bag:  A collection of tuples.
   * tuple:  An ordered set of data (each datum may be of any type).
-  * map:  A set of key value pairs, where each key is an atom and each value a 
datum (of any type).
+  * map:  A set of key value pairs, where each key is a string and each value 
a datum (of any type).
   * atom:  A single valued datum.  Currently these are always strings.
  
  Pig will now have the following data types:
   * bag:  A collection of data (not necessarily tuples).
   * tuple:   An order set of data (no changes).
-  * map:  A set of key value pairs, where each key is an atom and each value a 
datum (of any type).  This is not a change except that atoms will not 
necessarily be strings now.
+  * map:  A set of key value pairs, where each key is an atom and each value a 
datum (of any type).
   * int:  32 bit integer
   * long:  64 bit integer
   * float:  32 bit floating point
   * double:  64 bit floating point
-  * chararray:  Character array.  Includes the concept of encoding (i.e., how 
are the characters encoded).  Initially supported encodings are UTF16 and none 
(no encoding known or implied, data is simply handled as a series of bytes).
-  * unknown:  The type of this data is unknown or unspecified.
+  * chararray:  Character array with characters stored in a unicode format.
+  * bytearray: Array of bytes.
  
  Eventually, user defined types (UDTs) may be added.  They are not specified
  at this time.  However, nothing should be done in the implementation of types
@@ -59, +59 @@

  LOAD 'myfile' AS (colname [type], ...)
  }}}
  
+ This wil be legal anywhere AS is legal.  Note that in many instances, using
+ this syntax will result in a cast, since the results of a FOREACH ... GENERATE
+ statement will in most cases already have a type.
- This will only be legal on AS in LOAD.  Once this is done the types will be
- inferable from the output of other operations.  For example, given the script:
- {{{
- a = LOAD 'myfile' AS (name chararray, salary double, underlings long);
- b = GROUP a ALL;
- c = FOREACH b GENERATE AVG(salary), AVG(underlings);
- }}}
- The system will be able to track that a.salary is a double and a.underlings is
- a long without the user supplying that information.
  
  Users will not be required to define the type of a given datum.  If the type
+ is not provided the datum will be assigned the type bytearray.  No conversions
+ will be done on bytearrays unless users coerce them into other types (see 
[#Expression_Operators expression operators below] for details).
- is not provided the datum will be assigned the type unknown.  Most expression 
evaluations can still be used with unknown types (see
- [#Expression_Operators expression operators below] for details).  Once an 
unknown is coerced into a type by applying
- an expression evaluation the field will retain that new type, not revert to 
unknown.  Leaving data as type unknown allows users to work with
- data where they either do not want to provide datatypes or do not know them.
- However, there is a performance and functionality cost.  Certain performance
- optimizations are only possible if the types are known (for example computing
- a SUM as a long value instaed of a double).
  
  When defining complex types, users have two choices.  They can define them
  completely.  For example, column a is a bag of tuples, each tuple having size
  3, with types (long, int, long).  They can also define a complex type to
- contain type unknown.  Thus column a could be described as a bag of unknown,
+ contain an unknown type.  Thus column a could be described as a bag of 
unknown,
  or a bag of tuples of unknown.
  
  The complete syntax for describing types will be:
- || '''type declaration'''    || '''comments'''                                
                              ||
+ || '''type declaration''' || '''comments'''                                   
                                    ||
- || bag(type)             ||                                                   
                      ||
+ || bag(type | unknown)   ||                                                   
                                    ||
- || tuple(...)            || indicates contents of tuple are not known (but 
not necessarily unknown) ||
+ || tuple(...)            || indicates contents of tuple are not known         
                                    ||
- || tuple(type [, type])  ||                                                   
                      ||
+ || tuple(type [, type])  ||                                                   
                                    ||
- || map(type)             || type indicates key type, value types are always 
unknown                 ||
+ || map(type)             || type indicates key type, value types default to 
bytearray but can be cast to any type ||
- || int                   ||                                                   
                      ||
+ || int                   ||                                                   
                                    ||
- || long                  ||                                                   
                      ||
+ || long                  ||                                                   
                                    ||
- || float                 ||                                                   
                      ||
+ || float                 ||                                                   
                                    ||
- || double                ||                                                   
                      ||
+ || double                ||                                                   
                                    ||
- || chararray[(encoding)] || valid encodings are UTF16 and none, if not 
specified defaults to none   ||
- || unknown               ||                                                   
                      ||
+ || chararray             ||                                                   
                                    ||
+ || bytearray             ||                                                   
                                    ||
  
  For example:
  {{{
- a = LOAD 'myfile' AS (garage bag(unknown), links bag(chararray(UTF16)), page 
bag(tuple(...)), coordinates bag(tuple(float, chararray)));
+ a = LOAD 'myfile' AS (garage bag(unknown), links bag(chararray), page 
bag(tuple(...)), coordinates bag(tuple(float, bytearray)));
  }}}
  
  Users must also be able to declare types for their user defined functions
+ (UDFs).  `DEFINE` will be changed to have the following syntax:
- (UDFs).  The syntax for this will follow the specification given in
- PigExternalFunctionDev.  Specifically, `DEFINE` will be changed to have the
- following syntax:
  
  {{{
  DEFINE alias '=' type funcspec '(' arglist ')' ';'
  
  alias: [a-zA-Z_][a-zA-Z_0-9]+
  
- type: bag | tuple | map | int | long | float | double | chararray['(' 
encoding ')'] | unknown
+ type: bag | tuple | map | int | long | float | double | chararray | bytearray
  
  funcspec: pathelement
            funcspec '.' pathelement
@@ -128, +115 @@

  As with other types,
  users will not be required to declare the signature of their UDFs, in which 
case
  any arguments can be passed to the UDF and the UDFs return value will be
- treated as type unknown.
+ treated as type bytearray.
  
  If the function signature is defined, but used in a different way, this will
  result in a compile time error.  For example, the following will result in an
@@ -155, +142 @@

  Currently, the only supported constants are strings, which must be enclosed in
  single quotes.
  
- String constants will be changed to be assumed to be of type chararray(none).
+ String constants will be assumed to be of type chararray.
  Users can specify these constants as ASCII characters (e.g. `x MATCHES 'abc'` 
)
  or as octal constants (e.g. `x MATCHES '\0141\0142\0143'` ).  This allows
  users to use non-ASCII characters in expressions.  Casts to type
- chararray(UTF16) will be supported (see [#Cast_Operators casts] below).
+ bytearray will be supported (see [#Cast_Operators casts] below).
  
  Support will be added for numeric
  constants.  By default any numeric constant matching the regular expression
@@ -194, +181 @@

  == Nulls ==
  The concept of NULL is not currently supported in pig.  This
  forces errors in some cases where we would prefer not to error out (such as
+ divide by 0 and failed UDF calls).
- divide by 0 and failed UDF calls).  The basic approach to null values in pig
- is detailed on the page NullValuesInPig.  While that document introduces the
- concept of a multi-null (to indicate that a bag did not finish processing when
- an error occurred and there should possibly be more elements in the bag), that
- concept will not be implemented in pig at this time.
  
  All existing operators will be changed to work with Nulls.  Those changes 
will conform to SQL NULL semantics.  The following table describes how each 
operator will
  handle a null value:
  
- || '''Operator''' || '''Interaction''' ||
+ || '''Operator'''                || '''Interaction''' ||
- || Equality operators || If either sub-expression is null, the result of the 
equality comparison will be null ||
+ || Equality operators            || If either sub-expression is null, the 
result of the equality comparison will be null ||
- || Matches || If either the string being matched against or the string 
defining the match is null, the result will be null ||
+ || Matches                       || If either the string being matched 
against or the string defining the match is null, the result will be null ||
- || Is null || returns true if the tested value is null (duh!) ||
+ || Is null                       || returns true if the tested value is null 
(duh!) ||
- || Arithmetic operators, concat || If either sub-expression is null, the 
resulting expression is null ||
+ || Arithmetic operators, concat  || If either sub-expression is null, the 
resulting expression is null ||
- || Size || If the tested object is null, size will return null ||
+ || Size                          || If the tested object is null, size will 
return null ||
  || Dereference of a map or tuple || If the dereferenced map or tuple is null, 
the result will be null. ||
- || Cast || Casting a null from one type to another will result in a null ||
+ || Cast                          || Casting a null from one type to another 
will result in a null ||
- || Aggregate || Aggregate functions will ignore nulls, just as they do in 
SQL.  This requires changes to the existing implementations. ||
+ || Aggregate                     || Aggregate functions will ignore nulls, 
just as they do in SQL.  This requires changes to the existing implementations. 
||
  
  '''Generating Nulls'''
  
@@ -220, +203 @@

     * A user defined function returning an error for a given row without 
returning a fatal error.
     * A null value in the input data.
     * Dereference of a map key that does not exist in a map.  For example, 
given the map info containing [name#fred, phone#5551212], if the user does 
info#address, a null will be returned.
-    * Dereference of a tuple into a non-existent field.  While this may be 
suprising in that an invalid dereference would seem to be an error, pig's free 
flowing data model requires that not all tuples in a relation have the same 
number of entries, which may lead to some tuple dereferences into non-existent 
fields.
+    * Dereference of a tuple into a non-existent field.  While this may be 
suprising in that an invalid dereference would seem to be an error, pig's free 
form data model requires that not all tuples in a relation have the same number 
of entries, which may lead to some tuple dereferences into non-existent fields.
  
  [[Anchor(Expression_Operators)]]
  == Expression Operators ==
@@ -255, +238 @@

   * not yet:  The use of this operator with this/these types is valid, but 
will not be initially implemented.
   * yes:  This operator is supported with this/these types.
  
+ A note about the use of bytearray.  In most cases bytearray types can be
+ coerced into other types.  This is to maintain backward comptability with
+ current pig, and to maintain easy use of the language without requiring casts
+ everywhere in the code.
+ 
  [[Anchor(Comparitors)]]
  === Comparitors ===
  Comparitors are currently "perlish" in that they require the user to define
  the type of the operands.  If the user wishes to compare two operands
  numerically, = = is used, whereas `eq` is used for comparing two operands as
  strings.  The existing string comparitors `eq`, `ne`, `lt`, `lte`, `gt`, 
`gte` will
+ continue to be supported, but they should be considered depricated.
+ It will now be legal to use = = , ! = , etc. with data of type chararray.
- continue to be supported.  But they will only be necessary when working with
- data of type unknown.  It will now be legal to use = = , ! = , etc. with data 
of
- type chararray.
  
  [[Anchor(Numeric_Equals_and_Notequals)]]
  ==== Numeric Equals and Notequals ====
- ||                || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''unknown''' ||
+ ||                 || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''bytearray''' 
||
- || '''bag'''          || error || error   || error || error || error  || 
error   || error    || error       || error     ||
+ || '''bag'''       || error     || error       || error     || error     || 
error      || error       || error        || error           || error     ||
- || '''tuple'''        ||       || Tuple A is equal to tuple B iff they have 
the same size s, and for all 0 <= i < s A[i] = = B[i] || error || error || 
error  || error   || error    || error       || error     ||
+ || '''tuple'''     ||           || Tuple A is equal to tuple B iff they have 
the same size s, and for all 0 <= i < s A[i] = = B[i] || error || error || 
error  || error   || error    || error       || error     ||
- || '''map'''          ||       ||         || Map A is equal to map B iff A 
and B have the same number of entries, and for every key k1 in A with a value 
of v1, there is a key k2 in B with a value of v2, such that k1 = = k2 and v1 = 
= v2 || error || error  || error   || error    || error       || error     ||
+ || '''map'''       ||           ||             || Map A is equal to map B iff 
A and B have the same number of entries, and for every key k1 in A with a value 
of v1, there is a key k2 in B with a value of v2, such that k1 = = k2 and v1 = 
= v2 || error || error  || error   || error    || error       || error     ||
- || '''int'''          ||       ||         ||       || yes   || yes    || yes  
   || yes      || error       || as int    ||
+ || '''int'''       ||           ||             ||           || yes       || 
yes        || yes         || yes          || error           || as int    ||
- || '''long'''         ||       ||         ||       ||       || yes    || yes  
   || yes      || error       || as long   ||
+ || '''long'''      ||           ||             ||           ||           || 
yes        || yes         || yes          || error           || as long   ||
- || '''float'''        ||       ||         ||       ||       ||        || yes  
   || yes      || error       || as float  ||
+ || '''float'''     ||           ||             ||           ||           ||   
         || yes         || yes          || error           || as float  ||
- || '''double'''       ||       ||         ||       ||       ||        ||      
   || yes      || error       || as double ||
+ || '''double'''    ||           ||             ||           ||           ||   
         ||             || yes          || error           || as double ||
- || '''chararray'''    ||       ||         ||       ||       ||        ||      
   ||          || Two chararrays can be compared if they have the same 
encoding.  Given a chararray A with encoding X (where X is not none) and 
chararray B with encoding none, chararray B can be encoded using X, and the 
comparison done.  || as chararray with encoding none ||
- || '''unknown'''      ||       ||         ||       ||       ||        ||      
   ||          ||             || as double ||
+ || '''chararray''' ||           ||             ||           ||           ||   
         ||             ||              || yes             || as chararray ||
+ || '''bytearray''' ||           ||             ||           ||           ||   
         ||             ||              ||                 || yes       ||
  
  [[Anchor(String_equality_and_Inequality_Comparators.)]]
  ==== String equality and Inequality Comparators. ====
- These include `eq` `ne` `lt` `lte` `gt` `gte` .
+ These include `eq` `ne` `lt` `lte` `gt` `gte` .  They are only valid for use 
with chararray and bytearray types, and should be considered depricated.
  
+ ||                 || '''chararray'''          || '''bytearray''' ||
- These are only valid for use with chararray and unknown types.  They are only 
required when comparing two unknowns, and the user wishes to
- force the comparison to be done as chararrays.
- 
- ||                || '''chararray'''              || '''unknown'''            
           ||
- || '''chararray'''    || uses = = , etc. function || as chararray with 
encoding none ||
+ || '''chararray''' || uses = = , etc. function || as chararray    ||
- || '''unknown'''      ||                          || as chararray with 
encoding none ||
+ || '''bytearray''' ||                          || yes             ||
  
  
  [[Anchor(Numeric_Inequality_Operators_Except_Notequals)]]
  ==== Numeric Inequality Operators Except Notequals ====
- ||                || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''unknown''' ||
+ ||                 || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''bytearray''' 
||
- || '''bag'''          || error || error   || error || error || error  || 
error   || error    || error       || error     ||
- || '''tuple'''        ||       || error   || error || error || error  || 
error   || error    || error       || error     ||
- || '''map'''          ||       ||         || error || error || error  || 
error   || error    || error       || error     ||
- || '''int'''          ||       ||         ||       || yes   || yes    || yes  
   || yes      || error       || as int    ||
+ || '''bag'''       || error     || error       || error     || error     || 
error      || error       || error        || error           || error     ||
+ || '''tuple'''     ||           || error       || error     || error     || 
error      || error       || error        || error           || error     ||
+ || '''map'''       ||           ||             || error     || error     || 
error      || error       || error        || error           || error     ||
+ || '''int'''       ||           ||             ||           || yes       || 
yes        || yes         || yes          || error           || as int    ||
- || '''long'''         ||       ||         ||       ||       || yes    || yes  
   || yes      || error       || as long   ||
+ || '''long'''      ||           ||             ||           ||           || 
yes        || yes         || yes          || error           || as long   ||
- || '''float'''        ||       ||         ||       ||       ||        || yes  
   || yes      || error       || as float  ||
+ || '''float'''     ||           ||             ||           ||           ||   
         || yes         || yes          || error           || as float  ||
- || '''double'''       ||       ||         ||       ||       ||        ||      
   || yes      || error       || as double ||
+ || '''double'''    ||           ||             ||           ||           ||   
         ||             || yes          || error           || as double ||
- || '''chararray'''    ||       ||         ||       ||       ||        ||      
   ||          || Two chararrays can be compared if they have the same 
encoding.  Given a chararray A with encoding X (where X is not none) and 
chararray B with encoding none, chararray B can be encoded using X, and the 
comparison done.  || as chararray ||
- || '''unknown'''      ||       ||         ||       ||       ||        ||      
   ||          ||             || as double ||
+ || '''chararray''' ||           ||             ||           ||           ||   
         ||             ||              || yes             || as chararray ||
+ || '''bytearray''' ||           ||             ||           ||           ||   
         ||             ||              ||                 || yes       ||
  
  
  [[Anchor(MATCHES)]]
  ==== MATCHES ====
- ||                || '''chararray''' || '''unknown'''                       ||
- || '''chararray'''    || Two chararrays can be compared if they have the same 
encoding.  Given a chararray A with encoding X (where X is not none) and 
chararray B with encoding none, chararray B can be encoded using X, and the 
comparison done.  || as chararray with encoding none ||
- || '''unknown'''      ||             || as chararray with encoding none ||
+ ||                 || '''chararray''' || '''bytearray''' ||
+ || '''chararray''' || yes             || as chararray    ||
+ || '''bytearray''' ||                 || yes             ||
  
  [[Anchor(IS_NULL_)]]
  ==== IS NULL  ====
  This is a new operator.
  
+ This operator will be added to check if a datum is null.  It can be applied 
to any data type.
- This operator will be added to check if a datum is null.  For SQL 
compatibility the syntax IS NOT NULL will also be supported.  It can be
- applied to any data type.
  
  [[Anchor(Binary_Operators)]]
  === Binary Operators ===
@@ -324, +307 @@

  ==== Multiplication and Division ====
  These are the operators `*` and `/` .
  
- ||                || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''unknown''' ||
+ ||                 || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''bytearray''' 
||
- || '''bag'''          || error || error   || error || not yet || not yet || 
not yet || not yet || error     || as double, not yet ||
- || '''tuple'''        ||       || error   || error || not yet || not yet || 
not yet || not yet || error     || error     ||
- || '''map'''          ||       ||         || error || error || error  || 
error   || error    || error       || error     ||
+ || '''bag'''       || error     || error       || error     || not yet   || 
not yet    || not yet     || not yet      || error           || error           
||
+ || '''tuple'''     ||           || error       || error     || not yet   || 
not yet    || not yet     || not yet      || error           || error           
||
+ || '''map'''       ||           ||             || error     || error     || 
error      || error       || error        || error           || error           
||
- || '''int'''          ||       ||         ||       || yes   || yes    || yes  
   || yes      || error       || as int    ||
+ || '''int'''       ||           ||             ||           || yes       || 
yes        || yes         || yes          || error           || as int          
||
- || '''long'''         ||       ||         ||       ||       || yes    || yes  
   || yes      || error       || as long   ||
+ || '''long'''      ||           ||             ||           ||           || 
yes        || yes         || yes          || error           || as long         
||
- || '''float'''        ||       ||         ||       ||       ||        || yes  
   || yes      || error       || as float  ||
+ || '''float'''     ||           ||             ||           ||           ||   
         || yes         || yes          || error           || as float        ||
- || '''double'''       ||       ||         ||       ||       ||        ||      
   || yes      || error       || as double ||
+ || '''double'''    ||           ||             ||           ||           ||   
         ||             || yes          || error           || as double       ||
- || '''chararray'''    ||       ||         ||       ||       ||        ||      
   ||          || error       || error     ||
- || '''unknown'''      ||       ||         ||       ||       ||        ||      
   ||          ||             || as double ||
+ || '''chararray''' ||           ||             ||           ||           ||   
         ||             ||              || error           || error           ||
+ || '''bytearray''' ||           ||             ||           ||           ||   
         ||             ||              ||                 || error           ||
  
  [[Anchor(Modulo)]]
  ==== Modulo ====
- This is a new operator, `%` .
+ This is a new operator, `%`, it is valid only for integer types.
  
+ ||                 || '''int''' || '''long''' || '''bytearray''' ||
+ || '''int'''       || yes       || yes        || as int          ||
+ || '''long'''      ||           || yes        || as long         ||
+ || '''bytearray''' ||           ||            || error           ||
- ||                || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''unknown''' ||
- || '''bag'''          || error || error   || error || error || error  || 
error   || error    || error       || error     ||
- || '''tuple'''        ||       || not yet || error || error || error  || 
error   || error    || error       || as tuple, not yet ||
- || '''map'''          ||       ||         || error || error || error  || 
error   || error    || error       || error     ||
- || '''int'''          ||       ||         ||       || yes   || yes    || 
error   || error    || error       || as int    ||
- || '''long'''         ||       ||         ||       ||       || yes    || 
error   || error    || error       || as long   ||
- || '''float'''        ||       ||         ||       ||       ||        || 
error   || error    || error       || error     ||
- || '''double'''       ||       ||         ||       ||       ||        ||      
   || error    || error       || error     ||
- || '''chararray'''    ||       ||         ||       ||       ||        ||      
   ||          || error       || error     ||
- || '''unknown'''      ||       ||         ||       ||       ||        ||      
   ||          ||             || as long   ||
  
  We may choose not to implement the mod operator right away, as there are no
  immediate user requests for it.
@@ -358, +335 @@

  ==== Addition and Subtraction ====
  These are the operators `+` and `-` .
  
- ||                || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''unknown''' ||
+ ||                 || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''bytearray''' 
||
- || '''bag'''          || error || error   || error || error || error  || 
error   || error    || error       || error     ||
- || '''tuple'''        ||       || not yet || error || error || error  || 
error   || error    || error       || as tuple, not yet ||
- || '''map'''          ||       ||         || error || error || error  || 
error   || error    || error       || error     ||
+ || '''bag'''       || error     || error       || error     || error     || 
error      || error       || error        || error           || error           
||
+ || '''tuple'''     ||           || not yet     || error     || error     || 
error      || error       || error        || error           || error           
||
+ || '''map'''       ||           ||             || error     || error     || 
error      || error       || error        || error           || error           
||
- || '''int'''          ||       ||         ||       || yes   || yes    || yes  
   || yes      || error       || as int    ||
+ || '''int'''       ||           ||             ||           || yes       || 
yes        || yes         || yes          || error           || as int          
||
- || '''long'''         ||       ||         ||       ||       || yes    || yes  
   || yes      || error       || as long   ||
+ || '''long'''      ||           ||             ||           ||           || 
yes        || yes         || yes          || error           || as long         
||
- || '''float'''        ||       ||         ||       ||       ||        || yes  
   || yes      || error       || as float  ||
+ || '''float'''     ||           ||             ||           ||           ||   
         || yes         || yes          || error           || as float        ||
- || '''double'''       ||       ||         ||       ||       ||        ||      
   || yes      || error       || as double ||
+ || '''double'''    ||           ||             ||           ||           ||   
         ||             || yes          || error           || as double       ||
- || '''chararray'''    ||       ||         ||       ||       ||        ||      
   ||          || error       || error     ||
- || '''unknown'''      ||       ||         ||       ||       ||        ||      
   ||          ||             || as double ||
+ || '''chararray''' ||           ||             ||           ||           ||   
         ||             ||              || error           || error           ||
+ || '''bytearray''' ||           ||             ||           ||           ||   
         ||             ||              ||                 || error           ||
  
  [[Anchor(Concat)]]
  === Concat ===
  This is a new operator.
  
  A new operator concat will be added for chararrays.
- ||                || '''chararray''' || '''unknown''' ||
+ ||                 || '''chararray''' || '''bytearray''' ||
+ || '''chararray''' || yes             || as chararray    ||
+ || '''bytearray''' ||                 || yes             ||
- || '''chararray'''    || Two char arrays can be concatenated if they have the 
same encoding.  Given a char array A with encoding X (where X is not none) and 
char array B with encoding none, char array B can be encoded using encoding X, 
and the concatenation then done.  The resulting chararrays will have encoding 
X.  || as chararray ||
- || '''unknown'''      ||             || as chararray(none) ||
- 
  
  [[Anchor(Unary_Operators)]]
  === Unary Operators ===
@@ -393, +369 @@

  || '''float'''         || yes   ||
  || '''double'''        || yes   ||
  || '''chararray'''     || error ||
- || '''unknown'''       || as double ||
+ || '''bytearray'''     || as double ||
  
  [[Anchor(Size_)]]
  ==== Size  ====
@@ -408, +384 @@

  || '''float'''         || returns 1 ||
  || '''double'''        || returns 1 ||
  || '''chararray'''     || returns number of characters in the array ||
- || '''unknown'''       || as chararray(none) ||
+ || '''bytearray'''     || as number of bytes in the array ||
  
  [[Anchor(Tuple_Dereference_Operator)]]
  ==== Tuple Dereference Operator ====
  This is the operator `.` .
  
  For tuples, dereferencing into the tuple via . will continue to be supported. 
 This dereferencing can be done via name or position.  For example 
`mytuple.myfield`
- and `mytuple.$0` are both valid dereferences.  If the dot operator is applied 
to an unkonwn type, the type will be assumed to be a tuple.
+ and `mytuple.$0` are both valid dereferences.  If the dot operator is applied 
to a bytearray, the bytearray will be assumed to be a tuple.
  
  [[Anchor(Map_Dereference_Operator)]]
  ==== Map Dereference Operator ====
  This is the operator `#` .
  
  For maps, dereferencing into the map via # will continue to be supported.  
This dereferencing must be done by key, for example `mymap#mykey`
- If the pound operator is applied to an unkonwn type, the type will be assumed 
to be a map.
+ If the pound operator is applied to a bytearray, the bytearray will be 
assumed to be a map.
  
  [[Anchor(Cast_Operators)]]
  === Cast Operators ===
- Casts will be added to the language.  Casts will only be supported between 
atomic types.
+ Casts will be added to the language.
  
  C/Java like syntax will be used for casts (e.g. `(int)mydouble`).
  
- Cast operators.  Applies only to atomic types
- ||                || '''to'''  ||||||||||||||
- || '''from'''         || '''int''' || '''long''' || '''float''' || 
'''double''' || '''chararray'''                                 || 
'''unknown''' ||
- || '''int'''          ||       || yes    || yes     || yes      || yes        
                                 || error     ||
- || '''long'''         || yes   ||        || yes     || yes      || yes        
                                 || error     ||
- || '''float'''        || yes   || yes    ||         || yes      || yes        
                                 || error     ||
- || '''double'''       || yes   || yes    || yes     ||          || yes        
                                 || error     ||
- || '''chararray'''    || yes   || yes    || yes     || yes      || Casts 
between different character encodings || error     ||
- || '''unknown'''      || yes   || yes    || yes     || yes      || yes        
                                 ||           ||
+ Cast operators.
+ ||                 || '''to'''  ||             ||           ||           ||   
         ||             ||              ||                 ||                 ||
+ || '''from'''      || '''bag''' || '''tuple''' || '''map''' || '''int''' || 
'''long''' || '''float''' || '''double''' || '''chararray''' || '''bytearray''' 
||
+ || '''bag'''       ||           || error       || error     || error     || 
error      || error       || error        || error           || not yet         
||
+ || '''tuple'''     || error     ||             || error     || error     || 
error      || error       || error        || error           || not yet         
||
+ || '''map'''       || error     || error       ||           || error     || 
error      || error       || error        || error           || not yet         
||
+ || '''int'''       || error     || error       || error     ||           || 
yes        || yes         || yes          || yes             || yes             
||
+ || '''long'''      || error     || error       || error     || yes       ||   
         || yes         || yes          || yes             || yes             ||
+ || '''float'''     || error     || error       || error     || yes       || 
yes        ||             || yes          || yes             || yes             
||
+ || '''double'''    || error     || error       || error     || yes       || 
yes        || yes         ||              || yes             || yes             
||
+ || '''chararray''' || error     || error       || error     || yes       || 
yes        || yes         || yes          ||                 || yes             
||
+ || '''bytearray''' || yes       || yes         || yes       || yes       || 
yes        || yes         || yes          || yes             ||                 
||
  
  Downcasts may cause loss of data (ie casting from long to int may drop bits).
  
@@ -448, +427 @@

  
  In the following chart, the entry `error` indicates that applying that 
aggregate function to a field of this type is an error.  A data type (e.g. 
long) indicates
  that applying that aggregate function to a field of this type returns the 
indicated data type.
- ||                || '''int''' || '''long''' || '''float''' || '''double''' 
|| '''chararray''' || '''unknown''' ||
+ ||        || '''int''' || '''long''' || '''float''' || '''double''' || 
'''chararray''' || '''bytearray''' ||
- || COUNT          || long  || long   || long    || long     || long        || 
long      ||
+ || COUNT  || long      || long       || long        || long         || long   
         || long            ||
- || SUM            || long  || long   || double  || double   || error       || 
as double ||
+ || SUM    || long      || long       || double      || double       || error  
         || as double       ||
- || AVG            || long  || long   || double  || double   || error       || 
as double ||
+ || AVG    || long      || long       || double      || double       || error  
         || as double       ||
- || MIN            || int   || long   || float   || double   || chararray   || 
as double ||
+ || MIN    || int       || long       || float       || double       || 
chararray       || bytearray       ||
- || MAX            || int   || long   || float   || double   || chararray   || 
as double ||
+ || MAX    || int       || long       || float       || double       || 
chararray       || bytearray       ||
  
  [[Anchor(Argument_Construction_for_Functions)]]
  === Argument Construction for Functions ===

Reply via email to