Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/PigStreamingFunctionalSpec

------------------------------------------------------------------------------
  S = stream A through `stream.pl`;
  }}}
  
- In the example above, `DefaultSerializer` is used that takes tuples out of A 
and converts them into tab delimitted lines that are passed to `stream.pl`. If 
A was a result of a grouping operation, the `DefaultSerializer` would also 
flatten the data. The output of streaming is processed by `DefaultDeserializer` 
one line at a time and split on tabs. 
+ In the example above, default serialize (!PigStorage) is used that takes 
tuples out of A and converts them into tab delimitted lines that are passed to 
`stream.pl`. The output of streaming is processed by default deserializer 
(!PigStorage) one line at a time and split on tabs. 
  
  The user would be able to provide an alternative delimiter to default 
(de)serializer via `define command`:
  
  {{{
- define X `stream.pl` input(stdin using DefaultSerializer('^A')) output 
(stdout using DefaultDeserializer('^A'));
+ define X `stream.pl` input(stdin using PigStorage('^A')) output (stdout using 
PigStorage('^A'));
  S = stream A through X;
  }}}
  
@@ -209, +209 @@

  S = stream A through X;
  }}}
  
- The following serializers/deserializer will be part of pig distribution:
+ In addition to !PigStorage the following serializers/deserializer will be 
part of pig distribution:
  
-  1. !DefaultSerializer, !DefaultDeserializer as described above (This is 
going to be PigStorage)
+  1. !BinarySerializer, !BinaryDeserializer - treats the entire file as byte 
stream - no formating or interpretation.
   2. !PythonSerializer, !PythonDeserializer 
-  3. !BinarySerializer, !BinaryDeserializer - treats the entire file as byte 
stream - no formating or interpretation.
  
  Each deserializer will be implementing `LoadFunc` interface. Each serializer 
will be implementing `StoreFunc` interface. `StoreFunc` interface will be 
extended with `void flatten() throws OperationNotSupportedException;` method 
that would indicate that the data needs to be flattened before it is 
serialized. The class can choose not to support this functionality and through 
an exception.
  

Reply via email to