Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by OlgaN: http://wiki.apache.org/pig/PigStreamingFunctionalSpec ------------------------------------------------------------------------------ S = stream A through `stream.pl`; }}} - In the example above, `DefaultSerializer` is used that takes tuples out of A and converts them into tab delimitted lines that are passed to `stream.pl`. If A was a result of a grouping operation, the `DefaultSerializer` would also flatten the data. The output of streaming is processed by `DefaultDeserializer` one line at a time and split on tabs. + In the example above, default serialize (!PigStorage) is used that takes tuples out of A and converts them into tab delimitted lines that are passed to `stream.pl`. The output of streaming is processed by default deserializer (!PigStorage) one line at a time and split on tabs. The user would be able to provide an alternative delimiter to default (de)serializer via `define command`: {{{ - define X `stream.pl` input(stdin using DefaultSerializer('^A')) output (stdout using DefaultDeserializer('^A')); + define X `stream.pl` input(stdin using PigStorage('^A')) output (stdout using PigStorage('^A')); S = stream A through X; }}} @@ -209, +209 @@ S = stream A through X; }}} - The following serializers/deserializer will be part of pig distribution: + In addition to !PigStorage the following serializers/deserializer will be part of pig distribution: - 1. !DefaultSerializer, !DefaultDeserializer as described above (This is going to be PigStorage) + 1. !BinarySerializer, !BinaryDeserializer - treats the entire file as byte stream - no formating or interpretation. 2. !PythonSerializer, !PythonDeserializer - 3. !BinarySerializer, !BinaryDeserializer - treats the entire file as byte stream - no formating or interpretation. Each deserializer will be implementing `LoadFunc` interface. Each serializer will be implementing `StoreFunc` interface. `StoreFunc` interface will be extended with `void flatten() throws OperationNotSupportedException;` method that would indicate that the data needs to be flattened before it is serialized. The class can choose not to support this functionality and through an exception.