[Pig Wiki] Update of PigStreamingFunctionalSpec by XuZhang

2008-03-24 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by XuZhang:
http://wiki.apache.org/pig/PigStreamingFunctionalSpec

--
  If `ship` and `cache` options are not specified, pig will attempt to ship the 
binary in the following way:
  
 * If the first word on the streaming command is `perl` or `python`, pig 
would assume that the binary is the first string it encounters that does not 
start with dash.
-* Otherwise, pig will attempt to ship the first string from the command 
line as long as it does not come from `/bin, /user/bin, /user/local/bin`. It 
will determine that by scanning the path if an absolute path is provided or by 
executing `which`. The paths can be made configurable via `set stream.skippath 
paths` option.
+* Otherwise, pig will attempt to ship the first string from the command 
line as long as it does not come from `/bin, /usr/bin, /usr/local/bin`. It will 
determine that by scanning the path if an absolute path is provided or by 
executing `which`. The paths can be made configurable via `set stream.skippath 
paths` option.
  
  To prevent a command from being shipped, an empty list can be passed to 
`ship` clause.
  
@@ -191, +191 @@

  
   1. !DefaultSerializer, !DefaultDeserializer as described above (This is 
going to be PigStorage)
   2. !PythonSerializer, !PythonDeserializer 
-  3. !BinarSerailzie, !BinaryDeserializer - treats the entire file as byte 
stream - no formating or interpretation.
+  3. !BinarySerializer, !BinaryDeserializer - treats the entire file as byte 
stream - no formating or interpretation.
  
  Each deserializer will be implementing `LoadFunc` interface. Each serializer 
will be implementing `StoreFunc` interface. `StoreFunc` interface will be 
extended with `void flatten() throws OperationNotSupportedException;` method 
that would indicate that the data needs to be flattened before it is 
serialized. The class can choose not to support this functionality and through 
an exception.
  
@@ -237, +237 @@

  Y = stream X through Z;
  }}}
  
- This tells pig that streaming application stored its complete output into 
file called `outputfile` in the tasks's working directory and that the content 
of that file should be serialized into Y using !MySerializer. 
+ This tells pig that streaming application stored its complete output into 
file called `outputfile` in the tasks's working directory and that the content 
of that file should be deserialized into Y using MyDeserializer. 
  
  A user can specify multiple outputs but only the first one will be 
automatically loaded; the rest would be stored in dfs using the file name 
specified in the output as absolute path:
  


[Pig Wiki] Update of PigStreamingFunctionalSpec by XuZhang

2008-03-24 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by XuZhang:
http://wiki.apache.org/pig/PigStreamingFunctionalSpec

--
  Y = stream X through Z;
  }}}
  
- This tells pig that streaming application stored its complete output into 
file called `outputfile` in the tasks's working directory and that the content 
of that file should be deserialized into Y using MyDeserializer. 
+ This tells pig that streaming application stored its complete output into 
file called `outputfile` in the tasks's working directory and that the content 
of that file should be deserialized into Y using `MyDeserializer`. 
  
  A user can specify multiple outputs but only the first one will be 
automatically loaded; the rest would be stored in dfs using the file name 
specified in the output as absolute path: