Apache Wiki
Tue, 17 Nov 2009 19:30:39 -0800
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The "PigStreamingFunctionalSpec" page has been changed by MarcioSilva. The comment on this change is: correcting what appears to be a typo.. http://wiki.apache.org/pig/PigStreamingFunctionalSpec?action=diff&rev1=47&rev2=48 -------------------------------------------------- Streaming can have three separate meaning in the context of Pig project: 1. A specific way of submitting jobs to Hadoop: Hadoop Streaming - 2. A form of processing in which the entire portion of the dataset that corresponds to a task in sent to the task and output streams out. There is no temporal or causal correspondence between an input record and specific output records. + 2. A form of processing in which the entire portion of the dataset that corresponds to a task is sent to the task and output streams out. There is no temporal or causal correspondence between an input record and specific output records. 3. The use of non-Java functions with Pig. The goal of Pig with respect to streaming is to support #2 for (a)Java UDFs, (b)non-Java UDFs and (c)user specified binaries/scripts. We will start with (c) since it would be most beneficial for the users. It is not our goal to be feature-by-feature compatible with Hadoop streaming as it is too open-ended and might force us to implement features that we don't necessarily want in Pig.