Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/PigTutorial

------------------------------------------------------------------------------
   * Uses the GROUP command to group records by n-gram only.
   * Calls the !ScoreGenerator UDF to calculate a "popularity" score for the 
n-gram.
   * Uses the GENERATE command to assign names to the fields.
-  * Uses the FILTER command to move all records with a score less than or 
equal to 2.0.
+  * Uses the 
[http://wiki.apache.org/pig/PigLatin#FILTER:_Getting_rid_of_data_you_are_not_interested_in_
 FILTER] command to move all records with a score less than or equal to 2.0.
   * Uses the ORDER command to sort the remaining records by hour and score.
   * Uses the !PigStorage function to store the results. The output file 
contains a list of n-grams with the following fields: '''hour''', '''ngram''', 
'''score''', '''count''', '''mean''' 
  
@@ -124, +124 @@

   * Uses the GROUP command to group the records by n-gram and hour. 
   * Uses the COUNT function to get the count (occurrences) of each n-gram. 
   * Uses the GENERATE command to assign names to the fields.
-  * Uses the FILTER command to get the n-grams for hour ‘00’ 
-  * Uses the FILTER command to get the n-grams for hour ‘12’ 
+  * Uses the 
[http://wiki.apache.org/pig/PigLatin#FILTER:_Getting_rid_of_data_you_are_not_interested_in_
 FILTER] command to get the n-grams for hour ‘00’ 
+  * Uses the 
[http://wiki.apache.org/pig/PigLatin#FILTER:_Getting_rid_of_data_you_are_not_interested_in_
 FILTER] command to get the n-grams for hour ‘12’ 
   * Uses the JOIN command to join the n-grams in hour “00” and  hour 
“12” by field $0
   * Uses the COUNT function to get the count (occurrences) of the n-grams in 
both “00” and “12” 
   * Uses the !PigStorage function to store the results. The output file 
contains a list of n-grams with the following fields: '''hour''', 
'''count00''', '''count12'''

Reply via email to