Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "UDFsUsingScriptingLanguages" page has been changed by Aniket Mokashi.
http://wiki.apache.org/pig/UDFsUsingScriptingLanguages?action=diff&rev1=3&rev2=4

--------------------------------------------------

   '''schemaFunction''' defines delegate function and is not registered to pig.
   
  When no decorator is specified, pig assumes the output datatype as bytearray 
and converts the output generated by script function to bytearray. This is 
consistent with pig's behavior in case of Java UDFs.
+ 
  ''Sample Schema String'' - y:{t:(word:chararray,num:long)}, variable names 
inside schema string are not used anywhere, they are used just to make syntax 
identifiable to the parser.
  
  == Inline Scripts ==
@@ -92, +93 @@

  def percent(num, total):
    return num * 100 / total
  
- #CommaFormat-
+ ####################
+ # String Functions #
+ ####################
+ #commaFormat- format a number with commas, 12345-> 12,345
  @outputSchema("t:(numformat:chararray)")
  def commaFormat(num):
    return '{:,}'.format(num)
  
- ####################
- # String Functions #
- ####################
- 
+ #concatMultiple- concat multiple words
+ @outputSchema("t:(numformat:chararray)")
+ def concatMult4(word1, word2, word3, word4):
+   return word1+word2+word3+word4
  
  #######################
  # Data Type Functions #
  #######################
+ #collectBag- collect elements of a bag into other bag
+ #This is useful UDF after group operation
+ @outputSchema("bag:{(y:{t:(word:chararray)}}")
+ def collectBag(bag):
+   outBag = []
+   for word in bag:
+     tup=(len(bag), word[1])
+     outBag.append(tup)
+   return outBag
  
+ # Few comments- 
+ # pig mandates that a bag should be a bag of tuples, python UDFs should 
follow this pattern.
+ # tuple in python are immutable, appending to a tuple is not possible.
  
  }}}
- 
  == Performance ==
  === Jython ===
  

Reply via email to