Santhosh Srinivasan commented on PIG-732:

Review comments for the outputSchema method in the UDFs


SearchQuery is returning String. The outputSchema method should return a Schema 
with a single column of type CHARARRAY. You could use one of the following two 

1. If you wish to call the column query then use the following.
Schema s = new Schema();
s.add(new Schema.FieldSchema("query", DataType.CHARARRAY));
return s;

2. If you wish to use an generated name then use the following:
Schema s = new Schema();
input), DataType.CHARARRAY));
return s;

The relevant portion of the patch is shown below.

+  @Override
+  public Schema outputSchema(Schema input) {
+    try {
+      Schema s = new Schema();
+      s.add(new Schema.FieldSchema("query", DataType.CHARARRAY));
+      return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass()
+          .getName().toLowerCase(), input), s, DataType.CHARARRAY));
+    } catch (Exception e) {
+      return null;
+    }
+  }

> Utility UDFs 
> -------------
>                 Key: PIG-732
>                 URL: https://issues.apache.org/jira/browse/PIG-732
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Ankur
>            Priority: Minor
>         Attachments: udf.v1.patch, udf.v2.patch, udf.v3.patch
> Two utility UDFs and their respective test cases.
> 1. TopN - Accepts number of tuples (N) to retain in output, field number 
> (type long) to use for comparison, and an sorted/unsorted bag of tuples. It 
> outputs a bag containing top N tuples.
> 2. SearchQuery - Accepts an encoded URL from any of the 4 search engines 
> (Yahoo, Google, AOL, Live) and extracts and normalizes the search query 
> present in it.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to