[
https://issues.apache.org/jira/browse/PIG-732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696669#action_12696669
]
Santhosh Srinivasan commented on PIG-732:
-----------------------------------------
Review comments for the outputSchema method in the UDFs
Index:
contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/util/SearchQuery.java
==========================================================================
SearchQuery is returning String. The outputSchema method should return a Schema
with a single column of type CHARARRAY. You could use one of the following two
approaches:
1. If you wish to call the column query then use the following.
{code}
Schema s = new Schema();
s.add(new Schema.FieldSchema("query", DataType.CHARARRAY));
return s;
{code}
2. If you wish to use an generated name then use the following:
{code}
Schema s = new Schema();
s.add(new
Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),
input), DataType.CHARARRAY));
return s;
{code}
The relevant portion of the patch is shown below.
{code}
+ @Override
+ public Schema outputSchema(Schema input) {
+ try {
+ Schema s = new Schema();
+ s.add(new Schema.FieldSchema("query", DataType.CHARARRAY));
+ return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass()
+ .getName().toLowerCase(), input), s, DataType.CHARARRAY));
+ } catch (Exception e) {
+ return null;
+ }
+ }
+}
{code}
> Utility UDFs
> -------------
>
> Key: PIG-732
> URL: https://issues.apache.org/jira/browse/PIG-732
> Project: Pig
> Issue Type: New Feature
> Reporter: Ankur
> Priority: Minor
> Attachments: udf.v1.patch, udf.v2.patch, udf.v3.patch
>
>
> Two utility UDFs and their respective test cases.
> 1. TopN - Accepts number of tuples (N) to retain in output, field number
> (type long) to use for comparison, and an sorted/unsorted bag of tuples. It
> outputs a bag containing top N tuples.
> 2. SearchQuery - Accepts an encoded URL from any of the 4 search engines
> (Yahoo, Google, AOL, Live) and extracts and normalizes the search query
> present in it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.