[ https://issues.apache.org/jira/browse/PIG-732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696669#action_12696669 ]
Santhosh Srinivasan commented on PIG-732: ----------------------------------------- Review comments for the outputSchema method in the UDFs Index: contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/util/SearchQuery.java ========================================================================== SearchQuery is returning String. The outputSchema method should return a Schema with a single column of type CHARARRAY. You could use one of the following two approaches: 1. If you wish to call the column query then use the following. {code} Schema s = new Schema(); s.add(new Schema.FieldSchema("query", DataType.CHARARRAY)); return s; {code} 2. If you wish to use an generated name then use the following: {code} Schema s = new Schema(); s.add(new Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(), input), DataType.CHARARRAY)); return s; {code} The relevant portion of the patch is shown below. {code} + @Override + public Schema outputSchema(Schema input) { + try { + Schema s = new Schema(); + s.add(new Schema.FieldSchema("query", DataType.CHARARRAY)); + return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass() + .getName().toLowerCase(), input), s, DataType.CHARARRAY)); + } catch (Exception e) { + return null; + } + } +} {code} > Utility UDFs > ------------- > > Key: PIG-732 > URL: https://issues.apache.org/jira/browse/PIG-732 > Project: Pig > Issue Type: New Feature > Reporter: Ankur > Priority: Minor > Attachments: udf.v1.patch, udf.v2.patch, udf.v3.patch > > > Two utility UDFs and their respective test cases. > 1. TopN - Accepts number of tuples (N) to retain in output, field number > (type long) to use for comparison, and an sorted/unsorted bag of tuples. It > outputs a bag containing top N tuples. > 2. SearchQuery - Accepts an encoded URL from any of the 4 search engines > (Yahoo, Google, AOL, Live) and extracts and normalizes the search query > present in it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.