[ https://issues.apache.org/jira/browse/PIG-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172003#comment-13172003 ]
Prashant Kommireddi commented on PIG-2375: ------------------------------------------ The problem does not seem to be that incorrect outputSchema is invoked. Rather, the root/parent UDF is always instantiated before the actual overriding UDF is invoked. getFieldSchema() (from UserFuncExpression) is invoked on root UDF before it is called on overriding UDF. {code} @Override public LogicalSchema.LogicalFieldSchema getFieldSchema() throws FrontendException { if (fieldSchema!=null) return fieldSchema; LogicalSchema inputSchema = new LogicalSchema(); List<Operator> succs = plan.getSuccessors(this); if (succs!=null) { for(Operator lo : succs){ if (((LogicalExpression)lo).getFieldSchema()==null) { inputSchema = null; break; } inputSchema.addField(((LogicalExpression)lo).getFieldSchema()); } } // Since ef only set one time, we never change its value, so we can optimize it by instantiate only once. // This significantly optimize the performance of frontend (PIG-1738) if (ef==null) ef = (EvalFunc<?>) PigContext.instantiateFuncFromSpec(mFuncSpec); ef.setUDFContextSignature(signature); Properties props = UDFContext.getUDFContext().getUDFProperties(ef.getClass()); if(Util.translateSchema(inputSchema)!=null) props.put("pig.evalfunc.inputschema."+signature, Util.translateSchema(inputSchema)); // Store inputSchema into the UDF context ef.setInputSchema(Util.translateSchema(inputSchema)); //WHY DOES THIS NEED TO BE CALLED ON THE EVALFUNC THAT IS NOT USED Schema udfSchema = ef.outputSchema(Util.translateSchema(inputSchema)); if (udfSchema != null) { Schema.FieldSchema fs; if(udfSchema.size() == 0) { . . . . {code} Why would getFieldSchema() need to be invoked on root UDF when exec() actually needs to invoked on an overriding EvalFunc? > Incorrect outputSchema is invoked when overloading UDF in 0.9.1 > --------------------------------------------------------------- > > Key: PIG-2375 > URL: https://issues.apache.org/jira/browse/PIG-2375 > Project: Pig > Issue Type: Bug > Affects Versions: 0.9.1 > Reporter: Prashant Kommireddi > Assignee: Prashant Kommireddi > Fix For: 0.9.1 > > Attachments: LogFieldValue.java, LogFieldValues.java > > > When overloading a UDF with getArgToFuncMapping() the parent/root UDF > outputSchema() is being called. > {code} > @Override > public List<FuncSpec> getArgToFuncMapping() throws FrontendException { > List<FuncSpec> funcList = new ArrayList<FuncSpec>(); > Schema s = new Schema(); > s.add(new Schema.FieldSchema(null, DataType.TUPLE)); > s.add(new Schema.FieldSchema(null, DataType.CHARARRAY)); > funcList.add(new FuncSpec(this.getClass().getName(), s)); > Schema s1 = new Schema(); > s1.add(new Schema.FieldSchema(null, DataType.TUPLE)); > s1.add(new Schema.FieldSchema(null, DataType.TUPLE)); > funcList.add(new FuncSpec(LogFieldValues.class.getName(), s1)); > return funcList; > } > {code} > In the above function, "LogFieldValues" is used when the input is (tuple, > tuple) but the outputSchema() is invoked from the root UDF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira