[ 
https://issues.apache.org/jira/browse/PIG-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172003#comment-13172003
 ] 

Prashant Kommireddi commented on PIG-2375:
------------------------------------------

The problem does not seem to be that incorrect outputSchema is invoked. Rather, 
the root/parent UDF is always instantiated before the actual overriding UDF is 
invoked. 

getFieldSchema() (from UserFuncExpression) is invoked on root UDF before it is 
called on overriding UDF. 

{code}
  @Override
    public LogicalSchema.LogicalFieldSchema getFieldSchema() throws 
FrontendException {
        if (fieldSchema!=null)
            return fieldSchema;
        
        LogicalSchema inputSchema = new LogicalSchema();
        List<Operator> succs = plan.getSuccessors(this);

        if (succs!=null) {
            for(Operator lo : succs){
                if (((LogicalExpression)lo).getFieldSchema()==null) {
                    inputSchema = null;
                    break;
                }
                inputSchema.addField(((LogicalExpression)lo).getFieldSchema());
            }
        }

        // Since ef only set one time, we never change its value, so we can 
optimize it by instantiate only once.
        // This significantly optimize the performance of frontend (PIG-1738)
        if (ef==null)
            ef = (EvalFunc<?>) PigContext.instantiateFuncFromSpec(mFuncSpec);
        
        ef.setUDFContextSignature(signature);
        Properties props = 
UDFContext.getUDFContext().getUDFProperties(ef.getClass());
        if(Util.translateSchema(inputSchema)!=null)
                props.put("pig.evalfunc.inputschema."+signature, 
Util.translateSchema(inputSchema));
        // Store inputSchema into the UDF context
        ef.setInputSchema(Util.translateSchema(inputSchema));
        
//WHY DOES THIS NEED TO BE CALLED ON THE EVALFUNC THAT IS NOT USED
        Schema udfSchema = ef.outputSchema(Util.translateSchema(inputSchema));

        if (udfSchema != null) {
            Schema.FieldSchema fs;
            if(udfSchema.size() == 0) {
.
.
.
.

{code}

Why would getFieldSchema() need to be invoked on root UDF when exec() actually 
needs to invoked on an overriding EvalFunc? 
                
> Incorrect outputSchema is invoked when overloading UDF in 0.9.1
> ---------------------------------------------------------------
>
>                 Key: PIG-2375
>                 URL: https://issues.apache.org/jira/browse/PIG-2375
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Prashant Kommireddi
>            Assignee: Prashant Kommireddi
>             Fix For: 0.9.1
>
>         Attachments: LogFieldValue.java, LogFieldValues.java
>
>
> When overloading a UDF with getArgToFuncMapping() the parent/root UDF 
> outputSchema() is being called. 
> {code}
>   @Override
>     public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
>         List<FuncSpec> funcList = new ArrayList<FuncSpec>();
>         Schema s = new Schema();
>         s.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s.add(new Schema.FieldSchema(null, DataType.CHARARRAY));
>         funcList.add(new FuncSpec(this.getClass().getName(), s));
>         Schema s1 = new Schema();
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         funcList.add(new FuncSpec(LogFieldValues.class.getName(), s1));
>         return funcList;
>     }
> {code}
> In the above function, "LogFieldValues" is used when the input is (tuple, 
> tuple) but the outputSchema() is invoked from the root UDF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to