[jira] Commented: (PIG-597) Pig does not handdle correctly the case where * is passed to UDF

2009-01-13 Thread Shravan Matthur Narayanamurthy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663306#action_12663306
 ] 

Shravan Matthur Narayanamurthy commented on PIG-597:


The exception is being thrown from ARITY where it is trying to convert the 
first field of the tuple into a tuple. However, since we have a star, the tuple 
is not wrapped inside another tuple and hence the exception.

This was done in order to model the trunk behavior which is that there is an 
implicit flatten in front of a *. If we want to retain this behavior, then we 
need to change ARITY  other functions which were written with the assumption 
that POUserFunc will wrap anything inside a tuple though most of these 
functions will be useless when we have a UDF which outputs a tuple. To give an 
example, say we have a function which returns a tuple and we want to find its 
arity, ARITY(TupleRetUDF(*)) will always return one since POUserFunc will wrap 
the output of TupleRetUDF into another tuple and ARITY is changed to return 
just the size of the input tuple and not the size of the first field.

However, if we comment this code, then we need to modify FindQuantiles to 
consider the fact that everything will be wrapped inside a tuple  the behavior 
is not conditional upon the use of a star. I think this is better and Olga 
seems to agree as per her previous comment. Any other thoughts? Retain trunk 
behavior or change it?

 Pig does not handdle correctly the case where * is passed to UDF
 --

 Key: PIG-597
 URL: https://issues.apache.org/jira/browse/PIG-597
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Shravan Matthur Narayanamurthy

 Script:
 ==
 A = LOAD 'foo' USING PigStorage('\t');
 B = FILTER A BY ARITY(*)  5;
 DUMP B;
 Error:
 =
 2009-01-05 21:46:56,355 [main] ERROR
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
 - Caught error from UDF
 org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray cannot be cast 
 to org.apache.pig.data.Tuple [org.apache.pig.data.DataByteArray cannot be 
 cast to org.apache.pig.data.Tuple]
 Problem:
 ===
 Santhosh tracked this to the following code in POUserFunc.java:
 if(op instanceof POProject 
 op.getResultType() == DataType.TUPLE){
 POProject projOp = (POProject)op;
 if(projOp.isStar()){
 Tuple trslt = (Tuple) temp.result;
 Tuple rslt = (Tuple) res.result;
 for(int i=0;itrslt.size();i++)
 rslt.append(trslt.get(i));
 continue;
 }
 }
 It seems to be unwrapping the tuple before passing it to the function. There 
 is no comments so we are not sure why it is there; will need to run tests to 
 see if removing it would solve this issue and not create others.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-597) Pig does not handdle correctly the case where * is passed to UDF

2009-01-12 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663008#action_12663008
 ] 

Olga Natkovich commented on PIG-597:


Commenting this code breaks ORDER BY. I believe that this change is still 
correct and the sampling code need to be changed to work with it.

 Pig does not handdle correctly the case where * is passed to UDF
 --

 Key: PIG-597
 URL: https://issues.apache.org/jira/browse/PIG-597
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich

 Script:
 ==
 A = LOAD 'foo' USING PigStorage('\t');
 B = FILTER A BY ARITY(*)  5;
 DUMP B;
 Error:
 =
 2009-01-05 21:46:56,355 [main] ERROR
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
 - Caught error from UDF
 org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray cannot be cast 
 to org.apache.pig.data.Tuple [org.apache.pig.data.DataByteArray cannot be 
 cast to org.apache.pig.data.Tuple]
 Problem:
 ===
 Santhosh tracked this to the following code in POUserFunc.java:
 if(op instanceof POProject 
 op.getResultType() == DataType.TUPLE){
 POProject projOp = (POProject)op;
 if(projOp.isStar()){
 Tuple trslt = (Tuple) temp.result;
 Tuple rslt = (Tuple) res.result;
 for(int i=0;itrslt.size();i++)
 rslt.append(trslt.get(i));
 continue;
 }
 }
 It seems to be unwrapping the tuple before passing it to the function. There 
 is no comments so we are not sure why it is there; will need to run tests to 
 see if removing it would solve this issue and not create others.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.