[ 
https://issues.apache.org/jira/browse/PIG-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895835#action_12895835
 ] 

Olga Natkovich commented on PIG-565:
------------------------------------

DIFF handles all the types as expected. However, the documentation in 0.7.0 
release is slightly of. 

Current docs: "The DIFF function compares two fields in a tuple. If the field 
values match, null is returned. If the field values do not match, the 
non-matching elements are returned."

Should say something like: 

"DIFF takes two bags as arguments and compares them.   Any tuples that are in 
one bag but not the other are returned in a bag. If the bags match an empty bag 
is returned.  If the fields are not bags then they will be wrapped in tuples 
and returned in a bag if they do not match, or an empty bag will be returned if 
the two records match. The implementation assumes that both bags being passed 
to this function will fit entirely into memory simultaneously.  If that is not 
the case the UDF will still function, but it will be <strong>very</strong> 
slow."

I will reassign this bug to Corinne once I am done with it.




> Several builting functions no longer support bytearray
> ------------------------------------------------------
>
>                 Key: PIG-565
>                 URL: https://issues.apache.org/jira/browse/PIG-565
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>             Fix For: 0.8.0
>
>
> ARITY
> DIFF
> TOKENIZE
> All we need to do is to add lookup tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to