Jira Issue :
https://issues.apache.org/jira/browse/PIG-1841
<https://issues.apache.org/jira/browse/PIG-200>
On Tue, Feb 1, 2011 at 8:59 PM, Daniel Dai <[email protected]> wrote:
> Oh, I am wrong. SIZE is the right UDF to use. The issue is caused by
> TupleSize, as Eric points out a moment ago.
>
> Daniel
>
>
> Dmitriy Ryaboy wrote:
>
>> Daniel, if that's actually the case we need to fix the javadocs. Cause
>> they are pretty explicit...
>>
>> /**
>> * Find the number of fields in a tuple. Expected input is a tuple,
>> * output is an integer.
>> * @deprecated Use {@link SIZE} instead.
>> */
>> public class ARITY extends EvalFunc<Integer> {
>>
>> On Tue, Feb 1, 2011 at 12:10 PM, Daniel Dai <[email protected]>
>> wrote:
>>
>>
>>> You cannot get size of tuple using SIZE. Use ARITY instead.
>>>
>>> Daniel
>>>
>>> Xavier Stevens wrote:
>>>
>>>
>>>> I've written a regular expression EvalFunc similar to ExtractAll except
>>>> this is called FindAll. It returns a tuple of all strings found that
>>>> match the given pattern. The syntax looks like this:
>>>>
>>>> A = FOREACH raw_data GENERATE FindAll(field_str, '[^/]+') AS a_tuple;
>>>>
>>>> I dumped some return tuples which look something like this:
>>>>
>>>> ((a,b,c,d,e))
>>>>
>>>> I'm trying to get the size of the tuple so I can filter out certain
>>>> entries. If I simply do:
>>>>
>>>> B = FOREACH A GENERATE SIZE(a_tuple);
>>>> DUMP B;
>>>>
>>>> I always get a size of 1. I thought maybe this was due to the
>>>> surrounding bag so I tried to FLATTEN(FindAll(...)). Now I'm getting an
>>>> error from SIZE saying it can't convert a string to a DataBag.
>>>>
>>>> Any idea what's going on here?
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> -Xavier
>>>>
>>>>
>>>>
>>>
>>>
>>
>
--
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.: 55 31 34741485
Lab.: 55 31 34095840