Hi Kevin,

The inputs parameters to the udf are wrapped inside a tuple and then given
as input to the execu function in the udf.
In case of -
>> grunt> C = FOREACH A GENERATE UDF.SumAll((tuple(long,long,long))aa);
The exec function gets a Tuple with one column which is a
tuple(long,long,long)
ie in  exec(Tuple input), input.get(0) will return tuple(long,long,long) .

On the other hand if you called the udf this way -
>> grunt> C = FOREACH A GENERATE UDF.SumAll((long)a1,(chararray)a2);
 in  exec(Tuple input), input.get(0) will return long, input.get(1) will
return chararray.

I hope this answers you question.

Thanks,
Thejas




On 11/5/09 9:15 PM, "Kelvin Moss" <[email protected]> wrote:

>  
> Thanks for the reply. I understand that Tuple can have more than one field.
> That is why I was expecting Tuple.getAll to return me all the fields in the
> Tuple. But as it turns out it returns a Tuple.  That made me think that may be
> Tuple.getAll returns all the Tuples in the Tuple, but a Tuple like this is not
> valid, right?
>  
> ((1,2,3),(4,5,6))
>  
> It should be enlosed in a bag like {(1,2,3),(4,5,6)}. Or, may be I am
> confusing things? 
>  
> Thanks!
> 
> --- On Thu, 11/5/09, Jeff Zhang <[email protected]> wrote:
> 
> 
> From: Jeff Zhang <[email protected]>
> Subject: Re: Accessing fields in Tuple
> To: [email protected]
> Date: Thursday, November 5, 2009, 7:44 PM
> 
> 
> The input is the arguments you provide to your UDF. It is tuple type.  Tuple
> can have more than more than one element. That means your UDF can have more
> than one argument.  Here you provide one argument which is tuple type to
> your UDF.
> So that means the first element of input is a tuple.
> 
> 
> Jeff Zhang
> 
> 
> On Thu, Nov 5, 2009 at 2:23 AM, Kelvin Moss <[email protected]> wrote:
> 
>> Hi all,
>> 
>> I have the follwoing data file
>> 
>> (1L,2L,3L)
>> (4L,2L,1L)
>> (8L,3L,4L)
>> 
>> I am trying to write a UDF (like sum) that would add the fields in Tuple.
>> This works --
>> 
>> public class SumAll extends EvalFunc<Long> {
>> public Long exec(Tuple input) {
>> try {
>> return sum(input);
>> } catch (NumberFormatException e) {
>> // TODO Auto-generated catch block
>> e.printStackTrace();
>> } catch (ExecException e) {
>> // TODO Auto-generated catch block
>> e.printStackTrace();
>> }
>> return 0L;
>> }
>> 
>> static protected Long sum(Tuple input) throws ExecException,
>> NumberFormatException {
>>       long sum = 0;
>> 
>>       List<Object> values = input.getAll();
>>       for (Iterator<Object> it = values.iterator(); it.hasNext();) {
>>           Tuple t = (Tuple)it.next();
>>           sum += (Long)t.get(0);
>>           sum += (Long)t.get(1);
>>           sum += (Long)t.get(2);
>>        }
>>        return sum;
>> }
>> 
>> }
>> 
>> grunt> A = LOAD 'data2' as aa:bytearray;
>> grunt> C = FOREACH A GENERATE UDF.SumAll((tuple(long,long,long))aa);
>> grunt> dump C;
>> 2009-11-05 10:07:09,266 [main] INFO
>> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully
>> stored result in: "file:/tmp/temp1206478472/tmp-577036369"
>> 2009-11-05 10:07:09,267 [main] INFO
>> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records
>> written : 3
>> 2009-11-05 10:07:09,267 [main] INFO
>> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes
>> written : 0
>> 2009-11-05 10:07:09,267 [main] INFO
>> org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100%
>> complete!
>> 2009-11-05 10:07:09,267 [main] INFO
>> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
>> (6L)
>> (7L)
>> (15L)
>> grunt>
>> 
>> Initially I thought that such a loop would work
>> 
>> static protected Long sum(Tuple input) throws ExecException,
>> NumberFormatException {
>> long sum = 0;
>> 
>> List<Object> values = input.getAll(); // Would give all fields in Tuple??
>> for (Iterator<Object> it = values.iterator(); it.hasNext();) {
>>      sum += (Long)t;
>> }
>> return sum;
>> }
>> 
>> But I get an error that Tuple can't be cast back to Long. So my question is
>> that what is input.getAll() returning? What is the structure of data that
>> gets passed to exec function?
>> 
>> Thanks!
>> 
>> 
>> 
> 
> 
> 

Reply via email to