Thanks for the reply. I understand that Tuple can have more than one field. 
That is why I was expecting Tuple.getAll to return me all the fields in the 
Tuple. But as it turns out it returns a Tuple.  That made me think that may be 
Tuple.getAll returns all the Tuples in the Tuple, but a Tuple like this is not 
valid, right?
 
((1,2,3),(4,5,6))
 
It should be enlosed in a bag like {(1,2,3),(4,5,6)}. Or, may be I am confusing 
things? 
 
Thanks!

--- On Thu, 11/5/09, Jeff Zhang <[email protected]> wrote:


From: Jeff Zhang <[email protected]>
Subject: Re: Accessing fields in Tuple
To: [email protected]
Date: Thursday, November 5, 2009, 7:44 PM


The input is the arguments you provide to your UDF. It is tuple type.  Tuple
can have more than more than one element. That means your UDF can have more
than one argument.  Here you provide one argument which is tuple type to
your UDF.
So that means the first element of input is a tuple.


Jeff Zhang


On Thu, Nov 5, 2009 at 2:23 AM, Kelvin Moss <[email protected]> wrote:

> Hi all,
>
> I have the follwoing data file
>
> (1L,2L,3L)
> (4L,2L,1L)
> (8L,3L,4L)
>
> I am trying to write a UDF (like sum) that would add the fields in Tuple.
> This works --
>
> public class SumAll extends EvalFunc<Long> {
> public Long exec(Tuple input) {
> try {
> return sum(input);
> } catch (NumberFormatException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> } catch (ExecException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> }
> return 0L;
> }
>
> static protected Long sum(Tuple input) throws ExecException,
> NumberFormatException {
>      long sum = 0;
>
>      List<Object> values = input.getAll();
>      for (Iterator<Object> it = values.iterator(); it.hasNext();) {
>          Tuple t = (Tuple)it.next();
>          sum += (Long)t.get(0);
>          sum += (Long)t.get(1);
>          sum += (Long)t.get(2);
>       }
>       return sum;
> }
>
> }
>
> grunt> A = LOAD 'data2' as aa:bytearray;
> grunt> C = FOREACH A GENERATE UDF.SumAll((tuple(long,long,long))aa);
> grunt> dump C;
> 2009-11-05 10:07:09,266 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully
> stored result in: "file:/tmp/temp1206478472/tmp-577036369"
> 2009-11-05 10:07:09,267 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records
> written : 3
> 2009-11-05 10:07:09,267 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes
> written : 0
> 2009-11-05 10:07:09,267 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100%
> complete!
> 2009-11-05 10:07:09,267 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> (6L)
> (7L)
> (15L)
> grunt>
>
> Initially I thought that such a loop would work
>
> static protected Long sum(Tuple input) throws ExecException,
> NumberFormatException {
> long sum = 0;
>
> List<Object> values = input.getAll(); // Would give all fields in Tuple??
> for (Iterator<Object> it = values.iterator(); it.hasNext();) {
>     sum += (Long)t;
> }
> return sum;
> }
>
> But I get an error that Tuple can't be cast back to Long. So my question is
> that what is input.getAll() returning? What is the structure of data that
> gets passed to exec function?
>
> Thanks!
>
>
>



      

Reply via email to