Right.. Russel, the reason DynamicInvokers weren't working is that
InvokeForString expects a function that returns a String. randomUUID
returns a UUID, not a String.
You could of course call this trivially using jruby udfs (less work
than the java version).

D

On Sun, May 27, 2012 at 2:39 PM, Dragan Nedeljkovic <[email protected]> wrote:
> You have to call UUID.randomUUID() to get an UUID, but you cannot use DEFINE
> to do that since DEFINE does not support methods that return arbitrary 
> classes.
>
> Wrapping it into an UDF, works just fine,
>
> package piggybank;
>
> import java.io.IOException;
> import java.util.UUID;
>
> import org.apache.pig.EvalFunc;
> import org.apache.pig.data.Tuple;
>
> public class CreateUUID
> extends EvalFunc<String>
> {
> public String exec(Tuple input)
> throws IOException
> {
> try
> {
> return UUID.randomUUID().toString();
> }
> catch(Exception e)
> {
> // Throwing an exception will cause the task to fail.
> throw new IOException("Something bad happened!", e);
> }
> }
> }
> // eof
>
>
> register 'mypiggybank.jar';
> define CreateUUID piggybank.CreateUUID();
>
> input_lines = LOAD 'test_CreateUUID.in' AS (line:chararray);
> describe input_lines;
> dump input_lines;
>
> new_list = FOREACH input_lines GENERATE line, CreateUUID();
> describe new_list;
> dump new_list;
>
> -- eof
>
>
>>________________________________
>> From: Russell Jurney <[email protected]>
>>To: [email protected]
>>Sent: Sunday, May 27, 2012 4:56:07 PM
>>Subject: Re: Create rdbms like sequence in Pig on Pig Relation
>>
>>It helps, but I am not able to invoke java.util.UUID.toString, maybe
>>because it doesn't take an argument.  This is from the docs:
>>
>>DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String
>>String');
>>encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
>>decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded,
>>'UTF-8');
>>
>>
>>Maybe I forgot, but is this how I do it?
>>
>>DEFINE UUID InvokeForString('java.util.UUID.toString');
>>with_uuid = FOREACH my_stuff generate UUID(), *;
>>
>>
>>Sorry, I only understand example code - not APIs. My Java is quite weak.
>>
>>http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#toString()
>>
>>On Sun, May 27, 2012 at 2:33 AM, Subir S <[email protected]> wrote:
>>
>>> I hope this helps. DynamicInvoker feature in Pig. Added in 0.8.0
>>>
>>>
>>> http://squarecog.wordpress.com/2010/08/20/upcoming-features-in-pig-0-8-dynamic-invokers/
>>>
>>> Thanks
>>>
>>> On 5/24/12, Russell Jurney <[email protected]> wrote:
>>> > Thanks, I mean how do you invoke it directly in grunt> from Pig?
>>> >
>>> > I keep messing it up for the last 30 minutes. Should I check the settings
>>> > on my pacemaker, I feel like Fabio on NyQuil messing with this.
>>> >
>>> > On Wed, May 23, 2012 at 10:19 PM, Subir S <[email protected]>
>>> > wrote:
>>> >
>>> >> Hope this helps ->
>>> >> http://www.javapractices.com/topic/TopicAction.do?Id=56
>>> >>
>>> >> and this ->
>>> >>
>>> >>
>>> http://docs.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#randomUUID%28%29
>>> >>
>>> >> Thanks
>>> >>
>>> >>
>>> >>
>>> >> On Thu, May 24, 2012 at 10:42 AM, Russell Jurney
>>> >> <[email protected]>wrote:
>>> >>
>>> >> > How do you invoke java.util.UUID.randomUUID?  There is no invoker that
>>> >> > doesn't take an arg?
>>> >> >
>>> >> > On Sun, May 20, 2012 at 6:26 PM, Rajesh Balamohan <
>>> >> > [email protected]> wrote:
>>> >> >
>>> >> > > I dont think so. However, its a single line java command. You can
>>> >> create
>>> >> > > customUDF for this and use in your code.
>>> >> > >
>>> >> > > java.util.UUID.randomUUID();
>>> >> > >
>>> >> > > ~Rajesh.B
>>> >> > >
>>> >> > > On Sun, May 20, 2012 at 8:15 AM, DIPESH KUMAR SINGH
>>> >> > > <[email protected]>wrote:
>>> >> > >
>>> >> > > > Thanks Rajesh.
>>> >> > > >
>>> >> > > > Is GUID a built in UDF?
>>> >> > > >
>>> >> > > >
>>> >> > > > --
>>> >> > > > Dipesh
>>> >> > > >
>>> >> > > > On Sun, May 20, 2012 at 8:06 AM, Rajesh Balamohan <
>>> >> > > > [email protected]> wrote:
>>> >> > > >
>>> >> > > > > If you do not bother about sequence number and the intention is
>>> >> > > > > to
>>> >> > > create
>>> >> > > > > just unique key, you can just use GUID which doesn't require any
>>> >> > > > > synchronization at all (all mappers can run in parallel).
>>> >> > > > >
>>> >> > > > > The approached I suggested in earlier mail comes into picture
>>> >> mainly
>>> >> > > for
>>> >> > > > > sequence number.
>>> >> > > > >
>>> >> > > > > ~Rajesh.B
>>> >> > > > >
>>> >> > > > > On Sun, May 20, 2012 at 8:02 AM, Rajesh Balamohan <
>>> >> > > > > [email protected]> wrote:
>>> >> > > > >
>>> >> > > > > > Pig doesn't have that facility yet. Moreover, its not very
>>> >> > efficient
>>> >> > > to
>>> >> > > > > do
>>> >> > > > > > this in PIG/MR as it requires synchronization.
>>> >> > > > > >
>>> >> > > > > > However, if this is unavoidable situation for you, following
>>> >> things
>>> >> > > can
>>> >> > > > > be
>>> >> > > > > > considered
>>> >> > > > > >
>>> >> > > > > > 1. Maintaining the seq number details in zookeeper
>>> >> > > > > > 2. Having a simple structure in HBase table (seqNumber -->
>>> >> Value).
>>> >> > > You
>>> >> > > > > can
>>> >> > > > > > get a bucket of values (ex: 1000-2000) from this and use it in
>>> >> your
>>> >> > > > UDF.
>>> >> > > > > > When the range depletes, you have to query/update HBase table
>>> >> (ex:
>>> >> > > > > > 3000-4000). There are corner cases which needs to be handled.
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > > ~Rajesh.B
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > > On Sat, May 19, 2012 at 12:04 AM, DIPESH KUMAR SINGH <
>>> >> > > > > > [email protected]> wrote:
>>> >> > > > > >
>>> >> > > > > >> Sorry, if my point was not clear.
>>> >> > > > > >>
>>> >> > > > > >> I wish to create a sequence on a pig relation.
>>> >> > > > > >>
>>> >> > > > > >> Say For example i have a relation with data:
>>> >> > > > > >> (John, A-1)
>>> >> > > > > >> (Jack, B-2)
>>> >> > > > > >> (Jim, C-1)
>>> >> > > > > >>
>>> >> > > > > >> I want to create sequence i.e to add one more column to the
>>> >> > > relation,
>>> >> > > > > like
>>> >> > > > > >> a counter and keep on increasing the count for each record
>>> >> > > > > >> read.
>>> >> > > > > Expected
>>> >> > > > > >> output should be something like this:
>>> >> > > > > >>
>>> >> > > > > >> (If 200 is the start sequence. )
>>> >> > > > > >> (John, A-1, 201)
>>> >> > > > > >> (Jack, B-2, 202)
>>> >> > > > > >> (Jim, C-1, 203)
>>> >> > > > > >>
>>> >> > > > > >> Could you please suggest to proceed on this?
>>> >> > > > > >>
>>> >> > > > > >> Thanks,
>>> >> > > > > >> Dipesh
>>> >> > > > > >>
>>> >> > > > > >> On Fri, May 18, 2012 at 6:50 AM, Thejas Nair <
>>> >> > > [email protected]>
>>> >> > > > > >> wrote:
>>> >> > > > > >>
>>> >> > > > > >> > What do you mean by 'rdbms like sequence' ?
>>> >> > > > > >> > Thanks,
>>> >> > > > > >> > Thejas
>>> >> > > > > >> >
>>> >> > > > > >> >
>>> >> > > > > >> > On 5/16/12 10:41 AM, DIPESH KUMAR SINGH wrote:
>>> >> > > > > >> >
>>> >> > > > > >> >> I want to create a rdbms like sequence on a Pig relation.
>>> >> > > > > >> >>
>>> >> > > > > >> >> Is there any existing UDF which could do this?
>>> >> > > > > >> >>
>>> >> > > > > >> >> I am bit new to pig, Kindly suggest how to proceed?
>>> >> > > > > >> >>
>>> >> > > > > >> >>
>>> >> > > > > >> >> Thanks&  Regards,
>>> >> > > > > >> >>
>>> >> > > > > >> >
>>> >> > > > > >> >
>>> >> > > > > >>
>>> >> > > > > >>
>>> >> > > > > >> --
>>> >> > > > > >> Dipesh Kr. Singh
>>> >> > > > > >>
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > > --
>>> >> > > > > > ~Rajesh.B
>>> >> > > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > --
>>> >> > > > > ~Rajesh.B
>>> >> > > > >
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > --
>>> >> > > > Dipesh Kr. Singh
>>> >> > > >
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > ~Rajesh.B
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Russell Jurney twitter.com/rjurney [email protected]
>>> >> > datasyndrome.com
>>> >> >
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Russell Jurney twitter.com/rjurney [email protected]
>>> > datasyndrome.com
>>> >
>>>
>>
>>
>>
>>--
>>Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
>>
>>
>>

Reply via email to