Mike, Any suggestions on doing it for consequitive id's? On Aug 5, 2016 9:08 AM, "Tony Lane" <[email protected]> wrote:
> Mike. > > I have figured how to do this . Thanks for the suggestion. It works > great. I am trying to figure out the performance impact of this. > > thanks again > > > On Fri, Aug 5, 2016 at 9:25 PM, Tony Lane <[email protected]> wrote: > >> @mike - this looks great. How can i do this in java ? what is the >> performance implication on a large dataset ? >> >> @sonal - I can't have a collision in the values. >> >> On Fri, Aug 5, 2016 at 9:15 PM, Mike Metzger <[email protected]> >> wrote: >> >>> You can use the monotonically_increasing_id method to generate >>> guaranteed unique (but not necessarily consecutive) IDs. Calling something >>> like: >>> >>> df.withColumn("id", monotonically_increasing_id()) >>> >>> You don't mention which language you're using but you'll need to pull in >>> the sql.functions library. >>> >>> Mike >>> >>> On Aug 5, 2016, at 9:11 AM, Tony Lane <[email protected]> wrote: >>> >>> Ayan - basically i have a dataset with structure, where bid are unique >>> string values >>> >>> bid: String >>> val : integer >>> >>> I need unique int values for these string bid''s to do some processing >>> in the dataset >>> >>> like >>> >>> id:int (unique integer id for each bid) >>> bid:String >>> val:integer >>> >>> >>> >>> -Tony >>> >>> On Fri, Aug 5, 2016 at 6:35 PM, ayan guha <[email protected]> wrote: >>> >>>> Hi >>>> >>>> Can you explain a little further? >>>> >>>> best >>>> Ayan >>>> >>>> On Fri, Aug 5, 2016 at 10:14 PM, Tony Lane <[email protected]> >>>> wrote: >>>> >>>>> I have a row with structure like >>>>> >>>>> identifier: String >>>>> value: int >>>>> >>>>> All identifier are unique and I want to generate a unique long id for >>>>> the data and get a row object back for further processing. >>>>> >>>>> I understand using the zipWithUniqueId function on RDD, but that would >>>>> mean first converting to RDD and then joining back the RDD and dataset >>>>> >>>>> What is the best way to do this ? >>>>> >>>>> -Tony >>>>> >>>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Ayan Guha >>>> >>> >>> >> >
