Re: Replicating a row n times
I suggest you to use `monotonicallyIncreasingId` which is high efficient. But note that the ID it generated will not be consecutive. On Fri, Sep 29, 2017 at 3:21 PM, Kanagha Kumarwrote: > Thanks for the response. > I can use either row_number() or monotonicallyIncreasingId to generate > uniqueIds as in https://hadoopist.wordpress.com/2016/05/24/ > generate-unique-ids-for-each-rows-in-a-spark-dataframe/ > > I'm looking for a java example to use that to replicate a single row n > times by appending a rownum column generated as above or using explode > function. > > Ex: > > ds.withColumn("ROWNUM", org.apache.spark.sql.functions.explode(columnEx)); > > columnEx needs to be of type array inorder for explode to work. > > Any suggestions are helpful. > Thanks > > > On Thu, Sep 28, 2017 at 7:21 PM, ayan guha wrote: > >> How about using row number for primary key? >> >> Select row_number() over (), * from table >> >> On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumar >> wrote: >> >>> Hi, >>> >>> I'm trying to replicate a single row from a dataset n times and create a >>> new dataset from it. But, while replicating I need a column's value to be >>> changed for each replication since it would be end up as the primary key >>> when stored finally. >>> >>> Looked at the following reference:https://stackoverflo >>> w.com/questions/40397740/replicate-spark-row-n-times >>> >>> import org.apache.spark.sql.functions._ >>> val result = singleRowDF >>> .withColumn("dummy", explode(array((1 until 100).map(lit): _*))) >>> .selectExpr(singleRowDF.columns: _*) >>> >>> How can I create a column from an array of values in Java and pass it to >>> explode function? Suggestions are helpful. >>> >>> >>> Thanks >>> Kanagha >>> >> -- >> Best Regards, >> Ayan Guha >> > >
Re: Replicating a row n times
Thanks for the response. I can use either row_number() or monotonicallyIncreasingId to generate uniqueIds as in https://hadoopist.wordpress.com/2016/05/24/generate-unique-ids-for-each-rows-in-a-spark-dataframe/ I'm looking for a java example to use that to replicate a single row n times by appending a rownum column generated as above or using explode function. Ex: ds.withColumn("ROWNUM", org.apache.spark.sql.functions.explode(columnEx)); columnEx needs to be of type array inorder for explode to work. Any suggestions are helpful. Thanks On Thu, Sep 28, 2017 at 7:21 PM, ayan guhawrote: > How about using row number for primary key? > > Select row_number() over (), * from table > > On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumar > wrote: > >> Hi, >> >> I'm trying to replicate a single row from a dataset n times and create a >> new dataset from it. But, while replicating I need a column's value to be >> changed for each replication since it would be end up as the primary key >> when stored finally. >> >> Looked at the following reference:https://stackoverflow.com/questions/ >> 40397740/replicate-spark-row-n-times >> >> import org.apache.spark.sql.functions._ >> val result = singleRowDF >> .withColumn("dummy", explode(array((1 until 100).map(lit): _*))) >> .selectExpr(singleRowDF.columns: _*) >> >> How can I create a column from an array of values in Java and pass it to >> explode function? Suggestions are helpful. >> >> >> Thanks >> Kanagha >> > -- > Best Regards, > Ayan Guha >
Re: Replicating a row n times
How about using row number for primary key? Select row_number() over (), * from table On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumarwrote: > Hi, > > I'm trying to replicate a single row from a dataset n times and create a > new dataset from it. But, while replicating I need a column's value to be > changed for each replication since it would be end up as the primary key > when stored finally. > > Looked at the following reference: > https://stackoverflow.com/questions/40397740/replicate-spark-row-n-times > > import org.apache.spark.sql.functions._ > val result = singleRowDF > .withColumn("dummy", explode(array((1 until 100).map(lit): _*))) > .selectExpr(singleRowDF.columns: _*) > > How can I create a column from an array of values in Java and pass it to > explode function? Suggestions are helpful. > > > Thanks > Kanagha > -- Best Regards, Ayan Guha