There is a way, you can use
org.apache.spark.sql.functions.monotonicallyIncreasingId it will give each rows
of your dataframe a unique Id
On Tue, Oct 18, 2016 10:36 AM, ayan guha guha.a...@gmail.com
wrote:
Do you have any primary key or unique identifier in your data? Even if multiple
column
Do you have any primary key or unique identifier in your data? Even if
multiple columns can make a composite key? In other words, can your data
have exactly same 2 rows with different unique ID? Also, do you have to
have numeric ID?
You may want to pursue hashing algorithm such as sha group to con
Can any one help me out
On Mon, Oct 17, 2016 at 7:27 PM, Saurav Sinha
wrote:
> Hi,
>
> I am in situation where I want to generate unique Id for each row.
>
> I have use monotonicallyIncreasingId but it is giving increasing values
> and start generating from start if it fail.
>
> I have two quest
Hi,
I am in situation where I want to generate unique Id for each row.
I have use monotonicallyIncreasingId but it is giving increasing values and
start generating from start if it fail.
I have two question here:
Q1. Does this method give me unique id even in failure situation becaue I
want to