This computes the md5 hash of a given column id of Dataset ds:

ds.withColumn("id hash", md5($"id")).show(false)

Test with this Dataset ds:

import org.apache.spark.sql.types._
val ds = spark.range(10).select($"id".cast(StringType))

Available are md5, sha, sha1, sha2 and hash: https://spark.apache.org/docs/2.4.5/api/sql/index.html

Enrico


Am 28.02.20 um 13:56 schrieb Chetan Khatri:
Hi Spark Users,
How can I compute Hash of each row and store in new column at Dataframe, could someone help me.

Thanks



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to