Greetings !

                 I needed some efficient way in pyspark to execute a
comparison (on all the attributes) between the current row and the previous
row. My intent here is to leverage the distributed framework of Spark to
the best extent so that can achieve a good speed. Please can anyone suggest
me a suitable algorithm / command. Here is a snapshot of the underlying
data which I need to compare:

[image: Inline image 1]

Thanks in advance !


Reply via email to