Efficient way to compare the current row with previous row contents

Debabrata Ghosh Mon, 12 Feb 2018 04:10:29 -0800

Hi,
                 Greetings !

                 I needed some efficient way in pyspark to execute a
comparison (on all the attributes) between the current row and the previous
row. My intent here is to leverage the distributed framework of Spark to
the best extent so that can achieve a good speed. Please can anyone suggest
me a suitable algorithm / command. Here is a snapshot of the underlying
data which I need to compare:


[image: Inline image 1]

Thanks in advance !

D

Efficient way to compare the current row with previous row contents

Reply via email to