You should look into window functions for spark sql.
Debabrata Ghosh <[email protected]> schrieb am Mo. 12. Feb. 2018 um
13:10:

> Hi,
>                  Greetings !
>
>                  I needed some efficient way in pyspark to execute a
> comparison (on all the attributes) between the current row and the previous
> row. My intent here is to leverage the distributed framework of Spark to
> the best extent so that can achieve a good speed. Please can anyone suggest
> me a suitable algorithm / command. Here is a snapshot of the underlying
> data which I need to compare:
>
> [image: Inline image 1]
>
> Thanks in advance !
>
> D
>

Reply via email to