OK. I've worked it out.

df.withColumn('diff', col('A')-col('B'))

On Sun, May 7, 2017 at 11:49 AM, Zeming Yu <zemin...@gmail.com> wrote:

> Say I have the following dataframe with two numeric columns A and B,
> what's the best way to add a column showing the difference between the two
> columns?
>
> +-----------------+----------+
> |                A|         B|
> +-----------------+----------+
> |786.3199999999999|    786.12|
> |           786.12|    786.12|
> |           786.42|    786.12|
> |           786.72|    786.12|
> |           786.92|    786.12|
> |           786.92|    786.12|
> |           786.72|    786.12|
> |           786.72|    786.12|
> |           827.72|    786.02|
> |           827.72|    786.02|
> +-----------------+----------+
>
>
> I could probably figure out how to do this vis UDF, but is UDF generally 
> slower?
>
>
> Thanks!
>
>

Reply via email to