I am having trouble using a UDF on a column of Vectors in PySpark which can
be illustrated here:
from pyspark import SparkContext
from pyspark.sql import Row
from pyspark.sql.types import DoubleType
from pyspark.sql.functions import udf
from pyspark.mllib.linalg import Vectors
FeatureRow = Row('i
I have a Spark DataFrame that looks like:
| id | value | bin |
|+---+-|
| 1 | 3.4 | 2 |
| 2 | 2.6 | 1 |
| 3 | 1.8 | 1 |
| 4 | 9.6 | 2 |
I have a function `f` that takes an array of values and returns a number. I
want to add a column to the