Re: Issue with PySpark UDF on a column of Vectors

2015-06-18 Thread Xiangrui Meng
ts passed a Python > tuple like this: > > (1, None, None, [9.7, 1.0, -3.2]) > > Is it not possible to use UDFs on DataFrame columns of Vectors? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-PySpark-UDF-on-a

Issue with PySpark UDF on a column of Vectors

2015-06-18 Thread calstad
, -3.2]) Is it not possible to use UDFs on DataFrame columns of Vectors? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-PySpark-UDF-on-a-column-of-Vectors-tp23393.html Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Issue with PySpark UDF on a column of Vectors

2015-06-17 Thread Colin Alstad
I am having trouble using a UDF on a column of Vectors in PySpark which can be illustrated here: from pyspark import SparkContext from pyspark.sql import Row from pyspark.sql.types import DoubleType from pyspark.sql.functions import udf from pyspark.mllib.linalg import Vectors FeatureRow = Row('i