Thanks, I try to make but i can't. JavaPairRDD<String, Vector> unlabeledTest, the vector is Dence vector. I add import org.apache.spark.sql.SQLContext.implicits$ but there is no method toDf(), I am using Java not Scala.
2015-09-18 20:02 GMT+03:00 Feynman Liang <fli...@databricks.com>: > What is the type of unlabeledTest? > > SQL should be using the VectorUDT we've defined for Vectors > <https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala#L183> > so > you should be able to just "import sqlContext.implicits._" and then call > "rdd.toDf()" on your RDD to convert it into a dataframe. > > On Fri, Sep 18, 2015 at 7:32 AM, Yasemin Kaya <godo...@gmail.com> wrote: > >> Hi, >> >> I am using *spark 1.5, ML Pipeline Decision Tree >> <http://spark.apache.org/docs/latest/ml-decision-tree.html#output-columns>* >> to get tree's probability. But I have to convert my data to Dataframe type. >> While creating model there is no problem but when I am using model on my >> data there is a problem about converting to data frame type. My data type >> is *JavaPairRDD<String, Vector>* , when I am creating dataframe >> >> DataFrame production = sqlContext.createDataFrame( >> unlabeledTest.values(), Vector.class); >> >> *Error says me: * >> Exception in thread "main" java.lang.ClassCastException: >> org.apache.spark.mllib.linalg.VectorUDT cannot be cast to >> org.apache.spark.sql.types.StructType >> >> I know if I give LabeledPoint type, there will be no problem. But the >> data have no label, I wanna predict the label because of this reason I use >> model on it. >> >> Is there way to handle my problem? >> Thanks. >> >> >> Best, >> yasemin >> -- >> hiç ender hiç >> > > -- hiç ender hiç