Please check this section in the user guide: http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection
You need `import sqlContext.implicits._` to use `toDF()`. -Xiangrui On Mon, Mar 16, 2015 at 2:34 PM, Jay Katukuri <jkatuk...@apple.com> wrote: > Hi Xiangrui, > Thanks a lot for the quick reply. > > I am still facing an issue. > > I have tried the code snippet that you have suggested: > > val ratings = purchase.map { line => > line.split(',') match { case Array(user, item, rate) => > (user.toInt, item.toInt, rate.toFloat) > }.toDF("user", "item", "rate”)} > > for this, I got the below error: > > error: ';' expected but '.' found. > [INFO] }.toDF("user", "item", "rate”)} > [INFO] ^ > > when I tried below code > > val ratings = purchase.map ( line => > line.split(',') match { case Array(user, item, rate) => > (user.toInt, item.toInt, rate.toFloat) > }).toDF("user", "item", "rate") > > > error: value toDF is not a member of org.apache.spark.rdd.RDD[(Int, Int, > Float)] > [INFO] possible cause: maybe a semicolon is missing before `value toDF'? > [INFO] }).toDF("user", "item", "rate") > > > > I have looked at the document that you have shared and tried the following > code: > > case class Record(user: Int, item: Int, rate:Double) > val ratings = purchase.map(_.split(',')).map(r =>Record(r(0).toInt, > r(1).toInt, r(2).toDouble)) .toDF("user", "item", "rate") > > for this, I got the below error: > > error: value toDF is not a member of org.apache.spark.rdd.RDD[Record] > > > Appreciate your help ! > > Thanks, > Jay > > > On Mar 16, 2015, at 11:35 AM, Xiangrui Meng <men...@gmail.com> wrote: > > Try this: > > val ratings = purchase.map { line => > line.split(',') match { case Array(user, item, rate) => > (user.toInt, item.toInt, rate.toFloat) > }.toDF("user", "item", "rate") > > Doc for DataFrames: > http://spark.apache.org/docs/latest/sql-programming-guide.html > > -Xiangrui > > On Mon, Mar 16, 2015 at 9:08 AM, jaykatukuri <jkatuk...@apple.com> wrote: > > Hi all, > I am trying to use the new ALS implementation under > org.apache.spark.ml.recommendation.ALS. > > > > The new method to invoke for training seems to be override def fit(dataset: > DataFrame, paramMap: ParamMap): ALSModel. > > How do I create a dataframe object from ratings data set that is on hdfs ? > > > where as the method in the old ALS implementation under > org.apache.spark.mllib.recommendation.ALS was > def train( > ratings: RDD[Rating], > rank: Int, > iterations: Int, > lambda: Double, > blocks: Int, > seed: Long > ): MatrixFactorizationModel > > My code to run the old ALS train method is as below: > > "val sc = new SparkContext(conf) > > val pfile = args(0) > val purchase=sc.textFile(pfile) > val ratings = purchase.map(_.split(',') match { case Array(user, item, > rate) => > Rating(user.toInt, item.toInt, rate.toInt) > }) > > val model = ALS.train(ratings, rank, numIterations, 0.01)" > > > Now, for the new ALS fit method, I am trying to use the below code to run, > but getting a compilation error: > > val als = new ALS() > .setRank(rank) > .setRegParam(regParam) > .setImplicitPrefs(implicitPrefs) > .setNumUserBlocks(numUserBlocks) > .setNumItemBlocks(numItemBlocks) > > val sc = new SparkContext(conf) > > val pfile = args(0) > val purchase=sc.textFile(pfile) > val ratings = purchase.map(_.split(',') match { case Array(user, item, > rate) => > Rating(user.toInt, item.toInt, rate.toInt) > }) > > val model = als.fit(ratings.toDF()) > > I get an error that the method toDF() is not a member of > org.apache.spark.rdd.RDD[org.apache.spark.ml.recommendation.ALS.Rating[Int]]. > > Appreciate the help ! > > Thanks, > Jay > > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/RDD-to-DataFrame-for-using-ALS-under-org-apache-spark-ml-recommendation-ALS-tp22083.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org