some additional context: Since, I am using features of spark 1.3.0, I have downloaded spark 1.3.0 and used spark-submit from there. The cluster is still on spark-1.2.0.
So, this looks to me that at runtime, the executors could not find some libraries of spark-1.3.0, even though I ran spark-submit from my downloaded spark-1.30. On Apr 6, 2015, at 1:37 PM, Jay Katukuri <jkatuk...@apple.com> wrote: > Here is the command that I have used : > > spark-submit —class packagename.ALSNew --num-executors 100 --master yarn > ALSNew.jar -jar spark-sql_2.11-1.3.0.jar hdfs://input_path > > Btw - I could run the old ALS in mllib package. > > > > > > On Apr 6, 2015, at 12:32 PM, Xiangrui Meng <men...@gmail.com> wrote: > >> So ALSNew.scala is your own application, did you add it with >> spark-submit or spark-shell? The correct command should like >> >> spark-submit --class your.package.name.ALSNew ALSNew.jar [options] >> >> Please check the documentation: >> http://spark.apache.org/docs/latest/submitting-applications.html >> >> -Xiangrui >> >> On Mon, Apr 6, 2015 at 12:27 PM, Jay Katukuri <jkatuk...@apple.com> wrote: >>> Hi, >>> >>> Here is the stack trace: >>> >>> >>> Exception in thread "main" java.lang.NoSuchMethodError: >>> scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror; >>> at ALSNew$.main(ALSNew.scala:35) >>> at ALSNew.main(ALSNew.scala) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:483) >>> at >>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) >>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) >>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) >>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> >>> >>> Thanks, >>> Jay >>> >>> >>> >>> On Apr 6, 2015, at 12:24 PM, Xiangrui Meng <men...@gmail.com> wrote: >>> >>> Please attach the full stack trace. -Xiangrui >>> >>> On Mon, Apr 6, 2015 at 12:06 PM, Jay Katukuri <jkatuk...@apple.com> wrote: >>> >>> >>> Hi all, >>> >>> I got a runtime error while running the ALS. >>> >>> Exception in thread "main" java.lang.NoSuchMethodError: >>> scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror; >>> >>> >>> The error that I am getting is at the following code: >>> >>> val ratings = purchase.map ( line => >>> line.split(',') match { case Array(user, item, rate) => >>> (user.toInt, item.toInt, rate.toFloat) >>> }).toDF() >>> >>> >>> Any help is appreciated ! >>> >>> I have tried passing the spark-sql jar using the -jar >>> spark-sql_2.11-1.3.0.jar >>> >>> Thanks, >>> Jay >>> >>> >>> >>> On Mar 17, 2015, at 12:50 PM, Xiangrui Meng <men...@gmail.com> wrote: >>> >>> Please remember to copy the user list next time. I might not be able >>> to respond quickly. There are many others who can help or who can >>> benefit from the discussion. Thanks! -Xiangrui >>> >>> On Tue, Mar 17, 2015 at 12:04 PM, Jay Katukuri <jkatuk...@apple.com> wrote: >>> >>> Great Xiangrui. It works now. >>> >>> Sorry that I needed to bug you :) >>> >>> Jay >>> >>> >>> On Mar 17, 2015, at 11:48 AM, Xiangrui Meng <men...@gmail.com> wrote: >>> >>> Please check this section in the user guide: >>> http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection >>> >>> You need `import sqlContext.implicits._` to use `toDF()`. >>> >>> -Xiangrui >>> >>> On Mon, Mar 16, 2015 at 2:34 PM, Jay Katukuri <jkatuk...@apple.com> wrote: >>> >>> Hi Xiangrui, >>> Thanks a lot for the quick reply. >>> >>> I am still facing an issue. >>> >>> I have tried the code snippet that you have suggested: >>> >>> val ratings = purchase.map { line => >>> line.split(',') match { case Array(user, item, rate) => >>> (user.toInt, item.toInt, rate.toFloat) >>> }.toDF("user", "item", "rate”)} >>> >>> for this, I got the below error: >>> >>> error: ';' expected but '.' found. >>> [INFO] }.toDF("user", "item", "rate”)} >>> [INFO] ^ >>> >>> when I tried below code >>> >>> val ratings = purchase.map ( line => >>> line.split(',') match { case Array(user, item, rate) => >>> (user.toInt, item.toInt, rate.toFloat) >>> }).toDF("user", "item", "rate") >>> >>> >>> error: value toDF is not a member of org.apache.spark.rdd.RDD[(Int, Int, >>> Float)] >>> [INFO] possible cause: maybe a semicolon is missing before `value toDF'? >>> [INFO] }).toDF("user", "item", "rate") >>> >>> >>> >>> I have looked at the document that you have shared and tried the following >>> code: >>> >>> case class Record(user: Int, item: Int, rate:Double) >>> val ratings = purchase.map(_.split(',')).map(r =>Record(r(0).toInt, >>> r(1).toInt, r(2).toDouble)) .toDF("user", "item", "rate") >>> >>> for this, I got the below error: >>> >>> error: value toDF is not a member of org.apache.spark.rdd.RDD[Record] >>> >>> >>> Appreciate your help ! >>> >>> Thanks, >>> Jay >>> >>> >>> On Mar 16, 2015, at 11:35 AM, Xiangrui Meng <men...@gmail.com> wrote: >>> >>> Try this: >>> >>> val ratings = purchase.map { line => >>> line.split(',') match { case Array(user, item, rate) => >>> (user.toInt, item.toInt, rate.toFloat) >>> }.toDF("user", "item", "rate") >>> >>> Doc for DataFrames: >>> http://spark.apache.org/docs/latest/sql-programming-guide.html >>> >>> -Xiangrui >>> >>> On Mon, Mar 16, 2015 at 9:08 AM, jaykatukuri <jkatuk...@apple.com> wrote: >>> >>> Hi all, >>> I am trying to use the new ALS implementation under >>> org.apache.spark.ml.recommendation.ALS. >>> >>> >>> >>> The new method to invoke for training seems to be override def fit(dataset: >>> DataFrame, paramMap: ParamMap): ALSModel. >>> >>> How do I create a dataframe object from ratings data set that is on hdfs ? >>> >>> >>> where as the method in the old ALS implementation under >>> org.apache.spark.mllib.recommendation.ALS was >>> def train( >>> ratings: RDD[Rating], >>> rank: Int, >>> iterations: Int, >>> lambda: Double, >>> blocks: Int, >>> seed: Long >>> ): MatrixFactorizationModel >>> >>> My code to run the old ALS train method is as below: >>> >>> "val sc = new SparkContext(conf) >>> >>> val pfile = args(0) >>> val purchase=sc.textFile(pfile) >>> val ratings = purchase.map(_.split(',') match { case Array(user, item, >>> rate) => >>> Rating(user.toInt, item.toInt, rate.toInt) >>> }) >>> >>> val model = ALS.train(ratings, rank, numIterations, 0.01)" >>> >>> >>> Now, for the new ALS fit method, I am trying to use the below code to run, >>> but getting a compilation error: >>> >>> val als = new ALS() >>> .setRank(rank) >>> .setRegParam(regParam) >>> .setImplicitPrefs(implicitPrefs) >>> .setNumUserBlocks(numUserBlocks) >>> .setNumItemBlocks(numItemBlocks) >>> >>> val sc = new SparkContext(conf) >>> >>> val pfile = args(0) >>> val purchase=sc.textFile(pfile) >>> val ratings = purchase.map(_.split(',') match { case Array(user, item, >>> rate) => >>> Rating(user.toInt, item.toInt, rate.toInt) >>> }) >>> >>> val model = als.fit(ratings.toDF()) >>> >>> I get an error that the method toDF() is not a member of >>> org.apache.spark.rdd.RDD[org.apache.spark.ml.recommendation.ALS.Rating[Int]]. >>> >>> Appreciate the help ! >>> >>> Thanks, >>> Jay >>> >>> >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/RDD-to-DataFrame-for-using-ALS-under-org-apache-spark-ml-recommendation-ALS-tp22083.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>> >>> >>> >