Re: Spark 1.3.1 - SQL Issues
Thanks a bunch On 21 May 2015 07:11, "Davies Liu" wrote: > The docs had been updated. > > You should convert the DataFrame to RDD by `df.rdd` > > On Mon, Apr 20, 2015 at 5:23 AM, ayan guha wrote: > > Hi > > Just upgraded to Spark 1.3.1. > > > > I am getting an warning > > > > Warning (from warnings module): > > File > > > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\sql\context.py", > > line 191 > > warnings.warn("inferSchema is deprecated, please use createDataFrame > > instead") > > UserWarning: inferSchema is deprecated, please use createDataFrame > instead > > > > However, documentation still says to use inferSchema. > > Here: http://spark.apache.org/docs/latest/sql-programming-guide.htm in > > section > > > > Also, I am getting an error in mlib.ALS.train function when passing > > dataframe (do I need to convert the DF to RDD?) > > > > Code: > > training = ssc.sql("select userId,movieId,rating from ratings where > > partitionKey < 6").cache() > > print type(training) > > model = ALS.train(training,rank,numIter,lmbda) > > > > Error: > > > > Rank:8 Lmbda:1.0 iteration:10 > > > > Traceback (most recent call last): > > File "D:\Project\Spark\code\movie_sql.py", line 109, in > > bestConf = > getBestModel(sc,ssc,training,validation,validationNoRating) > > File "D:\Project\Spark\code\movie_sql.py", line 54, in getBestModel > > model = ALS.train(trainingRDD,rank,numIter,lmbda) > > File > > > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\mllib\recommendation.py", > > line 139, in train > > model = callMLlibFunc("trainALSModel", cls._prepare(ratings), rank, > > iterations, > > File > > > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\mllib\recommendation.py", > > line 127, in _prepare > > assert isinstance(ratings, RDD), "ratings should be RDD" > > AssertionError: ratings should be RDD > > > > -- > > Best Regards, > > Ayan Guha >
Re: Spark 1.3.1 - SQL Issues
The docs had been updated. You should convert the DataFrame to RDD by `df.rdd` On Mon, Apr 20, 2015 at 5:23 AM, ayan guha wrote: > Hi > Just upgraded to Spark 1.3.1. > > I am getting an warning > > Warning (from warnings module): > File > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\sql\context.py", > line 191 > warnings.warn("inferSchema is deprecated, please use createDataFrame > instead") > UserWarning: inferSchema is deprecated, please use createDataFrame instead > > However, documentation still says to use inferSchema. > Here: http://spark.apache.org/docs/latest/sql-programming-guide.htm in > section > > Also, I am getting an error in mlib.ALS.train function when passing > dataframe (do I need to convert the DF to RDD?) > > Code: > training = ssc.sql("select userId,movieId,rating from ratings where > partitionKey < 6").cache() > print type(training) > model = ALS.train(training,rank,numIter,lmbda) > > Error: > > Rank:8 Lmbda:1.0 iteration:10 > > Traceback (most recent call last): > File "D:\Project\Spark\code\movie_sql.py", line 109, in > bestConf = getBestModel(sc,ssc,training,validation,validationNoRating) > File "D:\Project\Spark\code\movie_sql.py", line 54, in getBestModel > model = ALS.train(trainingRDD,rank,numIter,lmbda) > File > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\mllib\recommendation.py", > line 139, in train > model = callMLlibFunc("trainALSModel", cls._prepare(ratings), rank, > iterations, > File > "D:\spark\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\spark-1.3.1-bin-hadoop2.6\python\pyspark\mllib\recommendation.py", > line 127, in _prepare > assert isinstance(ratings, RDD), "ratings should be RDD" > AssertionError: ratings should be RDD > > -- > Best Regards, > Ayan Guha - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org