Hi Yanbo,
You mean pyspark.mllib.recommendation right? That is the one used in the 
official tutorial.

Thank you,

From: Yanbo Liang <yblia...@gmail.com<mailto:yblia...@gmail.com>>
Date: Friday, 4 December 2015 03:17
To: Felix Cheung <felixcheun...@hotmail.com<mailto:felixcheun...@hotmail.com>>
Cc: Roberto Pagliari 
<roberto.pagli...@asos.com<mailto:roberto.pagli...@asos.com>>, 
"user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: Python API Documentation Mismatch

Hi Roberto,

There are two ALS available: 
ml.recommendation.ALS<http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#module-pyspark.ml.recommendation>
 and 
mllib.recommendation.ALS<http://spark.apache.org/docs/latest/api/python/pyspark.mllib.html#module-pyspark.mllib.recommendation>
 .
They have different usage and methods. I know it's confusion that Spark provide 
two version of the same algorithm. I strongly recommend to use the ALS 
algorithm at ML package.

Yanbo

2015-12-04 1:31 GMT+08:00 Felix Cheung 
<felixcheun...@hotmail.com<mailto:felixcheun...@hotmail.com>>:
Please open an issue in JIRA, thanks!





On Thu, Dec 3, 2015 at 3:03 AM -0800, "Roberto Pagliari" 
<roberto.pagli...@asos.com<mailto:roberto.pagli...@asos.com>> wrote:

Hello,
I believe there is a mismatch between the API documentation (1.5.2) and the 
software currently available.

Not all functions mentioned here
http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#module-pyspark.ml.recommendation

are, in fact available. For example, the code below from the tutorial works

# Build the recommendation model using Alternating Least Squaresrank = 
10numIterations = 10model = ALS.train(ratings, rank, numIterations)

While the alternative shown in the API documentation will not (it will complain 
that ALS takes no arguments. Also, but inspecting the module with Python 
utilities I could not find several methods mentioned in the API docs)

>>> df = sqlContext.createDataFrame(...     [(0, 0, 4.0), (0, 1, 2.0), (1, 1, 
>>> 3.0), (1, 2, 4.0), (2, 1, 1.0), (2, 2, 5.0)],...     ["user", "item", 
>>> "rating"])>>> als = ALS(rank=10, maxIter=5)>>> model = als.fit(df)


Thank you,


Reply via email to