Github user debasish83 commented on the pull request:
https://github.com/apache/spark/pull/3098#issuecomment-88346990
I reran the map computation on MovieLens with varying ranks:
Example run:
./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class
org.apache.spark.examples.mllib.MovieLensALS --jars
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g
./examples/target/spark-examples_2.10-1.4.0-SNAPSHOT.jar --lambda 0.065
--metrics map ~/datasets/ml-1m/ratings.dat
rank = default
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800187, test: 200022.
Test users 6035 MAP 0.03499984595868497
rank = 25
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 799385, test: 200824.
Test users 6034 MAP 0.042580954047373255
rank = 50
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800289, test: 199920.
Test users 6036 MAP 0.048958415806933275
rank = 100
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 801148, test: 199061.
Test users 6038 MAP 0.05503487765882986
The numbers are consistent with my runs before.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]