GitHub user debasish83 opened a pull request:
https://github.com/apache/spark/pull/5869
[SPARK-4231][MLLIB][Examples] MAP calculation added to examples.MovieLensALS
MAP calculation driver to MovieLensALS was not part of SPARK-3066 merge.
Added the driver in this PR.
@mengxr the results changed compared to my old runs. Any idea if some
internal ALS tuning has changed (I remember per user regularization change for
implicit feedback but that should not change explicit results) ?
MAP calculation:
./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class
org.apache.spark.examples.mllib.MovieLensALS --jars
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g
./examples/target/spark-examples_2.10-1.4.0-SNAPSHOT.jar --lambda 0.065
--metrics map ~/datasets/ml-1m/ratings.dat
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800163, test: 200046.
Test users 6035 MAP 0.019697998843987024
RMSE calculation:
./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class
org.apache.spark.examples.mllib.MovieLensALS --jars
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g
./examples/target/spark-examples_2.10-1.4.0-SNAPSHOT.jar --lambda 0.065
--metrics rmse ~/datasets/ml-1m/ratings.dat
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800116, test: 200093.
Test RMSE = 0.8558133665979457
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/debasish83/spark irmetrics
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/5869.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5869
----
commit 9b3951f558e5673eb475c575f14876421b5a3abc
Author: Debasish Das <[email protected]>
Date: 2014-11-05T01:23:09Z
validate user/product on MovieLens dataset through user input and compute
map measure along with rmse
commit cd3ab31cb9b244bae2b45396a6269ed1dc59151b
Author: Debasish Das <[email protected]>
Date: 2014-11-05T22:43:11Z
merged with AbstractParams serialization bug
commit 4bbae0f248ca8747b47ecf852d5aba19c9b39dab
Author: Debasish Das <[email protected]>
Date: 2014-11-05T23:23:02Z
comments fixed as per scalastyle
commit 9fa063e1eb172d68248e03797a54acc738543592
Author: Debasish Das <[email protected]>
Date: 2014-11-06T00:05:24Z
import scala.math.round
commit 10cbb37a7881867d801ae6630ffc0d09b3feebf9
Author: Debasish Das <[email protected]>
Date: 2014-11-08T06:31:40Z
provide ratio for topN product validation; generate MAP and prec@k metric
for movielens dataset
commit f38a1b59e27907f2aa9bd732c5f9147b738d3a0f
Author: Debasish Das <[email protected]>
Date: 2014-11-08T06:45:13Z
use sampleByKey for per user sampling
commit d144f57a58c9424365f1242f90961386c016641e
Author: Debasish Das <[email protected]>
Date: 2014-11-12T04:56:46Z
recommendAll API to MatrixFactorizationModel, uses topK finding using
BoundedPriorityQueue similar to RDD.top
commit 7163a5c21b394d8bd89694a9f08aa1b446c71956
Author: Debasish Das <[email protected]>
Date: 2014-11-19T21:58:45Z
Added API for batch user and product recommendation; MAP calculation for
product recommendation per user using randomized split
commit 3f97c499004aa58dfa1b51b8d2cbd6e5776f5fb1
Author: Debasish Das <[email protected]>
Date: 2014-11-19T23:38:45Z
fixed spark coding style for imports
commit ee9957144bc2d145c91fc4a4b894ccd2ee6bc2b9
Author: Debasish Das <[email protected]>
Date: 2015-04-01T01:52:27Z
addressed initial review comments;merged with master;added tests for batch
predict APIs in matrix factorization
commit 98fa4243dc6041290bdde51e1e899a8be7576470
Author: Debasish Das <[email protected]>
Date: 2015-04-01T01:59:57Z
updated with master
commit 3a0c4eb7f81ee0845f4945d395f6652c965f941b
Author: Debasish Das <[email protected]>
Date: 2015-04-01T04:31:01Z
updated with spark master
commit 3640409ac2dd2ea7ab5e67a520726f2387d137e3
Author: Debasish Das <[email protected]>
Date: 2015-05-02T23:17:45Z
MAP calculation driver added to MovieLensALS
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]