GitHub user debasish83 opened a pull request:

    https://github.com/apache/spark/pull/5869

    [SPARK-4231][MLLIB][Examples] MAP calculation added to examples.MovieLensALS

    MAP calculation driver to MovieLensALS was not part of SPARK-3066 merge. 
Added the driver in this PR.
    
    @mengxr the results changed compared to my old runs. Any idea if some 
internal ALS tuning has changed (I remember per user regularization change for 
implicit feedback but that should not change explicit results) ?
    
    MAP calculation:
    
    ./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class 
org.apache.spark.examples.mllib.MovieLensALS --jars 
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar 
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g 
./examples/target/spark-examples_2.10-1.4.0-SNAPSHOT.jar --lambda 0.065 
--metrics map ~/datasets/ml-1m/ratings.dat
    
    Got 1000209 ratings from 6040 users on 3706 movies.
    Training: 800163, test: 200046.
    Test users 6035 MAP 0.019697998843987024
    
    RMSE calculation:
    
    ./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class 
org.apache.spark.examples.mllib.MovieLensALS --jars 
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar 
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g 
./examples/target/spark-examples_2.10-1.4.0-SNAPSHOT.jar --lambda 0.065 
--metrics rmse ~/datasets/ml-1m/ratings.dat
    
    Got 1000209 ratings from 6040 users on 3706 movies.
    Training: 800116, test: 200093.
    Test RMSE = 0.8558133665979457


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/debasish83/spark irmetrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5869.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5869
    
----
commit 9b3951f558e5673eb475c575f14876421b5a3abc
Author: Debasish Das <[email protected]>
Date:   2014-11-05T01:23:09Z

    validate user/product on MovieLens dataset through user input and compute 
map measure along with rmse

commit cd3ab31cb9b244bae2b45396a6269ed1dc59151b
Author: Debasish Das <[email protected]>
Date:   2014-11-05T22:43:11Z

    merged with AbstractParams serialization bug

commit 4bbae0f248ca8747b47ecf852d5aba19c9b39dab
Author: Debasish Das <[email protected]>
Date:   2014-11-05T23:23:02Z

    comments fixed as per scalastyle

commit 9fa063e1eb172d68248e03797a54acc738543592
Author: Debasish Das <[email protected]>
Date:   2014-11-06T00:05:24Z

    import scala.math.round

commit 10cbb37a7881867d801ae6630ffc0d09b3feebf9
Author: Debasish Das <[email protected]>
Date:   2014-11-08T06:31:40Z

    provide ratio for topN product validation; generate MAP and prec@k metric 
for movielens dataset

commit f38a1b59e27907f2aa9bd732c5f9147b738d3a0f
Author: Debasish Das <[email protected]>
Date:   2014-11-08T06:45:13Z

    use sampleByKey for per user sampling

commit d144f57a58c9424365f1242f90961386c016641e
Author: Debasish Das <[email protected]>
Date:   2014-11-12T04:56:46Z

    recommendAll API to MatrixFactorizationModel, uses topK finding using 
BoundedPriorityQueue similar to RDD.top

commit 7163a5c21b394d8bd89694a9f08aa1b446c71956
Author: Debasish Das <[email protected]>
Date:   2014-11-19T21:58:45Z

    Added API for batch user and product recommendation; MAP calculation for 
product recommendation per user using randomized split

commit 3f97c499004aa58dfa1b51b8d2cbd6e5776f5fb1
Author: Debasish Das <[email protected]>
Date:   2014-11-19T23:38:45Z

    fixed spark coding style for imports

commit ee9957144bc2d145c91fc4a4b894ccd2ee6bc2b9
Author: Debasish Das <[email protected]>
Date:   2015-04-01T01:52:27Z

    addressed initial review comments;merged with master;added tests for batch 
predict APIs in matrix factorization

commit 98fa4243dc6041290bdde51e1e899a8be7576470
Author: Debasish Das <[email protected]>
Date:   2015-04-01T01:59:57Z

    updated with master

commit 3a0c4eb7f81ee0845f4945d395f6652c965f941b
Author: Debasish Das <[email protected]>
Date:   2015-04-01T04:31:01Z

    updated with spark master

commit 3640409ac2dd2ea7ab5e67a520726f2387d137e3
Author: Debasish Das <[email protected]>
Date:   2015-05-02T23:17:45Z

    MAP calculation driver added to MovieLensALS

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to