Github user debasish83 commented on the pull request:
https://github.com/apache/spark/pull/5005#issuecomment-84499918
Even after cleaning up iterator, adding in-place gemv and create the state
and re-use the memory, still the first iteration of Breeze NNLS is slower than
mllib NNLS...Rest iterations are fine:
Breeze NNLS:
./bin/spark-submit --master spark://TUSCA09LMLVT00C.local:7077 --class
org.apache.spark.examples.mllib.MovieLensALS --jars
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 1
./examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar --rank 50
--numIterations 2 --nonNegative ~/datasets/ml-1m/ratings.dat
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800702, test: 199507.
Running Breeze NNLSSolver
Test RMSE = 1.9630818565404409.
TUSCA09LMLVT00C:spark-brznnls v606014$ grep solveTime
./work/app-20150321192419-0000/0/stderr
15/03/21 19:24:29 INFO ALS: solveTime 249.982 ms
15/03/21 19:24:29 INFO ALS: solveTime 83.31 ms
15/03/21 19:24:30 INFO ALS: solveTime 91.942 ms
15/03/21 19:24:30 INFO ALS: solveTime 92.131 ms
15/03/21 19:24:31 INFO ALS: solveTime 58.648 ms
15/03/21 19:24:31 INFO ALS: solveTime 54.959 ms
15/03/21 19:24:32 INFO ALS: solveTime 93.302 ms
15/03/21 19:24:33 INFO ALS: solveTime 110.504 ms
15/03/21 19:24:33 INFO ALS: solveTime 57.124 ms
15/03/21 19:24:34 INFO ALS: solveTime 55.7 ms
mllib NNLS:
export solver=mllib; ./bin/spark-submit --master
spark://TUSCA09LMLVT00C.local:7077 --class
org.apache.spark.examples.mllib.MovieLensALS --jars
~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 1
./examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar --rank 50
--numIterations 2 --nonNegative ~/datasets/ml-1m/ratings.dat
Got 1000209 ratings from 6040 users on 3706 movies.
Training: 800702, test: 199507.
Test RMSE = 1.9630818565404409.
TUSCA09LMLVT00C:spark-brznnls v606014$ grep solveTime ./work/app-20150321192
app-20150321192419-0000/ app-20150321192553-0001/
TUSCA09LMLVT00C:spark-brznnls v606014$ grep solveTime
./work/app-20150321192553-0001/0/stderr
15/03/21 19:26:02 INFO ALS: solveTime 88.237 ms
15/03/21 19:26:02 INFO ALS: solveTime 61.216 ms
15/03/21 19:26:03 INFO ALS: solveTime 88.628 ms
15/03/21 19:26:03 INFO ALS: solveTime 76.532 ms
15/03/21 19:26:04 INFO ALS: solveTime 44.945 ms
15/03/21 19:26:04 INFO ALS: solveTime 44.895 ms
15/03/21 19:26:05 INFO ALS: solveTime 82.933 ms
15/03/21 19:26:06 INFO ALS: solveTime 83.018 ms
15/03/21 19:26:06 INFO ALS: solveTime 48.138 ms
15/03/21 19:26:07 INFO ALS: solveTime 49.3 ms
This is the version I will push to Breeze...It will be great if you guys
could take a look at the breeze nnls and give some pointers on the first
iteration...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]