[
https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377357#comment-14377357
]
Debasish Das edited comment on SPARK-2426 at 3/24/15 3:23 PM:
--------------------------------------------------------------
[~acopich] From your comment before "Anyway, l2 regularized stochastic matrix
decomposition problem is defined as follows
Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||)
under non-negativeness and normalization constraints.
.", could you please point me to a good reference with application to
collaborative filtering/topic modeling ? Stochastic matrix decomposition is
what we can do in this PR now https://github.com/apache/spark/pull/3221 Is not
there is log term that multiplies with R to make it a KL divergence loss ? May
be the log term can removed under non-negative and normalization constraints ?
@mengxr any ideas here ? If we can do that we can target KL divergence loss
from Lee's paper: http://hebb.mit.edu/people/seung/papers/ls-lponm-99.pdf
For MAP loss, I will open up a PR in a week through JIRA
https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much
slower we get compared to stochastic matrix decomposition using ALS. MAP loss
looks like a strong contender to LDA and can natively handle counts (does not
need regression style datasets which is difficult to get in practical setup
where people normally don't give any rating and satisfaction should be infered
from viewing time etc)
was (Author: debasish83):
[~acopich] From your comment before "Anyway, l2 regularized stochastic matrix
decomposition problem is defined as follows
Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||)
under non-negativeness and normalization constraints.
.", could you please point me to a good reference with application to
collaborative filtering/topic modeling ? Stochastic matrix decomposition is
what we can do in this PR now https://github.com/apache/spark/pull/3221....Is
not there is log term that multiplies with R to make it a KL divergence loss ?
May be the log term can removed under non-negative and normalization
constraints ? @mengxr any ideas here ? If we can do that we can target KL
divergence loss from Lee's paper:
http://hebb.mit.edu/people/seung/papers/ls-lponm-99.pdf
For MAP loss, I will open up a PR in a week through JIRA
https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much
slower we get compared to stochastic matrix decomposition using ALS. MAP loss
looks like a strong contender to LDA and can natively handle counts (does not
need regression style datasets which is difficult to get in practical setup
where people normally don't give any rating and satisfaction should be infered
from viewing time etc)
> Quadratic Minimization for MLlib ALS
> ------------------------------------
>
> Key: SPARK-2426
> URL: https://issues.apache.org/jira/browse/SPARK-2426
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 1.3.0
> Reporter: Debasish Das
> Assignee: Debasish Das
> Original Estimate: 504h
> Remaining Estimate: 504h
>
> Current ALS supports least squares and nonnegative least squares.
> I presented ADMM and IPM based Quadratic Minimization solvers to be used for
> the following ALS problems:
> 1. ALS with bounds
> 2. ALS with L1 regularization
> 3. ALS with Equality constraint and bounds
> Initial runtime comparisons are presented at Spark Summit.
> http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark
> Based on Xiangrui's feedback I am currently comparing the ADMM based
> Quadratic Minimization solvers with IPM based QpSolvers and the default
> ALS/NNLS. I will keep updating the runtime comparison results.
> For integration the detailed plan is as follows:
> 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization
> 2. Integrate QuadraticMinimizer in mllib ALS
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]