I'm cool with an NMF/PMF addition; you can make pull requests for further 
comments.

> On Dec 22, 2014, at 10:00 PM, 梁明强 <[email protected]> wrote:
> 
> Dear sir,
> 
> I read the source code of Mahout, it seems hasn’t Matrix Factorization(MF) 
> method.
> 
> MF is very useful in recommender system, especially when the dataset is very 
> sparse. Even though SVD and QR decomposition methods are successfully applied 
> to solve some practical problems, it suffers severe overfitting when applied 
> to sparse matrices. 
> 
> In fact, Matrix Factorization is very different to Matrix Decomposition, 
> although they look very similar. Simply put, Matrix Decomposition is used to 
> reduce dimension, and the original matrix need to be dense; but Matrix 
> Factorization is usually used to predict users' rating on some items when the 
> original matrix is very sparse.
> 
> So, I want to implement some Matrix Factorization method for Mahout, such as 
> Non-negative Matrix Factorization(NMF) and Probabilistic Matrix 
> Factorization(PMF).
> 
> If my proposal sounds interesting, do we need establish a new git branch, or 
> just fork and pull request?
> 
> And here is my elementary implementation of NMF in java,using multiplicative 
> update rules. In the next step I want to implement PMF, optimize them and 
> implement the distribution version.
> 
> 
> package org.apache.mahout.math;
> 
> import java.util.Random;
> import org.apache.mahout.math.function.DoubleFunction;
> public class NMF{
>    private Matrix w;
>    private Matrix h;
> 
>    /**
>     * @param v Matrix original
>     * @param r model order
>     * @param steps max steps before converge
>     * @param errMax threshold of object function change rate
>     */
>    public NMF(Matrix v, int r, int steps, double errMax) {
>        double oldObj, obj;
>        int n = v.rowSize();
>        int m = v.columnSize();
> 
>        w = new DenseMatrix(n, r).assign(RANDOMF);
>        h = new DenseMatrix(r, m).assign(RANDOMF);
> 
>        for (int step = 0; step < steps; step++) {
> 
>            oldObj = calObjectFunction(v, w, h);
> 
>            //update h
>            Matrix wtv = wt.times(v);
>            Matrix wtwh = wt.times(wh);
>            for (int i = 0; i < r; i++) {
>                for (int j = 0; j < m; j++) {
>                    if(wtwh.get(i, j) > 0)
>                        h.set(i, j, h.get(i,j)*
>                            (wtv.get(i,j)/wtwh.get(i,j)) );
>                }
>            }
> 
>        // update w
>            Matrix vht = v.times(h.transpose());
>            Matrix whht = w.times(h).times(h.transpose());
>            for (int i = 0; i < n; i++) {
>                for (int j = 0; j < r; j++) {
>                    if(whht.get(i, j) > 0)
>                        w.set(i, j, w.get(i,j)*
>                            (vht.get(i,j)/whht.get(i,j)) );
> 
>                }
>            }
> 
> 
>            obj = calObjectFunction(v, w, h);
>            double erro = oldObj - obj;
>            if(erro < errMax) break;
> 
>        }
> 
>    }
> 
>    /**
>     * Calculate the value of object function
>     * @param v Matrix original
>     * @param w Matrix factor
>     * @param h Matrix factor
>     * @return Value of object function
>     */
>    static double calObjectFunction(Matrix v, Matrix w, Matrix h){
>        Matrix wh = w.times(h);
>        Matrix minus = v.minus(wh);
>        double err = 0;
>        for(int i=0; i<minus.rowSize(); i++)
>            for(int j=0; j<minus.columnSize(); j++)
>                err += minus.get(i, j) * minus.get(i, j);
>        return err;
>    }
> 
> 
>    private static final DoubleFunction RANDOMF = new DoubleFunction() {
>        @Override
>        public double apply(double a) {
>            Random random = new Random();
>            return random.nextDouble();
>        }
>    };
> 
>    public Matrix getW() {
>        return w;
>    }
> 
>    public Matrix getH() {
>        return h;
>    }
> 
> 
> }
> 

Reply via email to