Gábor Hermann created FLINK-4961:
------------------------------------

             Summary: SGD for Matrix Factorization
                 Key: FLINK-4961
                 URL: https://issues.apache.org/jira/browse/FLINK-4961
             Project: Flink
          Issue Type: New Feature
          Components: Machine Learning Library
            Reporter: Gábor Hermann
            Assignee: Gábor Hermann


We have started an implementation of distributed stochastic gradient descent 
for matrix factorization based on Gemulla et al. [1].

The main problem with distributed SGD in general is the conflicting updates of 
the model variable. In case of matrix factorization we can avoid conflicting 
updates by carefully deciding in each iteration step which blocks of the rating 
matrix we should use to update the corresponding blocks of the user and item 
matrices (see Figure 1. in paper).

Although a general SGD solver might seem relevant for this issue, we can do 
much better in the special case of matrix factorization. E.g. in case of a 
linear regression model, the model is broadcasted in every iteration. As the 
model is typically small in that case, we can only avoid conflicts by having a 
"global" model. Based on this, the general SGD solver is a different issue.

To give more details, the algorithm works as follows.

We randomly create user and item vectors, then randomly partition them into 
{{k}} user and {{k}} item blocks. Based on these factor blocks we partition the 
rating matrix to {{k * k}} blocks correspondingly.

In one iteration step we choose {{k}} non-conflicting rating blocks, i.e. we 
should not choose two rating blocks simultaneously with the same user or item 
block. This is done by assigning a rating block ID to every user and item 
block. We match the user, item, and rating blocks by the current rating block 
ID, and update the user and item factors by the ratings locally. We also update 
the rating block ID for the factor blocks, thus in the next iteration we use 
other rating blocks to update the factors.

In {{k}} iteration we sweep through the whole rating matrix of {{k * k}} blocks 
(so instead of {{numberOfIterationSteps}} iterations we should do {{k * 
numberOfIterationSteps}} iterations).

[1] [http://people.mpi-inf.mpg.de/~rgemulla/publications]/gemulla11dsgd.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to