hi,
one more thing in terms of the SVD based recommender performance. There
is one single reason why it is so slow. When it does training there is a
phrase when it calculates the dot product of two vectors for every
single entry. If I want to train on the Netflix dataset (100m entries)
using 60 iterations and 64 features that comes down to 100m*64*60
multiplications. But this could be improved since there is only one
entry in each vector that is updated at each iteration. If the rest was
cached that would reduce the whole thing to 100m*60 multiplications.
implemented this just now, if interested adding it.
Tamas