Small numbers of new users rarely cause the universe to be all that
different.  That means that you can usually express their behavior pretty
accurately in terms of the old basis vectors.  Expressed mathematically, if
you have a large number of vectors sampled from some relatively low rank
sub-space then the basis for that sub-space as estimated from your first
sample will be pretty good and thus will be a decent basis for another
tranche of vectors from the same distribution.

So the answer to your question is, yes, there are easy assumptions to
simplify this process and yes, you can assume negligible effect and just
rebuild occasionally.

On Fri, Feb 25, 2011 at 10:42 AM, Chris Schilling <[email protected]>wrote:

> Hello,
>
> So, I have begun working with the SVDRecommender implementation.
>  Obviously, a matrix factorization technique will not play nice when we try
> to add an anon/new user (say via the PAUDM) because we have not
> re-factorized.
>
> So my question is more from the linear algebra standpoint:  is it possible
> to estimate the new singular vector  when adding a small amount of
> information (i.e. a new user or a new item) to the matrix?  Are there
> assumptions that can be made to simplify this estimate?  For instance, could
> we assume that the addition of a single row or column will have negligible
> effect on the already computed singular vectors?
>
> If this is possible, I would be interested in playing around with the
> implementation.  Can you point me in the right direction?  I'll do some more
> research over the next few days as well.
>
>
> Finally, I have a question about the sequential factorizations in Mahout.
>  Once I have my original matrix factorized, what is the easiest way to
> serialize the results of the factorization for reading back in at a later
> time?  Would I loop through all the items and users and dump the features
> for each?  How to read that back in then?
>
> Thanks for all the help, Much appreciated,
> Chris
>
>

Reply via email to