Well, as you already might have guessed, I am building a product recommender system for my thesis.
I am planning to evaluate ALS (both, implicit and explicit) as well as item -similarity recommendation for users with at least a few known products. Nevertheless, the majority of users only has seen a single (or 2-3) product(s). I want to recommend them the most popular items from clusters, their only product comes from (as a workaround for the cold-start problem). Furthermore, I expect to be able to see which "kind" of products users like. This might provide me some information about how well ALS and similarity recommenders fit the user's area of interest (an early evaluation) or at least to estimate if the chosen approach will work in some way. On May 6, 2013, at 9:09 PM, Ted Dunning <[email protected]> wrote: > I don't even think that clustering is all that necessary. > > The reduced cooccurrence matrix will give you items related to each item. > > You can use something like PCA, but SVD is just as good here due to near > zero mean. You could SSVD or ALS from Mahout to do this analysis and then > use k-means on the right singular vectors (aka item representation). > > What is the high level goal that you are trying to solve with this > clustering? > > > > > On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner <[email protected]>wrote: > >> And running the clustering on the cooccurrence matrix or doing PCA by >> removing eigenvalues/vectors? >> >> On May 6, 2013, at 8:52 PM, Ted Dunning <[email protected]> wrote: >> >>> On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner <[email protected] >>> wrote: >>> >>>> Oh, and I forgot how the views and sales are used to build product >>>> vectors. As of now, I implemented binary vectors, vectors counting the >>>> number of views and sales (e.g 1view=1count, 1sale=10counts) and >> ordinary >>>> vectors ( view => 1, sale=>5). >>>> >>> >>> I would recommend just putting the view and sale in different columns and >>> doing cooccurrence analysis on this. >> >>
