Are you looking to build a product recommender based on your own design? Or do you want to build one based on existing methods?
If you want to use existing methods, clustering has essentially no role. I think that composite approaches that use item meta-data and different kinds of behavioral cues are important to best performance. On Mon, May 6, 2013 at 12:35 PM, Dominik Hübner <[email protected]>wrote: > Well, as you already might have guessed, I am building a product > recommender system for my thesis. > > I am planning to evaluate ALS (both, implicit and explicit) as well as > item -similarity recommendation for users with at least a few known > products. Nevertheless, the majority of users only has seen a single (or > 2-3) product(s). I want to recommend them the most popular items from > clusters, their only product comes from (as a workaround for the cold-start > problem). Furthermore, I expect to be able to see which "kind" of products > users like. This might provide me some information about how well ALS and > similarity recommenders fit the user's area of interest (an early > evaluation) or at least to estimate if the chosen approach will work in > some way. > > On May 6, 2013, at 9:09 PM, Ted Dunning <[email protected]> wrote: > > > I don't even think that clustering is all that necessary. > > > > The reduced cooccurrence matrix will give you items related to each item. > > > > You can use something like PCA, but SVD is just as good here due to near > > zero mean. You could SSVD or ALS from Mahout to do this analysis and > then > > use k-means on the right singular vectors (aka item representation). > > > > What is the high level goal that you are trying to solve with this > > clustering? > > > > > > > > > > On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner <[email protected] > >wrote: > > > >> And running the clustering on the cooccurrence matrix or doing PCA by > >> removing eigenvalues/vectors? > >> > >> On May 6, 2013, at 8:52 PM, Ted Dunning <[email protected]> wrote: > >> > >>> On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner <[email protected] > >>> wrote: > >>> > >>>> Oh, and I forgot how the views and sales are used to build product > >>>> vectors. As of now, I implemented binary vectors, vectors counting the > >>>> number of views and sales (e.g 1view=1count, 1sale=10counts) and > >> ordinary > >>>> vectors ( view => 1, sale=>5). > >>>> > >>> > >>> I would recommend just putting the view and sale in different columns > and > >>> doing cooccurrence analysis on this. > >> > >> > >
