One more thing for now @Ted: What do you refer to with sparsification and reconstruction?
On May 7, 2013, at 12:19 AM, Ted Dunning <[email protected]> wrote: > Truly cold start is best handled by recommending the most popular items. > > If you know *anything* at all such as geo or browser or OS, then you can > use that to recommend using conventional techniques (that is, you can > recommend for the characteristics rather than for the person). > > Within a very few interactions, however, real recommendations will kick in. > > My lately preferred approach is to derive indicators using sparsification > or ALS+reconstruction. These indicators can be historical items or static > items such as geo information. These indicators can be combined in a > single step using a search engine. > > > > > > > On Mon, May 6, 2013 at 2:58 PM, Dominik Hübner <[email protected]> wrote: > >> The cluster was mostly intended for tackling the cold start problem for >> new users. >> I want to build a recommender based on existing components or to be >> precise a combination of them. >> >> Unfortunately, the only product meta-data I currently have is the product >> price. Furthermore, this is a project >> I am working on alone. As a consequence, the approaches I can examine in >> the given time are limited. >> >> Would using ALS and ranking its outcome by e.g. frequent item set >> algorithms be something worth looking into? >> Or did you mean something different? >> >> My personal goal is to build a recommender providing acceptable results >> using the data I currently have available. >> Of course, this will only serve as a basis for further improvements where >> necessary or if further information can be obtained. >> >> >> On May 6, 2013, at 11:21 PM, Ted Dunning <[email protected]> wrote: >> >>> Are you looking to build a product recommender based on your own design? >>> Or do you want to build one based on existing methods? >>> >>> If you want to use existing methods, clustering has essentially no role. >>> >>> I think that composite approaches that use item meta-data and different >>> kinds of behavioral cues are important to best performance. >>> >>> >>> On Mon, May 6, 2013 at 12:35 PM, Dominik Hübner <[email protected] >>> wrote: >>> >>>> Well, as you already might have guessed, I am building a product >>>> recommender system for my thesis. >>>> >>>> I am planning to evaluate ALS (both, implicit and explicit) as well as >>>> item -similarity recommendation for users with at least a few known >>>> products. Nevertheless, the majority of users only has seen a single (or >>>> 2-3) product(s). I want to recommend them the most popular items from >>>> clusters, their only product comes from (as a workaround for the >> cold-start >>>> problem). Furthermore, I expect to be able to see which "kind" of >> products >>>> users like. This might provide me some information about how well ALS >> and >>>> similarity recommenders fit the user's area of interest (an early >>>> evaluation) or at least to estimate if the chosen approach will work in >>>> some way. >>>> >>>> On May 6, 2013, at 9:09 PM, Ted Dunning <[email protected]> wrote: >>>> >>>>> I don't even think that clustering is all that necessary. >>>>> >>>>> The reduced cooccurrence matrix will give you items related to each >> item. >>>>> >>>>> You can use something like PCA, but SVD is just as good here due to >> near >>>>> zero mean. You could SSVD or ALS from Mahout to do this analysis and >>>> then >>>>> use k-means on the right singular vectors (aka item representation). >>>>> >>>>> What is the high level goal that you are trying to solve with this >>>>> clustering? >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner <[email protected] >>>>> wrote: >>>>> >>>>>> And running the clustering on the cooccurrence matrix or doing PCA by >>>>>> removing eigenvalues/vectors? >>>>>> >>>>>> On May 6, 2013, at 8:52 PM, Ted Dunning <[email protected]> >> wrote: >>>>>> >>>>>>> On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner < >> [email protected] >>>>>>> wrote: >>>>>>> >>>>>>>> Oh, and I forgot how the views and sales are used to build product >>>>>>>> vectors. As of now, I implemented binary vectors, vectors counting >> the >>>>>>>> number of views and sales (e.g 1view=1count, 1sale=10counts) and >>>>>> ordinary >>>>>>>> vectors ( view => 1, sale=>5). >>>>>>>> >>>>>>> >>>>>>> I would recommend just putting the view and sale in different columns >>>> and >>>>>>> doing cooccurrence analysis on this. >>>>>> >>>>>> >>>> >>>> >> >>
