Thank you, for that great discussion. I actually thought that would create a user profile. I guess we need some experiments to see. Btw, I guess the methods we generally do for recommending do not fit the real life enough, especially in an e-commerce domain. Time is a very important parameter that we shouldn't underestimate. Considering a shopping site without a recommender system, making a decision to see, or buy an item is highly dependent to previous choices. That's why I thought sometimes we should see users' history as sequences, not sets. My first HMM ideas came after this, but Sean said that it may be overkill. What do you think about HMM, is it worth to try?
On Wed, Jun 16, 2010 at 10:11 AM, Ted Dunning <[email protected]> wrote: > Quite possibly this would work quite well. > > The only difference between what you and I said was that I suggest > eliminating many items from the previous item list to avoid spurious > recommendations. The weighting of the IR engine will help fight that, but > I > would rather not keep the connections if they don't have any relevance. > > On Wed, Jun 16, 2010 at 12:05 AM, Gökhan Çapan <[email protected]> wrote: > > > Ted, > > I am not sure that I understood your suggestion correctly. But, I've come > > up > > with an idea after reading. > > If we create a dictionary-like structure with a high-weighted predecessor > > field, and a previous items field whose entries are constructed like; > > - an item as the key > > - its predecessor item in predecessor field > > - other previous items in the third field > > Do you think results of a search query with user's recent history yields > to > > a reasonable, ranked list of possible next items? > > > > > > On Tue, Jun 15, 2010 at 8:12 PM, Ted Dunning <[email protected]> > > wrote: > > > > > You have most of the workings available to do a reasonable job of this > in > > > Mahout. The simplest method in my mind is to grovel the logs and emit > > > pairs > > > of items with the key being the last item and previous items being the > > > value. Roughly this format should give you what you need for doing > > > cooccurrence counting and LLR reduction. The remaining pairs can be > > > sparsified and indexed using Lucene and can probably also be fed into > the > > > Taste part of Mahout. The default Lucene IDF weighting will do a > decent > > > job > > > of emulating Naive Bayes so you can feed in a user's recent history as > a > > > query so that is a reasonable implementation as well. > > > > > > On Tue, Jun 15, 2010 at 3:38 AM, Gökhan Çapan <[email protected]> > wrote: > > > > > > > Hi, > > > > This is not a question specific to Mahout library. I hope you'll be > > > > interested. > > > > > > > > While recommending to a user, we take his ratings to items, or some > > > > implicit ratings like his purchase history, click history, etc. into > > > > account. Item based collaborative filtering techniques generally > > compute > > > > item-to-item similarities in a symmetrical way ( sim(item1,item2) = > > > > sim(item2,item1). This is the nature of a distance measure). > > > > > > > > What if we consider user's historical data as a sequence, and want to > > > > predict the successor item? For example, in an e-commerce domain, we > > may > > > > want to find the item to buy after buying some other items. For > > example, > > > if > > > > we have a user vector u, where uti is the item that user was > interested > > > in > > > > time ti, what are the possible values of ucurrent? > > > > > > > > Considering active user's interest to items at a specific time as > > states, > > > > can we see predicting user's current interest as the unobserved state > > and > > > > the user data as an HMM? I do not know well HMM, do you think that > > point > > > of > > > > view to the problem seems reasonable? Do you have any > ideas/suggestions > > > > about other solutions if it is not a good way? > > > > -- > > > > Gökhan Çapan > > > > > > > > > > > > > > > -- > > Gökhan Çapan > > > -- Gökhan Çapan
