Ted, I am not sure that I understood your suggestion correctly. But, I've come up with an idea after reading. If we create a dictionary-like structure with a high-weighted predecessor field, and a previous items field whose entries are constructed like; - an item as the key - its predecessor item in predecessor field - other previous items in the third field Do you think results of a search query with user's recent history yields to a reasonable, ranked list of possible next items?
On Tue, Jun 15, 2010 at 8:12 PM, Ted Dunning <[email protected]> wrote: > You have most of the workings available to do a reasonable job of this in > Mahout. The simplest method in my mind is to grovel the logs and emit > pairs > of items with the key being the last item and previous items being the > value. Roughly this format should give you what you need for doing > cooccurrence counting and LLR reduction. The remaining pairs can be > sparsified and indexed using Lucene and can probably also be fed into the > Taste part of Mahout. The default Lucene IDF weighting will do a decent > job > of emulating Naive Bayes so you can feed in a user's recent history as a > query so that is a reasonable implementation as well. > > On Tue, Jun 15, 2010 at 3:38 AM, Gökhan Çapan <[email protected]> wrote: > > > Hi, > > This is not a question specific to Mahout library. I hope you'll be > > interested. > > > > While recommending to a user, we take his ratings to items, or some > > implicit ratings like his purchase history, click history, etc. into > > account. Item based collaborative filtering techniques generally compute > > item-to-item similarities in a symmetrical way ( sim(item1,item2) = > > sim(item2,item1). This is the nature of a distance measure). > > > > What if we consider user's historical data as a sequence, and want to > > predict the successor item? For example, in an e-commerce domain, we may > > want to find the item to buy after buying some other items. For example, > if > > we have a user vector u, where uti is the item that user was interested > in > > time ti, what are the possible values of ucurrent? > > > > Considering active user's interest to items at a specific time as states, > > can we see predicting user's current interest as the unobserved state and > > the user data as an HMM? I do not know well HMM, do you think that point > of > > view to the problem seems reasonable? Do you have any ideas/suggestions > > about other solutions if it is not a good way? > > -- > > Gökhan Çapan > > > -- Gökhan Çapan
