You have most of the workings available to do a reasonable job of this in Mahout. The simplest method in my mind is to grovel the logs and emit pairs of items with the key being the last item and previous items being the value. Roughly this format should give you what you need for doing cooccurrence counting and LLR reduction. The remaining pairs can be sparsified and indexed using Lucene and can probably also be fed into the Taste part of Mahout. The default Lucene IDF weighting will do a decent job of emulating Naive Bayes so you can feed in a user's recent history as a query so that is a reasonable implementation as well.
On Tue, Jun 15, 2010 at 3:38 AM, Gökhan Çapan <[email protected]> wrote: > Hi, > This is not a question specific to Mahout library. I hope you'll be > interested. > > While recommending to a user, we take his ratings to items, or some > implicit ratings like his purchase history, click history, etc. into > account. Item based collaborative filtering techniques generally compute > item-to-item similarities in a symmetrical way ( sim(item1,item2) = > sim(item2,item1). This is the nature of a distance measure). > > What if we consider user's historical data as a sequence, and want to > predict the successor item? For example, in an e-commerce domain, we may > want to find the item to buy after buying some other items. For example, if > we have a user vector u, where uti is the item that user was interested in > time ti, what are the possible values of ucurrent? > > Considering active user's interest to items at a specific time as states, > can we see predicting user's current interest as the unobserved state and > the user data as an HMM? I do not know well HMM, do you think that point of > view to the problem seems reasonable? Do you have any ideas/suggestions > about other solutions if it is not a good way? > -- > Gökhan Çapan >
