Quite possibly this would work quite well. The only difference between what you and I said was that I suggest eliminating many items from the previous item list to avoid spurious recommendations. The weighting of the IR engine will help fight that, but I would rather not keep the connections if they don't have any relevance.
On Wed, Jun 16, 2010 at 12:05 AM, Gökhan Çapan <[email protected]> wrote: > Ted, > I am not sure that I understood your suggestion correctly. But, I've come > up > with an idea after reading. > If we create a dictionary-like structure with a high-weighted predecessor > field, and a previous items field whose entries are constructed like; > - an item as the key > - its predecessor item in predecessor field > - other previous items in the third field > Do you think results of a search query with user's recent history yields to > a reasonable, ranked list of possible next items? > > > On Tue, Jun 15, 2010 at 8:12 PM, Ted Dunning <[email protected]> > wrote: > > > You have most of the workings available to do a reasonable job of this in > > Mahout. The simplest method in my mind is to grovel the logs and emit > > pairs > > of items with the key being the last item and previous items being the > > value. Roughly this format should give you what you need for doing > > cooccurrence counting and LLR reduction. The remaining pairs can be > > sparsified and indexed using Lucene and can probably also be fed into the > > Taste part of Mahout. The default Lucene IDF weighting will do a decent > > job > > of emulating Naive Bayes so you can feed in a user's recent history as a > > query so that is a reasonable implementation as well. > > > > On Tue, Jun 15, 2010 at 3:38 AM, Gökhan Çapan <[email protected]> wrote: > > > > > Hi, > > > This is not a question specific to Mahout library. I hope you'll be > > > interested. > > > > > > While recommending to a user, we take his ratings to items, or some > > > implicit ratings like his purchase history, click history, etc. into > > > account. Item based collaborative filtering techniques generally > compute > > > item-to-item similarities in a symmetrical way ( sim(item1,item2) = > > > sim(item2,item1). This is the nature of a distance measure). > > > > > > What if we consider user's historical data as a sequence, and want to > > > predict the successor item? For example, in an e-commerce domain, we > may > > > want to find the item to buy after buying some other items. For > example, > > if > > > we have a user vector u, where uti is the item that user was interested > > in > > > time ti, what are the possible values of ucurrent? > > > > > > Considering active user's interest to items at a specific time as > states, > > > can we see predicting user's current interest as the unobserved state > and > > > the user data as an HMM? I do not know well HMM, do you think that > point > > of > > > view to the problem seems reasonable? Do you have any ideas/suggestions > > > about other solutions if it is not a good way? > > > -- > > > Gökhan Çapan > > > > > > > > > -- > Gökhan Çapan >
