Understood, catalog categories, tags, etc will make good metadata to be included in the query and putting in separate fields allows us to separately boost each in the query. UserIDs that have interacted with the item is an interesting idea.
However the specific case I'm describing is not about content similarity. Talking here about item-item similarity exactly as encoded in the similarity matrix. The order or rank of these item-item similarities should be preserved and I was proposing doing so with the order of the itemID terms in the document. The query will return history based recs ranked by the order Solr applies. The doc itself for any item contains similar items ordered by their similarity magnitude, precalculated in Mahout RowSimilarityJob. On Jul 24, 2013, at 7:19 PM, Ted Dunning <[email protected]> wrote: Content based item similarity is a fine thing to include in a separate field. In addition, it is reasonable to describe a person's history in terms of the meta-data on the items they have interacted with. That allows you to build a set of socially driven meta-data indicators as well. This can be useful in the restaurant example where you might find that "elegant" or "home-style" might be good indicators for different restaurants even if those terms don't appear in a restaurant description. Sent from my iPhone On Jul 23, 2013, at 18:26, Pat Ferrel <[email protected]> wrote: > Honestly not trying to make this more complicated but… > > In the purely Mahout cross-recommender we got a ranked list of similar items > for any item so we could combine personal history-based recs with > non-personalized item similarity-based recs wherever we had an item context. > In a past ecom case the item similarity recs were quite useful when a user > was looking at an item already. In that case even if the user was unknown we > could make item similarity-based recs. > > How about if we order the items in the doc by rank in the existing fields > since they are just text? Then we would do user-history-based queries on the > fields for recs and docs[itemID].field to get the ordered list of items out > of any doc. Doing an ensemble would require weights though. Unless someone > knows a rank based method for combining results. I guess you could vote or > add rank numbers of like items or the log thereof... > > I assume the combination of results from [B'B] and [B'A] will be a query over > both fields with some boost or other to handle ensemble weighting. But if you > want to add item similarity recs another method must be employed, no? > > From past experience I strongly suspect item similarity rank is not something > we want to lose so unless someone has a better idea I'll just order the IDs > in the fields and call it good for now. >
