(Uncanny, I was just minutes before researching Grooveshark for unrelated reasons... Good to hear from any company doing recommendations and is willing to talk about it. I know of a number that can't or won't unfortunately.)
Yeah, sounds like we're all on the same page. One key point in what I think everyone is talking about is that this is not simply removing items *after* recommendations are computed. This risks removing most or all recommended items. It needs to be done during the process of selecting recommendations. But beyond that, it's a simple idea and just a question of implementation. It's "Rescorer" in the non-Hadoop code, which does more than provide a way to remove items but rather generally rearrange recommendations according to some logic. I think it's likely easy and useful to imitate this with a simple optional Mapper/Reducer phase in this nascent "RecommenderJob" pipeline that Sebastian is now helping expand into something more configurable and general purpose. Sean On Mon, Aug 23, 2010 at 8:25 PM, Chris Bates <[email protected]> wrote: > Hi all, > > I'm new to this forum and haven't seen the code you are talking about, so > take this with a grain of salt. The way we handle "banned items" at > Grooveshark is to post-process the itemID pairs in Hive. If a user dislikes > a recommended song/artist, an item pair is stored in HDFS and then when the > recs are computed, those banned user-item pairs are taken into account. > Here is an example query: > > SELECT DISTINCT st.uid, st.simuid, IF(b.uid=st.uid,1,0) as banned FROM > streams_u2u st LEFT OUTER JOIN bannedsimusers b ON (b.simuid=st.simuid); > > That query will print out a 1 or a 0 if the recommended item pair is banned > or not. Hive also supports case statements (I think), so you can make a > range of "banned-ness" I guess. Just another solution to the "dislike" > problem. > > Chris
