Nice question. I have answers I like. Really, it would be better to find words that mean thing-being-recommended-to and thing-being-recommended. I couldn't find easy, general terms that were more intuitive than "user" and "item". Even though these things need not be actual people or products, and so are inaccurate terms, they connote the right sorts of ways of thinking about what they are and how they work.
You could also say that since both can be anything, there should be at best one term for both -- a thing or entity. I don't like this on the same grounds that it makes things harder to think about in practice. Is that "thingID" the thing being recommended or recommended to in the code...? More important I don't think users and items are entirely symmetric, even though you could plug items in for users and vice versa. For instance, one is 'causing' the ratings and the other isn't. It's harder to make future predictions about the black-box source of new surprising data. That is, I may learn something quite new about you in your 1000th rating, when you rate your first classical music album ever; the 1000th rating for that same album probably didn't add much new info. Users, the causers, are more variable. And I think you do tend to have an independent/dependent variable, so to speak, in any setup. And, the algorithms sort of embed that assymmetry. Item-based recommenders aren't quite the same. For example it rather encourages you to pre-compute item-item similarity since this is likely to be relatively fixed, being the dependent variable. On Tue, Oct 18, 2011 at 9:24 AM, Dan Brickley <[email protected]> wrote: > As an aside, I've notice this 'users' terminology lurking in the > background of RowSimilarityJob (eg. in JIRA discussion). > > My use of it last week seemed perfectly reasonable; but rows were > books (or bibliographic records), with feature columns from library > topic codes. Does the 'user' terminology suggest it's really focussed > on recommendations? > > I'm used to seeing this in the Taste part of Mahout, where sometimes > it's suggested we can re-use recommender pieces by eg. thinking more > broadly and 'recommending topics to books' or vice versa. This makes > sense but introduces an extra layer of conceptual confusion. Is there > any important sense in which rows (or columns?) in RowSimilarityJob > ought to be thought of as users? Or the values/weights as preferences? > > cheers, > > Dan >
