I misunderstood the original question then. Thing-thing similarity is a key piece of most recommender algorithms. RowSimiliartyJob is reused for distributed comptuation. In those senses, the answer is 'yes'.
But I think the answer is 'no'. The similarity metrics are not derived from recommenders and are used in other contexts. You can compute thing-thing similarity for other reasons. There is no user-item asymmetry at this level, I think. Using the user-item terms is not well motivated here. At the same time I don't think it hurts much, and, lets you understand the relation to recommendations (which is a primary user of this general component) more easily. On Thu, Oct 20, 2011 at 12:06 PM, Dan Brickley <[email protected]> wrote: > In general, I completely agree with your perspective here. Even when > everything bottoms out as matrix maths underneath, that doesn't mean > that developers should only ever see that abstraction in their > day-to-day hacking. Mahout lets you adopt at various levels; Taste > gives almost a drop-in running service; the bin/mahout utility and > recommender APIs give a variety of high level entry points, and then > of course being opensource, Java developers can jump into the code at > any level that suits their need. For lots of those entry points, > 'user' and 'item' are a great way to present things. > > Anyhow, I think my question still holds: is the 'bin/mahout > rowsimilarity' piece of Mahout something that should be understood > primarily as a recommendations-oriented component? For my application > I was seeking just 'the most similar books' for any given book, to > feed those affinities to Gephi for visual mapping. I could > conceptualise this in terms of recommending I guess; but I didn't. So > that's why I was mildly suprised when I noticed that others in Jira > and email did seem to think of rowsimiliarityjob in > recommendation-oriented terms (ie. users and items). I completely > agree that those are useful notions to have in the APIs and utilities, > I just somehow wasn't expecting it right there (just as I wouldn't > expect it on the more mathsy APIs either). > > cheers, > > Dan > > ps. as an aside, your points here also remind me of a few passages in > http://en.wikipedia.org/wiki/Six_Degrees:_The_Science_of_a_Connected_Age > that emphasise how a purely mathemetical perspective on > networks/graphs can obscure the ways in which different kinds of > network can usefully be understood, and that sometimes you do need to > think about the social context alongside the maths... > >> On Tue, Oct 18, 2011 at 9:24 AM, Dan Brickley <[email protected]> wrote: >>> As an aside, I've notice this 'users' terminology lurking in the >>> background of RowSimilarityJob (eg. in JIRA discussion). >>> >>> My use of it last week seemed perfectly reasonable; but rows were >>> books (or bibliographic records), with feature columns from library >>> topic codes. Does the 'user' terminology suggest it's really focussed >>> on recommendations? >>> >>> I'm used to seeing this in the Taste part of Mahout, where sometimes >>> it's suggested we can re-use recommender pieces by eg. thinking more >>> broadly and 'recommending topics to books' or vice versa. This makes >>> sense but introduces an extra layer of conceptual confusion. Is there >>> any important sense in which rows (or columns?) in RowSimilarityJob >>> ought to be thought of as users? Or the values/weights as preferences? >>> >>> cheers, >>> >>> Dan >
