On Jul 22, 2011, at 7:23 AM, Niall Riddell wrote: > > > I've gone through MIA and felt the the rowsimilarityjob was a > possibility, however I understand that a JIRA has been raised to make > this potentially less general and in it's current form it may not > match my performance/cost criteria (i.e. high/low).
I don't think the goal of the JIRA issue is to make it less general, ti's to make the cases that can benefit from smarter use of the co-occurrences scale better. I see no reason why the existing format can't also be maintained for those similarity measures that can't benefit from more map side calculation. -Grant
