I don't know a good name for that. The problems is that a quadratic
amount of pairs needs to be emitted here. In our collaborative filtering
code, we solve this through downsampling.
--sebastian
On 04/08/2014 10:08 AM, Reinis Vicups wrote:
Hi,
this is not mahout question directly, but I figured that you guys most
likely can answer it.
Actually I have two questions:
1. This: {(1,2); (1,3); (2,3)} is not full cartesian product, right? It
is missing (1,1); (2,2); (3,3); (2,1);.... My question is - how is it
called? Partial cartesian? Asymetric cartesian?
2. If I try to build the product I described above in reducer, what
would be the best practice? My current code look like this:
@Override
public void reduce(final VarLongWritable key, final
Iterable<VarLongWritable> values, final Context context) {
final VarLongWritable[] valueArray = Iterables.toArray(values,
VarLongWritable.class);
for (int i = 0; i < valueArray.length; i++) {
for (int j = i + 1; j < valueArray.length; j++) {
context.write(new PairWritable(valueArray[i].get(),
valueArray[j].get()), customerPreferenceWritable);
}
}
}
I don't feel quite right with this solution since I make a copy of
values in "valueArray" and believe that it will cost me
OoutOfMemoryExceptions with larger data sets.
thanks and br
reinis