>
>
> Accessing a separate SequenceFile from within a Mapper is *way inefficient*
> (orders of magnitude slower).
>
> You want to do a map-side join.  This is what is done in MatrixMultiplyJob
> -
> your Mapper gets IntWritable as key, and the value is a Pair of
> VectorWritables -
> one from each matrix.
>

Excellent. Any idea what the Hadoop 0.20.2 equivalent for
CompositeInputFormat is? :)

Reply via email to