+1 hmm, it is tricky. On Tue, Jan 6, 2009 at 11:04 AM, Edward J. Yoon <[email protected]>wrote:
> If we collect blocks to one table during blocking_mapred(), locality > will be provided and more faster. > > row Key column:A column:B > c(0, 0) += a(0, 0) * b(0, 0) > c(0, 0) += a(0, 1) * b(1, 0) > c(0, 0) += a(0, 2) * b(2, 0) > c(0, 0) += a(0, 3) * b(3, 0) > c(0, 1) += a(0, 0) * b(0, 1) > c(0, 1) += a(0, 1) * b(1, 1) > ... > > What do you think? > > On Mon, Jan 5, 2009 at 10:30 AM, Edward J. Yoon <[email protected]> > wrote: > > Hama Trunk doesn't work for large matrices multiplication with > > mapred.task.timeout and scanner.timeout exception. I tried 1,000,000 * > > 1,000,000 matrix multiplication on 100 node. (Rests are good) > > > > To reduce read operation of duplicated block, I thought as describe > > below. But, each map processing seems too large. > > > > ---- > > // c[i][k] += a[i][j] * b[j][k]; > > > > map() { > > SubMatrix a = value.get(); > > > > for (RowResult row : scan) { > > collect : c[i][k] = a * b[j][k]; > > } > > } > > > > reduce() { > > c[i][k] += c[i][k]; > > } > > ---- > > > > Should we increase {mapred.task.timeout and scanner.timeout}? > > or any good idea? > > > > -- > > Best Regards, Edward J. Yoon @ NHN, corp. > > [email protected] > > http://blog.udanax.org > > > > > > -- > Best Regards, Edward J. Yoon @ NHN, corp. > [email protected] > http://blog.udanax.org >
