Hmm, Revision 725570 (samuel's logic) is more faster than revision 726934 on large scale.
On Thu, Dec 11, 2008 at 5:50 PM, Samuel Guo <[email protected]> wrote: > Oh, The code is not neccessary. JVM will set the initial value of a new > array. sorry. > > On Thu, Dec 11, 2008 at 4:37 PM, Edward J. Yoon <[email protected]>wrote: > >> Oh, one question. >> >> Why we need to fill C with zeros? >> >> public SubMatrix mult(SubMatrix b) { >> double[][] C = new double[this.getRows()][b.getColumns()]; >> for (int i = 0; i < this.getRows(); i++) { >> Arrays.fill(C[i], 0); >> } >> >> for (int i = 0; i < this.getRows(); i++) { >> for (int j = 0; j < b.getColumns(); j++) { >> for (int k = 0; k < this.getColumns(); k++) { >> C[i][j] += this.get(i, k) * b.get(k, j); >> } >> } >> } >> >> return new SubMatrix(C); >> } >> >> >> On Thu, Dec 11, 2008 at 5:30 PM, Samuel Guo <[email protected]> wrote: >> > On Thu, Dec 11, 2008 at 2:36 PM, Edward J. Yoon <[email protected] >> >wrote: >> > >> >> If we remove 'reduce phase', I guess we can reduce the disk I/O >> operations. >> > >> > >> > Yes. >> > >> > >> >> >> >> >> >> In the map, read { Constants.BLOCK_STARTROW, Constants.BLOCK_ENDROW, >> >> Constants.BLOCK_STARTCOLUMN, Constants.BLOCK_ENDCOLUMN } instead of { >> >> Constants.COLUMN }, and write directly blocks. >> > >> > >> > Two methods to be considered: >> > 1) We need a InputFormat that partitions the matrix table according to >> the >> > row boundaries of the blocks. >> > This should be carefully to make sure a single block will not divied >> > into two or more mappers. >> > >> > 2) Like what RandomMatrixMap does, we just tell the mappers the >> row/column >> > boundaries of the blocks of a matrix-table. >> > Scanner the portion of the table will be done in a mapper. >> > >> > I think 1) may be better than 2). >> > An InputFormat can get the locality of a range of table to let MR know >> how >> > to move the mr computations close to it. >> > In 2), if we do it like RandomMatrixMap, we may lose some locality >> > informations of the table. so that the network transfer overhead may be >> > increase. >> > >> > It is just my guess and thoughts. >> > >> > >> >> >> >> >> >> What do you think? >> >> >> >> -- >> >> Best Regards, Edward J. Yoon @ NHN, corp. >> >> [email protected] >> >> http://blog.udanax.org >> >> >> > >> >> >> >> -- >> Best Regards, Edward J. Yoon @ NHN, corp. >> [email protected] >> http://blog.udanax.org >> > -- Best Regards, Edward J. Yoon @ NHN, corp. [email protected] http://blog.udanax.org
