Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/hama/PerformanceEvaluation ------------------------------------------------------------------------------ - == Benchmarks == + * work in progress. + * See also : http://blog.udanax.org/2009/01/distributed-matrix-multiplication-with.html - This performance contains data load and export operations. - - Dependencies Information : - - * Hadoop 0.18.2 - * Hbase 0.18.1 - - Hardware Information : - - * 4 Intel(R) Xeon(R) CPU 2.33GHz, SATA hard disk, Physical Memory 16,626,844 KB - ---- - - * Dense matrix add - * Dense matrix multiply - - ||<bgcolor="#ececec"> Version ||<bgcolor="#ececec"> Operation ||<bgcolor="#ececec"> Cluster Size ||<bgcolor="#ececec"> Rows ||<bgcolor="#ececec"> Columns ||<bgcolor="#ececec"> Total Maps ||<bgcolor="#ececec"> Total Reduces ||<bgcolor="#ececec"> Time (seconds) ||<bgcolor="#ececec"> Bytes Read ||<bgcolor="#ececec"> Bytes Written||<bgcolor="#ececec"> mapred.child.java.opts || - ||Trunk 718158 || Mult ||2 node ||300 ||300 ||2||2||12 seconds ||1,464,484 || 2,929,092|| -Xmx200m || - ||Trunk 720735 || Mult ||2 node ||1,000 ||1,000 ||2||2||20 seconds || 16,166,452 || 32,333,028 || -Xmx200m || - ||Trunk 722320 || Add || 2 node ||3,000 ||3,000 ||4||2||298 seconds || 1,053,503,366 || 1,575,781,107 || -Xmx200m || - ||Trunk 722320 || Mult ||2 node ||3,000 ||3,000 ||4||2||124 seconds || 590,672,392 || 872,228,808 || -Xmx200m || - ||Trunk 722320 || Mult ||2 node ||5,000 ||5,000 ||50||4||912 seconds || 24,434,034,076 || 34,631,558,186 || -Xmx200m || - - {{{ - NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction. - - Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix - - Mflip/s Wall sec Library - ------- -------- ------------------------------------------- - 8,300 30 PESSL PDGEMM (16 processors) - 7,900 32 ScaLAPACK routine PDGEMM (16 processors) - 7,900 32 ESSL-SMP routine DGEMM (16 threads) - 7,900 32 NAG-SMP routine F01CKF (16 threads) - 1,200 213 ESSL routine DGEMM - - Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix - - Mflip/s Wall sec Library and configuration - ------- -------- ------------------------------------------- - 158,900 100 ScaLAPACK PDGEMM (256 proc, 16 nodes) - 146,200 110 PESSL PDGEMM (256 proc, 16 nodes) - 105,400 150 ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128) - 100,960 160 PESSL PDGEMM (144 proc, 9 nodes, block 128) - 79,400 200 PESSL PDGEMM (144 proc, 9 nodes, block 1024) - 74,800 214 ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024) - 55,000 290 PESSL PDGEMM (64 proc, 4 nodes) - 50,000 320 ScaLAPACK PDGEMM (64 proc, 4 nodes) - 27,160 590 PESSL PDGEMM (32 proc, 2 nodes) - 25,630 625 ScaLAPACK PDGEMM (32 proc, 2 nodes) - 15,800 1,010 PESSL PDGEMM (16 Proc, 1 node) - 15,600 1,025 ScaLAPACK PDGEMM (16 Proc, 1 node) - - Matrix-Matrix Multiply of Larger Dense Matrix - - Gflip/s Wall sec Size Library and configuration - ------- -------- ------- ------------------------------------------- - 163.6 1,529 50,000 ScaLAPACK PDGEMM (256 proc, 16 nodes) - 163.4 1,531 50,000 PESSL PDGEMM (256 proc, 16 nodes) - 179.6 11,141 100,000 PESSL PDGEMM (256 proc, 16 nodes, 128 block) - 210.7 9,495 100,000 ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block) - }}} - ---- - - * Dense LU factorization - * Transpose - * Matrix tridiagonalization, for eigenvalue computations of symmetric matrices. -
