Hello, Just wanted to start the discussion about BSP on MPI and BSP for shared memory architecture. GD already showed the 10x performance improvement of Apache Hama applications using MPI + InfiniBand without any changes.
Another is about BSP for shared memory architecture. As you know, some algorithms generates lots of communications and all-reduce operations among processors. To solve these problem, we might want to add new BSP package for shared-memory. What do you think? -- Best Regards, Edward J. Yoon
