thanks. - erik
> BSP is essentially all about "the inner loop". In this loop, you do > the work, and, at the bottom of the loop, you tell everyone what you > have done. > > So you are either computing or communicating. Which means, on your > $100M computer, that you are using about $50M of it over time. Which > is undesirable. > > Nowadays, people work fairly hard to ensure that while computation is > happening, the network is busy moving data. > > This problem with BSP is well known, which is why some folks have > tried to time-share the nodes in the following > way(www.ccs3.lanl.gov/pal/publications/papers/petrini01:feng.pdf): > have N jobs (N usually 2). While N-1 jobs are using the network, and > hence not computing, have 1 job computing. Of course, matching this > all up is hard, and most compute jobs typically are sized to use all > of memory, so this approach has not been used much. The nodes on the > big machines are typically not shared between jobs. > > BSP was an interesting idea but is not commonly used any more, at > least on the systems I know about. Rather, people work hard to overlap > communication and computation. > > ron > p.s. for more recent work see: www.cs.unm.edu/~fastos/06meeting/sft.pdf
