I agree with Devon, that analysis is essential when discussing performance issues. Here are three examples where analysis (or the lack of it) was critical.
1. When floating point vector processors were first made available for the IBM system running APL that I supported, there was immediate clamor for acquiring one for APL. The assumption was that, as a vector oriented language, it should immediately benefit all APL programs and show significant performance improvement with minimal or no work in reprogramming at the application level. Interest remained strong until an analysis showed that the time spent using floating point at all was 5% or less for the APL applications we were running. Even if the vector processor reduced the floating point processing time to zero, the benefits would be practically unnoticeable. 2. As server-based systems became all the rage, there were more proposals to move applications off the mainframe systems to servers. One of these proposals had an interesting result. The team that proposed this server-based APL migration had not bothered with any performance analysis. To them, it was obvious that any server implementation must inherently be better than any mainframe implementation. The migration was done to woefully underpowered server hardware and failed completely, due to impossibly slow performance. 3. One application programmer came to me with CPU-intensive APL code that ran weekly and was not completing by the required time. I reviewed the code. It was a solid program, coded with the finest COBOL-like scalar techniques. I spent a hour or so unrolling the loops, matricizing the code and validating the results. The final APL program ran in 1-2% of the time of the original, with exactly the same output. -- David Mitchell On 2/15/2010 13:49, Devon McCormick wrote: > Raul - yes - there's always been a lot of hand-waving magic about the > benefits of parallel processing but many a pretty ship of theory has > foundered on the cold hard rocks of fact. Until you consider a specific > problem and do the work, you can't make any claims about the benefits. > > In fact, it's easy to offer a "dis-proof of concept": parallelize > > 1 2+3 4 > > I bet any parallel version of this will lose to the non-parallel version - > there's no way the overhead cost of breaking up and re-assembling the > results of this small piece of work is less than simply doing it. > > We talked about this at the last NYCJUG and I'm glad to see it's still a > pressing topic as this will motivate me to update the wiki for February's > meeting. > > Regards, > > Devon > > On Mon, Feb 15, 2010 at 12:33 PM, Raul Miller<[email protected]> wrote: > >> On Mon, Feb 15, 2010 at 11:00 AM, bill lam<[email protected]> wrote: >>> Apart from the startup time and memory cost, the biggest problems are >>> the need to tailor the algorithm by programmer for each for its >>> applications, synchronisation of sessions, the memory bandwidth it >>> took to transfer data between session. OTOH the low level solution is >>> transparent, J programmers will not even need to aware its existence. >>> Henry Rich had also suggested this approach if memory served me. >> >> One problem with the low level approach seems to be that, so >> far, no one wants to fund the investigative costs associated with >> this change. >> >> To my knowledge no one has even shown a "proof of concept" >> implementation for any primitive. >> >> -- >> Raul ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
