I wanted to gather some analysis on parallelism for matrix multiplication. Amdahl 's law essentially compares speed when work is done serially to speed when some parallelism is introduced in the system.
Say I have Tcount threads used for computation on a system having NCores number of cores. Say dimension of all matrices is NxN. If code was completely sequential then N^3 I guess would be amount of work done. How would be the case when Tcount threads are used? say I have each thread performing calculation for certain number of rows. So if I have matrix dimension N and Tcount threads then each thread calculates N/Tcountnumber of rows , and for each row it takes N^ 2 work units serially. How much time would it now take with threads? I thought it would be n^3 work serially but if parallelism is involved it would be (n/tcount)*n^2 * tcount/ncores... But this causes tcount to cancel out which looks absurd. Any suggestions? -- You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/algogeeks?hl=en.
