also as Lance mentioned, usually "coefficient of performance" per core for distributed methods is lower than that of an iterative method. It is hard (if even possible) to achieve 100% scalability here. Simply put, if you have 5 computers to solve same problem, it will not be solved 5 times faster than a comparable method on a single computer.
On Wed, Aug 1, 2012 at 11:29 AM, Dmitriy Lyubimov <[email protected]> wrote: > I only know comparisons of parallel algorithms only. There's > performance and accuracy comparison between Mahout's SSVD and Lanczos > done in dissertation of N. Halko (see link at SSVD page on Mahout > wiki). There's also a "Heigen" SVD paper that discusses distributed > modified Lanczos method of a proprietary Hadoop-based implemetnation > at Yahoo. Even though it doesn't draw side-by-side comparisons, it > does present benchmark figures for the Heigen implementation so one > can approximately draw comparisons between Heigen and Mahout methods. > > w.r.t to parallel vs. non-parallel, IMO the bottom line is > practicality, not necessarily speed. There are some SVD problems that > one might argue that single computer solution is not practical and > which a distributed algorithm may actually shift into realm of > practical solutions. (in a sense that you don't need days to solve > it). But IMO direct comparison still doesn't make a lot of sense. > > On Sat, Jul 28, 2012 at 9:27 AM, mohsen jadidi <[email protected]> > wrote: >> Thank you for your replies. What I am interested to know is that if I want >> to compute the SVD for huge matrix , how much faster my computation get by >> using Mahout. >> >> On Fri, Jul 27, 2012 at 8:12 PM, Dmitriy Lyubimov <[email protected]> wrote: >> >>> IMO it doesn't make much sense to compare non-parallel and a parallel >>> algorithm (assuming they are running approximately same flops-sized >>> computation). Which is probably why there's not so many (i don't know >>> any). >>> >>> However, there are studies comparing parallel approaches (e.g. certain >>> mahout vs. giraph methods) given same amount of flops capacity in a >>> cluster, but i think you need to be more specific because there are >>> too many areas of interest you are talking about. >>> >>> On Fri, Jul 27, 2012 at 8:57 AM, mohsen jadidi <[email protected]> >>> wrote: >>> > Hey all, >>> > >>> > I am looking for some case studies which has evaluated some of Mahout >>> > algorithm implementation like different decomposition or different >>> > classifier. I just want to know how much faster is the Mahout in compare >>> of >>> > regular non. paralleled algorithms.I couldnt find anything useful. >>> > >>> > Thanks in advance, >>> > >>> > -- >>> > Mohsen Jadidi >>> >> >> >> >> -- >> Mohsen Jadidi
