Re: performance study

Dmitriy Lyubimov Wed, 01 Aug 2012 11:37:23 -0700

also as Lance mentioned, usually "coefficient of performance" per core
for distributed methods is lower than that of an iterative method. It
is hard (if even possible) to achieve 100% scalability here. Simply
put, if you have 5 computers to solve same problem, it will not be
solved 5 times faster than a comparable method on a single computer.


On Wed, Aug 1, 2012 at 11:29 AM, Dmitriy Lyubimov <[email protected]> wrote:
> I only know comparisons of parallel algorithms only. There's
> performance and accuracy comparison between Mahout's SSVD and Lanczos
> done in dissertation of N. Halko (see link at SSVD page on Mahout
> wiki). There's also a "Heigen" SVD paper that discusses distributed
> modified Lanczos method of a proprietary Hadoop-based implemetnation
> at Yahoo. Even though it doesn't draw side-by-side comparisons, it
> does present benchmark figures for the Heigen implementation so one
> can approximately draw comparisons between Heigen and Mahout methods.
>
> w.r.t to parallel vs. non-parallel, IMO the bottom line is
> practicality, not necessarily speed. There are some SVD problems that
> one might argue that single computer solution is not practical and
> which a distributed algorithm may actually shift into realm of
> practical solutions. (in a sense that you don't need days to solve
> it). But IMO direct comparison still doesn't make a lot of sense.
>
> On Sat, Jul 28, 2012 at 9:27 AM, mohsen jadidi <[email protected]> 
> wrote:
>> Thank you for your replies. What I am interested to know is that if I want
>> to compute the SVD for huge matrix , how much faster my computation get by
>> using Mahout.
>>
>> On Fri, Jul 27, 2012 at 8:12 PM, Dmitriy Lyubimov <[email protected]> wrote:
>>
>>> IMO it doesn't make much sense to compare non-parallel and a parallel
>>> algorithm (assuming they are running approximately same flops-sized
>>> computation). Which is probably why there's not so many (i don't know
>>> any).
>>>
>>> However, there are studies comparing parallel approaches (e.g. certain
>>> mahout vs. giraph methods) given same amount of flops capacity in a
>>> cluster, but i think you need to be more specific because there are
>>> too many areas of interest you are talking about.
>>>
>>> On Fri, Jul 27, 2012 at 8:57 AM, mohsen jadidi <[email protected]>
>>> wrote:
>>> > Hey all,
>>> >
>>> > I am looking for some case studies which has evaluated  some of Mahout
>>> > algorithm implementation like different decomposition or different
>>> > classifier. I just want to know how much faster is the Mahout in compare
>>> of
>>> > regular non. paralleled algorithms.I couldnt find anything useful.
>>> >
>>> > Thanks in advance,
>>> >
>>> > --
>>> > Mohsen Jadidi
>>>
>>
>>
>>
>> --
>> Mohsen Jadidi

Re: performance study

Reply via email to