Re: Performance of ALS

Sebastian Schelter Thu, 18 Apr 2013 13:51:51 -0700

Hi Robin,

It had your changes. In order to run the test on movielens1M, you have
to apply this small patch: https://gist.github.com/sscdotopen/5416101


Furthermore, you have to download the movielens1M dataset here:
http://www.grouplens.org/node/73

You have to convert the ratings.dat file like this:

cat ratings.dat |sed -e s/::/,/g| cut -d, -f1,2,3 > movielens.csv

Best,
Sebastian

On 18.04.2013 22:45, Robin Anil wrote:
> BTW did this include the changes I made in the trunk recently? I would also
> like to profile that code and see if we can squeeze out our Vectors and
> Matrices more. Could you point me to how I can run the 1M example.
> 
> Robin
> 
> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
> 
> 
> On Thu, Apr 18, 2013 at 3:43 PM, Robin Anil <robin.a...@gmail.com> wrote:
> 
>> I was just emailing something similar on Mahout(See my email). I saw the
>> TU Berlin name and I thought you would do something about it :) This is
>> excellent. One of the next gen work on Vectors is maybe investigating this.
>>
>>
>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>
>>
>> On Thu, Apr 18, 2013 at 3:37 PM, Sebastian Schelter <s...@apache.org>wrote:
>>
>>> Hi there,
>>>
>>> with regard to Robin mentioning JBlas [1] recently when we talked about
>>> the performance of our vector operations, I ported the solving code for
>>> ALS to JBlas today and got some awesome results.
>>>
>>> For the movielens 1M dataset and a factorization of rank 100, the
>>> runtimes per iteration dropped from 50 seconds to less than 7 seconds. I
>>> will run some tests with the distributed version and larger datasets in
>>> the next days, but from what I've seen we should really take a closer
>>> look at JBlas, at least for operations on dense matrices.
>>>
>>> Best,
>>> Sebastian
>>>
>>> [1] http://mikiobraun.github.io/jblas/
>>>
>>
>>
>

Re: Performance of ALS

Reply via email to