Re: Performance of ALS

Sebastian Schelter Thu, 18 Apr 2013 14:02:03 -0700

Hi Sean,

I simply used the Solve.solve() method, I guess it uses a QR
decomposition internally. I can provide a copy of the code if you want
to have a look.


Best,
Sebastian

On 18.04.2013 22:56, Sean Owen wrote:
> I'm always interested in optimizing the bit where you solve Ax=B which
> I so recently went on about. That's a dense-matrix problem. Is there a
> QR decomposition available?
> 
> I tried getting this part to run on a GPU, and it worked, but wasn't
> faster. Still somehow it was slower to push the smalish dense matrix
> onto the card so many times per second. Same issue is identified here
> so I'm interested to hear if this is a win by using the direct buffer
> approach.
> 
> On Thu, Apr 18, 2013 at 9:51 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
>> i've looked at jblas some time year or two ago.
>>
>> It's a fast bridge to LAPack and LAPack by far is hard to beat. But, I
>> think i convinced myself it lacks support for sparse stuff. Which will work
>> nice though still for many blockified algorithms such as ALS-WR with try to
>> avoid doing blas level 3 operations on sparse data.
>>
>>
>> On Thu, Apr 18, 2013 at 1:45 PM, Robin Anil <robin.a...@gmail.com> wrote:
>>
>>> BTW did this include the changes I made in the trunk recently? I would also
>>> like to profile that code and see if we can squeeze out our Vectors and
>>> Matrices more. Could you point me to how I can run the 1M example.
>>>
>>> Robin
>>>
>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>
>>>
>>> On Thu, Apr 18, 2013 at 3:43 PM, Robin Anil <robin.a...@gmail.com> wrote:
>>>
>>>> I was just emailing something similar on Mahout(See my email). I saw the
>>>> TU Berlin name and I thought you would do something about it :) This is
>>>> excellent. One of the next gen work on Vectors is maybe investigating
>>> this.
>>>>
>>>>
>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 3:37 PM, Sebastian Schelter <s...@apache.org
>>>> wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> with regard to Robin mentioning JBlas [1] recently when we talked about
>>>>> the performance of our vector operations, I ported the solving code for
>>>>> ALS to JBlas today and got some awesome results.
>>>>>
>>>>> For the movielens 1M dataset and a factorization of rank 100, the
>>>>> runtimes per iteration dropped from 50 seconds to less than 7 seconds. I
>>>>> will run some tests with the distributed version and larger datasets in
>>>>> the next days, but from what I've seen we should really take a closer
>>>>> look at JBlas, at least for operations on dense matrices.
>>>>>
>>>>> Best,
>>>>> Sebastian
>>>>>
>>>>> [1] http://mikiobraun.github.io/jblas/
>>>>>
>>>>
>>>>
>>>

Re: Performance of ALS

Reply via email to