Re: Performance of ALS

Sebastian Schelter Thu, 18 Apr 2013 14:13:04 -0700

Let us know the results! :)

I think in the case of ALS, we can even use Solve.solveSymmetric()


Best,
Sebastian

On 18.04.2013 23:07, Sean Owen wrote:
> Good lead -- from
> https://github.com/mikiobraun/jblas/blob/master/src/main/java/org/jblas/Solve.java
> it looks like it's an SVD. Definitely took a search to figure out what
> 'gelsd' does in LAPACK! I'll see if I can test-drive this too to see
> if it bumps performance. That would be great, JNI is a much smaller
> requirement than a GPU!
> 
> On Thu, Apr 18, 2013 at 10:01 PM, Sebastian Schelter <s...@apache.org> wrote:
>> Hi Sean,
>>
>> I simply used the Solve.solve() method, I guess it uses a QR
>> decomposition internally. I can provide a copy of the code if you want
>> to have a look.
>>
>> Best,
>> Sebastian
>>
>> On 18.04.2013 22:56, Sean Owen wrote:
>>> I'm always interested in optimizing the bit where you solve Ax=B which
>>> I so recently went on about. That's a dense-matrix problem. Is there a
>>> QR decomposition available?
>>>
>>> I tried getting this part to run on a GPU, and it worked, but wasn't
>>> faster. Still somehow it was slower to push the smalish dense matrix
>>> onto the card so many times per second. Same issue is identified here
>>> so I'm interested to hear if this is a win by using the direct buffer
>>> approach.
>>>
>>> On Thu, Apr 18, 2013 at 9:51 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
>>>> i've looked at jblas some time year or two ago.
>>>>
>>>> It's a fast bridge to LAPack and LAPack by far is hard to beat. But, I
>>>> think i convinced myself it lacks support for sparse stuff. Which will work
>>>> nice though still for many blockified algorithms such as ALS-WR with try to
>>>> avoid doing blas level 3 operations on sparse data.
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 1:45 PM, Robin Anil <robin.a...@gmail.com> wrote:
>>>>
>>>>> BTW did this include the changes I made in the trunk recently? I would 
>>>>> also
>>>>> like to profile that code and see if we can squeeze out our Vectors and
>>>>> Matrices more. Could you point me to how I can run the 1M example.
>>>>>
>>>>> Robin
>>>>>
>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>>>
>>>>>
>>>>> On Thu, Apr 18, 2013 at 3:43 PM, Robin Anil <robin.a...@gmail.com> wrote:
>>>>>
>>>>>> I was just emailing something similar on Mahout(See my email). I saw the
>>>>>> TU Berlin name and I thought you would do something about it :) This is
>>>>>> excellent. One of the next gen work on Vectors is maybe investigating
>>>>> this.
>>>>>>
>>>>>>
>>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 18, 2013 at 3:37 PM, Sebastian Schelter <s...@apache.org
>>>>>> wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> with regard to Robin mentioning JBlas [1] recently when we talked about
>>>>>>> the performance of our vector operations, I ported the solving code for
>>>>>>> ALS to JBlas today and got some awesome results.
>>>>>>>
>>>>>>> For the movielens 1M dataset and a factorization of rank 100, the
>>>>>>> runtimes per iteration dropped from 50 seconds to less than 7 seconds. I
>>>>>>> will run some tests with the distributed version and larger datasets in
>>>>>>> the next days, but from what I've seen we should really take a closer
>>>>>>> look at JBlas, at least for operations on dense matrices.
>>>>>>>
>>>>>>> Best,
>>>>>>> Sebastian
>>>>>>>
>>>>>>> [1] http://mikiobraun.github.io/jblas/
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>

Re: Performance of ALS

Reply via email to