It's done in one iteration.
This is the R from QR factorization:
5.0663 5.8122 4.9704 4.3987 6.3400 4.5970 5.0334
4.2581 3.3808 5.3250
0 2.4036 1.1722 2.3296 1.6580 0.4575 1.1706
2.1040 1.6738 1.4839
0 0 1.5085 0.0966 1.2581 0.5236 0.4712
-0.0411 0.3143 0.5957
0 0 0 1.8682 0.1834 -0.3244 -0.0073
0.3817 1.1673 0.4783
0 0 0 0 1.9569 0.8666 0.3201
-0.4167 0.0732 0.3114
0 0 0 0 0 1.3520 0.2326
-0.1156 -0.2793 0.0103
0 0 0 0 0 0 1.1689
0.3151 0.0590 0.0435
0 0 0 0 0 0 0
1.6296 -0.3494 -0.0024
0 0 0 0 0 0
0 0 1.4307 0.1803
0 0 0 0 0 0
0 0 0 1.1404
On Thu, Apr 4, 2013 at 3:46 PM, Koobas <[email protected]> wrote:
> Sorry, the image was off.
> This is more like it:
>
> [image: Inline image 1]
>
>
> On Thu, Apr 4, 2013 at 3:38 PM, Koobas <[email protected]> wrote:
>
>> k was 10
>>
>>
>> On Thu, Apr 4, 2013 at 3:37 PM, Koobas <[email protected]> wrote:
>>
>>> No major problems:
>>>
>>> A =
>>>
>>> 1 4 3 0 0
>>> 0 0 3 0 0
>>> 0 4 0 3 2
>>> 5 0 2 0 3
>>> 0 0 0 5 0
>>> 2 4 0 0 0
>>>
>>>
>>> XY =
>>>
>>> 0.9605 3.4671 2.4266 0.0542 0.1976
>>> 0.1638 0.0900 2.2520 -0.0241 0.0057
>>> 0.3024 3.4562 0.0997 2.6573 1.3113
>>> 4.2061 0.1756 1.7800 0.0045 2.4648
>>> -0.1467 0.1345 -0.0401 4.0637 0.2787
>>> 1.5208 3.3953 0.2292 0.0489 0.3508
>>>
>>>
>>> RMSE =
>>>
>>> 0.4084
>>>
>>> [image: Inline image 1]
>>>
>>>
>>> On Thu, Apr 4, 2013 at 3:04 PM, Koobas <[email protected]> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Apr 4, 2013 at 2:23 PM, Sean Owen <[email protected]> wrote:
>>>>
>>>>> Does it complete without problems? It may complete without error but
>>>>> the result may be garbage. The matrix that's inverted is not going to
>>>>> be singular due to round-off. Even if it's not you may find that the
>>>>> resulting vectors are infinite or very large. In particular I at least
>>>>> had to make the singularity threshold a lot larger than
>>>>> Double.MIN_VALUE in the QR decomposition.
>>>>>
>>>>> No, not at all, it completes very nicely.
>>>> I print the residual and also plot eyeball the plot of the matrix.
>>>> It does the job.
>>>>
>>>>
>>>>> Try some simple dummy data like below, without maybe k=10. If it
>>>>> completes with error that's a problem!
>>>>>
>>>>
>>>> Okay, let me try it....
>>>>
>>>>>
>>>>> 0,0,1
>>>>> 0,1,4
>>>>> 0,2,3
>>>>> 1,2,3
>>>>> 2,1,4
>>>>> 2,3,3
>>>>> 2,4,2
>>>>> 3,0,5
>>>>> 3,2,2
>>>>> 3,4,3
>>>>> 4,3,5
>>>>> 5,0,2
>>>>> 5,1,4
>>>>>
>>>>> On Thu, Apr 4, 2013 at 7:05 PM, Koobas <[email protected]> wrote:
>>>>> > I took Movie Lens 100K data without ratings and ran non-weighted ALS
>>>>> in
>>>>> > Matlab.
>>>>> > I set number of features k=2000, which is larger than the input
>>>>> matrix
>>>>> > (1000 x 1700).
>>>>> > I used QR to do the inversion.
>>>>> > It runs without problems.
>>>>> > Can you share your data?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Apr 4, 2013 at 1:10 PM, Koobas <[email protected]> wrote:
>>>>> >
>>>>> >> Just to throw another bit.
>>>>> >> Just like Ted was saying.
>>>>> >> If you take the largest singular value over the smallest singular
>>>>> value,
>>>>> >> you get your condition number.
>>>>> >> If it turns out to be 10^16, then you're loosing all the digits of
>>>>> double
>>>>> >> precision accuracy,
>>>>> >> meaning that your solver is nothing more than a random number
>>>>> generator.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Thu, Apr 4, 2013 at 12:21 PM, Dan Filimon <
>>>>> [email protected]>wrote:
>>>>> >>
>>>>> >>> For what it's worth, here's what I remember from my Numerical
>>>>> Analysis
>>>>> >>> course.
>>>>> >>>
>>>>> >>> The thing we were taught to use to figure out whether the matrix
>>>>> is ill
>>>>> >>> conditioned is the condition number of a matrix (k(A) = norm(A) *
>>>>> >>> norm(A^-1)). Here's a nice explanation of it [1].
>>>>> >>>
>>>>> >>> Suppose you want to solve Ax = b. How much worse results will you
>>>>> get
>>>>> >>> using
>>>>> >>> A if you're not really solving Ax = b but A(x + delta) = b +
>>>>> epsilon (x is
>>>>> >>> still a solution for Ax = b).
>>>>> >>> So, by perturbing the b vector by epsilon, how much worse is delta
>>>>> going
>>>>> >>> to
>>>>> >>> be? There's a short proof [1, page 4] but the inequality you get
>>>>> is:
>>>>> >>>
>>>>> >>> norm(delta) / norm(x) <= k(A) * norm(epsilon) / norm(b)
>>>>> >>>
>>>>> >>> The rule of thumb is that if m = log10(k(A)), you lose m digits of
>>>>> >>> accuracy. So, equivalently, if m' = log2(k(A)) you lose m' bits of
>>>>> >>> accuracy.
>>>>> >>> Since floats are 32bits, you can decide that say, at most 2 bits
>>>>> may be
>>>>> >>> lost, therefore any k(A) > 4 is not acceptable.
>>>>> >>>
>>>>> >>> Anyway there are lots of possible norms and you need to look at
>>>>> ways of
>>>>> >>> actually interpreting the condition number but from what I learned
>>>>> this is
>>>>> >>> probably the thing you want to be looking at.
>>>>> >>>
>>>>> >>> Good luck!
>>>>> >>>
>>>>> >>> [1] http://www.math.ufl.edu/~kees/ConditionNumber.pdf
>>>>> >>> [2] http://www.rejonesconsulting.com/CS210_lect07.pdf
>>>>> >>>
>>>>> >>>
>>>>> >>> On Thu, Apr 4, 2013 at 5:26 PM, Sean Owen <[email protected]>
>>>>> wrote:
>>>>> >>>
>>>>> >>> > I think that's what I'm saying, yes. Small rows X shouldn't
>>>>> become
>>>>> >>> > large rows of A -- and similarly small changes in X shouldn't
>>>>> mean
>>>>> >>> > large changes in A. Not quite the same thing but both are
>>>>> relevant. I
>>>>> >>> > see that this is just the ratio of largest and smallest singular
>>>>> >>> > values. Is there established procedure for evaluating the
>>>>> >>> > ill-conditioned-ness of matrices -- like a principled choice of
>>>>> >>> > threshold above which you say it's ill-conditioned, based on k,
>>>>> etc.?
>>>>> >>> >
>>>>> >>> > On Thu, Apr 4, 2013 at 3:19 PM, Koobas <[email protected]> wrote:
>>>>> >>> > > So, the problem is that the kxk matrix is ill-conditioned, or
>>>>> is there
>>>>> >>> > more
>>>>> >>> > > to it?
>>>>> >>> > >
>>>>> >>> >
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>
>