Re: Detecting rank-deficiency, or worse, via QR decomposition

Koobas Thu, 04 Apr 2013 12:47:05 -0700

Sorry, the image was off.
This is more like it:

[image: Inline image 1]



On Thu, Apr 4, 2013 at 3:38 PM, Koobas <[email protected]> wrote:

> k was 10
>
>
> On Thu, Apr 4, 2013 at 3:37 PM, Koobas <[email protected]> wrote:
>
>> No major problems:
>>
>> A =
>>
>>      1     4     3     0     0
>>      0     0     3     0     0
>>      0     4     0     3     2
>>      5     0     2     0     3
>>      0     0     0     5     0
>>      2     4     0     0     0
>>
>>
>> XY =
>>
>>     0.9605    3.4671    2.4266    0.0542    0.1976
>>     0.1638    0.0900    2.2520   -0.0241    0.0057
>>     0.3024    3.4562    0.0997    2.6573    1.3113
>>     4.2061    0.1756    1.7800    0.0045    2.4648
>>    -0.1467    0.1345   -0.0401    4.0637    0.2787
>>     1.5208    3.3953    0.2292    0.0489    0.3508
>>
>>
>> RMSE =
>>
>>     0.4084
>>
>> [image: Inline image 1]
>>
>>
>> On Thu, Apr 4, 2013 at 3:04 PM, Koobas <[email protected]> wrote:
>>
>>>
>>>
>>>
>>> On Thu, Apr 4, 2013 at 2:23 PM, Sean Owen <[email protected]> wrote:
>>>
>>>> Does it complete without problems? It may complete without error but
>>>> the result may be garbage. The matrix that's inverted is not going to
>>>> be singular due to round-off. Even if it's not you may find that the
>>>> resulting vectors are infinite or very large. In particular I at least
>>>> had to make the singularity threshold a lot larger than
>>>> Double.MIN_VALUE in the QR decomposition.
>>>>
>>>> No, not at all, it completes very nicely.
>>> I print the residual and also plot eyeball the plot of the matrix.
>>> It does the job.
>>>
>>>
>>>> Try some simple dummy data like below, without maybe k=10. If it
>>>> completes with error that's a problem!
>>>>
>>>
>>> Okay, let me try it....
>>>
>>>>
>>>> 0,0,1
>>>> 0,1,4
>>>> 0,2,3
>>>> 1,2,3
>>>> 2,1,4
>>>> 2,3,3
>>>> 2,4,2
>>>> 3,0,5
>>>> 3,2,2
>>>> 3,4,3
>>>> 4,3,5
>>>> 5,0,2
>>>> 5,1,4
>>>>
>>>> On Thu, Apr 4, 2013 at 7:05 PM, Koobas <[email protected]> wrote:
>>>> > I took Movie Lens 100K data without ratings and ran non-weighted ALS
>>>> in
>>>> > Matlab.
>>>> > I set number of features k=2000, which is larger than the input matrix
>>>> > (1000 x 1700).
>>>> > I used QR to do the inversion.
>>>> > It runs without problems.
>>>> > Can you share your data?
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Apr 4, 2013 at 1:10 PM, Koobas <[email protected]> wrote:
>>>> >
>>>> >> Just to throw another bit.
>>>> >> Just like Ted was saying.
>>>> >> If you take the largest singular value over the smallest singular
>>>> value,
>>>> >> you get your condition number.
>>>> >> If it turns out to be 10^16, then you're loosing all the digits of
>>>> double
>>>> >> precision accuracy,
>>>> >> meaning that your solver is nothing more than a random number
>>>> generator.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Thu, Apr 4, 2013 at 12:21 PM, Dan Filimon <
>>>> [email protected]>wrote:
>>>> >>
>>>> >>> For what it's worth, here's what I remember from my Numerical
>>>> Analysis
>>>> >>> course.
>>>> >>>
>>>> >>> The thing we were taught to use to figure out whether the matrix is
>>>> ill
>>>> >>> conditioned is the condition number of a matrix (k(A) = norm(A) *
>>>> >>> norm(A^-1)). Here's a nice explanation of it [1].
>>>> >>>
>>>> >>> Suppose you want to solve Ax = b. How much worse results will you
>>>> get
>>>> >>> using
>>>> >>> A if you're not really solving Ax = b but A(x + delta) = b +
>>>> epsilon (x is
>>>> >>> still a solution for Ax = b).
>>>> >>> So, by perturbing the b vector by epsilon, how much worse is delta
>>>> going
>>>> >>> to
>>>> >>> be? There's a short proof [1, page 4] but the inequality you get is:
>>>> >>>
>>>> >>> norm(delta) / norm(x) <= k(A) * norm(epsilon) / norm(b)
>>>> >>>
>>>> >>> The rule of thumb is that if m = log10(k(A)), you lose m digits of
>>>> >>> accuracy. So, equivalently, if m' = log2(k(A)) you lose m' bits of
>>>> >>> accuracy.
>>>> >>> Since floats are 32bits, you can decide that say, at most 2 bits
>>>> may be
>>>> >>> lost, therefore any k(A) > 4 is not acceptable.
>>>> >>>
>>>> >>> Anyway there are lots of possible norms and you need to look at
>>>> ways of
>>>> >>> actually interpreting the condition number but from what I learned
>>>> this is
>>>> >>> probably the thing you want to be looking at.
>>>> >>>
>>>> >>> Good luck!
>>>> >>>
>>>> >>> [1] http://www.math.ufl.edu/~kees/ConditionNumber.pdf
>>>> >>> [2] http://www.rejonesconsulting.com/CS210_lect07.pdf
>>>> >>>
>>>> >>>
>>>> >>> On Thu, Apr 4, 2013 at 5:26 PM, Sean Owen <[email protected]> wrote:
>>>> >>>
>>>> >>> > I think that's what I'm saying, yes. Small rows X shouldn't become
>>>> >>> > large rows of A -- and similarly small changes in X shouldn't mean
>>>> >>> > large changes in A. Not quite the same thing but both are
>>>> relevant. I
>>>> >>> > see that this is just the ratio of largest and smallest singular
>>>> >>> > values. Is there established procedure for evaluating the
>>>> >>> > ill-conditioned-ness of matrices -- like a principled choice of
>>>> >>> > threshold above which you say it's ill-conditioned, based on k,
>>>> etc.?
>>>> >>> >
>>>> >>> > On Thu, Apr 4, 2013 at 3:19 PM, Koobas <[email protected]> wrote:
>>>> >>> > > So, the problem is that the kxk matrix is ill-conditioned, or
>>>> is there
>>>> >>> > more
>>>> >>> > > to it?
>>>> >>> > >
>>>> >>> >
>>>> >>>
>>>> >>
>>>> >>
>>>>
>>>
>>>
>>
>

Re: Detecting rank-deficiency, or worse, via QR decomposition

Reply via email to