[julia-users] Re: Regression on a singular matrix

Cedric St-Jean Wed, 16 Sep 2015 19:57:07 -0700

Thank you, that makes sense. Looking forward to 0.4!


On Wednesday, September 16, 2015 at 10:20:50 PM UTC-4, Simon Kornblith 
wrote:
>
> \ is supposed to use pivoted QR by default, which a standard way to solve 
> least squares when X is not full rank. It gives me a LAPACKException on 0.3 
> too, but it works for me on 0.4:
>
> julia> X2 = [1 2 3;
>              1 2 5
>              1 2 -8
>              2 4 23];
>
> julia> y = [2.3, 5.6, 9.2, 10.5];
>
> julia> X2\y
> 3-element Array{Float64,1}:
>   1.28112 
>   2.56223 
>  -0.146502
>
> You can explicitly tell Julia to solve using pivoted QR, which works on 
> 0.3:
>
> julia> qrfact(X2, pivot=true)\y # on Julia 0.4: qrfact(X2, Val{true})\y
> 3-element Array{Float64,1}:
>   1.28112 
>   2.56223 
>  -0.146502
>
> Or you can solve using SVD:
>
> julia> svdfact(X2)\y
> 3-element Array{Float64,1}:
>   1.28112 
>   2.56223 
>  -0.146502
>
> Pivoted QR should be slightly faster than SVD, but both should be much 
> faster than rref.
>
> Simon
>
> On Wednesday, September 16, 2015 at 10:02:49 AM UTC-4, Cedric St-Jean 
> wrote:
>>
>> Sorry, "singular" was the wrong word. I meant that the matrix does not 
>> have full rank: the columns are not linearly independent.
>>
>> # Linearly independent columns
>> X1 = [1 2 3;
>>       1 2 5
>>       1 2 -8
>>       1 4 23] 
>> # Linearly dependent columns (4th row changed
>> X2 = [1 2 3;
>>       1 2 5
>>       1 2 -8
>>       2 4 23]
>> y = [2.3, 5.6, 9.2, 10.5]
>>
>> X1 \ y
>> >3-element Array{Float64,1}:
>>
>> > -8.18265 
>> >  6.94133 
>> > -0.394898
>>
>> X2 \ y
>> >LAPACKException(2)
>>
>>
>> This is because there is no unique solution to the linear regression 
>> problem when the matrix does not have full rank. The usual solution is 
>> regularization (eg. ridge regression), but selecting a linearly independent 
>> subset of the columns seems like a valid strategy too, if only prediction 
>> accuracy matters.
>>
>> Hence: isn't that a valid non-pedagogical use of rref?
>>
>> Cédric
>>
>> On Wednesday, September 16, 2015 at 12:13:55 AM UTC-4, Steven G. Johnson 
>> wrote:
>>>
>>>
>>>
>>> On Tuesday, September 15, 2015 at 10:22:47 AM UTC-4, Cedric St-Jean 
>>> wrote:
>>>>
>>>> In the discussion of issue 9804 
>>>> <https://github.com/JuliaLang/julia/pull/9804#issuecomment-140404574>, 
>>>> the general consensus seemed to be:
>>>>
>>>> > Right. "rref" has no use in real numerical coding, but every use in 
>>>> pedagogy. 
>>>>
>>>>
>>>>  I had a linear regression yield a PosDefError (the X matrix contains 
>>>> only integers and was singular). I solved it by selecting a linearly 
>>>> independent subset of columns 
>>>> <http://math.stackexchange.com/questions/199125/how-to-remove-linearly-dependent-rows-cols>.
>>>>  
>>>> This uses the Row-Echelon-Form, rref. Is there a simpler way of doing 
>>>> this that I'm missing?
>>>>
>>>
>>> A \ x, where A is a non-square matrix, finds least-square solution via 
>>> QR.  This is normally the "right" way to do regression.
>>>
>>> (You can also compute the QR factorization of A explicitly to get a rank 
>>> estimate; google "rank revealing QR" for more info on this sort of thing.)
>>>
>>

[julia-users] Re: Regression on a singular matrix

Reply via email to