\ is supposed to use pivoted QR by default, which a standard way to solve
least squares when X is not full rank. It gives me a LAPACKException on 0.3
too, but it works for me on 0.4:
julia> X2 = [1 2 3;
1 2 5
1 2 -8
2 4 23];
julia> y = [2.3, 5.6, 9.2, 10.5];
julia> X2\y
3-element Array{Float64,1}:
1.28112
2.56223
-0.146502
You can explicitly tell Julia to solve using pivoted QR, which works on 0.3:
julia> qrfact(X2, pivot=true)\y # on Julia 0.4: qrfact(X2, Val{true})\y
3-element Array{Float64,1}:
1.28112
2.56223
-0.146502
Or you can solve using SVD:
julia> svdfact(X2)\y
3-element Array{Float64,1}:
1.28112
2.56223
-0.146502
Pivoted QR should be slightly faster than SVD, but both should be much
faster than rref.
Simon
On Wednesday, September 16, 2015 at 10:02:49 AM UTC-4, Cedric St-Jean wrote:
>
> Sorry, "singular" was the wrong word. I meant that the matrix does not
> have full rank: the columns are not linearly independent.
>
> # Linearly independent columns
> X1 = [1 2 3;
> 1 2 5
> 1 2 -8
> 1 4 23]
> # Linearly dependent columns (4th row changed
> X2 = [1 2 3;
> 1 2 5
> 1 2 -8
> 2 4 23]
> y = [2.3, 5.6, 9.2, 10.5]
>
> X1 \ y
> >3-element Array{Float64,1}:
>
> > -8.18265
> > 6.94133
> > -0.394898
>
> X2 \ y
> >LAPACKException(2)
>
>
> This is because there is no unique solution to the linear regression
> problem when the matrix does not have full rank. The usual solution is
> regularization (eg. ridge regression), but selecting a linearly independent
> subset of the columns seems like a valid strategy too, if only prediction
> accuracy matters.
>
> Hence: isn't that a valid non-pedagogical use of rref?
>
> Cédric
>
> On Wednesday, September 16, 2015 at 12:13:55 AM UTC-4, Steven G. Johnson
> wrote:
>>
>>
>>
>> On Tuesday, September 15, 2015 at 10:22:47 AM UTC-4, Cedric St-Jean wrote:
>>>
>>> In the discussion of issue 9804
>>> <https://github.com/JuliaLang/julia/pull/9804#issuecomment-140404574>,
>>> the general consensus seemed to be:
>>>
>>> > Right. "rref" has no use in real numerical coding, but every use in
>>> pedagogy.
>>>
>>>
>>> I had a linear regression yield a PosDefError (the X matrix contains
>>> only integers and was singular). I solved it by selecting a linearly
>>> independent subset of columns
>>> <http://math.stackexchange.com/questions/199125/how-to-remove-linearly-dependent-rows-cols>.
>>>
>>> This uses the Row-Echelon-Form, rref. Is there a simpler way of doing
>>> this that I'm missing?
>>>
>>
>> A \ x, where A is a non-square matrix, finds least-square solution via
>> QR. This is normally the "right" way to do regression.
>>
>> (You can also compute the QR factorization of A explicitly to get a rank
>> estimate; google "rank revealing QR" for more info on this sort of thing.)
>>
>