Thank you, that makes sense. Looking forward to 0.4!
On Wednesday, September 16, 2015 at 10:20:50 PM UTC-4, Simon Kornblith
wrote:
>
> \ is supposed to use pivoted QR by default, which a standard way to solve
> least squares when X is not full rank. It gives me a LAPACKException on 0.3
> too, but it works for me on 0.4:
>
> julia> X2 = [1 2 3;
> 1 2 5
> 1 2 -8
> 2 4 23];
>
> julia> y = [2.3, 5.6, 9.2, 10.5];
>
> julia> X2\y
> 3-element Array{Float64,1}:
> 1.28112
> 2.56223
> -0.146502
>
> You can explicitly tell Julia to solve using pivoted QR, which works on
> 0.3:
>
> julia> qrfact(X2, pivot=true)\y # on Julia 0.4: qrfact(X2, Val{true})\y
> 3-element Array{Float64,1}:
> 1.28112
> 2.56223
> -0.146502
>
> Or you can solve using SVD:
>
> julia> svdfact(X2)\y
> 3-element Array{Float64,1}:
> 1.28112
> 2.56223
> -0.146502
>
> Pivoted QR should be slightly faster than SVD, but both should be much
> faster than rref.
>
> Simon
>
> On Wednesday, September 16, 2015 at 10:02:49 AM UTC-4, Cedric St-Jean
> wrote:
>>
>> Sorry, "singular" was the wrong word. I meant that the matrix does not
>> have full rank: the columns are not linearly independent.
>>
>> # Linearly independent columns
>> X1 = [1 2 3;
>> 1 2 5
>> 1 2 -8
>> 1 4 23]
>> # Linearly dependent columns (4th row changed
>> X2 = [1 2 3;
>> 1 2 5
>> 1 2 -8
>> 2 4 23]
>> y = [2.3, 5.6, 9.2, 10.5]
>>
>> X1 \ y
>> >3-element Array{Float64,1}:
>>
>> > -8.18265
>> > 6.94133
>> > -0.394898
>>
>> X2 \ y
>> >LAPACKException(2)
>>
>>
>> This is because there is no unique solution to the linear regression
>> problem when the matrix does not have full rank. The usual solution is
>> regularization (eg. ridge regression), but selecting a linearly independent
>> subset of the columns seems like a valid strategy too, if only prediction
>> accuracy matters.
>>
>> Hence: isn't that a valid non-pedagogical use of rref?
>>
>> Cédric
>>
>> On Wednesday, September 16, 2015 at 12:13:55 AM UTC-4, Steven G. Johnson
>> wrote:
>>>
>>>
>>>
>>> On Tuesday, September 15, 2015 at 10:22:47 AM UTC-4, Cedric St-Jean
>>> wrote:
>>>>
>>>> In the discussion of issue 9804
>>>> <https://github.com/JuliaLang/julia/pull/9804#issuecomment-140404574>,
>>>> the general consensus seemed to be:
>>>>
>>>> > Right. "rref" has no use in real numerical coding, but every use in
>>>> pedagogy.
>>>>
>>>>
>>>> I had a linear regression yield a PosDefError (the X matrix contains
>>>> only integers and was singular). I solved it by selecting a linearly
>>>> independent subset of columns
>>>> <http://math.stackexchange.com/questions/199125/how-to-remove-linearly-dependent-rows-cols>.
>>>>
>>>> This uses the Row-Echelon-Form, rref. Is there a simpler way of doing
>>>> this that I'm missing?
>>>>
>>>
>>> A \ x, where A is a non-square matrix, finds least-square solution via
>>> QR. This is normally the "right" way to do regression.
>>>
>>> (You can also compute the QR factorization of A explicitly to get a rank
>>> estimate; google "rank revealing QR" for more info on this sort of thing.)
>>>
>>