Le 23/01/2018 à 08:47, Martin Maechler a écrit :
Serguei Sokol <so...@insa-toulouse.fr>
on Mon, 22 Jan 2018 17:57:47 +0100 writes:
> Le 22/01/2018 à 17:40, Keith O'Hara a écrit :
>> This behavior is noted in the qr documentation, no?
>>
>> rank - the rank of x as computed by the decomposition(*): always full
rank in the LAPACK case.
> For a me a "full rank matrix" is a matrix the rank of which is indeed
min(nrow(A), ncol(A))
> but here the meaning of "always is full rank" is somewhat confusing.
Does it mean
> that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
> May be there is a jargon where "full rank" is a synonym of min(nrow(A),
ncol(A)) for any matrix
> but the fix to stick with commonly admitted rank definition (i.e. the
number of linearly independent
> columns in A) is so easy. Why to discard lapack case from it (even
properly documented)?
Because 99.5% of caller to qr() never look at '$rank',
so why should we compute it every time qr() is called?
1. Because R already does it for linpack so it would be consistent to do so for
lapack too.
2. Because R pretends that it is a part of a returned qr class.
3. Because its calculation is a negligible fraction of QR itself.
==> Matrix :: rankMatrix() does use "qr" as one of its several methods.
--------------
As wiser people than me have said (I'm paraphrasing, don't find a nice
citation):
While the rank of a matrix is a very well defined concept in
mathematics (theory), its practical computation on a finite
precision computer is much more challenging.
True. It is indeed depending of round-off errors during QR calculations and
tolerance
setting but putting it just as min(nrow(A), ncol(A)) and still calling it rank of
"full rank"
is by far the most misleading choice to my mind.
Once again, if we are already calculating it for linpack let do it in most
consistent
way for lapack too. I can propose a patch if you will.
Serguei.
The ?rankMatrix help page (package Matrix, part of your R)
https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/rankMatrix.html
starts with the following 'Description'
__ Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat
ambigous in practice. We provide several methods, the default corresponding to
Matlab's definition.
__ (*) The rank of a n x m matrix A, rk(A) is the maximal number of linearly
independent columns (or rows); hence rk(A) <= min(n,m).
>>> On Jan 22, 2018, at 11:21 AM, Serguei Sokol <so...@insa-toulouse.fr>
wrote:
>>>
>>> Hi,
>>>
>>> I have noticed different rank values calculated by qr() depending on
>>> LAPACK parameter. When it is FALSE (default) a true rank is estimated
and returned.
>>> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A))
is returned
>>> which is only occasionally a true rank.
>>>
>>> Would not it be more consistent to replace the rank in the latter case
by something
>>> based on the following pseudo code ?
>>>
>>> d=abs(diag(qr))
>>> rank=sum(d >= d[1]*tol)
>>>
>>> Here, we rely on the fact column pivoting is activated in the called
lapack routine (dgeqp3)
>>> and diagonal term in qr matrix are put in decreasing order (according
to their absolute values).
>>>
>>> Serguei.
>>>
>>> How to reproduce:
>>>
>>> a=diag(2)
>>> a[2,2]=0
>>> qaf=qr(a, LAPACK=FALSE)
>>> qaf$rank # shows 1. OK it's the true rank value
>>> qat=qr(a, LAPACK=TRUE)
>>> qat$rank #shows 2. Bad, it's not the expected value.
>>>
> --
> Serguei Sokol
> Ingenieur de recherche INRA
> Cellule mathématique
> LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
> 135 Avenue de Rangueil
> 31077 Toulouse Cedex 04
> tel: +33 5 6155 9849
> email: so...@insa-toulouse.fr
> http://www.lisbp.fr
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel