Dear Prof Therneau,
I have explored a bit more on this issue - I found that for this specific
problem,
one can get the older convergent behavior with just setting toler.chol to 4
e-12.
This is only a little larger than 2x the current default =
.Machine$double.eps^.75
= 1.8e-12.
Since you have mentioned 1e-10 a few times, and the change happened between
a few specific svn commits (r11513 and r11516), I had assumed that the parameter
must have been re-tuned - but am rather surprised that this is *not* the case.
It appears that toler.chol have always been .Machine$double.eps^.75 since at
least
June 2000. This means the meaning of it, and how it is used, has changed between
those commits? This is worrying.
Also on the general issues of warnings - it warns on the 5 column matrix
having rank less than 5, so it doesn't help before/after the changes (rank 3 or
rank 4).
But convergence and non-convergence probably should be treated quite
differently - the answer from non-convergence is definitely wrong, but the
rank 3 convergence result was supposed to be a valid answer to an scientific
question.
This rather suggests that those two should be treated/shown differently.
I'd also be interested to hear what went wrong with r11515 (memory consumption
just
climbs...). Am quite tempted to set this as a character-building exercise to
some
of my more capable students. :-).
Hin-Tak
On Wed, 18/12/13, Terry Therneau thern...@mayo.edu wrote:
Subject: Re: Coxph convergence
To: r-help@R-project.org, Hin-Tak Leung ht...@users.sourceforge.net,
David Winsemius dwinsem...@comcast.net
Date: Wednesday, 18 December, 2013, 19:12
I'll re-enter the fray.
The data set is an example where coxph is incorrect; due to
round off error it is treating
a 5 column rank 3 matrix as if it were rank 4. This of
course results in 0 digits of
precision.
Immediate fix, for the user, is to add
toler.chol= 1e-10 to their coxph call. It is
very likely that they will never ever have to change this
value.
I will look into changing the default from its current value
of .Machine$double.eps^.75.
However, I first have to check that this does not break
anything else. All epsilon
constants are a delicate dance between mutiple data sets,
and anyone with long experience
in numerical anlysis will tell you that it is impossible to
find constants that will work
for every data set. This is true for linear models,
logistic, Cox, ... you name it.
In summary:
I appreciate the example.
I'll add to my list of nasty problems.
I may be able to fix long term, and maybe not.
Changing the constant may break something
else.
I've given a good short term fix.
Terry T.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.