Re: [R] Coxph convergence

2014-02-05 Thread Hin-Tak Leung
Dear Prof Therneau,

I have explored a bit more on this issue - I found that for this specific 
problem,
one can get the older convergent behavior with just setting toler.chol to 4 
e-12.
This is only a little larger than 2x  the current default = 
.Machine$double.eps^.75
= 1.8e-12.

Since you have mentioned 1e-10 a few times, and the change happened between
a few specific svn commits (r11513 and r11516), I had assumed that the parameter
must have been re-tuned - but am rather surprised that this is *not* the case.
It appears that toler.chol have always been .Machine$double.eps^.75 since at 
least
June 2000. This means the meaning of it, and how it is used, has changed between
those commits? This is worrying.

Also on the general issues of warnings - it warns on the 5 column matrix
having rank less than 5, so it doesn't help before/after the changes (rank 3 or 
rank 4).
But convergence and non-convergence probably should be treated quite
differently - the answer from non-convergence is definitely wrong, but the
rank 3 convergence result was supposed to be a valid answer to an scientific 
question.
This rather suggests that those two should be treated/shown differently.

I'd also be interested to hear what went wrong with r11515 (memory consumption 
just
climbs...). Am quite tempted to set this as a character-building exercise to 
some
of my more capable students. :-).

Hin-Tak


On Wed, 18/12/13, Terry Therneau thern...@mayo.edu wrote:

 Subject: Re:  Coxph convergence
 To: r-help@R-project.org, Hin-Tak Leung ht...@users.sourceforge.net, 
David Winsemius dwinsem...@comcast.net
 Date: Wednesday, 18 December, 2013, 19:12

 I'll re-enter the fray.
 The data set is an example where coxph is incorrect; due to
 round off error it is treating
 a 5 column rank 3 matrix as if it were rank 4.  This of
 course results in 0 digits of 
 precision.
    Immediate fix, for the user, is to add
 toler.chol= 1e-10 to their coxph call. It is
 very likely that they will never ever have to change this
 value.

 I will look into changing the default from its current value
 of .Machine$double.eps^.75.
 However, I first have to check that this does not break
 anything else.  All epsilon
 constants are a delicate dance between mutiple data sets,
 and anyone with long experience
 in numerical anlysis will tell you that it is impossible to
 find constants that will work
 for every data set.  This is true for linear models,
 logistic, Cox, ... you name it.

 In summary:
 I appreciate the example.
 I'll add to my list of nasty problems.
 I may be able to fix long term, and maybe not. 
 Changing the constant may break something 
 else.
 I've given a good short term fix.

 Terry T.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coxph convergence

2013-12-18 Thread Terry Therneau

I'll re-enter the fray.
The data set is an example where coxph is incorrect; due to round off error it 
is treating
a 5 column rank 3 matrix as if it were rank 4.  This of course results in 0 digits of 
precision.

  Immediate fix, for the user, is to add toler.chol= 1e-10 to their coxph 
call. It is
very likely that they will never ever have to change this value.

I will look into changing the default from its current value of 
.Machine$double.eps^.75.
However, I first have to check that this does not break anything else.  All 
epsilon
constants are a delicate dance between mutiple data sets, and anyone with long 
experience
in numerical anlysis will tell you that it is impossible to find constants that 
will work
for every data set.  This is true for linear models, logistic, Cox, ... you 
name it.

In summary:
I appreciate the example.
I'll add to my list of nasty problems.
I may be able to fix long term, and maybe not.  Changing the constant may break something 
else.

I've given a good short term fix.

Terry T.



On 12/18/2013 05:00 AM, r-help-requ...@r-project.org wrote:

Your comprehension of the issue seem to be entirely wrong. Between r11513 and r11516, 
some tuning of internal parmeters were done, so the process of finding the rank of a 
singular matrix no longer converges (within the time/tolerance implicitly specified). 
There are warnings issued, but then there are misc warnings before and after (and one 
gets desensitised about them). Also the nature of the problem, which is to 
test for possibility of interactions - or lacking thereof -

outcome ~ factor A + factor B + factor A x factor B

or just extra terms in outcome ~ factor A + factor B + ... as an exploration 
of auxiliary effects, more often than not extra terms won't make
any difference and the matrix involved just isn't the nicest to manipulate; it 
is in the nature of that kind of exploratory work.

Professor Therneau replied that it is possible to get the older convergent 
behaviour by manual tuning of some of the convergence criteria parameters; I 
have responded that while that is possible, often one is simultaneously 
exploring many models with many possible auxiliary effects (and lacking 
thereof), manual tuning for each is neither feasible nor appropriate; and we 
sort of left it at that.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.