Re: [R] strange `nls' behaviour

2007-11-13 Thread Joerg van den Hoff
On Mon, Nov 12, 2007 at 06:59:56PM -0500, Duncan Murdoch wrote:
 On 12/11/2007 2:56 PM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 9:14 AM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
 I initially thought, this should better be posted to r-devel
 but alas! no response. 
 I think the reason there was no response is that your example is too 
 complicated.  You're doing a lot of strange things (fitfunc as a result 
 of deriv, using as.name, as.call, as.formula, etc.)  You should 
 simplify 
 thanks for the feedback.
 
 concerning  lot  of  strange  things:  OK.  I  thought the
 context might be important (why, for heaven's sake  do  you
 want  to  do  this!?), but, then, maybe not. so the easiest
 way to trigger a similar (not the  identical)  behaviour  is
 something like
 
 f - function (n) {
   #-
   #define n data points for a (hardcoded) model:
   #---
   x - seq(0, 5, length  = n)
   y - 2 * exp(-1*x) + 2; 
   y - rnorm(y,y, 0.01*y)
 
   #the model (same as the above hardcoded one):
   model - y ~ a * exp (-b*x) + c
 
   #call nls as usual:
   res1 - try(nls(model, start = c(a=2, b=1, c=2)))
 
   #call it a bit differently:
   res2 - nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2))
 
   list(res1 = res1, res2 = res2)
   #-
 }
 I'd say the problem is relying on the default for the envir parameter to 
 eval.  It's reasonable to expect nls to set things up so that terms in 
 the model formula are taken from the right place, but when your model 
 formula is playing with evaluation, you should be very careful to make 
 sure the evaluation takes place in the right context.
 
 agreed.
 
 The default for envir is parent.frame(), and that can't be right: that 
 will see local variables in whatever function called it, so if one of 
 them has a variable called model, you'll subscript it.
 
 for one, in my actual application, I _do_ specify envir (and
 remember the above was really  not  the  original  situation
 where the subsetting of `model' is actually done only in the
 `deriv' call).   second, if I'm not missing  something,  doing
 the  evaluation  in  parent.frame()  is  fine  in  the above
 example and does not explain  the  error,  whose  triggering
 solely depends on the number of data points used.
 
 You are missing something.  parent.frame() will be evaluated by eval(), 
 so it refers to whatever function calls eval().  You don't know what 
 function that will be, because it's buried deep within nls().

thanks for taking the time.

I  see. will remember to be even more careful with `eval' in
the future.

 
 Perhaps you think that the formula
 
 y ~ eval(model[[3]])
 
 is the same as the original one?  It's not.  Try printing it:

no, I understand roughly what lazy evaluation means :-)

 
  y ~ eval(model[[3]])
 y ~ eval(model[[3]])
 
 The eval doesn't get called at this point, it gets called for every step 
 of the least squares minimization.  Who knows what parent.frame() means 
 then?
 
 If I were trying to do what you're doing, I would construct the formula 
 before the call to nls, and pass that.  I.e. something like the 
 following silly code:
 
 model2 - model
 model2[[3]] - model[[3]] # The eval() is implicit
 res2 - nls(model2, start = c(a=2, b=1, c=2))
 
 
 I  was  trying that (more or less) in my original example, 
 I think, but I will reexamine my application and see, whether I can  bypass
 the problem somehow along these lines.
 
 If you really want to put eval() in a formula, then I can't see how you 
 could avoid an explicit specification of the envir parameter.  So I'd 
 call this a bug in your code.
 
 
 as I said, this seems right in principle (still, if the call
 does happen at some well  defined  place  such  as  a  small
 function local to a user visible one, the eval without envir
 might be quite OK) but not w.r.t. explaining the nls  crash.
 
 No, it's not okay.  Your formula is looking in places it shouldn't.  It 
 makes absolutely no sense for your formula to depend on what function 
 evaluates it.  It should only depend on the data that is available in 
 the data argument to nls() or visible in the current environment.

yes, accepted, as above.

 
 katherine  mullen  seems  to  have located the exact spot in
 `nls' where something goes wrong. so, for now I think martin
 might be right (bug in my thinking by assuming I'm allowed
 to use eval in this place), but otherwise it is  a  property
 (or bug) of `nls'.
 
 She may have discovered where the error call was triggered, but things 
 went wrong in this example when you mis-used eval().  If you think your 

I  don't  think  so  (but  may be wrong, of course): I again
stepped through the `nls'  call  with  `debug'  to  see  the
effect of katharine's 

[R] strange `nls' behaviour

2007-11-12 Thread Joerg van den Hoff

I initially thought, this should better be posted to r-devel
but alas! no response. so  I  try  it  here.  sory  for  the
lengthy explanation but it seems unavoidable. to quickly see
the problem simply copy the litte example below and execute

f(n=5)

which  crashes. called with n !=  5 (and of course n3 since
there are 3 parameters in the model...) everything is as  it
should be.

in detail:
I  stumbled over the follwing _very_ strange behaviour/error
when using `nls' which  I'm  tempted  (despite  the  implied
dangers) to call a bug:

I've  written a driver for `nls' which allows specifying the
model and the data vectors using arbitrary  symbols.   these
are  internally  mapped  to  consistent names, which poses a
slight complication when using `deriv' to  provide  analytic
derivatives. the following fragment gives the idea:

#-
f - function(n = 4) {

   x - seq(0, 5, length  = n)

   y - 2 * exp(-1*x) + 2; 
   y - rnorm(y,y, 0.01*y)

   model - y ~ a * exp (-b*x) + c

   fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))

   #standard call of nls:
   res1 - nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))

   call.fitfunc - 
   c(list(fitfunc), as.name(a), as.name(b), as.name(c), as.name(x))
   call.fitfunc - as.call(call.fitfunc)
   frml - as.formula(y ~ eval(call.fitfunc))

   #computed call of nls:
   res2 - nls(frml, start = c(a=1, b=1, c=1))

   list(res1 = res1, res2 = res2)
}
#-

the  argument  `n'   defines  the number of (simulated) data
points x/y which are going to be fitted by some model ( here
y ~ a*exp(-b*x)+c )

the first call to `nls' is the standard way of calling `nls'
when knowing all the variable and parameter names.

the second call (yielding `res2') uses a constructed formula
in `frml' (which in this example is of course not necessary,
but  in  the general case 'a,b,c,x,y' are not a priori known
names).

now, here is the problem: the call

f(4)

runs fine/consistently, as does every call with n  5.

BUT: for n = 5 (i.e. issuing f(5))

the second fit leads to the error message:

Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  : 
invalid type (language) for variable 'call.fitfunc'

I  cornered  this  to a spot in `nls' where a model frame is
constructed in variable `mf'.  the parsing/constructing here
seems  simply  to  be messed up for n = 5: `call.fitfunc' is
interpreted as variable.

I,  moreover, empirically noted that the problem occurs when
the total number of  parameters  plus  dependent/independent
variables  equals  the number of data points (in the present
example a,b,c,x,y).

so it is not the 'magic' number of 5 but rather the identity
of data vector length and number of parameters+variables  in
the model which leads to the problem.

this  is  with  2.5.0  (which  hopefully  is  not considered
ancient) and MacOSX 10.4.10.

any ideas?

thanks

joerg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange `nls' behaviour

2007-11-12 Thread Katharine Mullen
I can confirm this behavior on R-2.6.0 but don't have time to look into it
further at the moment.

On Mon, 12 Nov 2007, Joerg van den Hoff wrote:


 I initially thought, this should better be posted to r-devel
 but alas! no response. so  I  try  it  here.  sory  for  the
 lengthy explanation but it seems unavoidable. to quickly see
 the problem simply copy the litte example below and execute

 f(n=5)

 which  crashes. called with n !=  5 (and of course n3 since
 there are 3 parameters in the model...) everything is as  it
 should be.

 in detail:
 I  stumbled over the follwing _very_ strange behaviour/error
 when using `nls' which  I'm  tempted  (despite  the  implied
 dangers) to call a bug:

 I've  written a driver for `nls' which allows specifying the
 model and the data vectors using arbitrary  symbols.   these
 are  internally  mapped  to  consistent names, which poses a
 slight complication when using `deriv' to  provide  analytic
 derivatives. the following fragment gives the idea:

 #-
 f - function(n = 4) {

x - seq(0, 5, length  = n)

y - 2 * exp(-1*x) + 2;
y - rnorm(y,y, 0.01*y)

model - y ~ a * exp (-b*x) + c

fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))

#standard call of nls:
res1 - nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))

call.fitfunc -
c(list(fitfunc), as.name(a), as.name(b), as.name(c), as.name(x))
call.fitfunc - as.call(call.fitfunc)
frml - as.formula(y ~ eval(call.fitfunc))

#computed call of nls:
res2 - nls(frml, start = c(a=1, b=1, c=1))

list(res1 = res1, res2 = res2)
 }
 #-

 the  argument  `n'   defines  the number of (simulated) data
 points x/y which are going to be fitted by some model ( here
 y ~ a*exp(-b*x)+c )

 the first call to `nls' is the standard way of calling `nls'
 when knowing all the variable and parameter names.

 the second call (yielding `res2') uses a constructed formula
 in `frml' (which in this example is of course not necessary,
 but  in  the general case 'a,b,c,x,y' are not a priori known
 names).

 now, here is the problem: the call

 f(4)

 runs fine/consistently, as does every call with n  5.

 BUT: for n = 5 (i.e. issuing f(5))

 the second fit leads to the error message:

 Error in model.frame(formula, rownames, variables, varnames, extras, 
 extranames,  :
   invalid type (language) for variable 'call.fitfunc'

 I  cornered  this  to a spot in `nls' where a model frame is
 constructed in variable `mf'.  the parsing/constructing here
 seems  simply  to  be messed up for n = 5: `call.fitfunc' is
 interpreted as variable.

 I,  moreover, empirically noted that the problem occurs when
 the total number of  parameters  plus  dependent/independent
 variables  equals  the number of data points (in the present
 example a,b,c,x,y).

 so it is not the 'magic' number of 5 but rather the identity
 of data vector length and number of parameters+variables  in
 the model which leads to the problem.

 this  is  with  2.5.0  (which  hopefully  is  not considered
 ancient) and MacOSX 10.4.10.

 any ideas?

 thanks

 joerg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange `nls' behaviour

2007-11-12 Thread Martin Maechler
 DM == Duncan Murdoch [EMAIL PROTECTED]
 on Mon, 12 Nov 2007 07:36:34 -0500 writes:

DM On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
 I initially thought, this should better be posted to r-devel
 but alas! no response. 

DM I think the reason there was no response is that your example is too 
DM complicated.  You're doing a lot of strange things (fitfunc as a result 
DM of deriv, using as.name, as.call, as.formula, etc.)  You should 
simplify 
DM it down to isolate the bug.  Thats a lot of work, but you're the one in 
DM the best position to do it.  I'd say there's at least an even chance 
DM that the bug is in your code rather than in nls().

yes.. and.. no : 
- His code is quite peculiar, but I think only slightly too complicated

- one could argue that the bug is in Joerg's thinking that
  something like
nls(y ~ eval(fitfunc), )

  should be working at all.
  But then he had found by experiment that it (accidentally I   d'say)
  does work in many cases.

DM And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if 
it 
DM turns out to be an R bug.

You are right, but indeed (as has Kate just said) it *does*
exist in current R versions.

I agree that the behavior of nls() here is sub-optimal.
It *should* be consistent, i.e. work the same for n=4,5,6,..

I had spent about an hour after Joerg's R-devel posting,
and found to be too busy with more urgent matters --
unfortunately forgetting to give *some* feedback about my findings.

It may well be that we find that nls() should give an
(intelligible) error message in such eval() cases - rather than
only in one case...

Martin Maechler


DM Duncan Murdoch




DM so  I  try  it  here.  sory  for  the
 lengthy explanation but it seems unavoidable. to quickly see
 the problem simply copy the litte example below and execute
 
 f(n=5)
 
 which  crashes. called with n !=  5 (and of course n3 since
 there are 3 parameters in the model...) everything is as  it
 should be.
 
 in detail:
 I  stumbled over the follwing _very_ strange behaviour/error
 when using `nls' which  I'm  tempted  (despite  the  implied
 dangers) to call a bug:
 
 I've  written a driver for `nls' which allows specifying the
 model and the data vectors using arbitrary  symbols.   these
 are  internally  mapped  to  consistent names, which poses a
 slight complication when using `deriv' to  provide  analytic
 derivatives. the following fragment gives the idea:
 
 #-
 f - function(n = 4) {
 
 x - seq(0, 5, length  = n)
 
 y - 2 * exp(-1*x) + 2; 
 y - rnorm(y,y, 0.01*y)
 
 model - y ~ a * exp (-b*x) + c
 
 fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))
 
 #standard call of nls:
 res1 - nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))
 
 call.fitfunc - 
 c(list(fitfunc), as.name(a), as.name(b), as.name(c), as.name(x))
 call.fitfunc - as.call(call.fitfunc)
 frml - as.formula(y ~ eval(call.fitfunc))
 
 #computed call of nls:
 res2 - nls(frml, start = c(a=1, b=1, c=1))
 
 list(res1 = res1, res2 = res2)
 }
 #-
 
 the  argument  `n'   defines  the number of (simulated) data
 points x/y which are going to be fitted by some model ( here
 y ~ a*exp(-b*x)+c )
 
 the first call to `nls' is the standard way of calling `nls'
 when knowing all the variable and parameter names.
 
 the second call (yielding `res2') uses a constructed formula
 in `frml' (which in this example is of course not necessary,
 but  in  the general case 'a,b,c,x,y' are not a priori known
 names).
 
 now, here is the problem: the call
 
 f(4)
 
 runs fine/consistently, as does every call with n  5.
 
 BUT: for n = 5 (i.e. issuing f(5))
 
 the second fit leads to the error message:
 
 Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  : 
 invalid type (language) for variable 'call.fitfunc'
 
 I  cornered  this  to a spot in `nls' where a model frame is
 constructed in variable `mf'.  the parsing/constructing here
 seems  simply  to  be messed up for n = 5: `call.fitfunc' is
 interpreted as variable.
 
 I,  moreover, empirically noted that the problem occurs when
 the total number of  parameters  plus  dependent/independent
 variables  equals  the number of data points (in the present
 example a,b,c,x,y).
 
 so it is not the 'magic' number of 5 but rather the identity
 of data vector length and number of parameters+variables  in
 the model which leads to the problem.
 
 this  is  with  2.5.0  (which  hopefully  is  not considered
 ancient) and MacOSX 10.4.10.
 
 any ideas?
 
 

Re: [R] strange `nls' behaviour

2007-11-12 Thread Joerg van den Hoff
On Mon, Nov 12, 2007 at 03:25:38PM +0100, Martin Maechler wrote:
  DM == Duncan Murdoch [EMAIL PROTECTED]
  on Mon, 12 Nov 2007 07:36:34 -0500 writes:
 
 DM On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
  I initially thought, this should better be posted to r-devel
  but alas! no response. 
 
 DM I think the reason there was no response is that your example is too 
 DM complicated.  You're doing a lot of strange things (fitfunc as a 
 result 
 DM of deriv, using as.name, as.call, as.formula, etc.)  You should 
 simplify 
 DM it down to isolate the bug.  Thats a lot of work, but you're the one 
 in 
 DM the best position to do it.  I'd say there's at least an even chance 
 DM that the bug is in your code rather than in nls().
 
 yes.. and.. no : 
 - His code is quite peculiar, but I think only slightly too complicated

thank's  for  this  response. I just tried to give an easier
example in response to duncan's mail.

 
 - one could argue that the bug is in Joerg's thinking that
   something like
   nls(y ~ eval(fitfunc), )
 
   should be working at all.
   But then he had found by experiment that it (accidentally I   d'say)
   does work in many cases.
 
 DM And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if 
 it 
 DM turns out to be an R bug.
 
 You are right, but indeed (as has Kate just said) it *does*
 exist in current R versions.
 
 I agree that the behavior of nls() here is sub-optimal.
 It *should* be consistent, i.e. work the same for n=4,5,6,..
 
 I had spent about an hour after Joerg's R-devel posting,
 and found to be too busy with more urgent matters --
 unfortunately forgetting to give *some* feedback about my findings.
 
 It may well be that we find that nls() should give an
 (intelligible) error message in such eval() cases - rather than
 only in one case...

well,  if at all possible, it would _really_ be nice to have
the opportunity  to  pass  `eval'  constructs  throught  the
`model'  argument  to `nls'. I'm far from understanding what
actually is going on in `model.frame', but if it parses  the
thing  correctly  (accidentally  or  not)  except  in  one
special case, is'nt  there  a  clean  way  to  repair  the
special  case?  in my application I found the ability to use
`eval' constructs here really helpful. at least  I  did  not
find another way to be able to use `deriv' information _and_
allow the user to use arbitrary symbols  in  specifying  the
model  formula. at least I'd prefer the present situation of
works nearly always to a consistent works never.

joerg

 
 Martin Maechler
 
 
 DM Duncan Murdoch
 
 
 
 
 DM so  I  try  it  here.  sory  for  the
  lengthy explanation but it seems unavoidable. to quickly see
  the problem simply copy the litte example below and execute
  
  f(n=5)
  
  which  crashes. called with n !=  5 (and of course n3 since
  there are 3 parameters in the model...) everything is as  it
  should be.
  
  in detail:
  I  stumbled over the follwing _very_ strange behaviour/error
  when using `nls' which  I'm  tempted  (despite  the  implied
  dangers) to call a bug:
  
  I've  written a driver for `nls' which allows specifying the
  model and the data vectors using arbitrary  symbols.   these
  are  internally  mapped  to  consistent names, which poses a
  slight complication when using `deriv' to  provide  analytic
  derivatives. the following fragment gives the idea:
  
  #-
  f - function(n = 4) {
  
  x - seq(0, 5, length  = n)
  
  y - 2 * exp(-1*x) + 2; 
  y - rnorm(y,y, 0.01*y)
  
  model - y ~ a * exp (-b*x) + c
  
  fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))
  
  #standard call of nls:
  res1 - nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))
  
  call.fitfunc - 
  c(list(fitfunc), as.name(a), as.name(b), as.name(c), 
 as.name(x))
  call.fitfunc - as.call(call.fitfunc)
  frml - as.formula(y ~ eval(call.fitfunc))
  
  #computed call of nls:
  res2 - nls(frml, start = c(a=1, b=1, c=1))
  
  list(res1 = res1, res2 = res2)
  }
  #-
  
  the  argument  `n'   defines  the number of (simulated) data
  points x/y which are going to be fitted by some model ( here
  y ~ a*exp(-b*x)+c )
  
  the first call to `nls' is the standard way of calling `nls'
  when knowing all the variable and parameter names.
  
  the second call (yielding `res2') uses a constructed formula
  in `frml' (which in this example is of course not necessary,
  but  in  the general case 'a,b,c,x,y' are not a priori known
  names).
  
  now, here is the problem: the call
  
  f(4)
  
  runs fine/consistently, as does every call with n  5.
  
  BUT: 

Re: [R] strange `nls' behaviour

2007-11-12 Thread Katharine Mullen
On about line 547 in nls.R there is
  mf$formula -  # replace by one-sided linear model formula
  as.formula(paste(~, paste(varNames[varIndex], collapse = +)),
   env = environment(formula))

If this is replaced with

mf$formula -  # replace by one-sided linear model formula
  as.formula(paste(~, paste(varNames[1], collapse = +)),
   env = environment(formula))

then the behviour seems to be OK at least for Joerg's examples.

It remains to be determined whether this is a _good_ solution (that is, if
it's a general solution to get the desired behavior).

On Mon, 12 Nov 2007, Martin Maechler wrote:

  DM == Duncan Murdoch [EMAIL PROTECTED]
  on Mon, 12 Nov 2007 07:36:34 -0500 writes:

 DM On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
  I initially thought, this should better be posted to r-devel
  but alas! no response.

 DM I think the reason there was no response is that your example is too
 DM complicated.  You're doing a lot of strange things (fitfunc as a 
 result
 DM of deriv, using as.name, as.call, as.formula, etc.)  You should 
 simplify
 DM it down to isolate the bug.  Thats a lot of work, but you're the one 
 in
 DM the best position to do it.  I'd say there's at least an even chance
 DM that the bug is in your code rather than in nls().

 yes.. and.. no :
 - His code is quite peculiar, but I think only slightly too complicated

 - one could argue that the bug is in Joerg's thinking that
   something like
   nls(y ~ eval(fitfunc), )

   should be working at all.
   But then he had found by experiment that it (accidentally I   d'say)
   does work in many cases.

 DM And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if 
 it
 DM turns out to be an R bug.

 You are right, but indeed (as has Kate just said) it *does*
 exist in current R versions.

 I agree that the behavior of nls() here is sub-optimal.
 It *should* be consistent, i.e. work the same for n=4,5,6,..

 I had spent about an hour after Joerg's R-devel posting,
 and found to be too busy with more urgent matters --
 unfortunately forgetting to give *some* feedback about my findings.

 It may well be that we find that nls() should give an
 (intelligible) error message in such eval() cases - rather than
 only in one case...

 Martin Maechler


 DM Duncan Murdoch




 DM so  I  try  it  here.  sory  for  the
  lengthy explanation but it seems unavoidable. to quickly see
  the problem simply copy the litte example below and execute
 
  f(n=5)
 
  which  crashes. called with n !=  5 (and of course n3 since
  there are 3 parameters in the model...) everything is as  it
  should be.
 
  in detail:
  I  stumbled over the follwing _very_ strange behaviour/error
  when using `nls' which  I'm  tempted  (despite  the  implied
  dangers) to call a bug:
 
  I've  written a driver for `nls' which allows specifying the
  model and the data vectors using arbitrary  symbols.   these
  are  internally  mapped  to  consistent names, which poses a
  slight complication when using `deriv' to  provide  analytic
  derivatives. the following fragment gives the idea:
 
  #-
  f - function(n = 4) {
 
  x - seq(0, 5, length  = n)
 
  y - 2 * exp(-1*x) + 2;
  y - rnorm(y,y, 0.01*y)
 
  model - y ~ a * exp (-b*x) + c
 
  fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))
 
  #standard call of nls:
  res1 - nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))
 
  call.fitfunc -
  c(list(fitfunc), as.name(a), as.name(b), as.name(c), 
 as.name(x))
  call.fitfunc - as.call(call.fitfunc)
  frml - as.formula(y ~ eval(call.fitfunc))
 
  #computed call of nls:
  res2 - nls(frml, start = c(a=1, b=1, c=1))
 
  list(res1 = res1, res2 = res2)
  }
  #-
 
  the  argument  `n'   defines  the number of (simulated) data
  points x/y which are going to be fitted by some model ( here
  y ~ a*exp(-b*x)+c )
 
  the first call to `nls' is the standard way of calling `nls'
  when knowing all the variable and parameter names.
 
  the second call (yielding `res2') uses a constructed formula
  in `frml' (which in this example is of course not necessary,
  but  in  the general case 'a,b,c,x,y' are not a priori known
  names).
 
  now, here is the problem: the call
 
  f(4)
 
  runs fine/consistently, as does every call with n  5.
 
  BUT: for n = 5 (i.e. issuing f(5))
 
  the second fit leads to the error message:
 
  Error in model.frame(formula, rownames, variables, varnames, extras, 
 extranames,  :
  invalid type (language) for variable 'call.fitfunc'
 
  I  cornered  this  to a spot in 

Re: [R] strange `nls' behaviour

2007-11-12 Thread Duncan Murdoch
On 11/12/2007 9:14 AM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
 I initially thought, this should better be posted to r-devel
 but alas! no response. 
 
 I think the reason there was no response is that your example is too 
 complicated.  You're doing a lot of strange things (fitfunc as a result 
 of deriv, using as.name, as.call, as.formula, etc.)  You should simplify 
 
 thanks for the feedback.
 
 concerning  lot  of  strange  things:  OK.  I  thought the
 context might be important (why, for heaven's sake  do  you
 want  to  do  this!?), but, then, maybe not. so the easiest
 way to trigger a similar (not the  identical)  behaviour  is
 something like
 
 f - function (n) {
#-
#define n data points for a (hardcoded) model:
#---
x - seq(0, 5, length  = n)
y - 2 * exp(-1*x) + 2; 
y - rnorm(y,y, 0.01*y)
 
#the model (same as the above hardcoded one):
model - y ~ a * exp (-b*x) + c
 
#call nls as usual:
res1 - try(nls(model, start = c(a=2, b=1, c=2)))
 
#call it a bit differently:
res2 - nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2))
 
list(res1 = res1, res2 = res2)
#-
 }

I'd say the problem is relying on the default for the envir parameter to 
eval.  It's reasonable to expect nls to set things up so that terms in 
the model formula are taken from the right place, but when your model 
formula is playing with evaluation, you should be very careful to make 
sure the evaluation takes place in the right context.

The default for envir is parent.frame(), and that can't be right: that 
will see local variables in whatever function called it, so if one of 
them has a variable called model, you'll subscript it.

If I were trying to do what you're doing, I would construct the formula 
before the call to nls, and pass that.  I.e. something like the 
following silly code:

model2 - model
model2[[3]] - model[[3]] # The eval() is implicit
res2 - nls(model2, start = c(a=2, b=1, c=2))

If you really want to put eval() in a formula, then I can't see how you 
could avoid an explicit specification of the envir parameter.  So I'd 
call this a bug in your code.

Duncan Murdoch


 this is without all the overhead of my first example and now
 (since not quite the same) the problem  arises  at  n  ==  3
 where  the  fit  can't  really  procede  (there  are  also 3
 parameters -- the first example was more  relevant  in  this
 respect)  but  anyway  the  second nls-call does not procede
 beyond the parsing phase of `model.frame'.
 
 so,  I  can't  see  room for a real bug in my code, but very
 well a chance that I misuse `nls'  (i.e.  not  understanding
 what really is tolerable for the `model' argument of `nls').
 but  if the latter is not the case, I would think there is a
 bug in `nls'  (or,  actually,  rather  in  `model.frame'  or
 whatever)  when  parsing  the  nls call.


 
 
 it down to isolate the bug.  Thats a lot of work, but you're the one in 
 the best position to do it.  I'd say there's at least an even chance 
 that the bug is in your code rather than in nls().
 
 And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it 
 turns out to be an R bug.
 
 if  need  be,  I'll  do  that  (if  I  get it compiled under
 macosX). but assuming  that  you  have  R-patched  installed
 anyway, I would appreciate if you would copy the new example
 above or the old one  below  to  your  R-  prompt  and  see,
 whether  it  crashes  with  the same error message if called
 with the special number of data points (3 for the new, 5 for
 the  old  example)?  and  if  so,  maybe  bring  this to the
 attention of d. bates?
 
 
 j. van den hoff
 
 Duncan Murdoch
 
 
 
 
 
 so  I  try  it  here.  sory  for  the
 lengthy explanation but it seems unavoidable. to quickly see
 the problem simply copy the litte example below and execute
 
 f(n=5)
 
 which  crashes. called with n !=  5 (and of course n3 since
 there are 3 parameters in the model...) everything is as  it
 should be.
 
 in detail:
 I  stumbled over the follwing _very_ strange behaviour/error
 when using `nls' which  I'm  tempted  (despite  the  implied
 dangers) to call a bug:
 
 I've  written a driver for `nls' which allows specifying the
 model and the data vectors using arbitrary  symbols.   these
 are  internally  mapped  to  consistent names, which poses a
 slight complication when using `deriv' to  provide  analytic
 derivatives. the following fragment gives the idea:
 
 #-
 f - function(n = 4) {
 
x - seq(0, 5, length  = n)
 
y - 2 * exp(-1*x) + 2; 
y - rnorm(y,y, 0.01*y)
 
model - y ~ a * exp (-b*x) + c
 
fitfunc - deriv(model[[3]], c(a, b, c), c(a, b, c, x))
 
#standard call of nls:
res1 - nls(y ~ fitfunc(a, b, c, 

Re: [R] strange `nls' behaviour

2007-11-12 Thread Joerg van den Hoff
On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 9:14 AM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
 I initially thought, this should better be posted to r-devel
 but alas! no response. 
 
 I think the reason there was no response is that your example is too 
 complicated.  You're doing a lot of strange things (fitfunc as a result 
 of deriv, using as.name, as.call, as.formula, etc.)  You should simplify 
 
 thanks for the feedback.
 
 concerning  lot  of  strange  things:  OK.  I  thought the
 context might be important (why, for heaven's sake  do  you
 want  to  do  this!?), but, then, maybe not. so the easiest
 way to trigger a similar (not the  identical)  behaviour  is
 something like
 
 f - function (n) {
#-
#define n data points for a (hardcoded) model:
#---
x - seq(0, 5, length  = n)
y - 2 * exp(-1*x) + 2; 
y - rnorm(y,y, 0.01*y)
 
#the model (same as the above hardcoded one):
model - y ~ a * exp (-b*x) + c
 
#call nls as usual:
res1 - try(nls(model, start = c(a=2, b=1, c=2)))
 
#call it a bit differently:
res2 - nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2))
 
list(res1 = res1, res2 = res2)
#-
 }
 
 I'd say the problem is relying on the default for the envir parameter to 
 eval.  It's reasonable to expect nls to set things up so that terms in 
 the model formula are taken from the right place, but when your model 
 formula is playing with evaluation, you should be very careful to make 
 sure the evaluation takes place in the right context.

agreed.

 
 The default for envir is parent.frame(), and that can't be right: that 
 will see local variables in whatever function called it, so if one of 
 them has a variable called model, you'll subscript it.

for one, in my actual application, I _do_ specify envir (and
remember the above was really  not  the  original  situation
where the subsetting of `model' is actually done only in the
`deriv' call). second, if I'm not missing  something,  doing
the  evaluation  in  parent.frame()  is  fine  in  the above
example and does not explain  the  error,  whose  triggering
solely depends on the number of data points used.

 
 If I were trying to do what you're doing, I would construct the formula 
 before the call to nls, and pass that.  I.e. something like the 
 following silly code:
 
 model2 - model
 model2[[3]] - model[[3]] # The eval() is implicit
 res2 - nls(model2, start = c(a=2, b=1, c=2))


I  was  trying that (more or less) in my original example, 
I think, but I will reexamine my application and see, whether I can  bypass
the problem somehow along these lines.

 
 If you really want to put eval() in a formula, then I can't see how you 
 could avoid an explicit specification of the envir parameter.  So I'd 
 call this a bug in your code.


as I said, this seems right in principle (still, if the call
does happen at some well  defined  place  such  as  a  small
function local to a user visible one, the eval without envir
might be quite OK) but not w.r.t. explaining the nls  crash.
katherine  mullen  seems  to  have located the exact spot in
`nls' where something goes wrong. so, for now I think martin
might be right (bug in my thinking by assuming I'm allowed
to use eval in this place), but otherwise it is  a  property
(or bug) of `nls'.

joerg van den hoff

 
 Duncan Murdoch
 
 
 this is without all the overhead of my first example and now
 (since not quite the same) the problem  arises  at  n  ==  3
 where  the  fit  can't  really  procede  (there  are  also 3
 parameters -- the first example was more  relevant  in  this
 respect)  but  anyway  the  second nls-call does not procede
 beyond the parsing phase of `model.frame'.
 
 so,  I  can't  see  room for a real bug in my code, but very
 well a chance that I misuse `nls'  (i.e.  not  understanding
 what really is tolerable for the `model' argument of `nls').
 but  if the latter is not the case, I would think there is a
 bug in `nls'  (or,  actually,  rather  in  `model.frame'  or
 whatever)  when  parsing  the  nls call.
 
 
 
 
 it down to isolate the bug.  Thats a lot of work, but you're the one in 
 the best position to do it.  I'd say there's at least an even chance 
 that the bug is in your code rather than in nls().
 
 And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it 
 turns out to be an R bug.
 
 if  need  be,  I'll  do  that  (if  I  get it compiled under
 macosX). but assuming  that  you  have  R-patched  installed
 anyway, I would appreciate if you would copy the new example
 above or the old one  below  to  your  R-  prompt  and  see,
 whether  it  crashes  with  the same error message if called
 with the special number of data points (3 for the new, 5 for
 the  

Re: [R] strange `nls' behaviour

2007-11-12 Thread Duncan Murdoch
On 12/11/2007 2:56 PM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 9:14 AM, Joerg van den Hoff wrote:
 On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote:
 On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
 I initially thought, this should better be posted to r-devel
 but alas! no response. 
 I think the reason there was no response is that your example is too 
 complicated.  You're doing a lot of strange things (fitfunc as a result 
 of deriv, using as.name, as.call, as.formula, etc.)  You should simplify 
 thanks for the feedback.

 concerning  lot  of  strange  things:  OK.  I  thought the
 context might be important (why, for heaven's sake  do  you
 want  to  do  this!?), but, then, maybe not. so the easiest
 way to trigger a similar (not the  identical)  behaviour  is
 something like

 f - function (n) {
   #-
   #define n data points for a (hardcoded) model:
   #---
   x - seq(0, 5, length  = n)
   y - 2 * exp(-1*x) + 2; 
   y - rnorm(y,y, 0.01*y)

   #the model (same as the above hardcoded one):
   model - y ~ a * exp (-b*x) + c

   #call nls as usual:
   res1 - try(nls(model, start = c(a=2, b=1, c=2)))

   #call it a bit differently:
   res2 - nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2))

   list(res1 = res1, res2 = res2)
   #-
 }
 I'd say the problem is relying on the default for the envir parameter to 
 eval.  It's reasonable to expect nls to set things up so that terms in 
 the model formula are taken from the right place, but when your model 
 formula is playing with evaluation, you should be very careful to make 
 sure the evaluation takes place in the right context.
 
 agreed.
 
 The default for envir is parent.frame(), and that can't be right: that 
 will see local variables in whatever function called it, so if one of 
 them has a variable called model, you'll subscript it.
 
 for one, in my actual application, I _do_ specify envir (and
 remember the above was really  not  the  original  situation
 where the subsetting of `model' is actually done only in the
 `deriv' call).   second, if I'm not missing  something,  doing
 the  evaluation  in  parent.frame()  is  fine  in  the above
 example and does not explain  the  error,  whose  triggering
 solely depends on the number of data points used.

You are missing something.  parent.frame() will be evaluated by eval(), 
so it refers to whatever function calls eval().  You don't know what 
function that will be, because it's buried deep within nls().

Perhaps you think that the formula

y ~ eval(model[[3]])

is the same as the original one?  It's not.  Try printing it:

  y ~ eval(model[[3]])
y ~ eval(model[[3]])

The eval doesn't get called at this point, it gets called for every step 
of the least squares minimization.  Who knows what parent.frame() means 
then?

 If I were trying to do what you're doing, I would construct the formula 
 before the call to nls, and pass that.  I.e. something like the 
 following silly code:

 model2 - model
 model2[[3]] - model[[3]] # The eval() is implicit
 res2 - nls(model2, start = c(a=2, b=1, c=2))
 
 
 I  was  trying that (more or less) in my original example, 
 I think, but I will reexamine my application and see, whether I can  bypass
 the problem somehow along these lines.
 
 If you really want to put eval() in a formula, then I can't see how you 
 could avoid an explicit specification of the envir parameter.  So I'd 
 call this a bug in your code.
 
 
 as I said, this seems right in principle (still, if the call
 does happen at some well  defined  place  such  as  a  small
 function local to a user visible one, the eval without envir
 might be quite OK) but not w.r.t. explaining the nls  crash.

No, it's not okay.  Your formula is looking in places it shouldn't.  It 
makes absolutely no sense for your formula to depend on what function 
evaluates it.  It should only depend on the data that is available in 
the data argument to nls() or visible in the current environment.

 katherine  mullen  seems  to  have located the exact spot in
 `nls' where something goes wrong. so, for now I think martin
 might be right (bug in my thinking by assuming I'm allowed
 to use eval in this place), but otherwise it is  a  property
 (or bug) of `nls'.

She may have discovered where the error call was triggered, but things 
went wrong in this example when you mis-used eval().  If you think your 
original code used it properly, then simplify that down to a minimal 
example that triggers the bug.  Don't put things in the example unless 
they are necessary to trigger the error.  For example:

  - drop the wrapper function f.
  - fix n
  - use fixed data, not random data.
  - use an extremely simple model, e.g. y ~ a
Etc.

Duncan Murdoch


 
 joerg van den hoff
 
 Duncan Murdoch


 this is without all the overhead of my first