[R] problem with nls starting values

2012-09-27 Thread Benedikt Gehr

Hi

I would like to fit a non-linear regression to the follwoing data:

quantiles-c(seq(.05,.95,0.05))
slopes-c( 0.00e+00,  1.622074e-04 , 3.103918e-03 , 2.169135e-03 , 
9.585523e-04

,1.412327e-03 , 4.288103e-05, -1.351171e-04 , 2.885810e-04 ,-4.574773e-04
, -2.368968e-03, -3.104634e-03, -5.833970e-03, -6.011945e-03, -7.737697e-03
, -8.203058e-03, -7.809603e-03, -6.623985e-03, -9.414477e-03)
plot(slopes~quantiles)

I want to fit two models: asymptotic decay  and logistic decay(s-shaped).
I tried self-starting functions (SSlogis and SSasymp) like this:

dframe-data.frame(cbind(slopes,quantiles))
names(dframe)-c(slopes,quantiles)
summary(mod1-nls(slopes ~ SSlogis( quantiles, Asym, xmid, 
scal),data=dframe))
summary(mod1-nls(slopes ~ SSasymp( quantiles, Asym, resp0, 
lrc),data=dframe))


and I tried to specify the starting values myself. But I usually don't 
even get the nls started. It's always some singular gradient error or 
some other related error message (stopped after 50 iterations,etc.). 
When I leave out some values from the middle quantiles I manage to fit a 
3-parameter logistic model, but if I use all the values it doesn't work 
any longer.
Then I simulated perfect asymptotic decay data and tried to to fit an 
nls() with the correct parameter values, but it won't work either. What 
am I doing wrong?


Any help would be most appreciated

Best

benedikt

--
Benedikt Gehr
Ph.D. Student

Institute of Evolutionary Biology and Environmental Studies
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich

Office 13 J 36b
Phone: +41 (0)44 635 49 72
http://www.ieu.uzh.ch/staff/phd/gehr.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with nls starting values

2012-09-27 Thread Benedikt Gehr

thanks for your reply

I agree that an lm model would fit just as well, however the expectation 
from a mechanistic point of view would be a non-linear relationship.


Also when I simulate data as in

y_val-115-118*exp(-0.12*(seq(1,100)+rnorm(100,0,0.8)))
x_val-seq(1:100)
plot(y_val~x_val)
summary(mod1-nls(y_val~a-b*exp(-c*x_val),start=list(a=115,b=118,c=0.12)))

I do not get a convergence. I obviously must be doing something wrong.

Thanks for the help

Benedikt

Am 27.09.2012 20:00, schrieb Bert Gunter:

My guess:

You probably are overfitting your data. A straight line does about as
well as anything except for the 3 high leverage points, which the
minimization is probably having trouble with.

-- Bert



On Thu, Sep 27, 2012 at 10:43 AM, Benedikt Gehr
benedikt.g...@ieu.uzh.ch wrote:

quantiles-c(seq(.05,.95,0.05))
slopes-c( 0.00e+00,  1.622074e-04 , 3.103918e-03 , 2.169135e-03 ,
9.585523e-04
,1.412327e-03 , 4.288103e-05, -1.351171e-04 , 2.885810e-04 ,-4.574773e-04
, -2.368968e-03, -3.104634e-03, -5.833970e-03, -6.011945e-03, -7.737697e-03
, -8.203058e-03, -7.809603e-03, -6.623985e-03, -9.414477e-03)
plot(slopes~quantiles)





--
Benedikt Gehr
Ph.D. Student

Institute of Evolutionary Biology and Environmental Studies
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich

Office 13 J 36b
Phone: +41 (0)44 635 49 72
http://www.ieu.uzh.ch/staff/phd/gehr.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with nls starting values

2012-09-27 Thread Benedikt Gehr
now I feel very silly! I swear I was trying this for a long time and it 
didn't work. Now that I closed R and restarted it it works also on my 
machine.


So is the only problem that my model is overparametrized with the data I 
have? however shouldn't it be possible to fit an nls to these data?


thanks for the help

Am 27.09.2012 21:27, schrieb Berend Hasselman:

y_val-115-118*exp(-0.12*(seq(1,100)+rnorm(100,0,0.8)))
x_val-seq(1:100)
plot(y_val~x_val)
summary(mod1-nls(y_val~a-b*exp(-c*x_val),start=list(a=115,b=118,c=0.12)))



--
Benedikt Gehr
Ph.D. Student

Institute of Evolutionary Biology and Environmental Studies
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich

Office 13 J 36b
Phone: +41 (0)44 635 49 72
http://www.ieu.uzh.ch/staff/phd/gehr.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] order a data frame by date with order

2012-05-16 Thread Benedikt Gehr
Hi

I have a rather large data frame (7000 rows with 28 columns) which I want
to sort by date. Below I have a example of the data frame. The Date column
is called DT, is a factor and looks like this:

class(res.merge$DT)
[1] factor
head(res.merge$DT)
[1] 17.3.2012 13:54:02 17.3.2012 14:00:07 17.3.2012 14:30:25 17.3.2012
15:01:15
[5] 17.3.2012 15:32:14 17.3.2012 16:01:29
2530 Levels: 1.4.2012 00:00:52 1.4.2012 00:30:29 ... 9.5.2012 15:30:50

res.merge is the data frame unordered. Now I want to order the data frame
with:

res.ordered-res.merge[order(as.POSIXct(as.character(res.merge$DT),format=%d.%m.%Y
%H:%M:%S)),]

This works in fact, however for some reason there are always two entires
that go at the end of the data frame for no obvious reason (see below,
09.05.2012 ist the most recent date). And this is the case for different
data.frames. The two entries at the end are always 25.3.2012 02:00:xx and
25.3.2012 02.30.xx.

Can anybody tell me what the problem is? Any help is most appreciated.

Best Benedikt

res.ordered[2545:2549,]
 DT Typ  NOD Day_s DOW_s   Time_s Long  Lat
2547  9.5.2012 14:30:56 GPS 1893  9.5.2012We 14:30:00 7.452218 46.43579
2548  9.5.2012 15:02:09 GPS 1893  9.5.2012We 15:00:35 7.451983 46.43583
2549  9.5.2012 15:30:50 GPS 1893  9.5.2012We 15:30:00 7.451973 46.43597
1845 25.3.2012 02:00:18 GPS 1848 25.3.2012So 02:00:01 7.454266 46.45414
1846 25.3.2012 02:30:16 GPS 1848 25.3.2012So 02:30:00 7.454413 46.45437
 Height TOF Status FO_GPS GPS_N AOT Day_e DOW_e   Time_e   BV Temp 
SOG
2547 1182.8   3  A  1   143  55  9.5.2012We 14:30:56 3735   31
0.09
2548 1182.8   3  A  1   143  94  9.5.2012We 15:02:09 3637   32
0.02
2549 1176.5   3  A  1   143  50  9.5.2012We 15:30:50 3730   29
0.17
1845 1295.2   3  A  1   151  17 25.3.2012So 02:00:18 37157
0.18
1846 1287.3   3  A  1   144  16 25.3.2012So 02:30:16 37208
0.14
 Heading  SAE  HAE BW_2 BW_3  X..
2547   24.90 3.81 9.47 3666 3625 9.08
25487.86 0.51 7.17 3593 3586 9.11
2549  344.72 2.86 4.10 3662 3623 9.12
1845  335.54 3.53 5.63 3618 3618 0.81
1846   75.37 5.44 8.96 3618 3618 0.81

--
View this message in context: 
http://r.789695.n4.nabble.com/order-a-data-frame-by-date-with-order-tp4630225.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] order a data frame by date with orderl

2012-05-16 Thread Benedikt Gehr
Hi,
many thanks for your answer. if i set tz=GMT it does the job. Great! 
thanks

cheers

Benedikt

Am 16.05.2012 12:20, schrieb jholtman [via R]:
 Is the a daylight saving time problem?  Check your timezone and see 
 when it occurred; these times might not be legal.

 Sent from my iPad

 On May 16, 2012, at 3:27, Benedikt Gehr [hidden email] 
 /user/SendEmail.jtp?type=nodenode=4630229i=0 wrote:

  Hi
 
  I have a rather large data frame (7000 rows with 28 columns) which 
 I want
  to sort by date. Below I have a example of the data frame. The Date 
 column
  is called DT, is a factor and looks like this:
 
  class(res.merge$DT)
  [1] factor
  head(res.merge$DT)
  [1] 17.3.2012 13:54:02 17.3.2012 14:00:07 17.3.2012 14:30:25 17.3.2012
  15:01:15
  [5] 17.3.2012 15:32:14 17.3.2012 16:01:29
  2530 Levels: 1.4.2012 00:00:52 1.4.2012 00:30:29 ... 9.5.2012 15:30:50
 
  res.merge is the data frame unordered. Now I want to order the data 
 frame
  with:
 
  
 res.ordered-res.merge[order(as.POSIXct(as.character(res.merge$DT),format=%d.%m.%Y
  %H:%M:%S)),]
 
  This works in fact, however for some reason there are always two 
 entires
  that go at the end of the data frame for no obvious reason (see below,
  09.05.2012 ist the most recent date). And this is the case for 
 different
  data.frames. The two entries at the end are always 25.3.2012 
 02:00:xx and
  25.3.2012 02.30.xx.
 
  Can anybody tell me what the problem is? Any help is most appreciated.
 
  Best Benedikt
 
  res.ordered[2545:2549,]
  DT Typ  NOD Day_s DOW_s   Time_s Long   
Lat
  2547  9.5.2012 14:30:56 GPS 1893  9.5.2012We 14:30:00 7.452218 
 46.43579
  2548  9.5.2012 15:02:09 GPS 1893  9.5.2012We 15:00:35 7.451983 
 46.43583
  2549  9.5.2012 15:30:50 GPS 1893  9.5.2012We 15:30:00 7.451973 
 46.43597
  1845 25.3.2012 02:00:18 GPS 1848 25.3.2012So 02:00:01 7.454266 
 46.45414
  1846 25.3.2012 02:30:16 GPS 1848 25.3.2012So 02:30:00 7.454413 
 46.45437
  Height TOF Status FO_GPS GPS_N AOT Day_e DOW_e   Time_e   BV 
 Temp
  SOG
  2547 1182.8   3  A  1   143  55  9.5.2012We 14:30:56 
 3735   31
  0.09
  2548 1182.8   3  A  1   143  94  9.5.2012We 15:02:09 
 3637   32
  0.02
  2549 1176.5   3  A  1   143  50  9.5.2012We 15:30:50 
 3730   29
  0.17
  1845 1295.2   3  A  1   151  17 25.3.2012So 02:00:18 
 37157
  0.18
  1846 1287.3   3  A  1   144  16 25.3.2012So 02:30:16 
 37208
  0.14
  Heading  SAE  HAE BW_2 BW_3  X..
  2547   24.90 3.81 9.47 3666 3625 9.08
  25487.86 0.51 7.17 3593 3586 9.11
  2549  344.72 2.86 4.10 3662 3623 9.12
  1845  335.54 3.53 5.63 3618 3618 0.81
  1846   75.37 5.44 8.96 3618 3618 0.81
 
  --
  View this message in context: 
 http://r.789695.n4.nabble.com/order-a-data-frame-by-date-with-order-tp4630225.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  [hidden email] /user/SendEmail.jtp?type=nodenode=4630229i=1 
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4630229i=2 
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 
 If you reply to this email, your message will be added to the 
 discussion below:
 http://r.789695.n4.nabble.com/order-a-data-frame-by-date-with-order-tp4630225p4630229.html
  

 To unsubscribe from order a data frame by date with order, click here 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4630225code=YmVuZWRpa3QuZ2VockBpZXUudXpoLmNofDQ2MzAyMjV8LTc4NzA5MjQxMQ==.
 NAML 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  


-- 
Benedikt Gehr
Ph.D. Student

Institute of Evolutionary Biology and Environmental Studies
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich

Office 13 J 36b
Phone: +41 (0)44 635 49 72
http://www.ieu.uzh.ch/staff/phd/gehr.html



--
View this message in context: 
http://r.789695.n4.nabble.com/order-a-data-frame-by-date-with-order-tp4630225p4630237.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https

[R] true time series lags behind fitted values in arima model

2010-10-29 Thread Benedikt Gehr

Hi

I am fitting an arima model to some time series X. When I was comparing 
the fitted values of the model to the true time series I realized that 
the true time series lags one time step behind the fitted values of the 
arima model. And this is the case for any model. When I did a simple 
linear regression using lm to check, I also find the same results, that 
the true series lags behind the fitted values. I don't understand this, 
what am I doing wrong? Below I copied some of the code to demonstrate 
the issue.


## Analysis using arima()

X-c(6.841067, 6.978443, 6.984755, 7.007225, 7.161198, 7.169790, 
7.251534,  # the true time series

   7.336429, 7.356600, 7.413271, 7.404165, 7.480869, 7.498686, 7.429809,
   7.302747,  7.168251, 7.124798, 7.094881, 7.119132, 7.049250, 6.961049,
   7.013442,  6.915243, 6.758036, 6.665078, 6.730523, 6.702005, 6.905522,
   7.005191, 7.308986)

model100-arima(X,order=c(1,0,0),include.mean=T,method=ML)  # the 
arima model

resid100-residuals(model100)
Xfit100-X-resid100 # the fitted values
ts.plot(cbind(Xfit100,X),col=c(2,4),main=ARIMA(1,0,0))  # plot the 
true ts vs the fitted values
legend(20,7.5,c(true values,fitted 
values),pch=c(16,16),col=c(blue,red))


## Same analysis using lm()

Y-X[2:30]  # create X(t)- a time series without the first value
X1-X[1:29] # create the X(t-1)

time-seq(1978,2006)
model1-lm(Y~X1)
summary(model1)
arima.intercept-model1[[1]][[1]]/(1-model1[[1]][[2]]);arima.intercept
model100[[1]][[2]]
plot(Y~time,type=b)  # plot the two series
points(fitted(model1)~time,type=b,col=green)
legend(1995,7.5,c(true values,fitted 
values),pch=c(16,16),col=c(black,green))



Thank you very much for the help

best wishes

Benedikt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with arima() function

2010-10-26 Thread benedikt . gehr

   Hi
   I posted this problem yesterday but didn't get a reply so I try again today.
   I hope someone can help me with this.
   thank you very much for the help
   cheers
   Benedikt
   I would like to use arima () to find the best arima model for a time
   series. The default in arima apparently is to use conditional sum of
   squares to find the starting values and then use ML for the rest (as was
   described on the
   help page).
   Now, using the default may lead to error messages saying: non-stationary
   ar part in CSS. When changeing the default from CSS-ML to ML-only the
   minimization works. As far as I understand, arima doesn't require
   stationarity, but apparently CSS does.
   Can anyone tell me what exactly the css method does? And why is CSS-ML
   the default in R? Out of efficiency reasons? Because ML and ML-CSS gives
   the exact same estimates when applied to the same data. I tried to find
   out on google but I couldnt' find anything usefull or understandable to
   me as a non-statistician.
   Here some data that causes the error message:
   X-c(6.841067, 6.978443, 6.984755, 7.007225, 7.161198, 7.169790, 7.251534,
   7.336429, 7.356600, 7.413271, 7.404165, 7.480869, 7.498686, 7.429809,
   7.302747,  7.168251, 7.124798, 7.094881, 7.119132, 7.049250, 6.961049,
   7.013442,  6.915243, 6.758036, 6.665078, 6.730523, 6.702005, 6.905522,
   7.005191, 7.308986)
   model.examp-arima(X,order=c(7,0,0),include.mean=T)# gives an error
   model.examp-arima(X,order=c(7,0,0),include.mean=T,method=ML)  # gives no
   error
   Any help on this would be most appreciated
   Many thanks fo the help
   best wishes
   Benedikt
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-stationary ar part in css

2010-10-25 Thread Benedikt Gehr

Hi

I would like to use arima () to find the best arima model for y time 
series. The default in arima apparently is to use conditional sum of 
squares to find the starting values and then ML (as described on the 
help page).
Now using the default may lead to error messages saying: non-stationary 
ar part in CSS. When changeing the default to ML only the 
minimization works. As far as I understand, arima doesn't require 
stationarity, but apparently CSS does.
Can anyone tell me what exactly the css method does? And why is CSS-ML 
the default in R? Out of efficiency reasons? Because ML and ML-CSS gives 
the exact same estimates when applied to the same data. I tried to find 
out on google but I couldnt' find anything usefull or understandable to 
me as a non-statistician.


Here some data that causes the error message:

X-6.841067, 6.978443, 6.984755, 7.007225, 7.161198, 7.169790, 7.251534, 
7.336429, 7.356600, 7.413271, 7.404165, 7.480869, 7.498686, 7.429809, 
7.302747, 7.168251,
7.124798, 7.094881, 7.119132, 7.049250, 6.961049, 7.013442, 6.915243, 
6.758036, 6.665078, 6.730523, 6.702005, 6.905522, 7.005191, 7.308986)


model.examp-arima(X,order=c(7,0,0),include.mean=T)# gives an error
model.examp-arima(X,order=c(7,0,0),include.mean=T,method=ML)  # gives 
no error


Any help on this would be most appreciated

Many thanks fo the help

best wishes

Benedikt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] the arima()-function and AICc

2010-09-28 Thread Benedikt Gehr

 Hi

I'm trying to fit arima models with the arima() function and I have two 
questions.


##
##1. ##
##
I have n observations for my time series. Now, no matter what 
arima(p,d,q)- model I fit, I always get n residuals. How is that possible?
For example: If I try this out myself on an AR(1) and calculate the 
fitted values from the estimated coefficients I can calculate n-1 
residuals. What is the first residual - residuals[1] in the model?

Does it have something to do with the mean?

Here is what I did: - X is the time series I'm analysing

X-c(6.770705, 6.842524, 6.881832, 6.896694, 7.004967, 7.065750, 
7.139447, 7.227818, 7.274945, 7.333097, 7.350763, 7.404271, 7.426247, 
7.394454, 7.303650, 7.176984, 7.170972, 7.113736, 7.154326, 7.136678, 
7.103826, 7.146775, 7.084247, 7.016302, 6.784539, 6.705846, 6.709989, 
6.851557, 6.973064, 7.232223)


## The AR(1) Model

model10-arima(X,order=c(1,0,0),include.mean=T)

mu-model10[[1]][[2]]
a-model10[[1]][[1]]

## Get the fitted values and residuals of the arima(1,0,0)-model

fitted-vector(mode=numeric)
E-vector(mode=numeric)
for (i in 2:30){
fitted[i]-a*(X[i-1]-mu)+mu
E[i]-X[i]-fitted[i]
}
fitted
E# Innovations
residuals(model10)# Compare with the residuals from the arima() 
model


##
##2. ##
##
I want to calculate the AICc Value for model selection. Is there a way 
to calculate the AICc from the model output without doing it manually. I 
guess I could write a function myself somehow but it's always nice to 
have an inbuilt function with which to compare what one does. I know 
that there has been some discussion on this on the platform but I didn't 
find anything that helped me. I tried AICctab from the bbmle package but 
I couldn't figure out how to get my output from arima into the function. 
I tried with making a list of the logLik(mdoel) output and several other 
things but it never worked.


Any help with this is most appreciated

many thanks in advance

cheers

Beni

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Antwort: Re: an alternative to R for nonlinear stat models

2010-06-16 Thread benedikt . gehr

   Hi Paul

   Oh ok sorry, I send you below a copy of the R code I was using. It's very
   possible that my programming is slow as well. I'm very happy to learn.

   Sorry the code is a bit long, but its structured like this and cotains three
   main functions and the optimization:



   1. Input - (I didn't send the data)

   2.  Function  to  calculate the cell probabilities for the multinomial
   likelihood

   3. Function to calculate the multinomial

   4.Function to be optimized

   5. then I set the starting values

   6. optimization using mle2



   hope that helps



   cheers



   Beni

   -Paul Hiemstra p.hiems...@geo.uu.nl schrieb: -

 An: benedikt.g...@ieu.uzh.ch
 Von: Paul Hiemstra p.hiems...@geo.uu.nl
 Datum: 16.06.2010 09:36
 Kopie: r-help@r-project.org, da...@otter-rsch.com
 Betreff: Re: [R] an alternative to R for nonlinear stat models
 On 06/16/2010 07:35 AM, benedikt.g...@ieu.uzh.ch wrote:
  Hi
  I implemented the age-structure model in Gove et al (2002) in R,
 which is a
  nonlinear statistical model. However running the model in R was very
 slow.
  So Dave Fournier suggested to use the AD Model Builder Software
 package and
  helped me implement the model there.
  ADMB was incredibly fast in running the model:
  While running the model in R took 5-10 minutes, depending on the
 settings,
  in ADMB it took 1-2 seconds!
  I'm reporting this so that people who have performance issues with
 nonlinear
statistical models in R will know that there is a good free
 alternative for
  more difficult problems.
  There  is also a help platfrom equivalent to the one for R, and
 people
  running it are extremley helpful.
  I hope this might help someone
  cheers
  Beni
  __
  R-help@r-project.org mailing list
  [1]https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 [2]http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 Hi Beni,
 Thanks for posting information that might be useful for people on the
 list. The only thing is that without a little more detail on how exactly
 you implemented things in R, we are left to guess if the performance
 issues are a problem of R, or that your particular implementation was
 the problem. There are was of implementing R code in two ways, where the
 first takes minutes and the second 1-2 seconds. Furthermore, you are
 giving us no option to defend R ;).
 cheers,
 Paul
 --
 Drs. Paul Hiemstra
 Department of Physical Geography
 Faculty of Geosciences
 University of Utrecht
 Heidelberglaan 2
 P.O. Box 80.115
 3508 TC Utrecht
 Phone:  +3130 274 3113 Mon-Tue
 Phone:  +3130 253 5773 Wed-Fri
 [3]http://intamap.geo.uu.nl/~paul
 [4]http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

References

   1. https://stat.ethz.ch/mailman/listinfo/r-help
   2. http://www.r-project.org/posting-guide.html
   3. http://intamap.geo.uu.nl/~paul
   4. http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Antwort: Re: an alternative to R for nonlinear stat models

2010-06-16 Thread benedikt . gehr

   Hi

   I  think  my first answer doesn't seem to have gone through due to the
   attachement. So I copy pasted my last answer including the code below.

   Sorry about that:



   Hi Paul

   Oh ok sorry, I send you below a copy of the R code I was using. It's very
   possible that my programming is slow as well. I'm very happy to learn.

   Sorry the code is a bit long, but its structured like this and cotains three
   main functions and the optimization:



   1. Input - (I didn't send the data)

   2.  Function  to  calculate the cell probabilities for the multinomial
   likelihood

   3. Function to calculate the multinomial

   4.Function to be optimized

   5. then I set the starting values

   6. optimization using mle2



   hope that helps



   cheers



   Beni

   ###
   ###

   #library(boot)
   #library(bbmle)
   #library(demogR)

   ##Simulated harvest numbers of a cohort harvested over 30 years (including
   random noise)
   pop-N.true   # True population size
   cohort-C   # harvest matrix
   R-nrow(cohort)
   L-ncol(cohort)

   h-matrix(rep(Q,R),ncol=a,byrow=T)  # estimated harvest rates
   S-matrix(rep(P[-1],R),ncol=a-1,byrow=T)  # survival rates

   ##
   ##Function to calculate the cell probabilities
   ##

   predprob - function(S=S,h=h) {
 R-nrow(cohort)
 L-ncol(cohort)
 p - matrix(rep(0,(R)*(L)),ncol=L)  # matrix of cell probabilities
  p[R,1]-h[R,1]
p[1,L]-h[1,L]
   ay-rep(0,L+R+1) #   vector  of  last  cell  probabilities  -
   1-sum(rest));Gove(2002)
  ay[L+R]-1-h[1,L]
ay[2]-1-h[R,1]
index-3   #   ay[1]=Null,   ay[2]=1-h[R,1],ay[L+R]=1-h[1,L],
   ay[L+R+1]=Null

   for (i in -(R-2):(L-2)){# Calculating the cell prob after Gove (2002)

p[row(p) == col(p) - i] -
   odiag(h,i)*c(1,cumprod((1-odiag(h[1:(R-1),1:(L-1)],i))
 *odiag(S[1:(R-1),],i)))

ay[index]-1-sum(odiag(p[1:R,],i))

index-index+1
  }

p-rbind(p,ay[1:L])
p-cbind(p,ay[length(ay):(L+1)])
   return(p)

}

   predprob(S,h)  #test the function

   
   ##Function to calculate the multinomial -from Ben Bolker
   

   ## multinom WITHOUT ROUNDING  without error-checking
   dmnom - function(x,prob,log=FALSE) {
 r - lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
 if (log) r else exp(r)
   }

   ###
   ## function to be optimized
   ###

   mfun - function(logN,S=S,h=h,cohort=cohort) {
 L-ncol(cohort)
 R-nrow(cohort)
 d-matrix(rep(NA,(R+1)*(L+1)),ncol=L+1)  # last row and last col of the
   matrix
  d[1:R,1:L]-log(cohort)#  are  the  values to be optimized-Log
   tranformed data!

 index-1
 index1-(R+L)

   for (i in -(R-1):(L-1)){# values for the top row
   if (i= -(R-L) ){
   d[(R+1),(index+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))} else
   if (i -(R-L) i1 ){
   d[index1,(L+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))} else
   if (i0){
   d[index1,(L+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))}

   index-index+1
   index1-index1-1
   }

   pp-predprob(S,h)

   lhood-0   # The Likelihood function - to be maximized

   for (i in -(R-1):(L-1)){
lhood-lhood+(-dmnom(exp(odiag(d,i)),prob=odiag(pp,i),log=TRUE))
   }
   return(lhood)
   }

   logN-log(rep(1000,(R+L-1))) # Test the function with test-values
   mfun(logN,S,h,cohort)

   
   ##
   
   ##

   ##
   ##Starting values for the optimization
   ##

   N-rep(500,(R+L-1)) # optimization,-N-x1-x2-x3
   svec = log(N)   # transform to avoid constraints (x0)
   names(svec) - paste(logN,1:(R+L-1),sep=)
   parnames(mfun) - names(svec)

   ##
   ##Lower bounds for the L-BFGS-B-method
   ##

   index-1
   lower-rep(0,(R+L-1))

   for (i in -(R-1):(L-1)){
   lower[index]-log(sum(odiag(cohort,i)))
   index-index+1
   }

   lower-lower+0.001
   names(lower)-names(svec)
   exp(lower)   # check lower bounds
   #
   ##Optimization with mle2
   #
   ## use bounded optimization

   m1 = mle2(mfun,start=svec,
 method=L-BFGS-B,lower=lower,upper=10,  # lower=must be larger than
   sum(odiag(x))-if smaller NANs are produced
 vecpar=TRUE,
 data=list(S=S,h=h,cohort=cohort))

   coef(m1)
   summary(m1)
   Nest-exp(coef(m1)) # estimated cohort sizes for row and colum 1

   -Paul Hiemstra p.hiems...@geo.uu.nl schrieb: -

 An: benedikt.g...@ieu.uzh.ch
 Von: Paul Hiemstra p.hiems...@geo.uu.nl
   

[R] an alternative to R for nonlinear stat models

2010-06-15 Thread benedikt . gehr

   Hi
   I implemented the age-structure model in Gove et al (2002) in R, which is a
   nonlinear statistical model. However running the model in R was very slow.
   So Dave Fournier suggested to use the AD Model Builder Software package and
   helped me implement the model there.
   ADMB was incredibly fast in running the model:
   While running the model in R took 5-10 minutes, depending on the settings,
   in ADMB it took 1-2 seconds!
   I'm reporting this so that people who have performance issues with nonlinear
   statistical models in R will know that there is a good free alternative for
   more difficult problems.
   There  is also a help platfrom equivalent to the one for R, and people
   running it are extremley helpful.
   I hope this might help someone
   cheers
   Beni
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] running admb from R using system()

2010-05-31 Thread Benedikt Gehr

Hi

I'm trying to run an admb model from R by using the system () command. 
The admb model runs fine when running it from the admb command line or 
when using emacs. However when I try it with system() then R crashes 
every time.
And I tried using the R command line and RGui and in both it crashes. I 
also tried it from different computers.


What I do is I first set the directory where I keep the admb template 
file with the corresponding admb data file and then invoke the system 
command either step by step or directly. I can build the model but when 
executing the model.exe file R crashes.


##
setting the working directory
setwd(Documents and Settings\\Beni User\\Desktop\\Transfer\\Model 
Fournier\\admb models\\simulated data\\stochastic\\pooled age classes)


building the model - this works fine
system('makeadm stocpool')

running the model - makes R crash
system('stoc.exe')
###
or directly lie this:
setwd(Documents and Settings\\Beni User\\Desktop\\Transfer\\Model 
Fournier\\admb models\\simulated data\\stochastic\\pooled age classes)

system(./stocpool) - makes R crash right away

Does anyone know why this happens and how I can make this work?

Thanks a lot for the help!!

cheers

Beni

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read in data file into R

2010-05-31 Thread Benedikt Gehr

Hi
I'm trying to read a data file with output from another program (admb) 
into R for further analysis. However I'm not very successfull. The file 
extension for the data file is file.rep but it also doesn't help when I 
change it to file.txt


I have two problems/questions:

1.  The file is a single line of n values separated by a single space 
tab each. These values represent a time series of length n. How can I 
make a numeric vector with this data?
When I use the read.table command and read in the file R produces a 
list of as many objects as there are values (n). However what I need is 
a vector of length n in order to work with the data. When I try to 
coerce the list into a single vector using as.vector this doesn't work.
When I specify sep=\n then I get a list where all the n values are 
treated as one value and I cant extract single values.


2. And related to the issue above: When I have a data file which 
consists of two objects, one is a matrix and the other one is a vector. 
Can I read the file into R all at once as a list with 2 objects and then 
extract the matrix and the vector and work with them? Or is it necessary 
to first make two files, for each object one?


Below I have copied a subset of my data files for ilustration. This 
seems a very silly question but I just didn't manage to to it.

Thanks a lot for the help

cheers

beni

This is a subset of my data file for 1.:

Time series of reconstructed populations
3709.17 2660.93 2045.36 2090.33 2096.93 2205.65 2083.72 1797.53 1884.61 
1946.59 2101.66 2220.03 2080.04 2097.07 2332.9 2325.47 2091.67 2091.54 
2072.38 2025.31 1919.54 1781.95 1867.96 1685.12 1826.31 1654.25 1593.84 
1430.96 1539.89 1587.35 1472.32 1737.02 1510.37 1570.15 1723.21 1755.3 
1843.85 1829.2 1880.63 1916.79 1945.86 2096.64 2246.67 2101.16 2134.39 
2018.1 2174.04 


This is a subset of the data file for 2.:
Reconstructed population
203.026 200.005 205.206 217.36 279.415 750.965
495.041 91.3615 162.004 147.748 156.499 492.444
463.284 222.768 74.0028 116.643 106.379 303.677
468.042 208.478 180.442 53.282 83.9828 194.375
460.216 210.619 168.867 129.918 38.3631 135.857
461.88 207.097 170.601 121.584 93.5413 80.3142
474.857 207.846 167.749 122.833 87.5406 98.5
479.117 213.686 168.355 120.779 88.4396 101.233
480.269 215.603 173.085 121.216 86.961 102.94
483.206 216.121 174.638 124.622 87.2753 102.538
486.657 217.443 175.058 125.739 89.7275 102.608
490.516 218.996 176.128 126.042 90.5324 104.401
494.019 220.732 177.386 126.813 90.7501 105.676
497.345 222.308 178.793 127.718 91.305 106.327
500.797 223.805 180.07 128.731 91.9571 106.979
504.331 225.359 181.282 129.65 92.6863 107.701
507.892 226.949 182.54 130.523 93.3482 108.507
511.458 228.551 183.829 131.429 93.9768 109.296
515.039 230.156 185.127 132.357 94.629 110.054
518.65 231.767 186.426 133.291 95.2967 110.818
522.291 233.393 187.732 134.227 95.9696 111.595
525.957 235.031 189.048 135.167 96.6434 112.381
529.648 236.681 190.375 136.115 97.32 113.171
533.364 238.342 191.711 137.07 98.0025 113.964
537.106 240.014 193.057 138.032 98.6905 114.763
541.61 241.698 194.411 139.001 99.3832 115.569
545.435 243.725 195.775 139.976 100.081 116.38
549.312 245.446 197.417 140.958 100.783 117.197
553.294 247.19 198.811 142.14 101.49 118.019
557.349 248.982 200.224 143.144 102.341 118.847

time series of the reconstructed population
1855.98 1545.1 1286.75 1188.6 1143.84 1135.02 1159.33 1171.61 1180.07 
1188.4 1197.23 1206.61 1215.38 1223.8 1232.34 1241.01 1249.76 1258.54 
1267.36 1276.25 1285.21 1294.23 1303.31 1312.45 1321.66 1331.67 1341.37 
1351.11 1360.94 1370.89


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] optimize a joint lieklihood with mle2

2010-03-24 Thread Benedikt Gehr, Institut für Evolutionsbiologie und Umweltwissenschaften

Hi

I'm trying to maximize a joint likelihood of 2 likelihoods (Likelihood 1 and 
Likelihood 2) in mle2, where the parameters I estimate in Likelihood 2 go 
into the likelihood 1. In Likelihood 1 I estimate the vector logN with 
length 37, and for the Likelihood 2 I measure a vector s of length 8.

The values of s in Lieklihood 2 are used in the Likelihood 1.

I have 2 questions:

##1
I manage to write the joint lieklihood (it's very long which is why I don't 
put it here), but I don't know how to write the correct argument for the 
mle2 function. Here is what I've tried:


##
the lieklihood function to be optimized is called mfun(logN, s, h=h, 
cohort=cohort). I want to maximize the vectors logN and s. I define the 
starting values for logN as svec1 and for s as svec2


then I write for the optimization:

m1 = mle2(mfun,start=list(svec1=logN,svec2=s),
  method=L-BFGS-B,lower=lower,upper=10, 
  vecpar=TRUE,
  data=list(h=h,cohort=cohort))

but I get the error:
Error in mle2(mfun, start = list(svec1 = logN, svec2 = s), method = 
L-BFGS-B,  :
  some named arguments in 'start' are not arguments to the specified 
log-likelihood function


How can I correctly write the fucntion for this?

#

##2
Besides I'm using constrained optim method L-BFGS-B. But I would only like 
to use it for logN and not for s. Is that at all possible? I guess I could 
use the same limits also for s because s falls in the range of logN values, 
but how can I tell mle2 that I measure 2 likelihoods.


thanks so much for the help

Benedikt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] likelihood profiles for multiple parameters

2010-03-24 Thread Benedikt Gehr, Institut für Evolutionsbiologie und Umweltwissenschaften

Hi

I'm using mle2 for optimization of a multinomial likelihood. The model fit 
is not very good and I would like to look at the likelihood profiles. 
However I'm optimizing many parameters (~40) so using the function profile() 
takes forever and is not practical. How can I calculate approximate 
likelihood profiles? Is there a function for this in R?


thanks a lot for the help

Beni

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using near-zero probabilities in optimization

2010-03-10 Thread Benedikt Gehr, Institut für Evolutionsbiologie und Umweltwissenschaften
[row(p) == col(p) - i] - 
	odiag(h,i)*c(1,cumprod((1-odiag(h[1:(R-1),1:(L-1)],i))

*odiag(S[1:(R-1),],i)))

ay[index]-1-sum(odiag(p[1:R,],i))

index-index+1
   }

p-rbind(p,ay[1:L])
p-cbind(p,ay[length(ay):(L+1)])
return(p)

 }

predprob(S,h)   #test the function


##Function to calculate the multinomial -from Ben Bolker


## multinom WITHOUT ROUNDING  without error-checking
dmnom - function(x,prob,log=FALSE) {
  r - lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
  if (log) r else exp(r)
}

###
## function to be optimized
###

mfun - function(logN,S=S,h=h,cohort=cohort) {
  L-ncol(cohort)
  R-nrow(cohort)
  d-matrix(rep(NA,(R+1)*(L+1)),ncol=L+1)		# last row and last col of the 
matrix
  d[1:R,1:L]-log(cohort)# are the values to be optimized-Log 
tranformed data!


  index-1
  index1-(R+L)

for (i in -(R-1):(L-1)){# values for the top row
if (i= -(R-L) ){
d[(R+1),(index+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))} else
if (i -(R-L) i1 ){
d[index1,(L+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))} else
if (i0){
d[index1,(L+1)]-log(exp(logN[index])-sum(odiag(cohort,i)))}

index-index+1
index1-index1-1
}
  
pp-predprob(S,h)


lhood-0 # The Likelihood 
function - to be maximized

for (i in -(R-1):(L-1)){
 lhood-lhood+(-dmnom(exp(odiag(d,i)),prob=odiag(pp,i),log=TRUE))
}
return(lhood)
}

logN-log(rep(1000,(R+L-1))) # Test the function with 
test-values
mfun(logN,S,h,cohort)

##
##

##
##Starting values for the optimization
##

N-rep(500,(R+L-1))  # optimization,-N-x1-x2-x3
svec = log(N)   # transform to avoid 
constraints (x0)
names(svec) - paste(logN,1:(R+L-1),sep=)
parnames(mfun) - names(svec)

##
##Lower bounds for the L-BFGS-B-method
##

index-1
lower-rep(0,(R+L-1))

for (i in -(R-1):(L-1)){
lower[index]-log(sum(odiag(cohort,i)))
index-index+1
}

lower-lower+0.001
names(lower)-names(svec)
exp(lower)  # check lower 
bounds

#
##Optimization with mle2
#

## use bounded optimization

m1 = mle2(mfun,start=svec,
  method=L-BFGS-B,lower=lower,upper=10,		# lower=must be larger than 
sum(odiag(x))-if smaller NANs are produced

  vecpar=TRUE,
  data=list(S=S,h=h,cohort=cohort))

coef(m1)

Nest-exp(coef(m1))  # estimated cohort sizes 
for row and colum 1




#
##Analyse the Goodness of fit
#

Nya-matrix(rep(0,length(cohort)),ncol=L)		# project the N estimates into a 
matrix

Nya[1,1:L]-Nest[R:length(Nest)]
Nya[2:R,1]-rev(Nest[1:(R-1)])


stot-(1-Q[1:(a-1)])*P[2:a]# derived survival - product of (1-h(i)) amd 
s(i)

smat-matrix(rep(stot,R),ncol=L-1,byrow=T)

for (i in 1:ncol(smat)){# project the N 
estimates trough time
Nya[2:R,i+1]-Nya[1:(R-1),i]*smat[1:(R-1),i]
}


###
##Expected harvest values compared with observed harvest numbers
###

exp-Nya*h
obs-cohort

gof-((obs-exp)^2)/exp	# Goodness of fit for the single 
estimates			# Overall GOF statistic

gof.stat-sum(gof)   # GOF statistic
p.value-1-pchisq(gof.stat,(length(obs)-1))	# p-value with appropriate 
degrees of freedom

p.value

plot(obs~exp,main=Observed harvest numbers vs expected from the 
model,xlim=c(0,200),ylim=c(0,200))

abline(0,1)

##
##Plots
##
##Plot true population trend - compare with estimated trend

plot(rowSums(N.true)~seq(1:y),ylim=c(0,5000),col=red,pch=17,main=True vs 
estimated population trend)

lines(rowSums(N.true)~seq(1:y),col=red)

points(rowSums(Nya)~seq(1:y),ylim=c(0,5000),col=blue,pch=16)
lines(rowSums(Nya)~seq(1:y),col=blue)

###
###
##
##END#
##

Thanks a lot for the help

Benedikt



On Tue, 9 Mar 2010 15:34:09 + (UTC)
 Ben Bolker bol...@ufl.edu wrote:

Benedikt Gehr benedikt.gehr at ieu.uzh.ch writes:



Hi there

I am using mle2

[R] using near-zero probabilities in optimization

2010-03-09 Thread Benedikt Gehr

Hi there

I am using mle2 for a multinomial likelihood optimization problem. My 
function works fine when I'm using simulated data, however my cell 
probabilities of the true data for the multinomial likelihood are 
sometimes very small (in some cases 0.00...) and the estimated point 
estimates fit the true vlaues quite poorly. Is there a way how to handle 
near zero probabilities in maximum likelihood optimization?


Thanks a lot for the help

best wishes

Benedikt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function odiag(): assigning values to off-diagonal

2010-03-01 Thread Benedikt Gehr

hi
I'm trying to use the function odiag(x) for matrix calculations. I would 
like to assign new values to an off-diagonal in a matrix.

When I use the diag (x) function I could write something like

p-matrix(seq(1:12),ncol=4)
p.new-matrix(rep(0,12),ncol=4)

diag(p.new)-diag(p)
p.new

But this won't work with odiag. How can I turn odiag (x) into something 
like diag (x) in order to assign off-diagonals new values?

In the end I would like to do something like this:

h-matrix(rep(c(0.2,0.3,0.3,1),4),ncol=4,byrow=T) #harvest rates for 
the 3 diferent years

S-matrix(rep(c(0.9,0.8,0.7),4),ncol=3,byrow=T) #survival rates

for (i in 0:(L-2)){

   odiag(p,i) - 
(odiag(h,i)*c(1,cumprod((1-odiag(h[1:R,1:(L-1)],i))*odiag(S[1:R,1:(L-1)],i
  
p[Rp-i,Lp] - 1-sum(odiag(p[1:Rp,-Lp],i)) ## add one more value

 p[1,L]-h[1,L]
   p[2,Lp]-1-p[1,L]
  }

Does anybody know how I can do this?

Thank you very much for the help

Best wishes

Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using mle2 for multinomial model optimization

2010-02-12 Thread Benedikt Gehr

Hi there
I'm trying to find the mle fo a multinomial model -*L(N,h,S¦x)*. There 
is only *N* I want to estimate, which is used in the number of successes 
for the last cell probability. These successes are given by: 
p^(N-x1-x2-...xi)
All the other parameters (i.e. h and S) I know from somewhere else. 
Here is what I've tried to do so far for a imaginary data set:

###

cohort-c(50,54,50) #imaginary harvest numbers of a cohort harvested 
over 3 years


l-length(cohort)+1#nr of cell probabilities
h-c(0.2,0.3,1) #harvest rates for the 3 diferent years
S-c(0.9,0.8) #survival rates

mfun - function(d) {
 S-S#survival rate
 h-h#harvest rate
 l-length(d)
   p-rep(0,l) #Cell probabilities
   p[1]-h[1]
 
   prod-(1-h[1])*S[1]

   for (i in 2:(l-1)){
   p[i]-prod*h[i]
   prod-prod*(1-h[i])*S[i]   


   }
  p[l]-1-sum(p[-l])#last cell probability   
 -dmultinom(exp(d),prob=p,log=TRUE)#exp(d)-backtransform the estimates

}

c-c(cohort,100)#untransformed initial values for the 
optimization,-100=N-x1-x2-x3

nvec-c(rep(x,l-1),N) #names for the initial vector
nrs-c(1:(l-1),1)#nrs for the initial vector
svec = log(c) #transformation of the data to avoid 
constraints (x0)

names(svec) - paste(nvec,nrs,sep=)
parnames(mfun) - names(svec)

m1 = mle2(mfun,start=svec,
vecpar=TRUE,fixed=svec[1:(l-1)]) #only estimate 
N-x1-x2-x3,everything else is fixed


   coef(m1)


The function mfun is working, however the mle2 won't work. How can I 
fix this?


Thanks so much for your help

Beni

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compute maximum likelihood estimator for a multinomial function

2009-11-04 Thread Benedikt Gehr

Hi there

I am trying to learn how to compute mle in R for a multinomial negative 
log likelihood function.
I am using for this the book by B. Bolker Ecological models and data in 
R, chapter 6: Likelihood an all that. But he has no example for 
multinomial functions.


What I did is the following:
I first defined a function for the negative log likelihood:

###
log.multi-function(d,p){
-sum(dmultinom(d,prob=p,log=T))
}

##then defined the parameters

d-data ## (in my case counts of animals found dead in different 
age-classes 1:19)
p-d/sum(d) ##or some values close to the analytical solution n(i)/N for 
the  probabilities


##then I used the optim function like this:

opt1-optim(par=p,fn=log.multi,d=d)
#

This works perfectly when I use some simple imaginary numbers instead of 
my real data; e.g. d-c(3,6,9)


However for my real data this doesnt work because the differences 
between the different counts seem to be to large.

here are my data:

###
d-count
 d
[1]  7 15  9  3  6 13 11 17 15 36 38 52 49 53 33 20  6  2  1
###

I get the error message:  probabilities can't neither be negative nor 
all 0


So I wanted to try the mle or the mle2 functions of the packages 
stats4 and bbmle respectively. But I just can't get them running.

I tried it like this:

###
library(bbmle)

log.multi-function(d,p.est){
-sum(dmultinom(x=d,prob=p.est,log=T))
}

d-c(3,5,7)
p.est-d/sum(d)

m1-mle2(minuslogl=log.multi,start=list(p.est=p.est),data=list(N=sum(d),d=d))
###

I get the error:
Error in dmultinom(x = d, prob = p.est, log = T) :
 x[] and prob[] must be equal length vectors.

And no matter how I tried (and I tried for a long time!!) I cant't 
figure out what the problem is. I'm sure it a very silly mistake but I 
don't know what it is.

I think it must have something to do with how I define the parameters.

Could anyone help me with this?

Thank you so much in advance

Benedikt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.