Re: [R] Trying to replicate error message in subset()
Okay First- I apologise for my poor use of terminology (error vs. warning). Second- thanks for pointing out what I hadn't noticed before- when I pass one case for selection to subset, I get all observations. When I pass two cases (as a vector), I get every second case for both cases (if both are present, if not, I just get every second case for the one that is present). Same happens for three cases, as pointed out by Petr below. So, trying the %in% operator, I get slightly different behabviour, but the selection still seems dependent on the length of the vector given as a selector: b-c(D, F, A) new2.dat-subset(ex.dat, a%in%x1) new2.dat y1 x1 x2 12.34870479 A B 31.66090055 A B 5 -0.07904798 A B 72.07053656 A B 92.97980444 A B . Now, I just get every second observation, over all cases of x1. Probably doesn't get as far as A because F is not present? According to the documentation on subset(), the function works on rows of dataframes. I'm guessing this explains the behaviour I'm seeing- somehow, the length of the vector being passed as the subset argument is dictating which rows to evaluate. So, can anyone offer advice on how to select EVERY instance for multiple cases in a dataframe (i.e., all cases of both A and D from ex.dat), or will subset always be tied to the length of the 'subset' argument when a vector is passed to it? Cheers, Mike At 02:46 AM 12/02/2007, Petr Pikal wrote: Hi it is not error it is just warning (Beeping a tea kettle with boiling water is also not an error :-) and it tells you pretty explicitly what is wrong see length of your objects a-c(D, F, A) new.dat-subset(ex.dat, x1 == a) Warning messages: 1: longer object length is not a multiple of shorter object length in: is.na(e1) | is.na(e2) 2: longer object length is not a multiple of shorter object length in: `==.default`(x1, a) new.dat y1 x1 x2 30.5977786 A B 62.5470739 A B 90.9128595 A B 12 1.0953531 A D 15 2.4984470 A D 18 1.7289529 A D 61 -0.4848938 D B 6 you can do better with %in% operator. HTH Petr On 12 Feb 2007 at 1:51, Michael Rennie wrote: Date sent: Mon, 12 Feb 2007 01:51:54 -0500 To: r-help@stat.math.ethz.ch From: Michael Rennie [EMAIL PROTECTED] Subject:[R] Trying to replicate error message in subset() Hi, there I am trying to replicate an error message in subset() to see what it is that I'm doing wrong with the datasets I am trying to work with. Essentially, I am trying to pass a string vector to subset() in order to select a specific collection of cases (i.e., I have data for these cases in one table, and want to select data from another table that match up with the cases in the first table). The error I get is as follows: Warning messages: 1: longer object length is not a multiple of shorter object length in: is.na(e1) | is.na(e2) 2: longer object length is not a multiple of shorter object length in: `==.default`(LAKE, g) Here is an example case I've been working with (which works) that I've been trying to breaksuch that I can get this error message to figure out what I am doing wrong in my case. y1-rnorm(100, 2) x1-rep(1:5, each=20) x2-rep(1:2, each=10, times=10) ex.dat-data.frame(cbind(y1,x1,x2)) ex.dat$x1-factor(ex.dat$x1, labels=c(A, B, C, D, E)) ex.dat$x2-factor(ex.dat$x2, labels=c(B, D)) a-c(D, F) a new.dat-subset(ex.dat, x1 == a) new.dat I thought maybe I was getting errors because I had cases in my selection vector ('a' in this case) that weren't in my ex.dat list, but subset handles this fine and just gives me what it can find in the larger list. Any thoughts on how I can replicate the error? As far as I can tell, the only difference between the case where I am getting errors and the example above is that the levels of x1 in my case are words (i.e., Smelly, Howdy), but strings are strings, aren't they? Mike Michael Rennie Ph.D. Candidate, University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] Michael Rennie Ph.D. Candidate, University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
[R] Trying to replicate error message in subset()
Hi, there I am trying to replicate an error message in subset() to see what it is that I'm doing wrong with the datasets I am trying to work with. Essentially, I am trying to pass a string vector to subset() in order to select a specific collection of cases (i.e., I have data for these cases in one table, and want to select data from another table that match up with the cases in the first table). The error I get is as follows: Warning messages: 1: longer object length is not a multiple of shorter object length in: is.na(e1) | is.na(e2) 2: longer object length is not a multiple of shorter object length in: `==.default`(LAKE, g) Here is an example case I've been working with (which works) that I've been trying to breaksuch that I can get this error message to figure out what I am doing wrong in my case. y1-rnorm(100, 2) x1-rep(1:5, each=20) x2-rep(1:2, each=10, times=10) ex.dat-data.frame(cbind(y1,x1,x2)) ex.dat$x1-factor(ex.dat$x1, labels=c(A, B, C, D, E)) ex.dat$x2-factor(ex.dat$x2, labels=c(B, D)) a-c(D, F) a new.dat-subset(ex.dat, x1 == a) new.dat I thought maybe I was getting errors because I had cases in my selection vector ('a' in this case) that weren't in my ex.dat list, but subset handles this fine and just gives me what it can find in the larger list. Any thoughts on how I can replicate the error? As far as I can tell, the only difference between the case where I am getting errors and the example above is that the levels of x1 in my case are words (i.e., Smelly, Howdy), but strings are strings, aren't they? Mike Michael Rennie Ph.D. Candidate, University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Losing factor levels when moving variables from one context to another
Hi, there I'm currently trying to figure out how to keep my factor levels for a variable when moving it from one data frame or matrix to another. Example below: vec1-(rep(10,5)) vec2-(rep(30,5)) vec3-(rep(80,5)) vecs-c(vec1, vec2, vec3) resp-rnorm(2,15) dat-as.data.frame(cbind(resp, vecs)) dat$vecs-factor(dat$vecs) dat R returns: resp vecs 1 1.57606068767956 10 2 2.30271782269308 10 3 2.39874788444542 10 40.963987738423353 10 5 2.03620782454740 10 6 -0.0706713324725649 30 7 1.49001721222926 30 8 2.00587718501980 30 90.450576585429981 30 102.87120375367357 30 112.25575058079324 80 122.03471288724508 80 132.67432066972984 80 141.74102136279177 80 152.29827581276955 80 and now: newvar-(rnorm(15,4)) newdat-as.data.frame(cbind(newvar, dat$vecs)) newdat R returns: newvar V2 1 4.300788 1 2 5.295951 1 3 5.099849 1 4 3.211045 1 5 3.703554 1 6 3.693826 2 7 5.314679 2 8 4.70 2 9 3.534515 2 10 4.037401 2 11 4.476808 3 12 4.842449 3 13 3.109677 3 14 4.752961 3 15 4.445216 3 I seem to have lost everything I once has associated with vecs, and it's turned my actual values into arbitrary groupings. I assume this has something to do with the behaviour of factors? Does anyone have any suggestions on how to get my original levels, etc., back? Cheers, Mike Michael Rennie Ph.D. Candidate, University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting results from tapply; using indexing to incorporate error bars
Hi, Mark Thanks for the examples- this is great, and has helped me understand alot more what's going on in the plotting functions. Now that I'm trying to work error bars into this, I was curious if someone might give me a hand indexing this properly so that I can format my error bars to match the formatting of the grouping variables that match the lines in the plot. Here's a re-worked example from my first submission where I calculate standard errors for plotting on the figure, plot the error bars, and try to index them to match the appropriate lines. The values appear to be in the right place (I've turned on the legend option for the interaction plot to help verify this), however, my attempts at matching the colours on the bars to the colours on the lines fails miserably (as you'll see if you execute the code below). Is there any way to assign my colours to match them in a way that makes more sense? Thanks for your help so far, Mike y1-rnorm(40, 2) x1-rep(1:2, each=20) x2-rep(1:2, each=10, times=2) ex.dat-data.frame(cbind(y1,x1,x2)) ex.dat$x1-factor(ex.dat$x1, labels=c(A, B)) ex.dat$x2-factor(ex.dat$x2, labels=c(C, D)) attach(ex.dat) xbar-tapply(ex.dat$y1, ex.dat[,-1], mean) xbar s - tapply(ex.dat$y1, ex.dat[,-1], sd) n - tapply(ex.dat$y1, ex.dat[,-1], length) sem - s/sqrt(n) sem row.names(xbar) xbar[,1] #from Marc Schwartz# #par(mfcol = c(1, 3)) options(graphics.record = TRUE) #using plot with(ex.dat, plot(1:2, xbar[, 1], ylim = range(y1), type = b, col = red, lty = c(dashed, solid), xaxt = n, xlab = x1, ylab = mean of y1)) with(ex.dat, points(1:2, xbar[, 2], col = blue, type = b)) axis(1, at = 1:2, labels = c(A, B)) #using matplot matplot(1:2, xbar, col = c(red, blue), pch = 21, type = b, ylim = range(y1), lty = c(dashed, solid), xaxt = n, xlab = x1, ylab = mean of y1) axis(1, at = 1:2, labels = c(A, B)) arrows(1:2,xbar+sem, 1:2,xbar-sem, col = c(red, blue), angle=90, code=3, length=.1) #using interaction.plot with(ex.dat, interaction.plot(x1, x2, response = y1, type = b, pch = 21, col = c(red, blue), ylim = range(ex.dat$y1))) arrows(1:2,xbar+sem, 1:2,xbar-sem, col = c(red, blue), angle=90, code=3, length=.05) #as you can see, though the values for standard errors match the appropriate means, the assignment of colours does not. At 12:46 PM 26/01/2007, Marc Schwartz wrote: On Fri, 2007-01-26 at 11:50 -0500, Michael Rennie wrote: Hi, there I'm trying to plot what is returned from a call to tapply, and can't figure out how to do it. My guess is that it has something to do with the inclusion of row names when you ask for the values you're interested in, but if anyone has any ideas on how to get it to work, that would be stellar. Here's some example code: y1-rnorm(40, 2) x1-rep(1:2, each=20) x2-rep(1:2, each=10 times=2) ex.dat-data.frame(cbind(y1,x1,x2)) ex.dat$x1-factor(ex.dat$x1, labels=c(A, B)) ex.dat$x2-factor(ex.dat$x2, labels=c(C, D)) attach(ex.dat) xbar-tapply(ex.dat$y1, ex.dat[,-1], mean) xbar #values I'd like to plot: row.names(xbar) #levels of x1 xbar[,1] #mean response of y1 for group C (x2) across x1 #plot mean response y1 for group C against x1 (i.e., using x2 as a grouping factor). plot(row.names(xbar), xbar[,1], ylim=range[y1]) #This is where things break down. The error message states that I need finite xlim values but I haven't assigned anything to xlim. If I just plot the data: plot(x1, y1) #This works fine. #And, I can get this to work: stripchart(xbar[1,]~row.names(xbar), vert=T) #However, I'd like to then add the values for the second set of means (i.e., mean values for group D against x1, or (xbar[,2])) to the plot. #I tried following up the previous command with: points(row.names(xbar), xbar[,2]) #But that returns an error as well (NAs introduced by coercion). Any suggestions? Cheers, Mike PS- some of you might suggest for me to use interaction.plot, since this is essentially what I'm building here. True, but I can't get error bars using interaction.plot. I'm therefore trying to build my own version where I can specify the inclusion of error bars. Presumably the interaction.plot has figured out how to do what I'm attempting, so I have some faith that I am on the right track Michael, The problem is that when you are using the rownames for 'xbar', they are a character vector: str(rownames(xbar)) chr [1:2] A B When you attempt to use the same values from 'ex.dat$x1', they are factors, which are being coerced to their numeric equivalents, so that they can work as x axis values. str(ex.dat$x1) Factor w/ 2 levels A,B: 1 1 1 1 1 1 1 1 1 1 ... as.numeric(ex.dat$x1) [1
[R] plotting results from tapply
Hi, there I'm trying to plot what is returned from a call to tapply, and can't figure out how to do it. My guess is that it has something to do with the inclusion of row names when you ask for the values you're interested in, but if anyone has any ideas on how to get it to work, that would be stellar. Here's some example code: y1-rnorm(40, 2) x1-rep(1:2, each=20) x2-rep(1:2, each=10 times=2) ex.dat-data.frame(cbind(y1,x1,x2)) ex.dat$x1-factor(ex.dat$x1, labels=c(A, B)) ex.dat$x2-factor(ex.dat$x2, labels=c(C, D)) attach(ex.dat) xbar-tapply(ex.dat$y1, ex.dat[,-1], mean) xbar #values I'd like to plot: row.names(xbar) #levels of x1 xbar[,1] #mean response of y1 for group C (x2) across x1 #plot mean response y1 for group C against x1 (i.e., using x2 as a grouping factor). plot(row.names(xbar), xbar[,1], ylim=range[y1]) #This is where things break down. The error message states that I need finite xlim values but I haven't assigned anything to xlim. If I just plot the data: plot(x1, y1) #This works fine. #And, I can get this to work: stripchart(xbar[1,]~row.names(xbar), vert=T) #However, I'd like to then add the values for the second set of means (i.e., mean values for group D against x1, or (xbar[,2])) to the plot. #I tried following up the previous command with: points(row.names(xbar), xbar[,2]) #But that returns an error as well (NAs introduced by coercion). Any suggestions? Cheers, Mike PS- some of you might suggest for me to use interaction.plot, since this is essentially what I'm building here. True, but I can't get error bars using interaction.plot. I'm therefore trying to build my own version where I can specify the inclusion of error bars. Presumably the interaction.plot has figured out how to do what I'm attempting, so I have some faith that I am on the right track Michael Rennie Ph.D. Candidate, University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting F, df and r squared using sapply?
Hi, All How does one remove relevant information from a regression output besides just the coefficients? I've been able to modify the example given under help(by) to give me some additional information, but not everything I need. If you adjust the call statement from what is listed by adding the summary statement like so: tmp - by(warpbreaks, tension, function(x) summary(lm(breaks ~ wool, data=x))) sapply(tmp, coef) I am able to get the coefficients and the p-value (as well as 5 other rows of information I don't need, corresponding to St. error of the coefficients, t-values and the p-value of the intercept, but I can delete all this later). If you look at tmp, all the F statistics, r-squared values, etc. are there, but is there an easy way to get at them reported in a tabulated form? Specifically, is there a simple way I can adjust the sapply command to extract the F statistic, r squared, and degrees of freedom (in addition to the coefficients) and have them appear as rows in the output? Ultimately, I have 200+ groups that I am running regressions on in the same dataset, so I need some way to extract the relevant information in a tabulated form. and no smart-alec comments about bonferonni corrections :) Thanks, Mike Michael Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] extracting the t-statistic: just the numbers, please
Hi, there I am quite sure there is an easy answer to this, but I am unsure how to gather a bunch of t-statistics in an organized format. I am trying to generate a list of t-statistics for a randomization routine. If I try to collect a bunch of t-statistics from a run, this is what happens: M - 10 ; simt - NULL for(i in 1:M) + { +perm-sample(site,replace=F) + + permute-cbind(perm, site, a, b, c) + + m- order(perm) + + m1-cbind(perm[m], site[m], a[m], b[m], c[m]) + + black-c((m1[1:5,5]),(m1[11:15,5])) + #black + + white-c((m1[6:10,5]),(m1[16:20,5])) + #white + + sims - t.test(black,white,var.equal=FALSE,mu=0)$statistic + simt-c(simt,sims) + #simt + } # Next i (next simulation) simt t t t t t t t 0.3474150 0.1542973 -0.4044992 1.243 -0.2933944 -0.5809257 0.7799080 t t t -1.4132713 1.2048335 -0.6596936 Which gives me a list, but not in a form that I can do anything with. This is in stark contrast to what happens when requesting p-values, which gives output like this: M - 10 ; simt - NULL for(i in 1:M) + { +perm-sample(site,replace=F) + + permute-cbind(perm, site, a, b, c) + + m- order(perm) + + m1-cbind(perm[m], site[m], a[m], b[m], c[m]) + + black-c((m1[1:5,5]),(m1[11:15,5])) + #black + + white-c((m1[6:10,29]),(m1[16:20,5])) + #white + + sims - t.test(black,white,var.equal=FALSE,mu=0)$p.value + simt-c(simt,sims) + #simt + } # Next i (next simulation) simt [1] 0.6763749 0.7480091 0.9447851 0.3342029 0.7852635 0.3199006 0.5272153 [8] 0.3863616 0.7333693 0.7268907 Now THAT'S what I'd like to get for my t-statistics- a nice vector (simt) that I can deal with later, rather than the output I am currently getting (the first output above). Does anyone know a way to extract JUST the t-statistics from the t.test, without the t character header, so I can generate a nice little vector? Alternatively, can I manipulate the output I am currently getting for the t- statistics so that I can isolate just the numbers? -- Michael Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Kudos to the R support team
Hi there, I just wanted to take a quick second to thank everyone who maintains this service- just in case you don't hear it enough, there are a plethora of people who rely on this service (as evidenced by the help archives), and even if they don't say it, are greatly thankful for what you do here. Not only are you providing a service to people who are learning R, but indeed helping increase R literacy. Soon, they'll be speaking R on the subway. Thank you all, once again. You are all wonderful people. -- Michael Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] modelling nested random effects with interactions in R
Hi there Please excuse this elementary question, but I have been fumbling with this for hours and can't seem to get it right. I have a nested anova, with random factor lakefac nested within factor fishfac (fixed), with an additional fixed factor Habfac. If I consider everything as fixed effects, it's addmittedly not the correct model, but I can at least get this to work: nested - anova(lm(ltotinv ~ habfac + fishfac/lakefac + habfac:fishfac + habfac:(lakefac %in% fishfac))) nested Analysis of Variance Table Response: ltotinv Df Sum Sq Mean Sq F valuePr(F) habfac 2 17.3140 8.6570 25.0568 3.534e-08 *** fishfac 1 0.9131 0.9131 2.6428 0.11057 fishfac:lakefac 2 2.3802 1.1901 3.4447 0.04000 * habfac:fishfac 2 13.0101 6.5050 18.8281 9.196e-07 *** habfac:fishfac:lakefac 4 3.0103 0.7526 2.1783 0.08557 . Residuals 48 16.5838 0.3455 --- So now I try to run it using the lme4 package, treating lakefac as random; lakernd - lme(ltotinv ~ habfac + fishfac/lakefac + habfac:fishfac + habfac: (lakefac %in% fishfac), random = ~ lakefac) Error in .class1(object) : Argument data is missing, with no default lakernd Error: Object lakernd not found The lme help file suggests that if data is not specified, that it defaults to whatever object is currently in use in the environment (as was the case in the fixed effects model- I am using a matrix called use in this example). If I (naively) simply try to add data= use to the above formula, this happens: lakernd - lme(ltotinv ~ habfac + fishfac/lakefac + habfac:fishfac + habfac: (lakefac %in% fishfac), data = use, random = ~ lakefac) Error in as(data, data.frame) : No method or default for coercing matrix to data.frame So how do I get lme to use my data matrix use for the model? My guess is that my syntax is off, but does anyone have any suggestions on how to fix it? Unfortunately, the only resources I have available to me at the moment are an S-plus v. 4. guide to statistics, the archived help files on the R website, an the examples and help files in R. Your help with this problem is greatly appreciated- I do hope to hear from someone as to how I might get this to work. Sincerely, -- Michael Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Testing significance in a design with unequal but proportional sample sizes
Hi, all I have a rather un-ideal dataset that I am trying to work with, and would appreciate any advice you have on the matter. I have 4 years worth of data taken at 3 depth-zones from which samples have been taken at random. I am looking at the abundance of organism A between depth zones and across years, and am interested in the possible interaction of organism A distributions shifting between depth zones over time. Unfortunately, the sample sizes (n) differ between depth zones, as follows: Year 1 2 3 4 Depth Zone 1 15 15 15 15 2 10 10 10 10 3 5 5 5 5 As such, I have a 2-way anova with unequal but proportional subclass numbers. Sokal and Rolf (3rd Ed., 1995) have a nifty method of working out sums of squares in this type of scenario (page 357, 358, box 11.6). However, they don't tell you how to calculate the probabilities, but refer the reader on to Snedecor and Cochran (1967), which I am on my way to consult shortly. I'm curious as to whether there is a more straightforward method of coding this into R, rather than having to more or less customize my own statistical test. I found some discussions in the archives revolving around type III sums of squares from 2001, but the lack of consensus around the discussion did little to assure me that I should try this approach. Anyone with advice, code or suggestions, I'd love to hear any of it. Cheers, Mike -- Michael Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Excel can do what R can't?????
Quoting Roger D. Peng [EMAIL PROTECTED]: I'm having a little difficulty understanding this thread. If Excel can do the job correctly and suits your needs, why not just use Excel? Primarily because I don't know how to automate this in excel. The reason for me doing this is I eventually need it to go through 1000 rows of input variables so that I can get output with error associated with it. Plus, people think your cool if you can do it in R. As far as I know, 'optim' cannot optimize a function subject to arbitrary equality constraints. The 'constrOptim' function allows for linear inequality constraints but in general, you will either have to reformulate the problem or add your own penalty into the objective function. Also, just a small note, but using lexical scoping in your problem would eliminate the need to have all those variables defined in the global environment (but otherwise it won't change anything). What's lexical scoping? Mike -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Excel can do what R can't?????
Hi, Spencer I know I submitted a beastly ammount of code, but I'm not sure how to simplify it much further, and still sucessfully address the problem that i am having. The reason being is that the funciton begins f- function (q) At the top of the iterative loop. This is what takes q and generates Wtmod, Hgtmod at the end of the iterative loop. the assignment to f occurs at the bottom of the iterative loop. So, yes, the call to f is performing an immediate computation, but based on arguments that are coming out of the iterative loop above it, arguments which depend on q-(p, ACT). Maybe this is the problem; I've got too much going on between my function defenition and it's assignment, but I don't know how to get around it. So, I'm not sure if your example will work- the output from the iterative process is Wtmod, Hgtmod, and I want to minimize the difference between them and my observed endpoints (Wt, Hgt). The numbers I am varying to reach this optimization are in the iterative loop (p, ACT), so re-defining these outputs as x's and getting it to vary these doesn't do me much good unless they are directly linked to the output of the iterative loop above it. Last, it's not even that I'm getting error messages anymore- I just can't get the solution that I get from Excel. If I try to let R find the solution, and give it starting values of c(1,2), it gives me an optimization sulution, but an extremely poor one. However, if I give it the answer I got from excel, it comes right back with the same answer and solutions I get from excel. Using the 'trace' function, I can see that R gets stuck in a specific region of parameter space in looking for the optimization and just appears to give up. Even when it re-set itself, it keeps going back to this region, and thus doesn't even try a full range of the parameter space I've defined before it stops and gives me the wrong answer. I can try cleaning up the code and see if I can re-submit it, but what I am trying to program is so parameter heavy that 90% of it is just defining these at the top of the file. Thank you for the suggestions, Mike Quoting Spencer Graves [EMAIL PROTECTED]: The phrase: f - 10*(Wt-Wtmod)^2)/Wt) + (((Hgt-Hgtmod)^2)/Hgt))2) ; f is an immediate computation, not a function. If you want a function, try something like the following: f - function(x){ Wt - x[1] Wtmod - x[2] Hgt - x[3] Hgtmod - x[4] 10*(Wt-Wtmod)^2)/Wt) + (((Hgt-Hgtmod)^2)/Hgt))2) } OR f - function(x, X){ Wt - X[,1] Hgt - X[,2] Wtmod - x[1] Hgtmod - x[2] 10*(Wt-Wtmod)^2)/Wt) + (((Hgt-Hgtmod)^2)/Hgt))2) } par in optim is the starting values for x. Pass X to f via ... in the call to optim. If you can't make this work, please submit a toy example with the code and error messages. Please limit your example to 3 observations, preferably whole numbers so someone else can read your question in seconds. If it is any longer than that, it should be ignored. hope this helps. Spencer Graves M.Kondrin wrote: ?optim optim(par, fn, gr = NULL, method = c(Nelder-Mead, BFGS, CG, L-BFGS-B, SANN), lower = -Inf, upper = Inf, control = list(), hessian = FALSE, ...) . fn: A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result. Your fn defined as: f - 10*(Wt-Wtmod)^2)/Wt) + (((Hgt-Hgtmod)^2)/Hgt))2) ; f What is its first argument I wonder? I think you have just an ill-defined R function (although for Excel it may be OK - do not know) and optim just chokes on it. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Excel can do what R can't?????
Hi, Reid Do the values of W and Hg over time for a given q agree between R and Excel? Not the optimal value of q, just the trajectories for fixed q (trying several values for q). If I take the iterative loop out of the function, and ask for values of Hgmod, Hgtmod, and f, then I get EXACTLY what I get out of Excel. It's the optimization that seems to be the problem. If I trace the solutions, R isn't even exploring the full parameter space I tell it to look in. SO, the iterative loop is correct, and doing what it's supposed to, since values of p, ACT match exactly what they do in excel- it's something about how R is examining the possibilities in the optimization process that is giving me different answers between the two. I dunno- I'm going to tinker with it some more tonight. Mike Reid -Original Message- From: Michael Rennie [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 16, 2003 2:47 PM To: Huntsinger, Reid Subject: RE: [R] Excel can do what R can't? Hi, Reid At 02:09 PM 7/16/03 -0400, you wrote: R is good at automating specific kinds of complex loops, namely those that can be vectorized, or that can be written to draw on otherwise built-in facilities. It's usually reasonable for other kinds of loops but not spectacularly fast. You can write this part in C, though, quite easily, and R provides very convenient utilities for this. As for your code: You seem to have a system of equations that relates W and Hg to their one-period-ago values. It might clarify things if you coded this as a function: input time t values and q, output time t + 1 values. (You wouldn't need any arrays.) Then f would just iterate this function and calculate the criterion. Wouldn't I still need to loop this function to get it through 365 days? Is there a big difference, then, between this and what I've got? Does the trajectory of (W, Hg) for given q in R seem correct? Does it agree with Excel? What does the criterion function look like? You could plot it in R and perhaps see if the surface is complicated, in which case a simple grid search might work for you. When I give R the values that I get in excel for p, ACT, the function solution is actually more precise than what I get in Excel; the values for p, ACT come back identical (then again, they are exactly what I assigned..) But, if I leave R on it's own to find the solution, it keeps getting jammed in a particular region. I've never done any function plotting in R, but it would help if I could see what kind of surface I get for f as a function of p, ACT- this would at least force R to examine the full range of values specified by the upper and lower limits I've set (which it isn't doing under the 'optim' command). Mike Reid Huntsinger -Original Message- From: Michael Rennie [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 16, 2003 11:18 AM To: Spencer Graves Cc: R-Help; M.Kondrin Subject: Re: [R] Excel can do what R can't? Hi, Spencer I know I submitted a beastly ammount of code, but I'm not sure how to simplify it much further, and still sucessfully address the problem that i am having. The reason being is that the funciton begins f- function (q) At the top of the iterative loop. This is what takes q and generates Wtmod, Hgtmod at the end of the iterative loop. the assignment to f occurs at the bottom of the iterative loop. So, yes, the call to f is performing an immediate computation, but based on arguments that are coming out of the iterative loop above it, arguments which depend on q-(p, ACT). Maybe this is the problem; I've got too much going on between my function defenition and it's assignment, but I don't know how to get around it. So, I'm not sure if your example will work- the output from the iterative process is Wtmod, Hgtmod, and I want to minimize the difference between them and my observed endpoints (Wt, Hgt). The numbers I am varying to reach this optimization are in the iterative loop (p, ACT), so re-defining these outputs as x's and getting it to vary these doesn't do me much good unless they are directly linked to the output of the iterative loop above it. Last, it's not even that I'm getting error messages anymore- I just can't get the solution that I get from Excel. If I try to let R find the solution, and give it starting values of c(1,2), it gives me an optimization sulution, but an extremely poor one. However, if I give it the answer I got from excel, it comes right back with the same answer and solutions I get from excel. Using the 'trace' function, I can see that R gets stuck in a specific region of parameter space in looking for the optimization and just appears to give up. Even when it re-set itself, it keeps going back to this region, and thus doesn't even try a full range of the parameter space I've defined before it stops and gives me the wrong answer. I can try cleaning up the code and see if I can re-submit it, but what I am trying to program is so parameter heavy that 90
RE: [R] Excel can do what R can't?????
Hi, Reid and Spencer- I think I've figured something out pretty critical to the problem. Loking at my 'solver' options, I have a condition added that 'Hgtmod = Hgt'. Without this conditional statement, I have to run solver 3-4 times before I get a final solution. MEANING- solver and R, when left to their own devices, suck equally at finding a solution with similar starting points. BUT, given a conditional statement that demands Hgt = Hgtmod, it gives it somewhere to look withing the given parameter space. So, the millon dollar quesiton: Is there any way of setting up a contitional statement like this in 'optim', to specify a solution such that Hgtmod = Hgt? Or, write it into function f? The control statements 'fnscale' and 'parscale' look like candidates, but will only help me if I build Hgtmod into the optimization statement- can I do that? How do you specify multiple parameters? Or should I specify two functions to optimize- one, fucntion f, and a second, something like g- (Hgt = Hgtmod)^2 Can I do that? If this is all I need to do, I am off to the races, and owe you both a beer! I'm going to try some stuff. All is not lost! If you have any ideas, I'd love to hear them. THanks again for everything. Quoting Huntsinger, Reid [EMAIL PROTECTED]: I don't understand how you can take the loop out of the function and still get values for the final timepoint. And whether the optimal parameter values agree wasn't my question. First I'd like to determine whether Hg and W (in your code) have the same value in R as they do in Excel, for a few possible values of the parameter q. Yes- this was what I was trying to communicate (albeit unsucessfully) in my last e-mail. values of 1,2 or 1.5,1.5 or whatever I try, I get the same answers for f, Wtmod, Hgtmod in Excel as I do in R. Understand, though, that in order to determine the agreement for these values, I'm not using an optimization function of any sort- just plugging in numbers, and seeing what I get back. THat's what I meant about taking the 365-day loop out of the optimization function; even though I am still calculating a value for f, I am no longer defining it at the top of the loop with 'f-function(q)'; simply performing the calculation f - 10*Wt-Wtmod)^2)/Wt) + (((Hgt-Hgtmod)^2)/Hgt)) ; f. In order to find out if the value of f is the same in excel and in R given the same starting parameters of p, ACT. Sorry- I didn't mean to be unclear. Then, if so, whether the surface over which you're minimizing is complicated or not. As I said, if it is you can't expect local optimizers to work very well without good starting values, and they really don't explore the whole parameter space--that would be too slow for their usual applications. Optimization approaches like grid search or simulated annealing do try to cover the parameter space and may be better suited to your use. I would certainly try plotting and grid search just to see what's happening since that's clearly possible with two parameters. This is next on the list- I suspect that this is where I may be running into problems. But, if this is were I am running into problems, then does that mean that Excel is using a better optimization process than R? Since they are recognizing the same parameter space over p, ACT, as evidenced that they offer identical solutions for the same values of p, ACT, then shouldn't they be reaching the same solution? AH! Ureka moment- see top of e-mail Thanks again for your suggestions. Mike Reid Huntsinger -Original Message- From: Spencer Graves [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 16, 2003 5:20 PM To: Michael Rennie Cc: Huntsinger, Reid; R-Help Subject: Re: [R] Excel can do what R can't? I'm confused: I've done this type of thing by programming the same objective function in R (or S-Plus) and Excel. After the answers from my objective function in R match the answers in Excel, then I pass that objective function to something like optim, which then finds the same answers as solver in Excel. Your latest description makes me wonder if the function you pass to optim tries to do some of the optimization that optim is supposed to do. ??? hope this helps. spencer graves Michael Rennie wrote: Hi, Reid Do the values of W and Hg over time for a given q agree between R and Excel? Not the optimal value of q, just the trajectories for fixed q (trying several values for q). If I take the iterative loop out of the function, and ask for values of Hgmod, Hgtmod, and f, then I get EXACTLY what I get out of Excel. It's the optimization that seems to be the problem. If I trace the solutions, R isn't even exploring the full parameter space I tell it to look in. SO, the iterative loop is correct, and doing what it's supposed to, since values of p, ACT match exactly what they do in excel
[R] Excel can do what R can't?????
10.2 146 298 10 147 299 9.8 148 300 9.6 149 301 9.4 150 302 9.3 151 303 9.1 152 304 8.9 153 305 8.7 154 306 8.6 155 307 8.4 156 308 8.2 157 309 8.1 158 310 7.9 159 311 7.8 160 312 7.6 161 313 7.5 162 314 7.3 163 315 7.2 164 316 7 165 317 6.9 166 318 6.8 167 319 6.7 168 320 6.5 169 321 6.4 170 322 6.3 171 323 6.2 172 324 6.1 173 325 6 174 326 5.8 175 327 5.7 176 328 5.6 177 329 5.5 178 330 5.5 179 331 5.4 180 332 5.3 181 333 5.2 182 334 5.1 183 335 5 184 336 5 185 337 4.9 186 338 4.8 187 339 4.7 188 340 4.7 189 341 4.6 190 342 4.5 191 343 4.5 192 344 4.4 193 345 4.4 194 346 4.3 195 347 4.3 196 348 4.2 197 349 4.2 198 350 4.1 199 351 4.1 200 352 4 201 353 4 202 354 4 203 355 3.9 204 356 3.9 205 357 3.8 206 358 3.8 207 359 3.8 208 360 3.8 209 361 3.7 210 362 3.7 211 363 3.7 212 364 3.6 213 365 3.6 214 366 3.6 215 1 3.2 216 2 3.2 217 3 3.2 218 4 3.2 219 5 3.2 220 6 3.2 221 7 3.2 222 8 3.2 223 9 3.2 224 10 3.2 225 11 3.2 226 12 3.2 227 13 3.2 228 14 3.2 229 15 3.2 230 16 3.2 231 17 3.2 232 18 3.2 233 19 3.2 234 20 3.2 235 21 3.2 236 22 3.2 237 23 3.2 238 24 3.2 239 25 3.2 240 26 3.2 241 27 3.2 242 28 3.2 243 29 3.2 244 30 3.2 245 31 3.2 246 32 3.2 247 33 3.2 248 34 3.2 249 35 3.2 250 36 3.2 251 37 3.2 252 38 3.2 253 39 3.2 254 40 3.2 255 41 3.2 256 42 3.2 257 43 3.2 258 44 3.2 259 45 3.2 260 46 3.2 261 47 3.2 262 48 3.2 263 49 3.2 264 50 3.2 265 51 3.2 266 52 3.2 267 53 3.2 268 54 3.3 269 55 3.3 270 56 3.3 271 57 3.3 272 58 3.3 273 59 3.3 274 60 3.3 275 61 3.3 276 62 3.3 277 63 3.3 278 64 3.3 279 65 3.3 280 66 3.3 281 67 3.3 282 68 3.3 283 69 3.3 284 70 3.3 285 71 3.4 286 72 3.4 287 73 3.4 288 74 3.4 289 75 3.4 290 76 3.4 291 77 3.4 292 78 3.4 293 79 3.5 294 80 3.5 295 81 3.5 296 82 3.5 297 83 3.5 298 84 3.5 299 85 3.6 300 86 3.6 301 87 3.6 302 88 3.6 303 89 3.6 304 90 3.7 305 91 3.7 306 92 3.7 307 93 3.8 308 94 3.8 309 95 3.8 310 96 3.8 311 97 3.9 312 98 3.9 313 99 4 314 100 4 315 101 4 316 102 4.1 317 103 4.1 318 104 4.2 319 105 4.2 320 106 4.3 321 107 4.3 322 108 4.4 323 109 4.4 324 110 4.5 325 111 4.5 326 112 4.6 327 113 4.7 328 114 4.7 329 115 4.8 330 116 4.9 331 117 5 332 118 5 333 119 5.1 334 120 5.2 335 121 5.3 336 122 5.4 337 123 5.5 338 124 5.5 339 125 5.6 340 126 5.7 341 127 5.8 342 128 6 343 129 6.1 344 130 6.2 345 131 6.3 346 132 6.4 347 133 6.5 348 134 6.7 349 135 6.8 350 136 6.9 351 137 7 352 138 7.2 353 139 7.3 354 140 7.5 355 141 7.6 356 142 7.8 357 143 7.9 358 144 8.1 359 145 8.2 360 146 8.4 361 147 8.6 362 148 8.7 363 149 8.9 364 150 9.1 365 151 9.3 366 152 9.3 Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Excel can do what R can't?????
At 11:47 AM 7/15/03 -0700, Jerome Asselin wrote: Mike, The definition of your function f() seems quite inefficient. You could vectorize it, which would shorten and speed up your code, especially if M is large. Hi, Jerome I don;t think I can vectorize it, since in the iteration loop, the value for each [i] is dependent on the value of [i-1], so I require the loop to go through each [i] before I can get my values for any particular vector (variable). I actually had my program operating this way in the first place, but I get all sorts of warnings and the 'optim' function especially doesn't seem to appreciate it. See the R introduction file available online to learn how to do it if you don't already know how. Also, you have to return only one argument. Unless I'm wrong, your function wants to return Wtmod, Hgtmod, q and f. I'm don't think this would change anything in this case, but you should definitely clean this up! The calls to Wtmod, q, and Hgtmod are all just residual from the development of the loop inside function f. I would like to get the last line of 'bioday' reported from within the loop, had I figured out the optimization, but that point is rather moot unless I can get the optimization functioning. Another advice... If you can simplify your example into a few lines of ready-to-execute code with a toy dataset, then it's easy for everyone to try it out and you can get more feedback. The code you've included is quite large and cumbersome. For one thing, you could easily have removed the lines of code that were commented out. Meanwhile, I would suggest that you go back to the basics of R to clean up your code. Thanks for the advice- every bit helps if I eventually get this thing to work. Mike Sorry I can't be more helpful. Jerome On July 15, 2003 10:46 am, Michael Rennie wrote: Hi there I thought this would be of particular interest to people using 'optim' functions and perhaps people involved with R development. I've been beaten down by R trying to get it to perform an optimization on a mass-balance model. I've written the same program in excel, and using the 'solver' function, it comes up with an answer for my variables (p, ACT, which I've assigned to q in R) that gives a solution to the function f in about 3 seconds, with a value of the function around 0.0004. R, on the other hand, appears to get stuck in local minima, and spits back an approximation that is close the the p, ACT values excel does, but not nearly precise enough for my needs, and not nearly as precise as excel, and it takes about 3 minutes. Also, the solution for the value it returns for the function is about 8 orders of magnitude greater than the excel version, so I can't really say the function is approximating zero. I was able to determine this using a trace command on function f, which is listed below. This is very likely due to the fact that I've made some coding error along the way, or have done something else wrong, but I don't know. Either way, I am shocked and surprised that a program like excel is outperforming R. I've attached my command file and the dataset temp.dat at the bottom of this e-mail for anyone who would like to fiddle around with it, and if you come up with something, PLEASE let me know- In the meantime, I've got to start fiddling with excel and figuring out how to automate the solver calculation. Briefly, the point of the program is to approximate the model output from an iterative calculation, Wtmod and Hgtmod, to user-specified endpoints Wt and Hgt, by seeking the optimal values of p, ACT involved in the iterative process. Also, if your interested in recent correspondence that explains the point of the program a bit, and how the function ties in to the iterative process, search the R help forum for e-mails entitled [R] problem with coding for 'optim' in R. Thanks also to Roger Peng and numerous others for helping me get this far. The whole point of me doing this in R was because it's supposed to be spectacularly fast at automating complex loops, but seems to be falling short for this application. Hopefully it's something wrong with my coding and not with R itself. Mike R COMMAND FILE: #perch.R # # Hewett and Johnson bioenergetics # # model combined with # # Trudel MMBM to estimate # # Consumption in perch in R code # # Execute with # # R --vanilla perch.R perch.out# #USER INPUT BELOW #Weight at time 0 Wo- 9.2 #Hg concentration at time 0 (ugHg/g wet weight) Hgo- 0.08 #Weight at time t Wt- 32.2 #Hg concentration at time t (ugHg/g wet weight) Hgt- 0.110 #Prey methylmercury concentration (as constant) Hgp- 0.033 #Prey caloric value (as constant) Pc- 800 #Energy density of fish (as constant, calories) Ef - 1000 #Maturity status, 0=immature, 1=mature Mat
[R] problem with coding for 'optim' in R
)*(1+(1+40/Ya)^0.5)^2)/400 S - 0.172 FA - 0.158 FB - -0.222 FG - 0.631 UA- 0.0253 UB- 0.58 UG- -0.299 #Mass balance model parameters EA - 0.002938 EB - -0.2 EQ - 0.066 a - 0.8 #Specifying sex-specific parameters GSI- NULL if (Sex==1) GSI-0.05 else + if (Sex==2) GSI-0.17 # Define margin of error functions #merror - function(phat,M,alpha) # (1-alpha)*100% merror for a proportion #{ #z - qnorm(1-alpha/2) #merror - z * sqrt(phat*(1-phat)/M) # M is (Monte Carlo) sample size #merror #} #Bring in temp file temper - scan(temp.dat, na.strings = ., list(Day=0, jday=0, Temp=0)) Read 366 records Day-temper$Day ; jday-temper$jday ; Temp-temper$Temp ; temp- cbind (Day, jday, Temp) #Day = number of days modelled, jday=julian day, Temp = daily avg. temp. #temp [,2] Vc-(CTM-(temp[,3]))/(CTM-CTO) Vr-(RTM-(temp[,3]))/(RTM-RTO) comp- cbind (Day, jday, Temp, Vc, Vr) #comp bio-matrix(NA, ncol=13, nrow=length(Day)) W-NULL C-NULL ASMR-NULL SMR-NULL A-NULL F-NULL U-NULL SDA-NULL Gr-NULL Hg-NULL Ed-NULL GHg-NULL K-NULL Expegk-NULL EGK-NULL p-NULL ACT-NULL #p - 0.558626306252032 #ACT - 1.66764519286918 q-c(p,ACT) #introduce function to solve f - function (q) + { + + M- length(Day) #number of days iterated + + for (i in 1:M) + { + + #Bioenergetics model + + if (Day[i]==1) W[i] - Wo else + if (jday[i]==121 Mat==1) W[i] - (W[i-1]-(W[i-1]*GSI*1.2)) else + W[i] - (W[i-1]+(Gr[i-1]/Ef)) + #W + + #W-Wo + + C[i]- p*CA*(W[i]^CB)*((comp[i,4])^Xc)*(exp(Xc*(1-(comp[i,4]*Pc + + ASMR[i]- ACT*RA*(W[i]^RB)*((comp[i,5])^Xa)*(exp(Xa*(1-(comp[i,5] + + SMR[i]- (ASMR[i]/ACT) + + A[i]- (ASMR[i]-SMR[i]) + + F[i]- (FA*((comp[i,3])^FB)*(exp(FG*p))*C[i]) + + U[i]- (UA*((comp[i,3])^UB)*(exp(UG*p))*(C[i]-F[i])) + + SDA[i]- (S*(C[i]-F[i])) + + Gr[i]- (C[i]-(ASMR[i]+F[i]+U[i]+SDA[i])) + + #Trudel MMBM + + if (Day[i]==1) Hg[i] - Hgo else Hg[i] - a*Hgp*(C[i-1]/Pc/W[i-1])/EGK[i-1]*(1-Expegk[i-1])+(Hg[i-1]*Expegk[i-1]) + + Ed[i]- EA*(W[i]^EB)*(exp(EQ*(comp[i,3]))) + + GHg[i] - Gr[i]/Ef/W[i] + + if (Sex==1) K[i]-(((0.1681*(10^(1.3324+(0.000453*Hg[i])))/1000)/Hg[i])*GSI)/M else + if (Sex==2) K[i]-(((0.1500*(10^(0.8840+(0.000903*Hg[i])))/1000)/Hg[i])*GSI)/M + # = dw/ww conversion * gonad ~ body conc'n function(ng/g) / convert to ug/g + # then express as Q times GSI gives K / M gives daily K + + EGK[i] - (Ed[i] + GHg[i] + (K[i]*Mat)) + + Expegk[i] - exp(-1*EGK[i]) + + bio- cbind(W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg) + + } + + #warnings() + + dimnames (bio) -list(NULL, c(W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg)) + + bioday-cbind(jday, W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg) + + dimnames (bioday) -list(NULL, c(jday, W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg)) + + #bioday + + Wtmod- bioday [length(W),2] + Wtmod + + Hgtmod- bioday [length(Hg),14] + Hgtmod + + f - (((Wt-Wtmod)^2 + (Hgt-Hgtmod)^2)/2)^2 ; f + + q + + #warnings() + + write.table (bioday, file = perch.csv, append = FALSE, sep=,, na = NA, col.names = TRUE) + + } #nlm(f,c(1,1)) optim(c(1,1), f, method = L-BFGS-B, + lower = c(0, 0), upper=c(2, 10)) Error in [-(*tmp*, i, value = (W[i - 1] + (Gr[i - 1]/Ef))) : nothing to replace with Execution halted Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] problem with coding for 'optim' in R
Hi, Roger, At 01:37 PM 7/14/03 -0700, Roger D. Peng wrote: It's important to remember that in R functions return whatever happens to be the last element of the function block, unless there is an explicit 'return' statement. Your function 'f' in the second example is written incorrectly and will not work in 'optim'. The last element in the function block is: write.table (bioday, file = perch.csv, append = FALSE, sep=,, na = NA, col.names = TRUE) which I assume is *not* the value you want the function return. Your function 'f' is returning whatever 'write.table' returns, which is nothing useful. My guess is that you want your function 'f' to return the value 'f' defined in the function as f - (((Wt-Wtmod)^2 + (Hgt-Hgtmod)^2)/2)^2 So this statement should be the last line of your function. This is valuable information. Thanks very much. I'll move things about and see what happens. Also, your function 'f' (still from the second output) doesn't use the value 'q' at all, so I can't see how the optimizer can optimize a function that ignores its parameters. From what I've read and the examples I've encountered, the 'optim' function expects the first entry in optim(x, f, etc) To be the starting point for your variable that you specify earlier in the loop under f- function (x) If I am wrong on this, then this could be giving me problems as well. -roger Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] problem with coding for 'optim' in R
RA - 34.992 #0.0108*3240 cal/g 02, converting weight of 02 to cal RB - 0.8 #same as 1+(-0.2) see above... RQ - 2.1 RTO - 28 RTM - 33 Za - (log(RQ))*(RTM-RTO) Ya- (log(RQ))*(RTM-RTO+2) Xa- ((Za^2)*(1+(1+40/Ya)^0.5)^2)/400 S - 0.172 FA - 0.158 FB - -0.222 FG - 0.631 UA- 0.0253 UB- 0.58 UG- -0.299 #Mass balance model parameters EA - 0.002938 EB - -0.2 EQ - 0.066 a - 0.8 #Specifying sex-specific parameters GSI- NULL if (Sex==1) GSI-0.05 else + if (Sex==2) GSI-0.17 # Define margin of error functions #merror - function(phat,M,alpha) # (1-alpha)*100% merror for a proportion #{ #z - qnorm(1-alpha/2) #merror - z * sqrt(phat*(1-phat)/M) # M is (Monte Carlo) sample size #merror #} #Bring in temp file temper - scan(temp.dat, na.strings = ., list(Day=0, jday=0, Temp=0)) Read 366 records Day-temper$Day ; jday-temper$jday ; Temp-temper$Temp ; temp- cbind (Day, jday, Temp) #Day = number of days modelled, jday=julian day, Temp = daily avg. temp. #temp [,2] Vc-(CTM-(temp[,3]))/(CTM-CTO) Vr-(RTM-(temp[,3]))/(RTM-RTO) comp- cbind (Day, jday, Temp, Vc, Vr) #comp bio-matrix(NA, ncol=13, nrow=length(Day)) W-NULL C-NULL ASMR-NULL SMR-NULL A-NULL F-NULL U-NULL SDA-NULL Gr-NULL Hg-NULL Ed-NULL GHg-NULL K-NULL Expegk-NULL EGK-NULL p-NULL ACT-NULL p - 1 #0.558626306252032 ACT - 1 #1.66764519286918 q-c(p,ACT) #specify sttarting values #q0-c(p = 1, ACT = 1) #introduce function to solve f - function (q) + { + + + M- length(Day) #number of days iterated + + for (i in 1:M) + { + + #Bioenergetics model + + if (Day[i]==1) W[i] - Wo else + if (jday[i]==121 Mat==1) W[i] - (W[i-1]-(W[i-1]*GSI*1.2)) else + W[i] - (W[i-1]+(Gr[i-1]/Ef)) + #W + + #W-Wo + + C[i]- q[1]*CA*(W[i]^CB)*((comp[i,4])^Xc)*(exp(Xc*(1-(comp[i,4]*Pc + + ASMR[i]- q[2]*RA*(W[i]^RB)*((comp[i,5])^Xa)*(exp(Xa*(1-(comp[i,5] + + SMR[i]- (ASMR[i]/q[2]) + + A[i]- (ASMR[i]-SMR[i]) + + F[i]- (FA*((comp[i,3])^FB)*(exp(FG*p))*C[i]) + + U[i]- (UA*((comp[i,3])^UB)*(exp(UG*p))*(C[i]-F[i])) + + SDA[i]- (S*(C[i]-F[i])) + + Gr[i]- (C[i]-(ASMR[i]+F[i]+U[i]+SDA[i])) + + #Trudel MMBM + + if (Day[i]==1) Hg[i] - Hgo else Hg[i] - a*Hgp*(C[i-1]/Pc/W[i-1])/EGK[i-1]* (1-Expegk[i-1])+(Hg[i-1]*Expegk[i-1]) + + Ed[i]- EA*(W[i]^EB)*(exp(EQ*(comp[i,3]))) + + GHg[i] - Gr[i]/Ef/W[i] + + if (Sex==1) K[i]-(((0.1681*(10^(1.3324+(0.000453*Hg[i])))/1000)/Hg[i]) *GSI)/M else + if (Sex==2) K[i]-(((0.1500*(10^(0.8840+(0.000903*Hg[i])))/1000)/Hg[i])*GSI)/M + # = dw/ww conversion * gonad ~ body conc'n function(ng/g) / convert to ug/g + # then express as Q times GSI gives K / M gives daily K + + EGK[i] - (Ed[i] + GHg[i] + (K[i]*Mat)) + + Expegk[i] - exp(-1*EGK[i]) + + bio- cbind(W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg) + + } + + #warnings() + + dimnames (bio) -list(NULL, c (W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg)) + + bioday-cbind(jday, W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK, Hg) + + dimnames (bioday) -list(NULL, c (jday, W, C, ASMR, SMR, A, F, U, SDA, Gr, Ed, GHg, EGK , Hg)) + + #bioday + + Wtmod- bioday [length(W),2] + Wtmod + + Hgtmod- bioday [length(Hg),14] + Hgtmod + + q + + f - (((Wt-Wtmod)^2 + (Hgt-Hgtmod)^2)/2)^2 ; f + + #warnings() + + #write.table (bioday, file = perch.csv, append = FALSE, sep=,, na = NA, col.names = TRUE) + + + + #nlm(f,c(1,1)) + } optim(q, f, method = L-BFGS-B, + lower = c(0, 0), upper=c(2, 10)) Error in optim(q, f, method = L-BFGS-B, lower = c(0, 0), upper = c(2, : L-BFGS-B needs finite values of fn Execution halted -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] solving multiple parameter starting values in nlm
Hi, there At 12:28 AM 7/8/03 -0700, Spencer Graves wrote: Did you work through the examples at the end of the nlm documentation? The first example there shows that nlm expects a single vector argument over which f is to be minimized. This suggests that your first construction should not work, while your second should be closer. I've gone through and now attempted a number of varieties on this theme. Stated generally, the newest variation I am hung on is; x-NULL Y-NULL q-(x,y) f - function (q) { { for i in (1:length(day)) { .I then use x and y in my loop for daily iterations, where each element is dependent upon the value of the previous iteration, and obtain values for c, d, and try to minimize the following function } f - ((a^2 -c^2)^2 + (b^2 - d^2)^2)^2/2 ; f } nlm (f, 1, r=1) I've squared all the results, the differences, and the overall value as the function to minimize such that the differences between the user input values (a, b) and the model output (b,d) are exaggerated so as to obtain a sufficient minimization. This function works on it's own when performed after the daily iteration loop, but it doesn't even get that far in this version of the nlm loop. Also, the loop with only my daily iterations works fine (without the nlm loop), no warnings, all is good. But, once I embed it within the loop for function f, my program stalls on the daily iterations. I get error messages that seem to indicate that it isn't even going through the first iteration properly. If, however, I code my input variables as q[1] and q[2] in the loop, instead of x, y, it solves the loop correctly, but comes back with a billion warnings that the number of items to replace is not a multiple of replacement length. I think that it wants my q vector to be the same length as the iterations in the daily loop, 365, but gets mad when it isn't, because it possesses only 2 elements, (x ,y) or (q[1], q[2]). As well, when I run the daily iteration loops OUTSIDE the nlm loop with the q[1] and q[2] coding, I get the same error messages (the number of items to replace is not a multiple of replacement length). Despite this, it goes through the program and reports a solution. My fear is that even though I am getting the right answers from the daily loop in this last version, if I am getting errors, then I shouldn't just simply ignore them and continue. Or should I? Are the errors simply a result of a technicality? Personally, I would feel better if I could do it without them. Also, did you try running your function by itself, checking to make sure it computed correctly? The final line for the second function as I read it below appears OUTSIDE the function definition as a stand-alone expression to be executed once between the function definition and the nlm call. Yes, or at least a modified version of it. see above. Also, have you tried optim? It seems to be more general and robust, requiring fewer assumptions to function. The default method for optim does not require the function to be differentiable, while nlm does. Yes, and optim is stalling in the same places and generating the same errors as 'nlm'. Any other thoughts or suggestions for what I'm doing wrong? hope this helps. spencer graves Michael Rennie wrote: Hi there I am having trouble figuring out how to get an nlm function to report estimates for two parameter values in an estimation. The way I've got it goes something like this: f - function (q, r) { here, I have a second loop which uses q, r to give me values for c, d below. a and b are already specified; this loop is a mass-balance function where I am trying to find values of q, r to give me c and d. I want c and d (mass-balance model outputs) to approximate a and b (known endpoints that I want the model to attain), respecitvely. Thus, the function I want to minimize is: fu - ((a^2 - c^2) + (b^2 - d^2))/2 ; fu } nlm (f, 1, r=1) This doesn't return estimates for each parameter, and I get error messages saying something like repleaced missing value with maximum or something. All I did here was try (and fail, obviously) to model my needs against the examples given in help(nlm), but they are using vectors as inputs, and I only need 2 values input, and 2 returned. I've also tried specifying the function as f - function (q) { mass-balance loop } fu - ((a^2 - c^2) + (b^2 - d^2))/2 ; fu nlm(f, c(1,1)) Specifying 1 as the starting values, and coding q[1] and q[2] in my mass balance loop instead of q, r. But this doesn't work either, and I get messages that I think are indicating that my mass balance loop isn't computing properly (number of items to replace is not a multiple of replacement length- an error I don't get when I run the mass-balance loop outside the nlm function and just specify q, r). Does anyone know how to specify two parameters in the function, and ask for estimates of both back from nlm
[R] strange error message
Hi, there I have a loop that is producing data, but is also generating an error message that I can't understand. Here's the loop and the error message: bio-matrix(NA, ncol=9, nrow=366) W-NULL M- length(Day) #number of days iterated for (i in 1:M) + { + + + if (Day[i]==1) W[i] - Wo else W[i] - (W[i-1]+(Gr[i]/Ef)) + + + C- p*CA*(W^CB)*(((comp[,3])^Xc)*(exp(Xc*(1-(comp[,3])*Pc + + ASMR- ACT*RA*(W^RB)*(((comp[,4])^Xa)*(exp(Xa*(1-(comp[,4]) + + SMR- (ASMR/ACT) + + A- (ASMR-SMR) + + F- (FA*((comp[,2])^FB)*(exp(FG*p))*C) + + U- (UA*((comp[,2])^UB)*(exp(UG*p))*(C-F)) + + SDA- (S*(C-F)) + + Gr- (C-(ASMR+F+U+SDA)) + + bio- cbind(W, C, ASMR, SMR, A, F, U, SDA, Gr) + + } There were 50 or more warnings (use warnings() to see the first 50) warnings function (...) { if (!(n - length(last.warning))) return() names - names(last.warning) cat(Warning message, if (n 1) s, :\n, sep = ) for (i in 1:n) { out - if (n == 1) names[i] else paste(i, : , names[i], sep = ) if (length(last.warning[[i]])) { temp - deparse(last.warning[[i]]) out - paste(out, in:, temp[1], if (length(temp) 1) ...) } cat(out, ..., fill = TRUE) } } dimnames (bio) -list(NULL, c (W, C, ASMR, SMR, A, F, U, SDA, Gr)) bio WC ASMR SMR AF U [1,] 9.20 233.6647 107.5640 64.50050 43.06345 31.93755 15.840142 Also, does anyone know why I might be getting differences in the same calculation between R and Excel? Is there any way to keep R from rounding your numbers, or to specify the # of decimal places you want for an element? -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] if else statements in R
Hi, there I am a grad student struggling to get my syntax right in a model I am building in R. I am trying to specify a variable,W, in a loop, such that on the first iteration, it is equal to Wo, my starting value, which I have already specified with Wo-9.2 On any other iteration other than the first, I want W to equal the previous value of W in the iteration. plus an increment from that previous iteration, divided by another value. This then becomes the value used in all calculations in the iteration at hand. Here is my code, that does not work: W- function(w) { if (comp[i,1]=1) W[i]-Wo else W[i]-(work[i-1,5]work[i-1,13]*/Ef) } work is the dataframe that I am rbinding my interations into, and the W variable is the 5th variable in that frame, and the additive to W is the 13th variable in that frame. However, this is where the program stalls on me: { + + W- function(w) + { + if (comp[i,1]=1) W[i]-Wo else Error: syntax error Execution halted I've also attempted this variation; W- function(w) if (comp[i,1]=1) {W[i]-Wo} else W[i]-(work[i-1,5]work[i-1,13]*/Ef) Which meets with an equal lack of success, and stalls in the same spot as the previous version. Does anyone have any hints on how to make this if else statement work to do what I need? This is my first time trying to use these statements, and the S-plus manual i have is proving to be terribly unhelpful. What am I doing wrong? Mike Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] if else statements in R
Hi, there I am a grad student struggling to get my syntax right in a model I am building in R. I am trying to specify a variable,W, in a loop, such that on the first iteration, it is equal to Wo, my starting value, which I have already specified with Wo-9.2 On any other iteration other than the first, I want W to equal the previous value of W in the iteration. plus an increment from that previous iteration, divided by another value. This then becomes the value used in all calculations in the iteration at hand. Here is my code, that does not work: W- function(w) { if (comp[i,1]=1) W[i]-Wo else W[i]-(work[i-1,5]work[i-1,13]*/Ef) } work is the dataframe that I am rbinding my interations into, and the W variable is the 5th variable in that frame, and the additive to W is the 13th variable in that frame. However, this is where the program stalls on me: { + + W- function(w) + { + if (comp[i,1]=1) W[i]-Wo else Error: syntax error Execution halted I've also attempted this variation; W- function(w) if (comp[i,1]=1) {W[i]-Wo} else W[i]-(work[i-1,5]work[i-1,13]*/Ef) Which meets with an equal lack of success, and stalls in the same spot as the previous version. Does anyone have any hints on how to make this if else statement work to do what I need? This is my first time trying to use these statements, and the S-plus manual i have is proving to be terribly unhelpful. What am I doing wrong? Mike Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Constructing loops in which i+1st element depends on ith
I feel greedy posting for help twice in one day- please forgive me, but the thesis can't wait. I have been trying to get an if else statement working in a loop I am writing, in which I specify a variable value to 9.2 (Wo) on the first iteration, but for subsequent iterations, to adopt a value as a function of variables from the previous iteration. When I comment out the if else statement, let W = Wo, and adjust the rest of the file so I am reading the first row of the variables specified instead of entire columns in my calculations, the calculations all work nicely without errors - I get the numbers I ought to in one row of bio. When I activate the loop and try to run the file with the if .else, it looks like it wants to work, but heres where it stalls; Error in [-(*tmp*, i, value = ((bio[i - 1, 1] * bio[i - 1, 9])/Ef)) : nothing to replace with Execution halted (see complete data file below)... I guess its because it cant find the alternate case because it hasnt written it yet. So, my problem is two-fold: 1. How can I write my loop properly so that at the end of each iteration, it records the row of data from bio such that I can cbind it back to my Day data? 2. How can I specify the loop to use information from the i+1st element in calculations for calculations in the ith? Im pretty sure my problem is in where I create and specify files in the loop, but beyond that, I dont know. Heres the part of my command file thats giving me trouble. Everything before this is simply specifying variable names to numbers, which is not giving be problems. #Bring in temp file temper - scan(temp2.dat, na.strings = ., list(Day=0, Temp=0)) #Day = day on which iteration begins, starts at day 1, ends on 366 #Temp = temperature Day - temper$Day ; Temp-temper$Temp ; temp- cbind (Day, Temp) #temp [,2] p- 0.558626306252032 ACT - 1.66764519286918 Vc-((CTM-temp[,2])/(CTM-CTO)) Vr-((RTM-temp[,2])/(RTM-RTO)) comp- cbind (Day, Temp, Vc, Vr) bio-NULL M- length(Day) #number of days iterated for (i in 1:M) { weight- function(Day) { W-NULL if (Day[i]==1) W[i] - Wo {W[i] - ((bio[i-1,1]*bio[i-1,9])/Ef) } W } W-weight(Day) #W-Wo C- p*CA*(W^CB)*((comp[,3]^Xc)*(exp(Xc*(1-comp[,3]*Pc ASMR- (ACT*RA*(W^(RB))*((comp[,4]^Xa)*(exp(Xa*(1-comp[,4]) SMR- (ASMR/ACT) A- (ASMR-SMR) F- (FA*(comp[,2]^FB)*(exp(FG*p))*C) U- (UA*(comp[,2]^UB)*(exp(UG*p))*(C-F)) SDA- (S*(C-F)) Gr- (C-(ASMR+F+U+SDA)) bio- rbind(c(W, C, ASMR, SMR, A, F, U, SDA, Gr)) dimnames (bio) -list(NULL, c (W, C, ASMR, SMR, A, F, U, SDA, Gr)) } bio I would be grateful for any suggestions people might have. Thanks, Mike -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Creating a loop that works....
Hi there, First off, thanks to everyone who has helped me so far. Sorry to keep pestering you all. I'm including my code here, and I will comment down it where it is that I am having problems figuring out how to write this damn thing. temper - scan(temp2.dat, na.strings = ., list(Day=0, Temp=0)) Read 366 records Day - temper$Day ; Temp-temper$Temp ; temp- cbind (Day, Temp) #Day = number of days modelled, Temp = daily avg. temp. #temp [,2] p- 0.558626306252032 ACT - 1.66764519286918 Vc-((CTM-temp[,2])/(CTM-CTO)) Vr-((RTM-temp[,2])/(RTM-RTO)) comp- cbind (Day, Temp, Vc, Vr) bio-NULL M- length(Day) #number of days iterated for (i in 1:M) + { + + weight- function(Day) + { + W-NULL + if (Day[i]==1) {W[i] - Wo} + elseif (Day[i]1) {W[i] - ((bio[i-1,1]*bio[i-1,9])/Ef) + } + W + } + + W-weight(Day) The problem, as many of you have already identified, is right here. I hope I finally have the syntax right, but even if the if else is coded properly, I don't think R can find the values in the second condition I list. I need W in each step of the iteration to change slightly, based on the mess of calculations below (which are using parameters that I have already specified). After all the calculations are made, I hope to get values in bio[i,1] and bio [i,9] corresponding to the iteration that just occured, then return to the top of the loop to combine them into the value of W for the next iteration. What I think is happening here is that R is looking for values in the condition before they are actually there- the way I've written it, they can't be there until I get past the conditional step. Which means I am coding this all wrong. That said, I'm not sure how else to do it; the value of W in the next iteration is dependent on the values of Gr and W in the previous iteration, with the exception of the first one (Day=1). I've tried defining bio as bio-matrix(NA, ncol=9, nrow=366) but that doesn't help either. Perhaps my rbind at the end of the file is incorrect? I think maybe I'm getting mixed up between calculating vectors and values-- should I be specifying [i] for everything below where I am now specifying vecotrs? + #W-Wo + + C- p*CA*(W^CB)*((comp[,3]^Xc)*(exp(Xc*(1-comp[,3]*Pc + + ASMR- (ACT*RA*(W^(RB))*((comp[,4]^Xa)*(exp(Xa*(1-comp[,4]) + + SMR- (ASMR/ACT) + + A- (ASMR-SMR) + + F- (FA*(comp[,2]^FB)*(exp(FG*p))*C) + + U- (UA*(comp[,2]^UB)*(exp(UG*p))*(C-F)) + + SDA- (S*(C-F)) + + Gr- (C-(ASMR+F+U+SDA)) + #Day, Temp, Vc, Vr, W, C, ASMR, SMR, A, F, U, SDA, Gr) + + bio- rbind(c(W, C, ASMR, SMR, A, F, U, SDA, Gr)) + + dimnames (bio) -list(NULL, c (W, C, ASMR, SMR, A, F, U, SDA, Gr)) + + } Error: length of dimnames[2] not equal to array extent Execution halted -- Michael Rennie M.Sc. Candidate University of Toronto at Mississauga 3359 Mississauga Rd. N. Mississauga ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help