[R] Markov chain resources and questions
Can someone give me a pointer to where I should be looking for markov chain resources in R? Longer term, I am also interested in the question of whether explanatory variables can coupled to a probability transition matrix to assist in predicting the next state that a an object/system will go into. I'm imagining that this gets kind of ugly when you have nominal data (i.e., the next state) that you are trying to predict using a transition matrix and you want to try to boost your predictive power by incorporating other regressor variables. Can this be done, how, and is there something in R that does this? Another issue is how to assess the potential usefulness of the probability transition matrix and the corresponding frequency matrix. If your frequency matrix only has a few observations in each cell then it is not too useful for predictive purposes. Also, if the probabilties are close to .50 in all the cells, again it is not useful because it is not informative about the next state. Is their research that speaks to the issue of assessing the utility or informativeness of the transition matrix for predictive purposes. Regards, Paul Meagher __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Markov chain resources and questions
I just got several hits from www.r-project.org - search - R site search for Markov chain, Markov chain estimation, etc. Have you tried that? help this helps. spencer graves Paul Meagher wrote: Can someone give me a pointer to where I should be looking for markov chain resources in R? Longer term, I am also interested in the question of whether explanatory variables can coupled to a probability transition matrix to assist in predicting the next state that a an object/system will go into. I'm imagining that this gets kind of ugly when you have nominal data (i.e., the next state) that you are trying to predict using a transition matrix and you want to try to boost your predictive power by incorporating other regressor variables. Can this be done, how, and is there something in R that does this? Another issue is how to assess the potential usefulness of the probability transition matrix and the corresponding frequency matrix. If your frequency matrix only has a few observations in each cell then it is not too useful for predictive purposes. Also, if the probabilties are close to .50 in all the cells, again it is not useful because it is not informative about the next state. Is their research that speaks to the issue of assessing the utility or informativeness of the transition matrix for predictive purposes. Regards, Paul Meagher __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Markov chain resources and questions
On 19 Oct 2003 at 11:02, Paul Meagher wrote: You could look at chapter 5 of Jim Lindset's online document The statistical analysis of stochastic processes in Time, at his website, www.luc.ac.be/~jlindsey At this site there is also a collection of R functions for his examples. Kjetil Halvorsen Can someone give me a pointer to where I should be looking for markov chain resources in R? Longer term, I am also interested in the question of whether explanatory variables can coupled to a probability transition matrix to assist in predicting the next state that a an object/system will go into. I'm imagining that this gets kind of ugly when you have nominal data (i.e., the next state) that you are trying to predict using a transition matrix and you want to try to boost your predictive power by incorporating other regressor variables. Can this be done, how, and is there something in R that does this? Another issue is how to assess the potential usefulness of the probability transition matrix and the corresponding frequency matrix. If your frequency matrix only has a few observations in each cell then it is not too useful for predictive purposes. Also, if the probabilties are close to .50 in all the cells, again it is not useful because it is not informative about the next state. Is their research that speaks to the issue of assessing the utility or informativeness of the transition matrix for predictive purposes. Regards, Paul Meagher __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] problem with win.metafile( )
R1.8.0, Win2k: When I paste the code win.metafile(file = //.../plot1.wmf, width = 8.5, height = 6.25) lset( list( background = list(col = white))) xyplot( y ~ x | ID, data = Group1, scales = list(alternating = FALSE), ylim = c(.75,y.max), panel = function(x, y, panel.number, ... ) { panel.superpose(x = Group1$TIME, y = Group1$y, subscripts = TRUE, groups = Group1$ID, type = 'l', col = gray(.6)) sup.sym - trellis.par.get(superpose.symbol) panel.xyplot(x, y, type = 'b', col = black, lwd = 4, cex = 1.5, ...) panel.loess(x = Group1$TIME, y = Group1$y, col = gray(.2), lwd = 3, lty = 5, span = 1/3) } ) dev.off() into R1.8.0 I get the following messages: .. lset( list( background = list(col = white))) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found .. xyplot( .. ) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found dev.off() null device 1 If I change the first command in this script to trellis.device(postscript, file = //.../plot1.ps, color = TRUE) I get no errors. I also get no errors if I run the above script in R1.7.1. Are there any known issues with win.metafile( ) in R1.8.0? Should I reinstall? Any help would be appreciated. -david paul __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] problem with win.metafile( ): traceback()
For the first error message: win.metafile(file = //.../plot1.wmf, + width = 8.5, height = 6.25) lset( list( background = list(col = white))) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found traceback() 4: get(device) 3: trellis.device(device = .Device, new = FALSE) 2: trellis.par.get(item) 1: lset(list(background = list(col = white))) For the second error message: xyplot( y ~ x | ID, data = Group1, + scales = list(alternating = FALSE), + ylim = c(.75,y.max), + panel = function(x, y, panel.number, ... ) + { + panel.superpose(x = Group1$TIME, y = Group1$y, + subscripts = TRUE, groups = Group1$ID, + type = 'l', col = gray(.6)) + + sup.sym - trellis.par.get(superpose.symbol) + panel.xyplot(x, y, type = 'b', col = black, + lwd = 4, cex = 1.5, ...) + + #panel.loess(x = Group1$TIME, y = Group1$y, + col = gray(.2), lwd = 3, lty = 5, span = 1/3) + } + ) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found traceback() 6: get(device) 5: trellis.device(device = .Device, new = FALSE) 4: trellis.par.get(add.text) 3: trellis.skeleton(aspect = fill, strip = TRUE, panel = function (x, y, panel.number, ...) { panel.superpose(x = Group1$TIME, y = Group1$y, subscripts = TRUE, groups = Group1$ID, type = l, col = gray(0.6)) sup.sym - trellis.par.get(superpose.symbol) panel.xyplot(x, y, type = b, col = black, lwd = 4, cex = 1.5, ...) } 2: do.call(trellis.skeleton, c(list(aspect = aspect, strip = strip, panel = panel), dots)) 1: xyplot(y ~ x | ID, data = Group1, scales = list(alternating = FALSE), ylim = c(0.75, y.max), panel = function(x, y, panel.number, ...) { panel.superpose(x = Group1$TIME, y = Group1$y, subscripts = TRUE, groups = Group1$ID, type = l, col = gray(0.6)) sup.sym - trellis.par.get(superpose.symbol) panel.xyplot(x, y, type = b, col = black, lwd = 4, cex = 1.5, ...) }) -Original Message- From: Paul, David A Sent: Sunday, October 19, 2003 12:28 PM To: '[EMAIL PROTECTED]' Subject: [R] problem with win.metafile( ) R1.8.0, Win2k: When I paste the code win.metafile(file = //.../plot1.wmf, width = 8.5, height = 6.25) lset( list( background = list(col = white))) xyplot( y ~ x | ID, data = Group1, scales = list(alternating = FALSE), ylim = c(.75,y.max), panel = function(x, y, panel.number, ... ) { panel.superpose(x = Group1$TIME, y = Group1$y, subscripts = TRUE, groups = Group1$ID, type = 'l', col = gray(.6)) sup.sym - trellis.par.get(superpose.symbol) panel.xyplot(x, y, type = 'b', col = black, lwd = 4, cex = 1.5, ...) panel.loess(x = Group1$TIME, y = Group1$y, col = gray(.2), lwd = 3, lty = 5, span = 1/3) } ) dev.off() into R1.8.0 I get the following messages: .. lset( list( background = list(col = white))) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found .. xyplot( .. ) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found dev.off() null device 1 If I change the first command in this script to trellis.device(postscript, file = //.../plot1.ps, color = TRUE) I get no errors. I also get no errors if I run the above script in R1.7.1. Are there any known issues with win.metafile( ) in R1.8.0? Should I reinstall? Any help would be appreciated. -david paul __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] problem with win.metafile( ): traceback()
On Sunday 19 October 2003 11:49, Paul, David A wrote: For the first error message: win.metafile(file = //.../plot1.wmf, + width = 8.5, height = 6.25) Could you check what the value of the .Device variable (and .Devices as well) is at this point ? And not that it should matter, but what happens if you use trellis.device(win.metafile, file = //.../plot1.wmf, width = 8.5, height = 6.25) Deepayan lset( list( background = list(col = white))) Error in get(x, envir, mode, inherits) : variable win.metafile://.../plot1.wmf was not found traceback() 4: get(device) 3: trellis.device(device = .Device, new = FALSE) 2: trellis.par.get(item) 1: lset(list(background = list(col = white))) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Strange behaviour
Hi [EMAIL PROTECTED] wrote: As an absolute beginner I'm reading and practicing with the Verzani doc to learn R. Now, being an expert latex user who wants to integrate graphical capabilities if R and latex, using the Simple library and the simple.scatterplot examples I had a go at: 1) Including the resulting graph into a doc.snw then compiled through sweave latex; 2) Produce the graph in pdf format directly (using pdf(filename) at the very beginning and before issuing the simple.scatterplot command) and including it into the latex file via \includegraphics. In both cases the graph is a set of numbered rectangles. Investigating into the pdf files generated by R I found that they are made of 2 pages: the first contains those nasty rectangles while the second the right graph that should be inserted. The same result comes out if I try the layout command example in the help of the same package. My question is: is there a way to tell R to produce the second page only as a final pdf file? If not, any suggestion for a way out. The nasty rectangles are the output of the layout.show() function. This function draws a simple diagram (consisting of nasty rectangles) to indicate the regions that a call to layout() has set up. It is designed to help users to understand what on earth the layout() function is doing. (It is NOT a necessary part of setting up an arrangement of plots using the layout() function.) I suspect that the author of simpleR may have accidentally left the layout.show() call in simple.scatterplot() when copying the example from the layout() help file (apologies to John Verzani if this is an unfair diagnosis). So the immediate solution to your problem is to remove the line ... layout.show(nf) ... from simple.scatterplot(). The output should then be a single page which should include ok in latex. The larger problem of how to get at individual pages of output is probably best solved using something like the onefile argument to devices. For example, look at the files produced by ... pdf(onefile=FALSE) example(layout) ... and at the help page for pdf() to see more about how to do this. Hope that helps Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Oceanographic lattice plots
System info: Red Hat 9.0 R Version 1.8.0 ESS 5.1.21 Emacs 21.2.1 --- Hello I've been working with Paul Murrell here in New Zealand to develop plots of temperature and density profiles at 22 stations spanning a frontal zxone, and then overlaying the isothermal and mixed layer depths (similar to Kara et al. 2003 JGR Oceans 108(C3) pg. 24-5, figure 2). Paul Murrell has a package called gridBase which works with R 1.8.0 that does the job nicely. I've put the plot on a publicly accessible ftp site ftp.niwa.co.nz/incoming/mcclatchie/misc/ocean.plot.pdf Here is the code used to generate the plot that Paul developed with my collaboration. --- panel.t.s.profiles.mld.paul.alternate - function() { ## Purpose: ## -- ## Arguments: ## -- ## Author: Paul Murrell Sam McClatchie, Date: 16 Oct 2003, 09:43 require(gridBase) ## zt, p, zsigma, and lat are in the workspace ## ILD data generated by calculate.MLD.and.ILD(0.8) etc #ILD values are in the workspace #convert to positive values ILD0.2 - -ILD0.2 ILD0.2 - matrix(c(ILD0.2,NA,NA),ncol=8,byrow=T) ILD0.2.vec - c(ILD0.2[1,],ILD0.2[2,],ILD0.2[2,]) ILD0.5 - -ILD0.5 ILD0.5 - matrix(c(ILD0.5,NA,NA),ncol=8,byrow=T) ILD0.5.vec - c(ILD0.5[1,],ILD0.5[2,],ILD0.5[2,]) ILD0.8 - -ILD0.8 ILD0.8 - matrix(c(ILD0.8,NA,NA),ncol=8,byrow=T) ILD0.8.vec - c(ILD0.8[1,],ILD0.8[2,],ILD0.8[2,]) ## dummy MLD data MLD0.5 - ILD0.5*0.75 MLD0.5.vec - c(MLD0.5[1,],MLD0.5[2,],MLD0.5[2,]) nrow - 3 ncol - 8 grid.newpage() par(xpd=NA, col.axis=grey) # Use grid to make a layout of rows and columns # Each row consists of a plot region with a label area on top # (i.e., a Trellis-like arrangement) hence the nrow*2. # The label area is 1 line high, the plot areas # consume the remaining height. # Maybe you could add extra rows and cols to this layout to # create small gaps between each plot # This layout is within a viewport which leaves margins for axes # and labels push.viewport(viewport(x=unit(4, lines), y=unit(4, lines), width=unit(1, npc) - unit(6, lines), height=unit(1, npc) - unit(6, lines), just=c(left, bottom), layout=grid.layout(nrow*2, ncol, heights=unit(rep(c(1, 1), nrow), rep(c(lines, null), nrow) for (i in 1:nrow) { for (j in 1:ncol) { index - (i-1)*ncol + j evenrow - i %% 2 == 0 evencol - j %% 2 == 0 if (index 23) { # Go to plot region i, j # Set the yscale for doing the overlay later push.viewport(viewport(layout.pos.row=i*2, layout.pos.col=j, yscale=c(400, 0))) grid.rect(gp=gpar(col=grey)) # Draw first plot # Here's where we use gridBase to put a plot into a grid viewport # The par(plt=gridPLT()) makes the plotting region line up with # the current grid viewport (pushed two lines ago) par(plt=gridPLT(), new=TRUE, cex=0.8) plot(zt[,index],-p[,1], ylim=c(400,0), xlim=c(6,15), type='l', xlab=,ylab=, axes=FALSE, xpd=FALSE) # Draw axes (only do some) if (j == 1 !evenrow) axis(2, col=grey) if (i == nrow !evencol) axis(1, col=grey) par(new=TRUE) # Draw second plot plot(zsigmat[,index],-p[,1], ylim=c(400,0), type='l',lty=2, xlab=, ylab=, xlim=c(26.1,27.4), axes=FALSE, xpd=FALSE) # Draw axes if ((j == ncol || index == 22) evenrow) axis(4, col=grey) pop.viewport() # Draw the latitude labels push.viewport(viewport(layout.pos.row=i*2 - 1, layout.pos.col=j, gp=gpar(col=grey, fill=light grey))) grid.rect() grid.text(round(lat[index,],digits=2), gp=gpar(col=white)) # Draw top axes if (i == 1 evencol) { par(plt=gridPLT()) axis(3, col=grey) } pop.viewport() } } } # Overlay mixed layer depths lines # Here we use some grid functions to do drawing # The 0.5 means half way across the region, # the native means that the value is relative # to the yscale we set up when we created the viewport for (i in 1:nrow) { for (j in 1:ncol) { index - (i-1)*ncol + j if (index 23) { push.viewport(viewport(layout.pos.row=i*2, layout.pos.col=j, yscale=c(400, 0))) # Do a move.to in the first column and a line.to otherwise if (j == 1) grid.move.to(0.5, unit(MLD0.5[i, j], native)) else grid.line.to(0.5, unit(MLD0.5[i, j], native), gp=gpar(lty=dotted)) pop.viewport() } } } # Draw the overlay points last to overwrite the overlay lines for (i in 1:nrow) { for (j in 1:ncol) { index -
[R] lattice error
Hello, I tried to open lattice, but I get the following error: library(lattice) Error in loadNamespace(i, c(lib.loc, .libPaths()), keep.source) : package `grid' does not have a name space Error in library(lattice) : package/namespace load failed I retyped it after loading grid but the same message appeared. could it be that it is caused by a recently done update.packages() ? any idea what might cause this problem? using R.version.string [1] R version 1.7.1, 2003-06-16 thanks in advance, Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] lattice error
I started to notice this when I moved to 1.8.0 and e.g. load MASS. Rob On Sunday, October 19, 2003, at 02:31 PM, Martin Wegmann wrote: Hello, I tried to open lattice, but I get the following error: library(lattice) Error in loadNamespace(i, c(lib.loc, .libPaths()), keep.source) : package `grid' does not have a name space Error in library(lattice) : package/namespace load failed I retyped it after loading grid but the same message appeared. could it be that it is caused by a recently done update.packages() ? any idea what might cause this problem? using R.version.string [1] R version 1.7.1, 2003-06-16 thanks in advance, Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Running RMySQL with SuSE 8.2?
Since there doesn't appear to be an RMySQL rpm for SuSE 8.*, does anyone know if the 7.3 version will work with the SuSE 8.2 rpms of R and DBI? The package installs without complaint, but when I try to run con - dbConnect(dbDriver(MySQL),dbname=test) I get the error Error in dbConnect(dbDriver(MySQL)) : couldn't find function .valueClassTest (This is my first attempt to access a an rdms from R, so I could be doing something else wrong.) Any ideas as what might be generating this error, or as to combinations of rpms that will work under SuSE 8.2 would be appreciated. (I took a stab at compiling RMySQL from src, but I don't have MySQL src installed and I rather not get involved in this if I can avoid it.) Thanks, Barnet Wagman __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] help with legend()
Hi Deepayan Sarkar wrote: On Friday 17 October 2003 02:20, Martin Maechler wrote: PaulSch == Schwarz, Paul [EMAIL PROTECTED] on Wed, 15 Oct 2003 12:09:11 -0700 writes: PaulSch I am converting some S-PLUS scripts that I use for PaulSch creating manuscript figures to R so that I can take PaulSch advantage of the plotmath capabilities. In my PaulSch S-PLUS scripts I like to use the key() function for PaulSch adding legends to plots, AFAIK key() in S+ is from the trellis library section. The corresponding R package, trellis, has ^^^ lattice, actually :-) a draw.key() function that may work similarly to S-plus' key() {Deepayan ?}. That's correct. Of course, the S-PLUS key() works wih non-trellis graphs as well, whereas draw.key() will produce a grid object and hence work with grid graphics only. (I haven't checked Paul's new gridBase package, that may enable using this for base graphics as well.) gridBase makes it possible, although it takes a little bit of work. Here's a simple example. ## First a standard base plot (mangled example from lattice): data(OrchardSprays) attach(OrchardSprays) tmt - sort(as.numeric(treatment)) dec - decrease[order(as.numeric(treatment))] row - rowpos[order(as.numeric(treatment))] plot(tmt, dec, type=n) for (i in unique(row)) { subset - row == i lines(tmt[subset], dec[subset], col=i) } ## Now load lattice (to produce the key) and gridBase (to combine the ## lattice key with the base plot): library(lattice) library(gridBase) ## Align grid viewports with base plot: par(new=TRUE) vps - baseViewports() push.viewport(vps$inner, vps$figure, vps$plot) ## Create lattice key: key - draw.key(list(lines = list(col=1:8), text = list(lab=as.character(unique(OrchardSprays$rowpos))), columns = 4, title = Row position, background=par(bg), border=TRUE)) ## Use a grid viewport to position the key 3mm in from the top-left ## corner of the plot (NOTE this doesn't quite work properly -- the ## width and height of the key are not calculated correctly ## [Deepayan: It's an error in grid and I'm working on a fix]): push.viewport(viewport(x=unit(3, mm), y=unit(1, npc) - unit(3,mm), width=unit(1, grobwidth, key), height=unit(1, grobheight, key), just=c(left, top))) grid.draw(key) # This just shows where the viewport is # and shows how it is too big for the key grid.rect(gp=gpar(col=grey)) ## Clean up: pop.viewport(4) Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] lattice error
On Sunday 19 October 2003 16:31, Martin Wegmann wrote: Hello, I tried to open lattice, but I get the following error: library(lattice) Error in loadNamespace(i, c(lib.loc, .libPaths()), keep.source) : package `grid' does not have a name space Error in library(lattice) : package/namespace load failed I retyped it after loading grid but the same message appeared. could it be that it is caused by a recently done update.packages() ? Yes. You haven't upgraded R to 1.8.0, but are trying to use a version of lattice meant to work with 1.8.0 and above. (It would probably have worked if grid had been updated as well, but I think that won't happen due to some internal restructuring.) Deepayan any idea what might cause this problem? using R.version.string [1] R version 1.7.1, 2003-06-16 thanks in advance, Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Fitting a Weibull/NaNs
I'm trying to fit a Weibull distribution to some data via maximum likelihood estimation. I'm following the procedure described by Doug Bates in his Using Open Source Software to Teach Mathematical Statistics but I keep getting warnings about NaNs being converted to maximum positive value: llfunc - function (x) { -sum(dweibull(AM,shape=x[1],scale=x[2], log=TRUE))} mle - nlm(llfunc,c(shape=1.5,scale=40), hessian=TRUE) Warning messages: 1: NaNs produced in: dweibull(x, shape, scale, log) 2: NA/Inf replaced by maximum positive value 3: NaNs produced in: dweibull(x, shape, scale, log) 4: NA/Inf replaced by maximum positive value Can someone offer some advice here? Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. If not, I suggest you rewrite llfunc to store nlglk - (-sum(...)) and then print out x whenever nlglk is NA or Inf or Nan. hope this helps. spencer graves Eric Rescorla wrote: I'm trying to fit a Weibull distribution to some data via maximum likelihood estimation. I'm following the procedure described by Doug Bates in his Using Open Source Software to Teach Mathematical Statistics but I keep getting warnings about NaNs being converted to maximum positive value: llfunc - function (x) { -sum(dweibull(AM,shape=x[1],scale=x[2], log=TRUE))} mle - nlm(llfunc,c(shape=1.5,scale=40), hessian=TRUE) Warning messages: 1: NaNs produced in: dweibull(x, shape, scale, log) 2: NA/Inf replaced by maximum positive value 3: NaNs produced in: dweibull(x, shape, scale, log) 4: NA/Inf replaced by maximum positive value Can someone offer some advice here? Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Processing logic for Huge Data set
Hello All, I am new to R. I am trying to process this huge data set of matrix containing four columns, say x1, x2, x3, x4 and n number of rows. I want to aggregate the matrix by x1 and perform statistic based on columns x2, x3, x4. I tried aggregate function but it gave me memory allocation error (which I am not surprised), so I ended up performing a for loop based on x1 and subsetting the matrix based on x1. However I have a hunch that their should be a less expensive way of doing this processing. Any ideas or tips to optimize this processing logic would be greatly appreciated. Manoj [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
Spencer Graves [EMAIL PROTECTED] writes: I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. This works. I've got some more questions, though: (1) Does it introduce bias to work with the logs like this? (2) My original data set had zero values. I added .5 experimentally, which is how I got to this data set. This procedure doesn't work on the original data set. Instead I get (with the numbers below being the values that caused problems): [1] 0.41 3.70 1.00 [1] 0.41 3.70 1.00 [1] 0.410001 3.70 1.00 [1] 0.41 3.74 1.00 [1] 0.41 3.70 1.01 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
If the algorithm works properly, you should get exactly the same answer using a linear or a log scale for the parameters. The bigger question is not bias but the accuracy of a normal approximation for confidence intervals and regions. I have evaluated this by making contour plots of the log(likelihood). I use outer to compute this over an appropriate grid of the parameters. Then I use contour [or image with contour(..., add=TRUE)] to see the result. After I get a picture, I may specify the levels, using, e.g., 2*log(likelihood ratio) is approximately chi-square with 2 degrees of freedom. The normality assumption says that the contours should be close to elliptical. I've also fit log(likelihood) to a parabola in the parameters, possibly after deleting points beyond the 0.001 level for chi-square(2). If I get a good fit, I'm happy. If not, I try a different parameterization. When I've done this, I've found that I tend to get more nearly normal contours by throwing the constraint to (-Inf) than leaving it at 0, i.e., by Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley) explain that parameter effects curvature seems to be vastly greater than the intrinsic curvature of the nonlinear manifold, onto which a response vector is projected by nonlinear least square. This is different from maximum likelihood, but I believe that this principle would still likely apply. Does this make sense? spencer graves p.s. I don't understand what you are saying about 0.41 3.70 1.00 below. You are giving me a set of three numbers when you are trying to estimate two parameters and getting NAs, Inf's and NaNs. I don't understand. Are you printing out x when the log(likelihood) is NA, NaN or Inf? If yes, is one component of x = 0? Eric Rescorla wrote: Spencer Graves [EMAIL PROTECTED] writes: I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. This works. I've got some more questions, though: (1) Does it introduce bias to work with the logs like this? (2) My original data set had zero values. I added .5 experimentally, which is how I got to this data set. This procedure doesn't work on the original data set. Instead I get (with the numbers below being the values that caused problems): [1] 0.41 3.70 1.00 [1] 0.41 3.70 1.00 [1] 0.410001 3.70 1.00 [1] 0.41 3.74 1.00 [1] 0.41 3.70 1.01 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks, -Ekr __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
Spencer Graves [EMAIL PROTECTED] writes: Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley) explain that parameter effects curvature seems to be vastly greater than the intrinsic curvature of the nonlinear manifold, onto which a response vector is projected by nonlinear least square. This is different from maximum likelihood, but I believe that this principle would still likely apply. Does this make sense? spencer graves Some :) p.s. I don't understand what you are saying about 0.41 3.70 1.00 below. You are giving me a set of three numbers when you are trying to estimate two parameters and getting NAs, Inf's and NaNs. I don't understand. Are you printing out x when the log(likelihood) is NA, NaN or Inf? If yes, is one component of x = 0? Eric Rescorla wrote: Doh! Typographical error to R. I had the hessian=TRUE clause inside the c(). Doesn't make any difference for the results, though. I'm doing the following: llfunc - + function (zzz) { + tmp - -sum(dweibull(d$Age.Month,shape=exp(zzz[1]),scale=exp(zzz[2]), log=TRUE)) + if(is.infinite(tmp) | is.na(tmp)) { print(zzz);} + tmp + + } mle - nlm(llfunc,c(shape=.37,scale=4.0), hessian=TRUE) [1] 0.37 4.00 [1] 0.37 4.00 [1] 0.370001 4.00 [1] 0.37 4.04 [1] 0.3701 4. [1] 0.3700 4.0004 [1] 0.3702 4. [1] 0.3701 4.0004 [1] 0.3700 4.0008 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value 5: NA/Inf replaced by maximum positive value 6: NA/Inf replaced by maximum positive value 7: NA/Inf replaced by maximum positive value 8: NA/Inf replaced by maximum positive value I'm a little vague on how this is supposed to work, but when I just compute -sum(dweibull(d$Age.Month,shape=1.5,scale=40,log=TRUE)) I get Inf. The problem seems to be that some of the values of d$Age.Month are 0 and since the Weibull always has a value of 0 at 0, the log likelihood comes out insane. (I'm getting 0 values due to quantization error). OTOH when I remove the 0 values it works great, but that seems kind of ad hoc. Is there some standard fix for this? Thanks much, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help