Re: [R] help with sockets in R
readLines() is for a text-mode connection; readChar() is for a binary-mode connection. Given that you asked for possible re-encoding by the 'encoding' argument, you cannot safely mix them (text-mode access is re-encoded, binary-mode is not). However, we don't know if re-encoding was active in your case since we don't know your locale. Either don't specify an encoding and re-encode the response in R or use readLines() to read the complete response and split it up later. For a different approach with read/write.socket() see tests/internet.R in the R sources. Please do note that the posting guide asked you for 'at a minimum' information (which includes the locale) and a reproducible example. On Tue, 21 Sep 2010, Christopher Bare wrote: Hi R gurus, I'm trying to use a ReSTful web service from within R. Specifically, I need to make HTTP PUT requests. I'm able to make the request and that goes well enough, but I'm having trouble properly consuming the HTTP response. The problem comes in when I'm trying to parse out the response body. I get the length of the response body from the Content-Length header. I then try to use readChar(con, nchars=content.length). The result I'm seeing is that the first few characters of the response body are cut off. My code looks like this: http.request - function(host, path, request, port=80) { con - socketConnection(host=host, port=port, open=w+, blocking=TRUE, encoding=UTF-8) writeLines(request, con) write(wrote request, stderr()) flush(stderr()) # build response object response - list() response$status - readLines(con, n=1) response$headers - character(0) repeat{ ss - readLines(con, n=1) write(ss, stderr()) flush(stderr()) if (ss == ) break key.value - strsplit(ss, :\\s*) response$headers[key.value[[1]][1]] - key.value[[1]][2] } if (any(names(response$headers)=='Content-Length')) { content.length - as.integer(response$headers['Content-Length']) response$body - readChar(con, nchars=content.length) } close(con) } response$body ends up with e,\id\:\some_doc\,\rev\:\7-906e06a7744780ef93126adc6f8f10ef\}\n when it should be: {\ok\:true,\id\:\some_doc\,\rev\:\7-906e06a7744780ef93126adc6f8f10ef\} Is mixing readLines and readChars on the same connection causing bad juju? If it is, what's the recommended what to do such a thing? Thanks for any help!! -Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for calculating bandwidths
Thx so much, especially to Dennis. -- View this message in context: http://r.789695.n4.nabble.com/Package-for-calculating-bandwidths-tp2548091p2549796.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with mclust package
Hi, I am trying to run th mclust package on a variable Tuberculin indurations recorded as mm. The file has only one variable. When I run the package I get NULL value for mu and sigma. Can anybody say why? This is the program: library(mclust) mc-Mclust(x.trab,G=1:9,warn=TRUE) mc mc$mu sqrt(mc$sigmasq) and the output I get is library(mclust) mc-Mclust(x.trab,G=1:9,warn=TRUE) There were 11 warnings (use warnings() to see them) mc best model: unequal variance with 3 components mc$mu NULL sqrt(mc$sigmasq) Error in sqrt(mc$sigmasq) : Non-numeric argument to mathematical function warnings() Warning messages: 1: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold 2: In mclustBIC(data = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, ... : there are missing groups 3: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold 4: In mclustBIC(data = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, ... : there are missing groups 5: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold 6: In mclustBIC(data = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, ... : there are missing groups 7: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold 8: In mclustBIC(data = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, ... : there are missing groups 9: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold 10: In mclustBIC(data = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, ... : there are missing groups 11: In meV(data = data, z = z, prior = prior, control = control, ... : sigma-squared falls below threshold Please help. Basilea Watson Biostatistician Tuberculosis Research Centre, India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting random effects from model formula
Hi R-users I would like to extract the random effects (1|SITE, 1+SPECIES|SITE and BA|SITE) from this model formula: Full_model - formula (VAR ~ (1|SITE) + (1+SPECIES|SITE) + (BA|SITE) + HEIGHT + COND + NN_DIST) I tried: terms(Full_model) labels(terms(Full_model)) but I could not distinguish between random and fixed effects. thanks Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] labels in (box)plot
Hi Dennis, Thanks for your answer (for some reason, the only one). I could do that for sure, but I think it looks better if the labels are all horizontal. Anyway, it looks like what I wanted to do is not really possible. I have to play with the arguments I already know... Regards, Ivan Le 9/21/2010 23:27, Dennis Murphy a écrit : Why don't you just rotate the labels 90 degrees? D. On Tue, Sep 21, 2010 at 7:23 AM, Ivan Calandra ivan.calan...@uni-hamburg.de mailto:ivan.calan...@uni-hamburg.de wrote: Dear users, I would like all the ticks on a boxplot (x and y) to be labeled I have checked all the par() arguments but couldn't find what I'm looking for Here is an example to show it: df - structure(list(SPECSHOR = structure(c(1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 4L, 4L), .Label = c(cotau, dibic, eqgre, gicam), class = factor), Sq122.median = c(2.335835, 1.76091, 1.64717, 1.285505, 1.572405, 1.86761, 1.82541, 1.62458, 0.157813, 0.864523)), .Names = c(SPECSHOR, Sq122.median), class = data.frame, row.names = c(9L, 16L, 23L, 74L, 83L, 90L, 98L, 109L, 121L, 139L)) par(mfrow=c(2,2)) ## not necessary in that example, but I do need it with my real data boxplot(Sq122.median~SPECSHOR, data=df, horizontal=TRUE) ## the upper label on the y-axis (gicam) is not shown I know I can decrease the size of the labels by setting cex.axis=0.8, but I would prefer that all labels are always there, independent of their size, without me setting their size explicitly, and even if they then overlap. Am I clear? Is it possible, and how? Thanks in advance Ivan -- Ivan CALANDRA PhD Student University of Hamburg [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] invalid 'row.names' length error when running scatterplots or plot in R Commander
Thank you everyone for your helpful feedback. To clarify a few things that people have been asking: Part of the problem comes from the fact that these are external students, so our conversations are over email and I'm not there to try things. I admit my shortcomings when it comes to R script writing. We use R Commander only in the course and play around with script within the Script Window when needed. I am realizing that R Commander can do things slightly differently to R itself. Scatterplot is a function within R commander and it sounds like it isn't a function in R itself--plot does the same thing. The error message that they received is the entire error message that R Commander produces. This dataset is one of ours and does not come from an R package. The problem happens on any dataset whether it is read from an .rda or .csv file. I've used these datasets with Rcmdr over the last year and several hundred students without problem before now. I don't teach attach() to the students, but had these two use it in this instance because Rcmdr wasn't recognizing dataset$variable. plot(Calf~Bicep, data=Mass) will also work. The students can correctly run stats using the data and the appropriate variables are present, etc, suggesting that the data are being read in OK. I will get the students to run the code in R itself to see if that works and also push for full output including str(), ls(), sessionInfo(), traceback() etc from the students and will report back. Cheers, Sam -Original Message- From: Dejian Zhao [mailto:dejian.z...@gmail.com] Sent: Monday, 20 September 2010 5:32 PM To: r-help@r-project.org Cc: Samantha McKenzie Subject: Re: [R] invalid 'row.names' length error when running scatterplots or plot in R Commander I did not reproduce the error either, because I failed to find the function scattperplot in your script. :D I suggest that you check whether they read the data into R correctly. On 2010-9-20 10:00, Samantha McKenzie wrote: Hello, I teach statistics and use R Commander for teaching. I have 2 students out of 169 that can't get scatterplots or plot to work. I have had them update packages, restart R/R Commander/their computers and even reinstall R/R Commander. One is using Windows 7 on a new pc and the other is a pc user (not sure the OS). They are both using R2.11.1 and R Commander 1.6-0. The data look like this: Mass Mass Fore Bicep Chest Neck Shoulder Waist Height Calf Thigh Head 1 77.0 28.5 33.5 100.0 38.5114.0 85.0 178.0 37.5 53.0 58.0 2 85.5 29.5 36.5 107.0 39.0119.0 90.5 187.0 40.0 52.0 59.0 3 63.0 25.0 31.0 94.0 36.5102.0 80.5 175.0 33.0 49.0 57.0 4 80.5 28.5 34.0 104.0 39.0114.0 91.5 183.0 38.0 50.0 60.0 5 79.5 28.5 36.5 107.0 39.0114.0 92.0 174.0 40.0 53.0 59.0 6 94.0 30.5 38.0 112.0 39.0121.0 101.0 180.0 39.5 57.5 59.0 7 66.0 26.5 29.0 93.0 35.0105.0 76.0 177.5 38.5 50.0 58.5 8 69.0 27.0 31.0 95.0 37.0108.0 84.0 182.5 36.0 49.0 60.0 9 65.0 26.5 29.0 93.0 35.0112.0 74.0 178.5 34.0 47.0 55.5 10 58.0 26.5 31.0 96.0 35.0103.0 76.0 168.5 35.0 46.0 58.0 This script: scatterplot(Calf~Bicep, reg.line=lm, smooth=FALSE, spread=FALSE, + boxplots=FALSE, span=0.5, data=Mass) Produces this error: invalid 'row.names' length The data look fine and correlation/regression can be done on them with correct output. Just not scatterplots. I tried a work around using the following script, but still with the same result: attach(Mass) Mass str(Mass) names(Mass) plot(Calf~Bicep) abline(lm(Calf~Bicep)) I cannot repeat the error, nor have I found much information on that error message, and I'm a bit stumped why these two students are getting the error even after one has reinstalled the program. Cheers, Sam McKenzie The University of Queensland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Hi, I agree with you that levels should not be automatically dropped after subsetting. However, I think there should/can be an extra argument to make it possible (the default being no dropping). I have no example in mind, but I guess it is possible that sometimes, one want to show only some levels. Would it be a bad approach? Anyway, it is not that complicated to use factor() again. Ivan Le 9/21/2010 23:22, Greg Snow a écrit : This comes up every now and then. The fact is that the behavior of R in not throwing away information unless explicitly told to, is a feature, and one that I don't want to see go away. Yes in your example doing a table or plot based on iris1$Species gives meaningless results, but anything you do with that column in now meaningless, why do you care if there is extra information in a column that you should not be doing anything further with anyways? Does it really make sense to use that column for anything now? It is a bit like a teacher bemoaning the fact that half of his/her students scored below the class median. Now some proposes that all factors should have levels dropped after subsetting, this is worse than useless, consider the following made up example: tmp1- rep( c(1:5,1:5), c(10,20,30,20,0,0,10,20,30,20) ) result- factor(tmp1, levels=1:5, labels=c('Strongly Disagree', 'Disagree', 'No Opinion', 'Agree', 'Strongly Agree') ) my.df- data.frame( result=result, sex = rep( c('M','F'), each=80 ) ) df.m.2- df.m.1- my.df[ my.df$sex=='M', ] df.f.2- df.f.1- my.df[ my.df$sex=='F', ] df.m.1[]- lapply( df.m.1, factor ) df.f.1[]- lapply( df.f.1, factor ) dev.new() par(mfrow=c(2,1)) barplot(table(df.m.1$result), main='Males') barplot(table(df.f.1$result), main='Females') dev.new() par(mfrow=c(2,1)) barplot(table(df.m.2$result), main='Males') barplot(table(df.f.2$result), main='Females') Which pair of plots is more meaningful? Easier to read? Not misleading? -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with outer()
Dear all, I have following piece of codes: xx - seq(-2,2, length.out=11) mat1 - cbind(rep(43, 5), rnorm(5)) mat2 - cbind(rep(53, 5), rnorm(5)) outer(c(1,-1), xx, function(x,y) { AA - sign(x) BB - sign(y) CC - abs(y) DD1 - mat1[,2]-mat1[1,1] DD2 - mat2[,2]-mat2[1,1] EE - (DD1 - 0) *AA + DD2*BB res1 - mean(EE)/mat1[1,1] res2 - ifelse(quantile(EE/mat1[1,1], 0.05) -0.65, quantile(EE/mat1[1,1], 0.05), paste( -0.65)) return(paste(res1, res2, sep=--)) } ) While running this code I am getting warnings as well as error: outer(c(1,-1), xx, function(x,y) { + AA - sign(x) + BB - sign(y) + CC - abs(y) + DD1 - mat1[,2]-mat1[1,1] + DD2 - mat2[,2]-mat2[1,1] + EE - (DD1 - 0) *AA + DD2*BB + res1 - mean(EE)/mat1[1,1] + res2 - ifelse(quantile(EE/mat1[1,1], 0.05) -0.65, quantile(EE/mat1[1,1], 0.05), paste( -0.65)) + return(paste(res1, res2, sep=--)) + } + ) Error in dim(robj) - c(dX, dY) : dims [product 22] do not match the length of object [1] In addition: Warning messages: 1: In (DD1 - 0) * AA : longer object length is not a multiple of shorter object length 2: In DD2 * BB : longer object length is not a multiple of shorter object length I am able to trace the warning, which comes from multiplication with AA BB. However could not find the correct way to tackle this warning. Neither the error. Can somebody help me where I was wrong? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colorramp in Maptools, how to choose min and max values for the fg= argument
Hi Jannis, thanks a lot for your reply. Unfortunately the solution you proposed does not work. Maybe the reason is, that plot.Map only accepts hsv colours and I do not know to convert the rgb colours to the right colour space. Isn't there a possibility to specify minimal an maximal values when using colorRampPalette to only choose a certain part of the colorRamp(Palette)? I could than define a new colorset jet.colors2, which is picks out the colours between min(Vector)/10 and max(Vector)/10? jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) Intervalls - cut(as.numeric(Matrix[,nc]),n, na.rm=TRUE) } fgs - jet.colors(n)[Intervalls] Maybe this is a bit confusing... Thanks anyway for any help! Katrin Jannis schrieb: Hi Katrin, your problem is not related to maptools, rather to the base function colorRampPalette. In my opinion the easiest way to get what you want is to use colorRamp: Vector=c(2,3,4,5,6) color.ramp - colorRamp(c('white','black')) map.colors - color.ramp(Vector/10) The 10 refers to the maximum value that you specified. You just have to adjust the colors. HTH Jannis --- schaber kscha...@ipp.mpg.de schrieb am Di, 21.9.2010: Von: schaber kscha...@ipp.mpg.de Betreff: [R] Colorramp in Maptools, how to choose min and max values for the fg= argument An: r-help@r-project.org Datum: Dienstag, 21. September, 2010 15:42 Uhr Hello, I am using maptools for plooting geographical data. The colour of the region indicates some region dependent value (population for example). I pass the colours of the regions to the plot.Map function by defining the foreground colour: jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) n=100 cols - jet.colors(n) Intervalls = cut(as.numeric(Vector),n, na.rm=TRUE) fgs -cols[Intervalls] ## is there some option/possibility here to choose colours differently ?? plot.Map(Shape.map, fg=fgs, ol=NA, bty=n, xlab=, ylab=, xaxt=n, yaxt=n) My problem is the following: I would like to choose minimal and maximal value of the colour palette, different from the minimal and maximal values of the Vector. For example: Vector=c(2,3,4,5,6), but the maximal colour should correspond to 10 and the minimal to 0. I know, that maptools is not the most actual solution to plot maps, but I am building on an old code and therefore do not want to change it. I would very much appreciate any help or hints!! Katrin -Integrierter Anhang folgt- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colorramp in Maptools, how to choose min and max values for the fg= argument
On Wed, Sep 22, 2010 at 9:56 AM, schaber kscha...@ipp.mpg.de wrote: Hi Jannis, thanks a lot for your reply. Unfortunately the solution you proposed does not work. Maybe the reason is, that plot.Map only accepts hsv colours and I do not know to convert the rgb colours to the right colour space. Isn't there a possibility to specify minimal an maximal values when using colorRampPalette to only choose a certain part of the colorRamp(Palette)? I could than define a new colorset jet.colors2, which is picks out the colours between min(Vector)/10 and max(Vector)/10? jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) Intervalls - cut(as.numeric(Matrix[,nc]),n, na.rm=TRUE) } fgs - jet.colors(n)[Intervalls] Maybe this is a bit confusing... If you want to control precisely the mapping of numeric values to colours, then you could try my colourschemes package from R-forge: install.packages(colourschemes, repos=http://R-Forge.R-project.org;) then you create a function that maps values to colours by, for example, interpolating colours at fixed values, eg: rs = rampInterpolate ( limits =c(-2 , 2), ramp = c(red, yellow , blue )) and then: rs(2) [1] # # gives blue rs(-2) [1] #FFFF # gives red rs(0.5) [1] #BFBF40FF # nearer yellow than blue so now whatever the range of your data you can just use the rs() function to get the same colours for the same values. d=data.frame(x=1:10,y=1:10,V1=runif(10,0,2),V2=runif(10,-2,0),V3=runif(10,-2,2)) plot(d$x,d$y,col=rs(d$V1)) plot(d$x,d$y,col=rs(d$V2)) plot(d$x,d$y,col=rs(d$V3)) Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] efficient list indexing
Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred - list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred - rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient list indexing
I'm confused by what you are looking for. There's some slight possibility that you are looking for double bracket subscripting with a vector: list(a=1:5, b=letters)[[c(2,4)]] [1] d On 22/09/2010 10:58, Alaios wrote: Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred- list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred- rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomForest - partialPlot - Reg
From: Vijayan Padmanabhan Dear R Group I had an observation that in some cases, when I use the randomForest model to create partialPlot in R using the package randomForest the y-axis displays values that are more than -1! It is a classification problem that i was trying to address. Any insights as to how the y axis can display value more than -1 for some variables? Am i missing something! Yes, the Detail section of the help page for partialPlot, or specifically, what the function is plotting for a classification model. Andy Thanks Regards Vijayan Padmanabhan Can you avoid printing this? Think of the environment before printing the email. -- - Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] group means of multi-way table?
On 09/22/2010 08:52 AM, Simon Kiss wrote: hello, can someone tell me how to generate the means for a data frame that looks like this? My data frame has many more variables, but I won't bother you with those; these are the one's that I'm interested in. Needless to say, z is the variable in which I'm interested. I'd like to find out the mean score of z for NDP managers, Conservative managers and Liberal managers and then for a few other configurations. Ive played around with aggregate, tapply and by, but I can't get it to work. Cordially, Simon Kiss mydata=data.frame(cbind(x,y,z)) mydata$x=as.factor(sample(c(labourers, salaried, managers), size=300, replace=TRUE)) mydata$y=as.factor(sample(c(NDP, Green, Liberal, Conservative), size=300, replace=TRUE)) mydata$z=as.numeric(sample(1:4, size=300, replace=TRUE)) Hi Simon, This looks a bit like what brkdn in the prettyR package does. If you want the means nested within subgroups, maybe brkdnNest in the plotrix package will do what you want. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient list indexing
Also, it may be that you want bigfred to be a list of 5 lists each of 2 elements (happy and name) rather than a list of 10 elements. Thus (also using double bracketing) fred- list(happy = 1:10, name = squash) bigfred - replicate(5, fred, FALSE) bigfred[[2]][[2]] hth Keith J Patrick Burns pbu...@pburns.seanet.com wrote in message news:4c99e118.2010...@pburns.seanet.com... I'm confused by what you are looking for. There's some slight possibility that you are looking for double bracket subscripting with a vector: list(a=1:5, b=letters)[[c(2,4)]] [1] d On 22/09/2010 10:58, Alaios wrote: Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred- list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred- rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lrm ( Design package) poor performance?
% correct is an improper scoring rule and a discontinuous one to boot. So it will not always agree with more proper scoring rules. When you have a more difficult task, e.g., discriminating more categories, indexes such as the generalized c-index that utilize all the categories will recognize the difficulty of the task and give a lower value. No cause for alarm. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2550118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Referencing factor name
Simple problem - I want the ylab to automatically pick up x1 rather than having to define x1 in the plot statement. x1-c(1.2,2,3);x2-c(1,2.1,2.6) y-x1 plot(1:3,y, ylab=x1) There must be a way of accessing the name x1 somehow, but unfortunately I don't know how to search for it. Any help would be great, -- View this message in context: http://r.789695.n4.nabble.com/Referencing-factor-name-tp2550131p2550131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient list indexing
Hi: I believe we had this discussion yesterday, http://r.789695.n4.nabble.com/Object-oriented-programming-in-R-td2538541.html#a2538916 but since you chose to repeat that message, it clearly wasn't enough, so start with http://cran.r-project.org/doc/manuals/R-intro.html#Lists http://stackoverflow.com/questions/2050790/how-to-correctly-use-lists-in-r With the following example as a reference: Empl - list(employee = Anna, spouse = Fred, children = 3, child.ages = c(4, 7, 9)) we quote Venables and Ripley (2002, p. 45): It is important to appreciate the difference between [ and [[. The [ form extracts sub-vectors, so Empl[2] is a list of length one, whereas Empl[[2]] is a component (a character vector of length one). In that context, consider the following: Empl$employee [1] Anna Empl$child.ages[2] [1] 7 x - spouse; Empl[[x]] [1] Fred Empl[[3]] [1] 3 Empl[[4]] [1] 4 7 9 Empl[[4]][2] [1] 7 Empl[4] $child.ages [1] 4 7 9 Empl[4][2] # Why? $NA NULL Empl[2] $spouse [1] Fred Empl[[2]] # Why the difference? [1] Fred A somewhat more expansive discussion is found on pp. 12-13 of Venables and Ripley (2000), S Programming. Dennis On Wed, Sep 22, 2010 at 2:58 AM, Alaios ala...@yahoo.com wrote: Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred - list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred - rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Referencing factor name
Hi: I presume you mean x1-c(1.2,2,3) y - 'x1' plot(1:3, get(y), ylab = y) ?? HTH, Dennis On Wed, Sep 22, 2010 at 4:44 AM, Paul Chatfield p.s.chatfi...@reading.ac.uk wrote: Simple problem - I want the ylab to automatically pick up x1 rather than having to define x1 in the plot statement. x1-c(1.2,2,3);x2-c(1,2.1,2.6) y-x1 plot(1:3,y, ylab=x1) There must be a way of accessing the name x1 somehow, but unfortunately I don't know how to search for it. Any help would be great, -- View this message in context: http://r.789695.n4.nabble.com/Referencing-factor-name-tp2550131p2550131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lrm ( Design package) poor performance?
Thats great thanks I guess it is hard to not use % as a performance measure when that is what is commonly used in everyday life. So when i come to predicting the response of new data ( using the estimated mean Y ) which i am more comfortable with i can say - Species A - 2.12 - Therefore this is category 2 Species B - 2.72 - Therefore this is category 3 (on a side note, i had no species with a rating of 6 - the upper category?) The problem comes in explaining this to my peers who are non-statsistcally minded. What you are saying is if i bootstrapp and report the c-values and Bieber scores etc this is sufficient to give an indication of confidence? On 22 Sep 2010, at 12:36, Frank Harrell wrote: % correct is an improper scoring rule and a discontinuous one to boot. So it will not always agree with more proper scoring rules. When you have a more difficult task, e.g., discriminating more categories, indexes such as the generalized c-index that utilize all the categories will recognize the difficulty of the task and give a lower value. No cause for alarm. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2550118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot.ts versus plot.zoo
plot.ts has an argument yax.flip, plot.zoo does not. Is there a way to make the yaxis flip in plot.zoo? I tried using a custom panel function: panel.yaxis-function(...) { npnl-parent.frame$panel.number if (npnl %% 2 == 0) { axis(side=3) } else { axis(side=2) } } This leads to a blank window. I am stuck with the intricacies of the plotting and axis generation here. Thank you in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival curve mean adjusted for covariate
do the same thing for female and then take the weighted average of the two means. -- View this message in context: http://r.789695.n4.nabble.com/Survival-curve-mean-adjusted-for-covariate-tp2548387p2550178.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival curve mean adjusted for covariate
do the same thing for female and then take the weighted average of the two means. -- View this message in context: http://r.789695.n4.nabble.com/Survival-curve-mean-adjusted-for-covariate-tp2548387p2550179.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot.ts versus plot.zoo
On Wed, Sep 22, 2010 at 8:07 AM, Alex van der Spek do...@xs4all.nl wrote: plot.ts has an argument yax.flip, plot.zoo does not. Is there a way to make the yaxis flip in plot.zoo? I tried using a custom panel function: panel.yaxis-function(...) { npnl-parent.frame$panel.number if (npnl %% 2 == 0) { axis(side=3) } else { axis(side=2) } } Depending on your zoo series, you may be able to do this: plot(as.ts(z), yax.flip = TRUE) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kstest vs shapirotest
Dear R-users Idea: Analysing tree height frequency with hist(), normal distribution (ks.test shapiro.test) and skewness (package e1071 - thanks a lot for this useful package)as an indication of possible self-thinning in an experimental tree stand. Problem: Results from the ks.test and the shapiro.test are not comparable (see example of both tests). Tree height is a nice continuous variable. Sample size is around n= 250-350. Does anybode know about a problem in ks tests using a large sample size and working with subsets (e.g file[,])? Comparing tests with qqplots, I would appreciate shapiro, but I am wondering about the results from ks (test ist not very sensitive, D=1, p=2.2e-16 many times? Thanks Sibylle shapiro.test(Biotree[Ac,]$Height2008) Shapiro-Wilk normality test data: Biotree[Ac, ]$Height2008 W = 0.9908, p-value = 0.05175 ks.test(Biotree[Ac,]$Height2008, pnorm) One-sample Kolmogorov-Smirnov test data: Biotree[Ac, ]$Height2008 D = 1, p-value 2.2e-16 alternative hypothesis: two-sided Warning message: In ks.test(Biotree[Ac, ]$Height2008, pnorm) : cannot compute correct p-values with ties -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply union function vectorially
unique(unlist(list.array)) -- View this message in context: http://r.789695.n4.nabble.com/apply-union-function-vectorially-tp2550162p2550193.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient list indexing
Great one! Thanks this will simplify a lot addresing. Best REgards From: Keith Jewell k.jew...@campden.co.uk To: r-h...@stat.math.ethz.ch Sent: Wed, September 22, 2010 1:25:27 PM Subject: Re: [R] efficient list indexing Also, it may be that you want bigfred to be a list of 5 lists each of 2 elements (happy and name) rather than a list of 10 elements. Thus (also using double bracketing) fred- list(happy = 1:10, name = squash) bigfred - replicate(5, fred, FALSE) bigfred[[2]][[2]] hth Keith J Patrick Burns pbu...@pburns.seanet.com wrote in message news:4c99e118.2010...@pburns.seanet.com... I'm confused by what you are looking for. There's some slight possibility that you are looking for double bracket subscripting with a vector: list(a=1:5, b=letters)[[c(2,4)]] [1] d On 22/09/2010 10:58, Alaios wrote: Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred- list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred- rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Favor about Rimage Packages in R
Hello,everyone: I facing a problem in Rimage package.I cannot read the images with size 364X364 in jpeg format.What happen to it?After I run it, it give me an error Can't Open fie.I try many time already. But I can read an JPEG image than have smaller size or bigger size than 364X364.What should I do?Thank you very much. Chuan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eigen and svd
Dear R-helpers, could anybody explain me briefly what is the difference between eigenvectors returned by 'eigen' and 'svd' functions and how they are related? Thanks in advance Ondrej Mikula __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient list indexing
For some reason I did not receive your email. Sorry for the inconvenience caused From: Dennis Murphy djmu...@gmail.com Cc: Rhelp r-help@r-project.org Sent: Wed, September 22, 2010 1:46:28 PM Subject: Re: [R] efficient list indexing Hi: I believe we had this discussion yesterday, http://r.789695.n4.nabble.com/Object-oriented-programming-in-R-td2538541.html#a2538916 but since you chose to repeat that message, it clearly wasn't enough, so start with http://cran.r-project.org/doc/manuals/R-intro.html#Lists http://stackoverflow.com/questions/2050790/how-to-correctly-use-lists-in-r With the following example as a reference: Empl - list(employee = Anna, spouse = Fred, children = 3, child.ages = c(4, 7, 9)) we quote Venables and Ripley (2002, p. 45): It is important to appreciate the difference between [ and [[. The [ form extracts sub-vectors, so Empl[2] is a list of length one, whereas Empl[[2]] is a component (a character vector of length one). In that context, consider the following: Empl$employee [1] Anna Empl$child.ages[2] [1] 7 x - spouse; Empl[[x]] [1] Fred Empl[[3]] [1] 3 Empl[[4]] [1] 4 7 9 Empl[[4]][2] [1] 7 Empl[4] $child.ages [1] 4 7 9 Empl[4][2] # Why? $NA NULL Empl[2] $spouse [1] Fred Empl[[2]] # Why the difference? [1] Fred A somewhat more expansive discussion is found on pp. 12-13 of Venables and Ripley (2000), S Programming. Dennis Hello everyone, I need some help with lists inside lists (a good way to emulate a struct)\ Assume that there is a small list called fred: fred - list(happy = 1:10, name = squash) and a big list called bigfred that includes fred list 5 times bigfred - rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like this: bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newey West and Singular Matrix
dear R experts: I am writing my own little newey-west standard error function, with heteroskedasticity and arbitrary x period autocorrelation corrections. including my function in this post here may help others searching for something similar. it is working quite well, except on occasion, it complains that Error in solve.default(crossprod(x.na.omitted, x.na.omitted)) : system is computationally singular: reciprocal condition number = 3.63797e-23 I know that lm can do the inversion, so I presume that there is a more stable way than qr.solve . I looked into lm, then into lm.fit, and it seems to invoke dqrls . is this the recommended way, or is there a higher-level more stable matrix inversion routine that I could use? help is, as always, appreciated. (also, if you see something else silly in my code, let me know, please.) regards, /iaw se.neweywest - function( lmobject.withxtrue, ar.terms =0 ) { assert( (class(lmobject.withxtrue)==lm), [se.white] works only on 'lm' objects, not on , class(lmobject.withxtrue), objects \n ) x.na.omitted - lmobject.withxtrue$x assert( class(x.na.omitted)==matrix, [se.white] internal error---have no X matrix. did you invoke with 'x=TRUE'?\n) r.na.omitted - residuals( lmobject.withxtrue ) diagband.matrix - function( m, ar.terms ) { nomit - m-ar.terms-1 mm - matrix( TRUE, nrow=m, ncol=m ) mm[1:nomit,(ncol(mm)-nomit+1):ncol(mm)] - (lower.tri( matrix(TRUE, nrow=nomit, ncol=nomit) )) mm[(ncol(mm)-nomit+1):ncol(mm),1:nomit] - (upper.tri( matrix(TRUE, nrow=nomit, ncol=nomit) )) mm } ##V(b) = inv(X'X) X' diag(e^2) X inv(X'X) invx - qr.solve( crossprod( x.na.omitted, x.na.omitted ) ) if (!ar.terms) resid.matrix - diag( r.na.omitted^2 ) else { full - r.na.omitted %*% t(r.na.omitted) ## the following is not particularly good. see, we could zero out also ## items which are multiplications with a missing residual for example, ## if we do an ar1 correction, and obs 5 is missing, then the AR term on ## 4 and 6 could be set to 0. right now, we just adjust for an add'l ## term. maskmatrix - diagband.matrix( length(r.na.omitted), ar.terms ) resid.matrix - full * maskmatrix } invx.x - invx %*% t(x.na.omitted) vmat - invx.x %*% resid.matrix %*% t(invx.x) sqrt(diag(vmat)) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kstest vs shapirotest
In general Shapiro's normality test is more powerful than the KS. For this specific case, I don't see the significantly different results from both tests. The normality assumption in this example seems to be questionable. -- View this message in context: http://r.789695.n4.nabble.com/kstest-vs-shapirotest-tp2550192p2550243.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] speeding up regressions using ddply
Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. The data set looks like this, with status as the binary dependent variable and V1:V8 as potential independent variables in the logistic regression: m - matrix(rnorm(288), nrow = 36) colnames(m) - paste('V', 1:8, sep = '') x - data.frame( status = factor(rep(rep(c('D','L'), each = 6), 3)), as.data.frame(m)) I used melt to put my data frame into a more workable format require(reshape) xm - melt(x, id = 'status') Here is the basic shape of the function I'd like to apply to every combination of variables in the dataset: h- function(df) { attach(df) log.glm - (glm(status ~ value1+ value2 , family=binomial(link=logit), na.action=na.omit)) #What I can't figure out is how to specify 2 different variables (I've put value1 and value2 as placeholders) from the xm to include in the model glm.summary-summary(log.glm) aic - extractAIC(log.glm) coef - coef(glm.summary) list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here } And then I'd like to use ddply to speed up the computations. require(pplyr) output-dddply(xm, .(variable), as.data.frame.function(h)) output I can easily do this using ddply when I only want to use 1 variable in the model, but can't figure out how to do it with two variables. Many thanks for any hints! Ali Alison Macalady Ph.D. Candidate University of Arizona School of Geography and Development Laboratory of Tree Ring Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] best model cp mallow
Hi, I'm using the Cp and Radj method for the selection of variable, with the library leaps. I need to save the best model in a file. If I use the adjr2 method it's work, but I've a problem with Cp method: adjr - leaps(x,y,method=adjr2) maxadjr(adjr) 12312 0.713 0.56 ok... but whit Cp method don't work adjr - leaps(x,y,method=Cp) maxadjr(adjr) Errore in order(l$a) : l'argomento 1 non è di tipo vector Thanks, Alfredo -- View this message in context: http://r.789695.n4.nabble.com/best-model-cp-mallow-tp2550015p2550015.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lrm ( Design package) poor performance?
Thats great thanks, I suppose it is hard to move away from a more traditional measure of performance such a percentage correct, at least for the relatively amateur statisticians among us who have been graded on such a system. The difficulty comes in reporting the effectiveness of the model to my peers. I have a c-value of 0.71 and a Bieber score of 0.199. So when it comes to predicting the response of newdata (using the estimated mean Y, which i am more comfortable understanding) i.e species 1 - mean = 2.12 therefore on the response scale this is 2 species 2 - mean = 2.98 therefore on the response scale this is 3 etc (on a side note, using this approach no species had a mean of 6, which is the last ordinal category?) A common question is how confident are you that the species has that response. What you are saying is using bootsrapping, and quoting more proper scoring rules such as c-values and Bieber score explains the confidence sufficiently? Chris On 22 Sep 2010, at 12:36, Frank Harrell wrote: % correct is an improper scoring rule and a discontinuous one to boot. So it will not always agree with more proper scoring rules. When you have a more difficult task, e.g., discriminating more categories, indexes such as the generalized c-index that utilize all the categories will recognize the difficulty of the task and give a lower value. No cause for alarm. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2550118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting random effects from model formula
Hi again, Sorry, probably I was not clear enough. I was wondering if there is a way in R to identify (and extract) only the random effects, which, because I am using the lmer function, are the terms with the symbol | on the left side of the grouping variable (SITE in my example). Thanks Lorenzo From: Lorenzo Cattarino Sent: Wednesday, 22 September 2010 5:23 PM To: r-help@r-project.org Subject: extracting random effects from model formula Hi R-users I would like to extract the random effects (1|SITE, 1+SPECIES|SITE and BA|SITE) from this model formula: Full_model - formula (VAR ~ (1|SITE) + (1+SPECIES|SITE) + (BA|SITE) + HEIGHT + COND + NN_DIST) I tried: terms(Full_model) labels(terms(Full_model)) but I could not distinguish between random and fixed effects. thanks Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge verctor and matrix
hi, can anyone tell me how to merge a vector and a matrix? v=c(1,4,2) names(v)=c(e,r,t) m=matrix(c(r,t,r,s,e,5,6,7,8,9),nr=5) colnames(m)=c(c1,c2) I want to do like merge(v, m, by.x=names,by.y=c1) I got error Error in fix.by(by.x, x) : 'by' must specify valid column(s) thanks jian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with ggplot2 - Boxplot
Hi, I am using ggplot2 to create a boxplot that summarizes a continuous variable. This code works fine for me on one PC however when I use it on another it doesnt. The structure of the dataset AHT_TopCD is SubReason=Categorical variable, AHT=Continuous variable. The code for the boxplot: require(ggplot2) qplot(SubReason,AHT,data=AHT_TopCD,geom=boxplot,main=AHT Spread - By Sub-Reason,xlab=AHT,colour=SubReason,alpha = I(1 / 5))+ + coord_flip() + scale_x_discrete(breaks=NA) The error I get is : Error in get(make_aesthetics, env = x, inherits = TRUE)(x, ...) : could not find function empty I do not understand this error. Can anyone help me with this please? Also, let me know if you have any questions or require clarification on anything here. Regards, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-ggplot2-Boxplot-tp2549970p2549970.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting random effects from model formula
actually I need to extract the random effect from the formula, not the model any idea? From: Sacha Viquerat [mailto:sacha.v...@googlemail.com] Sent: Wed 22/09/2010 5:55 PM To: Lorenzo Cattarino Subject: Re: [R] extracting random effects from model formula Am 22.09.2010 09:22, schrieb Lorenzo Cattarino: Hi R-users I would like to extract the random effects (1|SITE, 1+SPECIES|SITE and BA|SITE) from this model formula: Full_model- formula (VAR ~ (1|SITE) + (1+SPECIES|SITE) + (BA|SITE) + HEIGHT + COND + NN_DIST) I tried: terms(Full_model) labels(terms(Full_model)) but I could not distinguish between random and fixed effects. thanks Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. try ranef(model). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ggplot2 - Boxplot
Hi, are you sure you have the same version R + packages version on both ? -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-ggplot2-Boxplot-tp2549970p2549997.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in eval(expr, envir, enclos)
Hello, I just figured out that there was some problem with my dataset. So, the Regression is working fine now. Thanks a lot for all your help and suggestions. Uttara -- View this message in context: http://r.789695.n4.nabble.com/Error-in-eval-expr-envir-enclos-tp2547917p2550252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply union function vectorially
Hello everybody, I have a list.array which is a list containing arrays, lets say list.array= list( c(1,2,3),c(5,6,7),c(1,4,6,7,8) ); I would like to apply the union function to those 3 vectors to get the union of the three : [1,2,3,4,5,6,7,8] I tried do.call(what=union,args=list.array) but this is not working... Do you have a non loop solution for this ? Regards -- View this message in context: http://r.789695.n4.nabble.com/apply-union-function-vectorially-tp2550162p2550162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigen and svd
svd() does not return eigeinvectors! -- View this message in context: http://r.789695.n4.nabble.com/eigen-and-svd-tp2550210p2550257.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best model cp mallow
I think you need to set up a cut-off of Cp and then get the good values of Cp from adjr$Cp. -- View this message in context: http://r.789695.n4.nabble.com/best-model-cp-mallow-tp2550015p2550283.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigen and svd
X = U D V' ## D are the singular values of X X'X = V D^2 V' ## D^2 are the eigenvalues of X'X V is the same in both factorizations. 2010/9/22 OndÅej Mikula onmik...@gmail.com Dear R-helpers, could anybody explain me briefly what is the difference between eigenvectors returned by 'eigen' and 'svd' functions and how they are related? Thanks in advance Ondrej Mikula __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge verctor and matrix
Try this: merge(m, v, by.x = 'c1', by.y = 0, all = TRUE, sort = FALSE) On Wed, Sep 22, 2010 at 4:57 AM, Yuan Jian jayuan2...@yahoo.com wrote: hi, can anyone tell me how to merge a vector and a matrix? v=c(1,4,2) names(v)=c(e,r,t) m=matrix(c(r,t,r,s,e,5,6,7,8,9),nr=5) colnames(m)=c(c1,c2) I want to do like merge(v, m, by.x=names,by.y=c1) I got error Error in fix.by(by.x, x) : 'by' must specify valid column(s) thanks jian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge verctor and matrix
v=data.frame(c1=c(e,r,t),v=c(1,4,2) ) m=matrix(c(r,t,r,s,e,5,6,7,8,9),nr=5) colnames(m)=c(c1,c2) m=as.data.frame(m) merge(v, m, by =c1 ) c1 v c2 1 e 1 9 2 r 4 5 3 r 4 7 4 t 2 6 -- View this message in context: http://r.789695.n4.nabble.com/merge-verctor-and-matrix-tp2550280p2550315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge verctor and matrix
On Sep 22, 2010, at 3:57 AM, Yuan Jian wrote: hi, can anyone tell me how to merge a vector and a matrix? v=c(1,4,2) names(v)=c(e,r,t) m=matrix(c(r,t,r,s,e,5,6,7,8,9),nr=5) colnames(m)=c(c1,c2) I want to do like merge(v, m, by.x=names,by.y=c1) I got error Error in fix.by(by.x, x) : 'by' must specify valid column(s) When v is coerced to a data.frame, its names become row.names: merge(m, v, by.x=c1, by.y=row.names, all=TRUE) c1 c2 y 1 e 9 1 2 r 5 4 3 r 7 4 4 s 8 NA 5 t 6 2 merge(v, m, by.y=c1, by.x=row.names, all=TRUE) Row.names x c2 1 e 1 9 2 r 4 5 3 r 4 7 4 s NA 8 5 t 2 6 thanks jian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting random effects from model formula
On Sep 22, 2010, at 5:09 AM, Lorenzo Cattarino wrote: actually I need to extract the random effect from the formula, not the model any idea? From: Sacha Viquerat [mailto:sacha.v...@googlemail.com] Sent: Wed 22/09/2010 5:55 PM To: Lorenzo Cattarino Subject: Re: [R] extracting random effects from model formula Am 22.09.2010 09:22, schrieb Lorenzo Cattarino: Hi R-users I would like to extract the random effects (1|SITE, 1+SPECIES| SITE and BA|SITE) from this model formula: modterm - attr(terms(Full_model), term.labels) modterm[grep(\\|, attr(terms(Full_model), term.labels) )] [1] 1 | SITE 1 + SPECIES | SITE BA | SITE Full_model- formula (VAR ~ (1|SITE) + (1+SPECIES|SITE) + (BA|SITE) + HEIGHT + COND + NN_DIST) I tried: terms(Full_model) labels(terms(Full_model)) but I could not distinguish between random and fixed effects. thanks Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. try ranef(model). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxme AIC score and p-value mismatch??
The AICs do not seem right to me either. Unless I am missing something, it appears that the formula: AIC= -2x logLik -2k is being applied, rather than: AIC= -2x logLik +2k Meaning models with fewer degrees of freedom are being penalised. So in your example I make the degrees of freedom 9.61 +2*2= 13.61 for Integrated logLik 23.78 +2*13.7 for Penalised logLik. Someone please let me know if I am mistaken here! -- View this message in context: http://r.789695.n4.nabble.com/coxme-AIC-score-and-p-value-mismatch-tp2333980p2550348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigen and svd
It does for real-symmetric and complex Hermitian matrices, i.e. the $u and $v from svd() are the same as $vectors from eigen() for Hermitian matrices. There might be differences in signs, but that does not matter. Of course the singular values and eigenvalues are identical too. Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng, C Sent: Wednesday, September 22, 2010 9:13 AM To: r-help@r-project.org Subject: Re: [R] eigen and svd svd() does not return eigeinvectors! -- View this message in context: http://r.789695.n4.nabble.com/eigen-and-svd-tp2550210p2550257.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxme AIC score and p-value mismatch??
I think I've figured it out, the AIC column is the IMPROVEMENT in AIC compared to the null model. So bigger values are better. -- View this message in context: http://r.789695.n4.nabble.com/coxme-AIC-score-and-p-value-mismatch-tp2333980p2550409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speeding up regressions using ddply
Hi Alison, On Wed, Sep 22, 2010 at 11:05 AM, Alison Macalady a...@kmhome.org wrote: Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. In my experience ddply is not particularly fast. I use it a lot because it is flexible and has easy to understand syntax, not for it's speed. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. I'm not sure ddply is the tool for this job. The data set looks like this, with status as the binary dependent variable and V1:V8 as potential independent variables in the logistic regression: m - matrix(rnorm(288), nrow = 36) colnames(m) - paste('V', 1:8, sep = '') x - data.frame( status = factor(rep(rep(c('D','L'), each = 6), 3)), as.data.frame(m)) You can use combn to determine the combinations you want: Varcombos - combn(names(x)[-1], 2) From there you can do a loop, something like results - list() for(i in 1:dim(Varcombos)[2]) { log.glm - glm(as.formula(paste(status ~ , Varcombos[1,i], + , Varcombos[2,i], sep=)), family=binomial(link=logit), na.action=na.omit, data=x) glm.summary-summary(log.glm) aic - extractAIC(log.glm) coef - coef(glm.summary) results[[i]] - list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here names(results)[i] - paste(Varcombos[1,i], Varcombos[2,i], sep=_) } I'm sure you could replace the loop with something more elegant, but I'm not really sure how to go about it. I used melt to put my data frame into a more workable format require(reshape) xm - melt(x, id = 'status') Here is the basic shape of the function I'd like to apply to every combination of variables in the dataset: h- function(df) { attach(df) log.glm - (glm(status ~ value1+ value2 , family=binomial(link=logit), na.action=na.omit)) #What I can't figure out is how to specify 2 different variables (I've put value1 and value2 as placeholders) from the xm to include in the model glm.summary-summary(log.glm) aic - extractAIC(log.glm) coef - coef(glm.summary) list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here } And then I'd like to use ddply to speed up the computations. require(pplyr) output-dddply(xm, .(variable), as.data.frame.function(h)) output I can easily do this using ddply when I only want to use 1 variable in the model, but can't figure out how to do it with two variables. I don't think this approach can work. You are saying split up xm by variable and then expecting to be able to reference different levels of variable within each split, an impossible request. Hope this helps, Ista Many thanks for any hints! Ali Alison Macalady Ph.D. Candidate University of Arizona School of Geography and Development Laboratory of Tree Ring Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speeding up regressions using ddply
There has been a recent addition of parallel processing capabilities to plyr (I believe v1.2 and later), along with a dataframe iterator construct. Both have improved performance of ddply greatly for multicore/cluster computing. So we now have the niceness of plyr's grammar with pretty good performance. From the plyr NEWS file: Version 1.2 (2010-09-09) -- NEW FEATURES * l*ply, d*ply, a*ply and m*ply all gain a .parallel argument that when TRUE, applies functions in parallel using a parallel backend registered with the foreach package: x - seq_len(20) wait - function(i) Sys.sleep(0.1) system.time(llply(x, wait)) # user system elapsed # 0.007 0.005 2.005 library(doMC) registerDoMC(2) system.time(llply(x, wait, .parallel = TRUE)) # user system elapsed # 0.020 0.011 1.038 On 9/22/10 10:41 AM, Ista Zahn wrote: Hi Alison, On Wed, Sep 22, 2010 at 11:05 AM, Alison Macaladya...@kmhome.org wrote: Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. In my experience ddply is not particularly fast. I use it a lot because it is flexible and has easy to understand syntax, not for it's speed. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. I'm not sure ddply is the tool for this job. The data set looks like this, with status as the binary dependent variable and V1:V8 as potential independent variables in the logistic regression: m- matrix(rnorm(288), nrow = 36) colnames(m)- paste('V', 1:8, sep = '') x- data.frame( status = factor(rep(rep(c('D','L'), each = 6), 3)), as.data.frame(m)) You can use combn to determine the combinations you want: Varcombos- combn(names(x)[-1], 2) From there you can do a loop, something like results- list() for(i in 1:dim(Varcombos)[2]) { log.glm- glm(as.formula(paste(status ~ , Varcombos[1,i], + , Varcombos[2,i], sep=)), family=binomial(link=logit), na.action=na.omit, data=x) glm.summary-summary(log.glm) aic- extractAIC(log.glm) coef- coef(glm.summary) results[[i]]- list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here names(results)[i]- paste(Varcombos[1,i], Varcombos[2,i], sep=_) } I'm sure you could replace the loop with something more elegant, but I'm not really sure how to go about it. I used melt to put my data frame into a more workable format require(reshape) xm- melt(x, id = 'status') Here is the basic shape of the function I'd like to apply to every combination of variables in the dataset: h- function(df) { attach(df) log.glm- (glm(status ~ value1+ value2 , family=binomial(link=logit), na.action=na.omit)) #What I can't figure out is how to specify 2 different variables (I've put value1 and value2 as placeholders) from the xm to include in the model glm.summary-summary(log.glm) aic- extractAIC(log.glm) coef- coef(glm.summary) list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here } And then I'd like to use ddply to speed up the computations. require(pplyr) output-dddply(xm, .(variable), as.data.frame.function(h)) output I can easily do this using ddply when I only want to use 1 variable in the model, but can't figure out how to do it with two variables. I don't think this approach can work. You are saying split up xm by variable and then expecting to be able to reference different levels of variable within each split, an impossible request. Hope this helps, Ista Many thanks for any hints! Ali Alison Macalady Ph.D. Candidate University of Arizona School of Geography and Development Laboratory of Tree Ring Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reliability of the level-1 random coefficients (lme4)
Hello everyone, I want to estimate the reliability *of the level*-*1 random coefficients (the one *reported in the HLM output) using the software R. Does anyone know how to get this statistic from R? I'm using the function lmer of the package lme4 to estimate multilevel models. I tried to use the formula but I can't find specific information such as the sigma squared. Thank you, Luana Marotta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Problem with outer()
Hi It is difficult to give some help as you did not provide any clue what the result shall be. basically outer takes 2 vectors and evaluate function for each combination of elements in both vectors. However the function has to be vectorised and your function is not. fff=function(x,y) { + AA - sign(x) + BB - sign(y) + CC - abs(y) + DD1 - mat1[,2]-mat1[1,1] + DD2 - mat2[,2]-mat2[1,1] + EE - (DD1 - 0) *AA + DD2*BB + res1 - mean(EE)/mat1[1,1] + res2 - ifelse(quantile(EE/mat1[1,1], 0.05) + -0.65, quantile(EE/mat1[1,1], 0.05), paste( -0.65)) + return(paste(res1, res2, sep=--)) + } fff(1:5, 3:8) [1] -2.22113283836956-- -0.65 Warning messages: 1: In DD2 * BB : longer object length is not a multiple of shorter object length 2: In (DD1 - 0) * AA + DD2 * BB : longer object length is not a multiple of shorter object length Therefore you need to redefine DD1, DD2 and EE computation. Regards Petr r-help-boun...@r-project.org napsal dne 22.09.2010 10:18:07: Dear all, I have following piece of codes: xx - seq(-2,2, length.out=11) mat1 - cbind(rep(43, 5), rnorm(5)) mat2 - cbind(rep(53, 5), rnorm(5)) outer(c(1,-1), xx, function(x,y) { AA - sign(x) BB - sign(y) CC - abs(y) DD1 - mat1[,2]-mat1[1,1] DD2 - mat2[,2]-mat2[1,1] EE - (DD1 - 0) *AA + DD2*BB res1 - mean(EE)/mat1[1,1] res2 - ifelse(quantile(EE/mat1[1,1], 0.05) -0.65, quantile(EE/mat1[1,1], 0.05), paste( -0.65)) return(paste(res1, res2, sep=--)) } ) While running this code I am getting warnings as well as error: outer(c(1,-1), xx, function(x,y) { + AA - sign(x) + BB - sign(y) + CC - abs(y) + DD1 - mat1[,2]-mat1[1,1] + DD2 - mat2[,2]-mat2[1,1] + EE - (DD1 - 0) *AA + DD2*BB + res1 - mean(EE)/mat1[1,1] + res2 - ifelse(quantile(EE/mat1[1,1], 0.05) -0.65, quantile(EE/mat1[1,1], 0.05), paste( -0.65)) + return(paste(res1, res2, sep=--)) + } + ) Error in dim(robj) - c(dX, dY) : dims [product 22] do not match the length of object [1] In addition: Warning messages: 1: In (DD1 - 0) * AA : longer object length is not a multiple of shorter object length 2: In DD2 * BB : longer object length is not a multiple of shorter object length I am able to trace the warning, which comes from multiplication with AA BB. However could not find the correct way to tackle this warning. Neither the error. Can somebody help me where I was wrong? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor
Hi what exactly do you want. You say you can read your data x - read.jpeg(system.file(data, cat.jpg, package=rimage)) plot(x) What do you mean by pixel image? By reading a picture you get an object imagematrix str(x) imagematrix [1:420, 1:418, 1:3] 0.255 0.251 0.247 0.247 0.255 ... - attr(*, type)= chr rgb - attr(*, class)= chr [1:2] imagematrix array Do you want to extract part of this object? Here it is. plot(imagematrix(x[30:150, 30:150,])) Go through help pages and try examples from them. Regards Petr r-help-boun...@r-project.org napsal dne 21.09.2010 14:52:35: Hi Chuan, I'm forwarding your question to the list because I haven't used the rimage package... It's best if you post questions to the list anyway because you are more likely to get a fast and useful answer. On 20 September 2010 23:03, chuan zun liang wrote: Dear Michael: I am so sorry,disturb again.I can plot jpeg image in R under package rimage But How I can it into pixel image?I want exact some part of image.I found this function in package spatstat...Is it possible I do it?Thank a lot Chuan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] defining set of variables in a formula
Dear fellow R users, I am trying to conduct a regression analysis. I have thousands of variables. The names are V1, V2,V2000 Is there an easy way to include these variables in the regression? my model is something like that: model- lm(y~V1+V2+.+V2000, data=data) Thanks so much in advance, Ozlem __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] defining set of variables in a formula
Ozlem, Just read ?formula, where it says: There are two special interpretations of ‘.’ in a formula. The usual one is in the context of a ‘data’ argument of model fitting functions and means ‘all columns not otherwise in the formula’: see ‘terms.formula’. In the context of ‘update.formula’, *only*, it means ‘what was previously in this part of the formula’. So, lm(y ~ . , data = data) But what do you hope to conclude from an analysis with thousands of dependent variables? Ozlem Yanmaz wrote: Dear fellow R users, I am trying to conduct a regression analysis. I have thousands of variables. The names are V1, V2,V2000 Is there an easy way to include these variables in the regression? my model is something like that: model- lm(y~V1+V2+.+V2000, data=data) Thanks so much in advance, Ozlem __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] defining set of variables in a formula
Hi, Try with .: model- lm(y~., data=data) From ?formula: There are two special interpretations of |.| in a formula. The usual one is in the context of a |data| argument of model fitting functions and means 'all columns not otherwise in the formula': see |terms.formula http://127.0.0.1:24287/library/stats/help/terms.formula|. HTH, Ivan Le 9/22/2010 17:23, Ozlem Yanmaz a écrit : Dear fellow R users, I am trying to conduct a regression analysis. I have thousands of variables. The names are V1, V2,V2000 Is there an easy way to include these variables in the regression? my model is something like that: model- lm(y~V1+V2+.+V2000, data=data) Thanks so much in advance, Ozlem __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] For loop with ifelse help
Hello R users, I have 2 files (file1 and f2) and I am trying to sum columns 6:10 of a specific row in f2 and append it in file 1 if the state variable in file 1 equals the rowname in f2. Below is an example of the code I wrote using a for loop, but it not working (i.e it only works for the last number (10) in the loop). Can someone tell me how to fix? Many thanks ! file1 - data.frame(ID=seq(1:30), state=sample(1:10, 30, replace=TRUE)); file1 ID state 1 1 7 2 2 7 3 3 6 4 4 4 5 5 5 6 6 7 7 710 8 8 1 9 9 1 10 10 5 . file2 - matrix(seq(1:100),nrow=10) f2 - as.data.frame(file2); f2 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 1 11 21 31 41 51 61 71 81 91 2 2 12 22 32 42 52 62 72 82 92 3 3 13 23 33 43 53 63 73 83 93 4 4 14 24 34 44 54 64 74 84 94 5 5 15 25 35 45 55 65 75 85 95 6 6 16 26 36 46 56 66 76 86 96 7 7 17 27 37 47 57 67 77 87 97 8 8 18 28 38 48 58 68 78 88 98 9 9 19 29 39 49 59 69 79 89 99 10 10 20 30 40 50 60 70 80 90 100 for (i in length(f2)) { file1$chksum - ifelse ((file1$state==rownames(f2)[i]), rowSums(f2[rownames(f2)[i], 6:10]), 0) } print(file1) ID state chksum 1 1 7 0 2 2 7 0 3 3 6 0 4 4 4 0 5 5 5 0 6 6 7 0 7 710400 8 8 1 0 9 9 1 0 10 10 5 0 11 1110400 12 12 9 0 13 1310400 14 14 9 0 15 15 5 0 16 16 3 0 17 17 1 0 18 18 7 0 19 19 7 0 20 20 2 0 21 21 3 0 22 22 8 0 23 23 8 0 24 24 4 0 25 25 6 0 26 26 6 0 27 27 3 0 28 28 3 0 29 29 5 0 30 30 5 0 -- View this message in context: http://r.789695.n4.nabble.com/For-loop-with-ifelse-help-tp2550547p2550547.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ggplot2 - Boxplot
That implies you need to update your version of plyr. Hadley On Wed, Sep 22, 2010 at 4:10 AM, RaoulD raoul.t.dso...@gmail.com wrote: Hi, I am using ggplot2 to create a boxplot that summarizes a continuous variable. This code works fine for me on one PC however when I use it on another it doesnt. The structure of the dataset AHT_TopCD is SubReason=Categorical variable, AHT=Continuous variable. The code for the boxplot: require(ggplot2) qplot(SubReason,AHT,data=AHT_TopCD,geom=boxplot,main=AHT Spread - By Sub-Reason,xlab=AHT,colour=SubReason,alpha = I(1 / 5))+ + coord_flip() + scale_x_discrete(breaks=NA) The error I get is : Error in get(make_aesthetics, env = x, inherits = TRUE)(x, ...) : could not find function empty I do not understand this error. Can anyone help me with this please? Also, let me know if you have any questions or require clarification on anything here. Regards, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-ggplot2-Boxplot-tp2549970p2549970.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] failure to access packages
I am helping a fellow worker get R up and running, and he has run into a peculiar problem I've not encountered in previous install situations. From the Rconsole menu choice, he can set CRAN mirror to USA CA2, but when he selects load packages a very truncated list of packages appears (many packages don't appear at all). If he issues chooseCRANmirror() from the prompt, he can still select USA CA2, but still can't see all the packages that are there. If he issues local({pkg - select.list(sort(.packages(all.available = TRUE)),graphics=TRUE) + if(nchar(pkg)) library(pkg, character.only=TRUE)}) he gets this error: Error in select.list(sort(.packages(all.available = TRIE)), graphics = TRUE : Unused argument(s) (graphics = TRUE) When I issue same command from prompt, I get the expected list of packages. Has anyone encountered this problem? I've searched CRAN help, but got nowhere. Gregory A. Graves, Lead Scientist Everglades REstoration COoordination and VERification (RECOVER) Restoration Sciences Department South Florida Water Management District Phones: DESK: 561 / 682 - 2429 CELL: 561 / 719 - 8157 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can ucminf be installed in 64 bit R and one more question?
thanks. Duncan. previously I try to install the 32 bit ucminf on a 64 bit R thus it cannot fit in. thanks for the link and I have downloaded and installed it. it works perfect. Nan - Original Message From: Duncan Murdoch murdoch.dun...@gmail.com To: Hey Sky heyskywal...@yahoo.com Cc: R r-help@r-project.org Sent: Tue, September 21, 2010 12:51:04 PM Subject: Re: [R] Can ucminf be installed in 64 bit R and one more question? On 21/09/2010 11:43 AM, Hey Sky wrote: Hey, Duncan thanks for your reply. I am not sure which version i have installed but I downloaded it from http://cran.skazkaforyou.com/. when I check the R installed, it says 2.11.1. I do not know I answered your question or not. if not, where I can find them? (in fact, I did not notice/find there are many versions of 64 bit R.) 2.11.1 should be fine, but perhaps you tried to install the 32 bit version of the package into 64 bit R. You can get the 64 bit package from http://probability.ca/cran/bin/windows64/contrib/2.11/ucminf_1.0-5.zip but first install the package it depends on, http://probability.ca/cran/bin/windows64/contrib/2.11/numDeriv_2009.2-1.zip Duncan Murdoch Nan - Original Message From: Duncan Murdochmurdoch.dun...@gmail.com To: Hey Skyheyskywal...@yahoo.com Cc: Rr-help@r-project.org Sent: Tue, September 21, 2010 6:51:57 AM Subject: Re: [R] Can ucminf be installed in 64 bit R and one more question? On 20/09/2010 8:36 PM, Hey Sky wrote: Hey, R Users my windows is 64 bit windows 7. I am trying to install the package ucminf into my 64 bit version R but cannot. the package I downloaded is from http://cran.r-project.org/web/packages/ucminf/index.html and I installed it with the install from local zip files, due to I did not connect my computer to internet. did anyone meet this problem and is there a version of ucminf for 64 bit R? Binary installs are specific to particular R versions. There are several versions of R with 64 bit Windows builds now; which one are you using? Duncan Murdoch the question is: why the ucminf (for 32 bit R) with option hessian=3 always give hessian matrix while most of other methods failed (includiing the option hessian=1 which using numDeriv)? thanks for any information Nan from Montreal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop with ifelse help
On Sep 22, 2010, at 11:42 AM, Pele wrote: Hello R users, I have 2 files (file1 and f2) and I am trying to sum columns 6:10 of a specific row in f2 and append it in file 1 if the state variable in file 1 equals the rowname in f2. Below is an example of the code I wrote using a for loop, but it not working (i.e it only works for the last number (10) in the loop). Can someone tell me how to fix? Many thanks ! file1 - data.frame(ID=seq(1:30), state=sample(1:10, 30, replace=TRUE)); file1 ID state 1 1 7 2 2 7 3 3 6 4 4 4 5 5 5 6 6 7 7 710 8 8 1 9 9 1 10 10 5 . file2 - matrix(seq(1:100),nrow=10) f2 - as.data.frame(file2); f2 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 1 11 21 31 41 51 61 71 81 91 2 2 12 22 32 42 52 62 72 82 92 3 3 13 23 33 43 53 63 73 83 93 4 4 14 24 34 44 54 64 74 84 94 5 5 15 25 35 45 55 65 75 85 95 6 6 16 26 36 46 56 66 76 86 96 7 7 17 27 37 47 57 67 77 87 97 8 8 18 28 38 48 58 68 78 88 98 9 9 19 29 39 49 59 69 79 89 99 10 10 20 30 40 50 60 70 80 90 100 for (i in length(f2)) { file1$chksum - ifelse ((file1$state==rownames(f2)[i]), rowSums(f2[rownames(f2)[i], 6:10]), 0) } That looks overly complex and inefficient (not to mention wrong): Try: res - merge(file1, rowSums(file2), by.x=state, by.y=row.names, all=TRUE) names(res)[4] - chksum If you need it sorted by ID then: res[order(res$ID],] -- David. print(file1) ID state chksum 1 1 7 0 2 2 7 0 3 3 6 0 4 4 4 0 5 5 5 0 6 6 7 0 7 710400 8 8 1 0 9 9 1 0 10 10 5 0 11 1110400 12 12 9 0 13 1310400 14 14 9 0 15 15 5 0 16 16 3 0 17 17 1 0 18 18 7 0 19 19 7 0 20 20 2 0 21 21 3 0 22 22 8 0 23 23 8 0 24 24 4 0 25 25 6 0 26 26 6 0 27 27 3 0 28 28 3 0 29 29 5 0 30 30 5 0 -- View this message in context: http://r.789695.n4.nabble.com/For-loop-with-ifelse-help-tp2550547p2550547.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bctrans: Box-Cox Transformation Problem
Hello, I'm currently trying to model the movement of a slope (v.obs) with a regression model. The data can be found following the given links: either http://www.sendspace.com/file/dnugwc or http://rapidshare.com/files/420569660/sel.day.txt I want to use the Box-Cox transformation to normalize the response as well as the predictor variables. The scatterplot looks like this: library(zoo) library(alr3) load(sel.day.txt) sel.p1-window(sel, start=as.POSIXct(2008-04-05), end=as.POSIXct(2009-04-01)) pairs(~v.obs+ snow+ HH6.1+ Q.Enz+ pcpt+ qd,data=sel.p1,gap=0.4,cex.labels=1.5) In Sheather: A Modern Approach to Regression with R the function bctrans is used to calculate lambda for the variables. I use yeo.johnson since there are values=0 in the data. Doing this creates following output: 2 summary(bctrans(~v.obs+ snow+ pcpt+ Q.Enz+ qd+ HH6.1, data=sel.p1, family=yeo.johnson)) yeo.johnson Transformations to Multinormality Est.Power Std.Err. Wald(Power=0) Wald(Power=1) v.obs -49.9674 5.5747 -8.9632 -9.1426 snow-4.1130 0.3326 -12.3655 -15.3719 pcpt 0.6111 0.08117.5341 -4.7950 Q.Enz -0.8584 0.0904 -9.4967 -20.5601 qd -26.1100 2.3432 -11.1427 -11.5695 HH6.1 -6.0205 0.0023-2653.7643-3094.5528 LRT df p.value LR test, all lambda equal 0 549.4523 6 0 LR test, all lambda equal 1 1414.1770 6 0 So what to do with that. I tried transforming my variables with the Est.Power given in the output. I rounded the values more or less arbitrarily for the first try: v.obs-(sel.p1$v.obs^(-0.5)-1)/-0.5 snow-(sel.p1$snow^(-4)-1)/-4 pcpt-(sel.p1$pcpt^(0.5)-1)/0.5 Q.Enz-(sel.p1$Q.Enz^(-0.9)-1)/-0.9 qd-(sel.p1$qd^(-26)-1)/-26 HH6.1-(sel.p1$HH6.1^(-6)-1)/-6 trans-merge(v.obs,qd,pcpt,snow,HH6.1,Q.Enz) This gives me a lot of -Inf's which I d'ont like too much. I thought about transforming the data first, e.g v.obs-v.obs*10^5. But that doesn't seem the right way, and doing that i often get errors from bctrans: 2 summary(bctrans(~ v.obs+ snow+ pcpt+ Q.Enz+ qd+ HH6.1, data=sel.p1, family=yeo.johnson)) Error in optim(start, neg.kernel.profile.logL, hessian = TRUE, method = L-BFGS-B, : L-BFGS-B needs finite values of 'fn' These errors also happen when i try another formula without the response variable: 2 summary(bctrans(~ snow+ pcpt+ Q.Enz+ qd+ HH6.1, data=sel.p1, family=yeo.johnson)) Error in optim(start, neg.kernel.profile.logL, hessian = TRUE, method = L-BFGS-B, : L-BFGS-B needs finite values of 'fn' Does anybody have an idea how to cope with the data to get proper parameters for the transformation? Thanks a lot Axel Kasparek TU München __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can ucminf be installed in 64 bit R and one more question?
hey, Ravi yes. I have tried the hessian() in the numDeriv package and it is the same package ucminf() uses to calculate the hessian matrix while having the option hessian=1. maybe I should avoid the word fail but instead using some others. anyway, what I mean in the former post is the hessian(), the optim(...hessian=T...) or ucminf(...hessian=1...) give a NaN for std.err, which a negative variance appears. but the ucminf( ... hessian=3 ...) will not generate negative variance. I am curious how the hessian here has been calculated. hope I expressed the situation clear PS, for the zero hessian matrix I met before, I think it might be the function curve is too flat at the initial value, since the second order derivation shows the curvature of the curve. any possible suggestions for it? Nan - Original Message From: Ravi Varadhan rvarad...@jhmi.edu To: Hey Sky heyskywal...@yahoo.com; R r-help@r-project.org Sent: Tue, September 21, 2010 12:22:37 PM Subject: RE: [R] Can ucminf be installed in 64 bit R and one more question? I don’t understand your question about the hessian. Did you try hessian() function in numDeriv on the result from optim directly rather than using ucminf? Also, what do you mean by `failed'? What error message did you get? Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hey Sky Sent: Monday, September 20, 2010 8:36 PM To: R Subject: [R] Can ucminf be installed in 64 bit R and one more question? Hey, R Users my windows is 64 bit windows 7. I am trying to install the package ucminf into my 64 bit version R but cannot. the package I downloaded is from http://cran.r-project.org/web/packages/ucminf/index.html and I installed it with the install from local zip files, due to I did not connect my computer to internet. did anyone meet this problem and is there a version of ucminf for 64 bit R? the question is: why the ucminf (for 32 bit R) with option hessian=3 always give hessian matrix while most of other methods failed (includiing the option hessian=1 which using numDeriv)? thanks for any information Nan from Montreal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unique subsetting question
Hi all, I'm looking at a large data set, and I'm interested in removing rows where only one variable is duplicated. Here's an example: presidents Qtr1 Qtr2 Qtr3 Qtr4 1945 NA 87 82 75 1946 63 50 43 32 1947 35 60 54 55 1948 36 39 NA NA 1949 69 57 57 51 1950 45 37 46 39 1951 36 24 32 23 1952 25 32 NA 32 1953 59 74 75 60 1954 71 61 71 57 1955 71 68 79 73 1956 76 71 67 75 1957 79 62 63 57 1958 60 49 48 52 1959 57 62 61 66 1960 71 62 61 57 1961 72 83 71 78 1962 79 71 62 74 1963 76 64 62 57 1964 80 73 69 69 1965 71 64 69 62 1966 63 46 56 44 1967 44 52 38 46 1968 36 49 35 44 1969 59 65 65 56 1970 66 53 61 52 1971 51 48 54 49 1972 49 61 NA NA 1973 68 44 40 27 1974 28 25 24 24 See how in 1954 and 1955, the Qtr1 approval rating is the same? Let's say I wanted to return the presidents data frame, but only have unique values for Qtr1. I doesn't matter which years are displayed for duplicated values-- it just matters that each value is not displayed more than once. Any way I can do this but still have it be a data frame that shows Qtr2, 3, and 4 values? Thanks in advance, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550453.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
Hi, Take a look at ?duplicated and ?unique HTH, Ivan Le 9/22/2010 16:55, AndrewPage a écrit : Hi all, I'm looking at a large data set, and I'm interested in removing rows where only one variable is duplicated. Here's an example: presidents Qtr1 Qtr2 Qtr3 Qtr4 1945 NA 87 82 75 1946 63 50 43 32 1947 35 60 54 55 1948 36 39 NA NA 1949 69 57 57 51 1950 45 37 46 39 1951 36 24 32 23 1952 25 32 NA 32 1953 59 74 75 60 1954 71 61 71 57 1955 71 68 79 73 1956 76 71 67 75 1957 79 62 63 57 1958 60 49 48 52 1959 57 62 61 66 1960 71 62 61 57 1961 72 83 71 78 1962 79 71 62 74 1963 76 64 62 57 1964 80 73 69 69 1965 71 64 69 62 1966 63 46 56 44 1967 44 52 38 46 1968 36 49 35 44 1969 59 65 65 56 1970 66 53 61 52 1971 51 48 54 49 1972 49 61 NA NA 1973 68 44 40 27 1974 28 25 24 24 See how in 1954 and 1955, the Qtr1 approval rating is the same? Let's say I wanted to return the presidents data frame, but only have unique values for Qtr1. I doesn't matter which years are displayed for duplicated values-- it just matters that each value is not displayed more than once. Any way I can do this but still have it be a data frame that shows Qtr2, 3, and 4 values? Thanks in advance, Andrew -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Crash report: regexpr(a{2-}, )
On 9/21/2010 8:04 PM, Henrik Bengtsson wrote: Each of the following calls crash (core dumps) R (R --vanilla) on various versions and OSes: regexpr(a{2-}, ) sub(a{2-}, ) gsub(a{2-}, ) EXAMPLES: To add another (windows) example it also crashes the 2.12.0 alpha build: sessionInfo() R version 2.12.0 alpha (2010-09-20 r52948) Platform: i386-pc-mingw32/i386 (32-bit) ... regexpr(a{2-}, ) Assertion failed: iter-max == -1 || iter-max == 1, file tre-compile.c, line 1825 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. sessionInfo() R version 2.11.1 Patched (2010-09-16 r52949) Platform: i386-pc-mingw32 (32-bit) ... regexpr(a{2-}, ) Assertion failed: iter-max == -1 || iter-max == 1, file tre-compile.c, line 1825 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. sessionInfo() R version 2.12.0 Under development (unstable) (2010-09-14 r52910) Platform: i386-pc-mingw32/i386 (32-bit) ... regexpr(a{2-}, ) Assertion failed: iter-max == -1 || iter-max == 1, file tre-compile.c, line 1825 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. sessionInfo() R version 2.11.0 Patched (2010-05-09 r51960) x86_64-unknown-linux-gnu ... regexpr(a{2-}, ) R: tre-compile.c:1825: tre_ast_to_tnfa: Assertion `iter-max == -1 || iter-max == 1' failed. Aborted /Henrik -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop with ifelse help
Hi David - thanks for your suggestion, but I am trying to avoid doing any merging and sorting for this step because the real file I will be working with has about 20 million records. If I can get this loop or something similar to work will be good enough. thanks again.. -- View this message in context: http://r.789695.n4.nabble.com/For-loop-with-ifelse-help-tp2550547p2550656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] failure to access packages
The problem R ver is 2.10.1 - I am using 2.11.1 Running windows xp sp3 - same system as what I use The suggested solution is for the user to upgrade to 2.11.1 to resolve package incompatibilities Gregory A. Graves, Lead Scientist Everglades REstoration COoordination and VERification (RECOVER) Restoration Sciences Department South Florida Water Management District Phones: DESK: 561 / 682 - 2429 CELL: 561 / 719 - 8157 -Original Message- From: Graves, Gregory Sent: Wednesday, September 22, 2010 11:45 AM To: 'r-help@r-project.org' Cc: King, Christopher Subject: RE: failure to access packages I am helping a fellow worker get R up and running, and he has run into a peculiar problem I've not encountered in previous install situations. From the Rconsole menu choice, he can set CRAN mirror to USA CA2, but when he selects load packages a very truncated list of packages appears (many packages don't appear at all). If he issues chooseCRANmirror() from the prompt, he can still select USA CA2, but still can't see all the packages that are there. If he issues local({pkg - select.list(sort(.packages(all.available = TRUE)),graphics=TRUE) + if(nchar(pkg)) library(pkg, character.only=TRUE)}) he gets this error: Error in select.list(sort(.packages(all.available = TRIE)), graphics = TRUE : Unused argument(s) (graphics = TRUE) When I issue same command from prompt, I get the expected list of packages. Has anyone encountered this problem? I've searched CRAN help, but got nowhere. Gregory A. Graves, Lead Scientist Everglades REstoration COoordination and VERification (RECOVER) Restoration Sciences Department South Florida Water Management District Phones: DESK: 561 / 682 - 2429 CELL: 561 / 719 - 8157 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kstest vs shapirotest
The way that you called ks.test below your null hypothesis is that your data comes from a normal distribution with mean 0 and standard deviation 1. Now I am not familiar with your data, but I would be very surprised in general to find a population of trees where half of them had negative heights so it is not surprising that the ks test shows your data to not be consistent with the null of normal with mean 0, sd 1. If you are interested in normality of populations based on samples greater than 250 then you should look into SnowsPenultimateNormalityTest() in the TeachingDemos package, note however that the documentation is likely to be more useful than the test itself. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Sibylle Stöckli Sent: Wednesday, September 22, 2010 6:32 AM To: r-help@r-project.org Subject: [R] kstest vs shapirotest Dear R-users Idea: Analysing tree height frequency with hist(), normal distribution (ks.test shapiro.test) and skewness (package e1071 - thanks a lot for this useful package)as an indication of possible self-thinning in an experimental tree stand. Problem: Results from the ks.test and the shapiro.test are not comparable (see example of both tests). Tree height is a nice continuous variable. Sample size is around n= 250-350. Does anybode know about a problem in ks tests using a large sample size and working with subsets (e.g file[,])? Comparing tests with qqplots, I would appreciate shapiro, but I am wondering about the results from ks (test ist not very sensitive, D=1, p=2.2e-16 many times? Thanks Sibylle shapiro.test(Biotree[Ac,]$Height2008) Shapiro-Wilk normality test data: Biotree[Ac, ]$Height2008 W = 0.9908, p-value = 0.05175 ks.test(Biotree[Ac,]$Height2008, pnorm) One-sample Kolmogorov-Smirnov test data: Biotree[Ac, ]$Height2008 D = 1, p-value 2.2e-16 alternative hypothesis: two-sided Warning message: In ks.test(Biotree[Ac, ]$Height2008, pnorm) : cannot compute correct p-values with ties -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ordinal mixed model
Hello, I am trying to build a generalised linear mixed model. My dependent variable is ordinal. I have a random factor (7 individuals), and a repeated measure where the dependent variable was measured three times for each of four attempts (so the repeats are nested). I also have a few covariates. I am a complete novice in R, being used to using SPSS. SPSS lets me build an ordinal model with repeated measures, but can't include a random factor. So that is really the hurdle, is this possible in R. And is there a way to do this that could be explained to someone who has no experience in R? Any help would be greatly appreciated. Kind Regards, ptwal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] statistic term in boot function
Hi There, Just a question regarding the function that is specified to boot (I have read the help, the manual and online examples.). The description of boot says that the second argument of statistic (non parametric bootstrap) must be a vector of indices, frequencies or weights which define the bootstrap sample. If what I will be bootstrapping is e.g., the mean of a vector of length N and I want each observation in the vector to be given equal weight, should the second argument be a vector containing N 1's? Thanks beforehand for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/statistic-term-in-boot-function-tp2550629p2550629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
I understand how duplicated and unique work for a list where all parts of a given row are duplicated, or how to find duplicated values if I'm just looking at that first column, but in this case the rows for 1954 and 1955 are not completely the same; only quarter 1 is duplicated, so I'm not sure how to apply either duplicated or unique in that case. Thanks, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550651.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
On Sep 22, 2010, at 12:35 PM, AndrewPage wrote: I understand how duplicated and unique work for a list where all parts of a given row are duplicated, or how to find duplicated values if I'm just looking at that first column, but in this case the rows for 1954 and 1955 are not completely the same; only quarter 1 is duplicated, so I'm not sure how to apply either duplicated or unique in that case. Did you understand that the example data you are using is neither a data.frame nor a matrix? Thanks, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550651.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with sockets in R
Hi, Thanks for the advice! My locale and encoding setting follow: Sys.getlocale() [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 getOption(encoding) [1] native.enc I was indeed able to solve my immediate problem by using readLines to read the whole response and parse it later, following the example in an old version of OmegaHat's SSOAP package. The httpRequest package also reads the entire response as a unit, but does so using readBin and rawToChar in one case and read.socket in another. OmegaHat's RCurl library deserves further investigation and might be the smartest way to go. Thanks also for the pointer to tests/internet.R, which uses read.socket. The R Data Import/Export guide states that socketConnection is preferred over the make.socket/read.socket methods. But with several different options, it's hard to figure out which way to go. My understanding of HTTP is very limited, but might reading the response 'til the server closes the socket run afoul of some of the more advanced uses of HTTP? Thanks, -chris On Tue, Sep 21, 2010 at 11:43 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: readLines() is for a text-mode connection; readChar() is for a binary-mode connection. Given that you asked for possible re-encoding by the 'encoding' argument, you cannot safely mix them (text-mode access is re-encoded, binary-mode is not). However, we don't know if re-encoding was active in your case since we don't know your locale. Either don't specify an encoding and re-encode the response in R or use readLines() to read the complete response and split it up later. For a different approach with read/write.socket() see tests/internet.R in the R sources. Please do note that the posting guide asked you for 'at a minimum' information (which includes the locale) and a reproducible example. On Tue, 21 Sep 2010, Christopher Bare wrote: Hi R gurus, I'm trying to use a ReSTful web service from within R. Specifically, I need to make HTTP PUT requests. I'm able to make the request and that goes well enough, but I'm having trouble properly consuming the HTTP response. The problem comes in when I'm trying to parse out the response body. I get the length of the response body from the Content-Length header. I then try to use readChar(con, nchars=content.length). The result I'm seeing is that the first few characters of the response body are cut off. My code looks like this: http.request - function(host, path, request, port=80) { con - socketConnection(host=host, port=port, open=w+, blocking=TRUE, encoding=UTF-8) writeLines(request, con) write(wrote request, stderr()) flush(stderr()) # build response object response - list() response$status - readLines(con, n=1) response$headers - character(0) repeat{ ss - readLines(con, n=1) write(ss, stderr()) flush(stderr()) if (ss == ) break key.value - strsplit(ss, :\\s*) response$headers[key.value[[1]][1]] - key.value[[1]][2] } if (any(names(response$headers)=='Content-Length')) { content.length - as.integer(response$headers['Content-Length']) response$body - readChar(con, nchars=content.length) } close(con) } response$body ends up with e,\id\:\some_doc\,\rev\:\7-906e06a7744780ef93126adc6f8f10ef\}\n when it should be: {\ok\:true,\id\:\some_doc\,\rev\:\7-906e06a7744780ef93126adc6f8f10ef\} Is mixing readLines and readChars on the same connection causing bad juju? If it is, what's the recommended what to do such a thing? Thanks for any help!! -Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
I just figured that out, but the real data I'm using is a data frame for sure, so I'll find another example. -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550736.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem opening pdf device on Windows 7
I can not open PDf device. Acrobat is closed. I have checked archives but could not find a solution. What should I do? cont.cdfplot(myanalysis.pdf, myanalysis$CDF, ylbl.r=Stream Length (km)) Error in pdf(file = pdffile, width = width, height = height) : unable to start device pdf In addition: Warning message: In pdf(file = pdffile, width = width, height = height) : cannot open 'pdf' file argument 'myanalysis.pdf' Then I tried: pdf() Error in pdf() : unable to start device pdf In addition: Warning message: In pdf() : cannot open 'pdf' file argument 'Rplots.pdf' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] legend
Hi, there is a function to plot survival curves: library(survival) plot.KM - function(survival, x, x_cut.off, main='', label='') { plot(survfit(survival ~ I(x = x_cut.off)), main=main) legend('bottomleft', c(expression(label = x_cut.off),expression(label x_cut.off))) } Now, I need to determine as the argument what appears in the legend. I want plot.KM(survival, x, x_cut.off=0.5, main='', label='ABC') but what I get is ('label x_cut.off') in legend instead of 'ABC0.5'. Symbol '=' should be in mathematical form. will appreciate any help. robert -- View this message in context: http://r.789695.n4.nabble.com/legend-tp2550747p2550747.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doing operations by grouping variable
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Seth W Bigelow Sent: Tuesday, September 21, 2010 4:22 PM To: bill.venab...@csiro.au Cc: r-help@r-project.org Subject: Re: [R] Doing operations by grouping variable Aah, that is the sort of truly elegant solution I have been seeking. And it's wrapped up in a nice programming shortcut to boot (i.e., the within statement). I retract anything I may have said about tapply being clunky. Many thanks --Seth Dr. Seth W. Bigelow Biologist, USDA-FS Pacific Southwest Research Station 1731 Research Park Drive, Davis California bill.venab...@csiro.au 09/21/2010 03:15 PM To sbige...@fs.fed.us You left out the subscript. Why not just do d - within(data.frame(group = rep(1:5, each = 5), variable = rnorm(25)), scaled - variable/tapply(variable, group, max)[group]) This approach can be tricky when there is more than one grouping variable. E.g., suppose we have grouping variables g1 and g2: d - data.frame(x=1:10, g1=LETTERS[rep(11:12,each=5)], g2=letters[rep(21:23,c(3,3,4))]) d x g1 g2 1 1 K u 2 2 K u 3 3 K u 4 4 K v 5 5 K v 6 6 L v 7 7 L w 8 8 L w 9 9 L w 10 10 L w and we want to divide each x value by it max for each g1*g2 group (6 possible groups, of which 4 are in the data). You can extend Bill V.'s approach with with(d, x/tapply(x, list(g1,g2), FUN=max)[cbind(g1,g2)]) [1] 0.333 0.667 1.000 0.800 1.000 [6] 1.000 0.700 0.800 0.900 1.000 That would fail if g1 and g2 were not factors but were integer vectors. Try it with di - data.frame(x=1:10, g1=rep(11:12,each=5), g2=rep(21:23,c(3,3,4))) with(di, x/tapply(x, list(g1,g2), FUN=max)[cbind(g1,g2)]) Error in tapply(x, list(g1, g2), FUN = max)[cbind(g1, g2)] : subscript out of bounds To avoid that problem you can call tapply with no FUN to get the indices to subscript by with(d, x/tapply(x, list(g1,g2), FUN=max)[tapply(x, list(g1, g2))]) [1] 0.333 0.667 1.000 0.800 1.000 [6] 1.000 0.700 0.800 0.900 1.000 The misleadingly named ave() can avoid the need to do the subscripting after tapply but has other problems with(d, x/ave(x, g1, g2, FUN=max)) [1] 0.333 0.667 1.000 0.800 1.000 [6] 1.000 0.700 0.800 0.900 1.000 Warning messages: 1: In FUN(X[[6L]], ...) : no non-missing arguments to max; returning -Inf 2: In FUN(X[[6L]], ...) : no non-missing arguments to max; returning -Inf It gives the right answer but it is calling FUN even for the empty interaction groups. For some FUN's this would abort the call, not just give a warning. In any case it is a waste of time. In either case you can also use the interaction() function to change the multiple grouping vectors into one: d - within(d, interaction(g1, g2, drop=TRUE)) with(d, x/ave(x, g1g2, FUN=max)) [1] 0.333 0.667 1.000 0.800 1.000 [6] 1.000 0.700 0.800 0.900 1.000 with(d, x/tapply(x, g1g2, FUN=max)[g1g2]) K.u K.u K.u K.v K.v L.v 0.333 0.667 1.000 0.800 1.000 1.000 L.w L.w L.w L.w 0.700 0.800 0.900 1.000 with(d, x/tapply(x, g1g2, FUN=max)[tapply(x, g1g2)]) K.u K.u K.u K.v K.v L.v 0.333 0.667 1.000 0.800 1.000 1.000 L.w L.w L.w L.w 0.700 0.800 0.900 1.000 The names are probably unwanted in the tapply cases; use unname to get rid of them. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com and be done with it? (Warning: if you replace the second '-' above by '=', it will not work. It is NOT true that you can always replace '-' by '=' for assignment. Why?) Bill Venables. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
Hi Andrew, You can use duplicated() to index the rows you wish to keep, like this: test.dat - data.frame(a=c(1,1:5,5:10), b=1:12, c=letters[1:12]) #make up data duplicated(test.dat$a) # see what duplicated() function does !duplicated(test.dat$a) # see how we can invert using the ! function so that we get non-duplicated test.dat[!duplicated(test.dat$a),] # this is the important bit: use indexing to select non-duplicated rows. Best, Ista On Wed, Sep 22, 2010 at 12:35 PM, AndrewPage savejar...@yahoo.com wrote: I understand how duplicated and unique work for a list where all parts of a given row are duplicated, or how to find duplicated values if I'm just looking at that first column, but in this case the rows for 1954 and 1955 are not completely the same; only quarter 1 is duplicated, so I'm not sure how to apply either duplicated or unique in that case. Thanks, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550651.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
How about this: s = c(aa, bb, cc, , aa, dd, , aa) n = c(2, 3, 5, 6, 7, 8, 9, 3) b = c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE) df = data.frame(n, s, b) # df is a data frame I want to display df with no value in s occurring more than once. Also, I want to delete the rows where s contains . -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550769.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape is re-ordering my variables
On 09/21/2010 09:44 PM, Dennis Murphy wrote: Hi: Reshaping multiple variables is nontrivial. Try the following (untested): reshape(rcw, idvar = 'ICU', varying = list(c(paste('Q6.RC', 1:4, sep = '.'), c(paste('Q6.FT.RC', 1:4, 'years', sep = '.'), c(paste('Q6.FT.RC', 1:4, 'months', sep = '.'), c(paste('Q6.PT.RC', 1:4, 'years', sep = '.'), c(paste('Q6.PT.RC', 1:4, 'months', sep = '.')), v.names = c(init,FTy,FTm,PTy,PTm), direction = 'long') Thanks. Your approach worked although there were unnecessary 'c(' in varying component. The command that seems to have worked for me is: rcl - reshape(rcw, idvar = 'ICU', varying = list(paste('Q6.RC', 1:4, sep = '.'), paste('Q6.FT.RC', 1:4, 'years', sep = '.'), paste('Q6.FT.RC', 1:4, 'months', sep = '.'), paste('Q6.PT.RC', 1:4, 'years', sep = '.'), paste('Q6.PT.RC', 1:4, 'months', sep = '.')), v.names = c(init,FTy,FTm,PTy,PTm), direction = 'long') So, thanks again for pointing me in the right direction here. Kevin The list contains the subgroups of the variables you want combined and v.names, as you appear to know, provides new names for the reshaped columns. My template example also has a times variable, but it may not be necessary in your case. HTH, Dennis On Tue, Sep 21, 2010 at 12:01 PM, Kevin E. Thorpe kevin.tho...@utoronto.ca mailto:kevin.tho...@utoronto.ca wrote: Is it an undocumented (at least I missed it if it's documented) feature of the reshape function to do numeric variables followed by character? I ask because that seems to be the case below. str(rcw) 'data.frame': 23 obs. of 21 variables: $ ICU : int 1 18 17 9 22 19 6 16 25 26 ... $ Q6.RC.1 : chr SM JF IW MS ... $ Q6.FT.RC.1.years : int 0 8 12 3 9 1 5 16 5 5 ... $ Q6.FT.RC.1.months: int 0 0 0 0 0 0 0 6 0 0 ... $ Q6.PT.RC.1.years : int 2 0 0 1 2 0 0 0 0 0 ... $ Q6.PT.RC.1.months: int 0 0 0 0 0 0 0 0 0 0 ... $ Q6.RC.2 : chr BA ML TM YL ... $ Q6.FT.RC.2.years : int 0 0 7 3 0 9 0 0 0 0 ... $ Q6.FT.RC.2.months: int 0 0 0 0 0 9 0 0 0 0 ... $ Q6.PT.RC.2.years : int 2 10 2 0 0 9 0 5 0 0 ... $ Q6.PT.RC.2.months: int 0 0 0 0 8 9 1 0 6 6 ... $ Q6.RC.3 : chr LL TM 9 9 ... $ Q6.FT.RC.3.years : int 6 0 9 9 9 9 0 9 0 0 ... $ Q6.FT.RC.3.months: int 0 0 9 9 9 9 0 9 0 0 ... $ Q6.PT.RC.3.years : int 0 8 9 9 9 9 0 9 0 0 ... $ Q6.PT.RC.3.months: int 0 0 9 9 9 9 1 9 4 4 ... $ Q6.RC.4 : chr 9 IW 9 9 ... $ Q6.FT.RC.4.years : int 9 0 9 9 9 9 9 9 9 9 ... $ Q6.FT.RC.4.months: int 9 0 9 9 9 9 9 9 9 9 ... $ Q6.PT.RC.4.years : int 9 12 9 9 9 9 9 9 9 9 ... $ Q6.PT.RC.4.months: int 9 0 9 9 9 9 9 9 9 9 ... This data frame needs to be converted to long format with 5 variables repeating over 4 observations. rcl - reshape(rcw,idvar=ICU,varying=2:21,direction=long,v.names=c(init,FTy,FTm,PTy,PTm)) str(rcl) 'data.frame': 92 obs. of 7 variables: $ ICU : int 1 18 17 9 22 19 6 16 25 26 ... $ time: int 1 1 1 1 1 1 1 1 1 1 ... $ init: int 0 0 0 0 0 0 0 6 0 0 ... $ FTy : int 0 8 12 3 9 1 5 16 5 5 ... $ FTm : int 0 0 0 0 0 0 0 0 0 0 ... $ PTy : int 2 0 0 1 2 0 0 0 0 0 ... $ PTm : chr SM JF IW MS ... - attr(*, reshapeLong)=List of 4 ..$ varying:List of 5 .. ..$ FTm : chr Q6.FT.RC.1.months Q6.FT.RC.2.months Q6.FT.RC.3.months Q6.FT.RC.4.months .. ..$ FTy : chr Q6.FT.RC.1.years Q6.FT.RC.2.years Q6.FT.RC.3.years Q6.FT.RC.4.years .. ..$ PTm : chr Q6.PT.RC.1.months Q6.PT.RC.2.months Q6.PT.RC.3.months Q6.PT.RC.4.months .. ..$ PTy : chr Q6.PT.RC.1.years Q6.PT.RC.2.years Q6.PT.RC.3.years Q6.PT.RC.4.years .. ..$ init: chr Q6.RC.1 Q6.RC.2 Q6.RC.3 Q6.RC.4 .. ..- attr(*, v.names)= chr init FTy FTm PTy ... .. ..- attr(*, times)= int 1 2 3 4 ..$ v.names: chr init FTy FTm PTy ... ..$ idvar : chr ICU ..$ timevar: chr time In the result, the values in the first of the varying variables goes into the last variable while the other values are shifted left. The attributes in the result are correct, but the contents of rcl$PTm are what I expected in rcl$init. sessionInfo() R version 2.11.1 Patched (2010-07-21 r52598) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US LC_NUMERIC=C
Re: [R] Unique subsetting question
Thanks-- that works for what I'm trying to do. I was also wondering, in the data frame example you gave, if I just wanted to get rid of rows where the a value is 5, how would I do that? -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550836.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
Hi Andrew, Perhaps you did not notice my previous email. The answer is still the same (see below): On Wed, Sep 22, 2010 at 1:48 PM, AndrewPage savejar...@yahoo.com wrote: How about this: s = c(aa, bb, cc, , aa, dd, , aa) n = c(2, 3, 5, 6, 7, 8, 9, 3) b = c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE) df = data.frame(n, s, b) # df is a data frame I want to display df with no value in s occurring more than once. df - df[!duplicated(df$s),] Also, I want to delete the rows where s contains . Same idea here: df[s != ,] -Ista -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550769.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting bins and frequencies from frequency table
Dear R users, I would like to great a frequency table from raw data and then access the classes/bins and their respective frequencies separately. Here the code to create the frequency tables: x1 - c(1,5,1,1,2,2,3,4,5,3,2,3,6,4,3,8) t1 - table(x1) print(t1[1]) Its easy to plot this, but how do I actually access the frequencies alone and the bins alone? Basically I am looking to get: bins - c(1, 2, 3, 4, 5, 6, 8) freq - c(3, 3, 4, 2, 2, 1, 1) When running print(t1[1]) I only get one pair. It seems to be organized that way. Is there a better way? Perhaps 'table' is not the right approach? Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
I already gave you three examples of how this works. Your last request can be done in exactly the same way. Give it a try and see what happens (use example data of course!). As a last resort you could read the documentation: ?Comparison ?Extract -Ista On Wed, Sep 22, 2010 at 2:22 PM, AndrewPage savejar...@yahoo.com wrote: Thanks-- that works for what I'm trying to do. I was also wondering, in the data frame example you gave, if I just wanted to get rid of rows where the a value is 5, how would I do that? -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550836.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting bins and frequencies from frequency table
Hi Ralf try hist() obl-hist(x1, plot=FALSE) it returns midpoints and their respective frequencies. You can specify the breakpoints as well. ?hist for details. Mike On Wed, Sep 22, 2010 at 1:44 PM, Ralf B ralf.bie...@gmail.com wrote: Dear R users, I would like to great a frequency table from raw data and then access the classes/bins and their respective frequencies separately. Here the code to create the frequency tables: x1 - c(1,5,1,1,2,2,3,4,5,3,2,3,6,4,3,8) t1 - table(x1) print(t1[1]) Its easy to plot this, but how do I actually access the frequencies alone and the bins alone? Basically I am looking to get: bins - c(1, 2, 3, 4, 5, 6, 8) freq - c(3, 3, 4, 2, 2, 1, 1) When running print(t1[1]) I only get one pair. It seems to be organized that way. Is there a better way? Perhaps 'table' is not the right approach? Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting bins and frequencies from frequency table
Ralf - Try bins = as.numeric(names(t1)) freqs = as.vector(t1) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 22 Sep 2010, Ralf B wrote: Dear R users, I would like to great a frequency table from raw data and then access the classes/bins and their respective frequencies separately. Here the code to create the frequency tables: x1 - c(1,5,1,1,2,2,3,4,5,3,2,3,6,4,3,8) t1 - table(x1) print(t1[1]) Its easy to plot this, but how do I actually access the frequencies alone and the bins alone? Basically I am looking to get: bins - c(1, 2, 3, 4, 5, 6, 8) freq - c(3, 3, 4, 2, 2, 1, 1) When running print(t1[1]) I only get one pair. It seems to be organized that way. Is there a better way? Perhaps 'table' is not the right approach? Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting bins and frequencies from frequency table
Try this: as.data.frame(table(x1)) On Wed, Sep 22, 2010 at 3:44 PM, Ralf B ralf.bie...@gmail.com wrote: Dear R users, I would like to great a frequency table from raw data and then access the classes/bins and their respective frequencies separately. Here the code to create the frequency tables: x1 - c(1,5,1,1,2,2,3,4,5,3,2,3,6,4,3,8) t1 - table(x1) print(t1[1]) Its easy to plot this, but how do I actually access the frequencies alone and the bins alone? Basically I am looking to get: bins - c(1, 2, 3, 4, 5, 6, 8) freq - c(3, 3, 4, 2, 2, 1, 1) When running print(t1[1]) I only get one pair. It seems to be organized that way. Is there a better way? Perhaps 'table' is not the right approach? Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] legend
On Sep 22, 2010, at 1:39 PM, threshold wrote: Hi, there is a function to plot survival curves: library(survival) plot.KM - function(survival, x, x_cut.off, main='', label='') { plot(survfit(survival ~ I(x = x_cut.off)), main=main) legend('bottomleft', c(expression(label = x_cut.off),expression(label x_cut.off))) } This is untested because survival is not an object for the rest of us to survfit: plot.KM - function(survival, x, x_cut.off, main='', label='') { #really kind of worried about whether this part will work, but perhaps you already have this running? plot(survfit(survival ~ I(x = x_cut.off)), main=main) # ## perhaps something like this ... untested d/t no data example legend('bottomleft', labels= bquote( .(label)=.(x_cut.off), .(label).(x_cut.off) ) ) } (And if you have made such a Surv object, perhaps naming it something other than the package name would be less confusing to us ordinary mortals.) -- David. Now, I need to determine as the argument what appears in the legend. I want plot.KM(survival, x, x_cut.off=0.5, main='', label='ABC') but what I get is ('label x_cut.off') in legend instead of 'ABC0.5'. Symbol '=' should be in mathematical form. will appreciate any help. robert -- View this message in context: http://r.789695.n4.nabble.com/legend-tp2550747p2550747.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] legend
On Sep 22, 2010, at 2:58 PM, David Winsemius wrote: On Sep 22, 2010, at 1:39 PM, threshold wrote: Hi, there is a function to plot survival curves: library(survival) plot.KM - function(survival, x, x_cut.off, main='', label='') { plot(survfit(survival ~ I(x = x_cut.off)), main=main) legend('bottomleft', c(expression(label = x_cut.off),expression(label x_cut.off))) } This is untested because survival is not an object for the rest of us to survfit: plot.KM - function(survival, x, x_cut.off, main='', label='') { #really kind of worried about whether this part will work, but perhaps you already have this running? plot(survfit(survival ~ I(x = x_cut.off)), main=main) # ## perhaps something like this ... untested d/t no data example legend('bottomleft', labels= bquote( .(label)=.(x_cut.off), ^^no^^ .(label).(x_cut.off) ) ) (labels is a argument to text() but not to legend(). That should be : legend('bottomleft', legend= bquote( .(label)=.(x_cut.off), .(label).(x_cut.off) ) ) } (And if you have made such a Surv object, perhaps naming it something other than the package name would be less confusing to us ordinary mortals.) -- David. Now, I need to determine as the argument what appears in the legend. I want plot.KM(survival, x, x_cut.off=0.5, main='', label='ABC') but what I get is ('label x_cut.off') in legend instead of 'ABC0.5'. Symbol '=' should be in mathematical form. will appreciate any help. robert David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem opening pdf device on Windows 7
On 09/22/2010 05:28 PM, anupam wrote: I can not open PDf device. Acrobat is closed. I have checked archives but could not find a solution. What should I do? cont.cdfplot(myanalysis.pdf, myanalysis$CDF, ylbl.r=Stream Length (km)) Error in pdf(file = pdffile, width = width, height = height) : unable to start device pdf In addition: Warning message: In pdf(file = pdffile, width = width, height = height) : cannot open 'pdf' file argument 'myanalysis.pdf' Then I tried: pdf() Error in pdf() : unable to start device pdf In addition: Warning message: In pdf() : cannot open 'pdf' file argument 'Rplots.pdf' And your working directory is writable for you? Otherwise, Change dir from the File menu to somewhere that is. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple Lorenz curves in one diagram
Hi group, I would like to draw multiple Lorenz curves in a single plot using data already prepared. Here is a simple example: require(lawstat) lorenz.curve(c(1,2,3),c(4,5,4)) lorenz.curve(c(1,2,3),c(4,2,1)) This example draws two separate graphs. How can I combine them in a distinguishable way? I tried ?polygon without success... Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple Lorenz curves in one diagram
On Sep 22, 2010, at 3:32 PM, Ralf B wrote: Hi group, I would like to draw multiple Lorenz curves in a single plot using data already prepared. Here is a simple example: require(lawstat) lorenz.curve(c(1,2,3),c(4,5,4)) #You can get a half-assed solution by separating the two plot calls with: par(new=TRUE) lorenz.curve(c(1,2,3),c(4,2,1), ylab=) # But it still overwrites the annotations, and there is no switch to turn them off. You can easily create a similar plot function with the legend entries commented out. The lorenz.curve code can be accessed just by typing its name. This example draws two separate graphs. How can I combine them in a distinguishable way? I tried ?polygon without success... Ralf David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] legend
On Sep 22, 2010, at 3:03 PM, David Winsemius wrote: On Sep 22, 2010, at 2:58 PM, David Winsemius wrote: On Sep 22, 2010, at 1:39 PM, threshold wrote: Hi, there is a function to plot survival curves: library(survival) plot.KM - function(survival, x, x_cut.off, main='', label='') { plot(survfit(survival ~ I(x = x_cut.off)), main=main) legend('bottomleft', c(expression(label = x_cut.off),expression(label x_cut.off))) } This is untested because survival is not an object for the rest of us to survfit: plot.KM - function(survival, x, x_cut.off, main='', label='') { #really kind of worried about whether this part will work, but perhaps you already have this running? plot(survfit(survival ~ I(x = x_cut.off)), main=main) # ## perhaps something like this ... untested d/t no data example legend('bottomleft', labels= bquote( .(label)=.(x_cut.off), ^^no^^ .(label).(x_cut.off) ) ) (labels is a argument to text() but not to legend(). That should be : legend('bottomleft', legend= bquote( .(label)=.(x_cut.off), .(label).(x_cut.off) ) ) Or not: I stripped out the suspect fitting and plotting with unreproducible code and worked on a more simple function and the above does not work: plot.KM - function(survival, x, x_cut.off, main='', label='') { plot(1~1, main=main);\ L= list(bquote(.(label) = .(x_cut.off)), bquote(.(label) . (x_cut.off) ) ) legend('bottomleft', legend=sapply(L, as.expression)) } The sapply strategy on a list of bquote()-ed expressions was suggested by Gabor Grothendieck to Thomas Lumley on rhelp in 2005 for just such an application. http://finzi.psych.upenn.edu/R/Rhelp02/archive/58403.html -- David. } (And if you have made such a Surv object, perhaps naming it something other than the package name would be less confusing to us ordinary mortals.) -- David. Now, I need to determine as the argument what appears in the legend. I want plot.KM(survival, x, x_cut.off=0.5, main='', label='ABC') but what I get is ('label x_cut.off') in legend instead of 'ABC0.5'. Symbol '=' should be in mathematical form. will appreciate any help. robert David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speeding up regressions using ddply
Why do you want to do this? If there is just a small part of the logistic regression that you are interested in, then there may be a way to compute or approximate that more quickly than doing a full glm fit on every pair. It seems unlikely that you would get much meaning out of that many full regressions, but there may be some piece that you are looking for that getting just that could lend itself to further graphing/analysis. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Alison Macalady Sent: Wednesday, September 22, 2010 5:05 AM To: r-help@r-project.org Subject: [R] speeding up regressions using ddply Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. The data set looks like this, with status as the binary dependent variable and V1:V8 as potential independent variables in the logistic regression: m - matrix(rnorm(288), nrow = 36) colnames(m) - paste('V', 1:8, sep = '') x - data.frame( status = factor(rep(rep(c('D','L'), each = 6), 3)), as.data.frame(m)) I used melt to put my data frame into a more workable format require(reshape) xm - melt(x, id = 'status') Here is the basic shape of the function I'd like to apply to every combination of variables in the dataset: h- function(df) { attach(df) log.glm - (glm(status ~ value1+ value2 , family=binomial(link=logit), na.action=na.omit)) #What I can't figure out is how to specify 2 different variables (I've put value1 and value2 as placeholders) from the xm to include in the model glm.summary-summary(log.glm) aic - extractAIC(log.glm) coef - coef(glm.summary) list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here } And then I'd like to use ddply to speed up the computations. require(pplyr) output-dddply(xm, .(variable), as.data.frame.function(h)) output I can easily do this using ddply when I only want to use 1 variable in the model, but can't figure out how to do it with two variables. Many thanks for any hints! Ali Alison Macalady Ph.D. Candidate University of Arizona School of Geography and Development Laboratory of Tree Ring Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique subsetting question
Oops, yeah I didn't see that. Thanks, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550865.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.