[R] Help With Fleiss Kappa
Hi All, I am using fleiss kappa for inter rater agreement. Are there any know issues with Fleiss kappa calculation in R? Even when I supply mock data with total agreement among the raters I do not get a kappa value of 1. instead I am getting negative values. I am using the irr package version 0.70 Any help is much appreciated. Thanks and Regards M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] arr.lcol in plotmat from diagram package
Rajesh, A quick reply to your questions concerning the diagram package: To use different colors per arrow: just define arr.col as a Matrix or a data.frame, and give the color number or colorname to each arrow: To use numbers: M - matrix(nrow=4,ncol=4,data=0) M[2,1]-1 ;M[4,2]-2;M[3,4]-3;M[1,3]-4 pp-plotmat(M,pos=c(1,2,1),curve=0.2,name=letters[1:4],lwd=1,box.lwd=2, cex.txt=0.8,arr.type=triangle,box.size=0.1,box.type=rect, box.prop=0.5,arr.len=0.6,arr.col=M) To use color names: arr.col - as.data.frame(M) arr.col[2,1]-red arr.col[4,2]-black arr.col[3,4]-darkgreen arr.col[1,3]-orange pp-plotmat(M,pos=c(1,2,1),curve=0.2,name=letters[1:4],lwd=1,box.lwd=2, cex.txt=0.8,arr.type=triangle,box.size=0.1,box.type=rect, box.prop=0.5, arr.len=0.8,arr.col=arr.col) Karline __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for (n in SystemResults$EnTime) return EnTime[n] until reaching (all)
Hi r-help-boun...@r-project.org napsal dne 12.07.2009 22:24:29: On Sun, Jul 12, 2009 at 1:05 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 12, 2009, at 3:35 PM, David Winsemius wrote: On Jul 12, 2009, at 2:53 PM, Mark Knecht wrote: As a test I tried to print down to the string (all) and then break but this code and everything I've tried so far is terribly wrong. Every attempt prints lots of error messages. I'm not grasping at all what I'm doing wrong or what's the right way to do this sort of thing. Clearly my first for loop isn't a success! for(n in SystemResults$EnTime) { if(SystemResults$EnTime[n] == (all)) break) Inside the loop, shouldn't you be comparing to n?? As you have it now, the values of that factor are probably being used as indices to itself. (Not good.) Also not good is the use of break. It looks to be fairly severely deprecated at this point Appears I am wrong about this. I was basing my assumption on this interaction with the R interpreter: ?break Error in genericForPrimitive(f) : methods may not be defined for primitive function break in this version of R But: ?Control ... suggests that break-ing out of for loops remains acceptable. David Winsemius, MD Heritage Laboratories West Hartford, CT Hi David, Thanks for the response. It is helping. I think the break is required as your suggestion doesn't exit the loop i there is more data like mine. It just skips printing the (all) but incorrectly prints the other copies down lower in the data frame: tf - factor(c(53 , 906 , 919 , 932 , 945 , 958 , 1011 , 1024 , (all), 53 , 906 , 919 , 932 , 945 , 958 , 1011 , 1024 , (all) ) ) for(n in tf ) {if (n != (all)) print(n)} [1] 53 [1] 906 [1] 919 [1] 932 [1] 945 [1] 958 [1] 1011 [1] 1024 [1] 53 [1] 906 [1] 919 [1] 932 [1] 945 [1] 958 [1] 1011 [1] 1024 whereas the else break gets me out: tf - factor(c(53 , 906 , 919 , 932 , 945 , 958 , 1011 , 1024 , (all), 53 , 906 , 919 , 932 , 945 , 958 , 1011 , 1024 , (all) ) ) for(n in tf ) {if (n != (all)) print(n) else break} [1] 53 [1] 906 [1] 919 [1] 932 [1] 945 [1] 958 [1] 1011 [1] 1024 My confusion here is really how the 'n' is being used. I thought it was just an index - a number that gets used inside of the curly braces like other languages I've used. It seems it isn't that at all but really operates as something that returns the actual value of the position in the factor. I was trying to reference the location in the list but for is already returning the value. Strange, but I'm sure there are good reasons. Please note that I anot a prgrammer and have no formal training so it hardly matters what I think! :-) I'm now wondering if I'd be better off to try using ?match to find the first position of (all) instead of using the for loop? If match like that? min(which(tf %in% (all))) Regards Petr returned a number then I think I'd be more comfortable, but maybe I should keep going the way I am. It seems I'm maybe getting the getting the correct answer now but I'm concerned that it's coming back with quotes. I can get around that using print(as.integer(x)) but all this coercion stuff that R is doing is giving me fits. I just don't have my head around it yet. None the less R is already giving me visibility into my data that I've not had before so overall the results are strongly positive. Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simulation in multilevel model
Dear List Members, I am running a random intercept and random slope (2 cross-level interactions) MLM using package lme4 (individuals nested in countries, number of 2nd level units is 21). My DV (member of a trade union or not) is dichotomous, and thus it is a logistic multilevel model. The model converges and the results are significant. I would like to ask what solution can be used, if I want to simulate the model results for a hypothetical individual. More precisely, I am interested in comparing the probabilities of being a trade union member across countries, using a hypothetical individual. For this, I have the values for all the independent variables, but a simple calculation based on the coefficients is not enough, since it does not encompass the random effects in the calculation of the probabilities. What is the solution to estimate these log odds, using the model coefficients and still including tha random effects? Thank you, -- Zoltan Fazekas MA Political Science Department CEU, Budapest [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems in plotting with abline
Dear R-users, I am using R(a package igraph) to calculate certain topological features of networks. When I try to draw a plot between these features I get an error. Following is the code I am using : * plot(met_eco_deg,met_eco_bet) lmout-lm(met_eco_bet ~ met_eco_deg) abline(lmout) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet* *met_eco_deg* and *met_eco_bet* are the two objects generated from igraph package. I don't get any error when I just use the plot function. Can anyone help me out ?Thanks in advance. Regards, Anupam Sinha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems in plotting with abline
Thanks a million for your suggestion. It works. On Mon, Jul 13, 2009 at 2:29 PM, Alain Guillet alain.guil...@uclouvain.bewrote: Hello, The error message says there is no opened graphical windows. You cannot use abline if there is no open graphical windows. Try this: plot(met_eco_deg,met_eco_bet) abline(lmout) Regards, Alain anupam sinha wrote: Dear R-users, I am using R(a package igraph) to calculate certain topological features of networks. When I try to draw a plot between these features I get an error. Following is the code I am using : * plot(met_eco_deg,met_eco_bet) lmout-lm(met_eco_bet ~ met_eco_deg) abline(lmout) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet* *met_eco_deg* and *met_eco_bet* are the two objects generated from igraph package. I don't get any error when I just use the plot function. Can anyone help me out ?Thanks in advance. Regards, Anupam Sinha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Problems in plotting with abline
Hi r-help-boun...@r-project.org napsal dne 13.07.2009 10:45:50: Dear R-users, I am using R(a package igraph) to calculate certain topological features of networks. When I try to draw a plot between these features I get an error. Following is the code I am using : * plot(met_eco_deg,met_eco_bet) lmout-lm(met_eco_bet ~ met_eco_deg) abline(lmout) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet* *met_eco_deg* and *met_eco_bet* are the two objects generated from igraph package. I don't get any error when I just use the plot function. Can anyone help me out ?Thanks in advance. I do not have any experience in igraph but the error from abline states that plot is not initialised. I suspected that igraph is based on grid graphics and in this case you can not easily mix base and grid graphics, however from docs it seems that plain igraph plots are based on base graphics therefore it shall work. If your syntax is really only those 3 lines one after another do you see a plot after plot(met_eco_deg,met_eco_bet)? If not this is the problem. If after plot some plot appears and abline issues an error, than the problem is not so simple and without reproducible example it would be quite tricky to find out what is going on. Regards Petr Regards, Anupam Sinha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Combine two matricies
Hi, I have two matricies a and x: a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]431 [3,]542 x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]3 NA NA [2,] NA22 [3,] NA52 I wish to combine these two into one matrix using the values from x where x has values, and values from a where x has NA's, giving a new matrix which would look like this: ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]422 [3,]552 I want an automatic way of doing this as my actual application is a much larger matrix. Thanks in advance Tom _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combine two matricies
try this: a - matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3) x - matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3) ind - is.na(x) x[ind] - a[ind] x I hope it helps. Best, Dimitris Tom Liptrot wrote: Hi, I have two matricies a and x: a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]431 [3,]542 x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]3 NA NA [2,] NA22 [3,] NA52 I wish to combine these two into one matrix using the values from x where x has values, and values from a where x has NA's, giving a new matrix which would look like this: ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]422 [3,]552 I want an automatic way of doing this as my actual application is a much larger matrix. Thanks in advance Tom _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combine two matricies
On Mon, Jul 13, 2009 at 11:31 AM, Tom Liptrot tomlipt...@hotmail.com wrote: I wish to combine these two into one matrix using the values from x where x has values, and values from a where x has NA's, giving a new matrix which would look like this: This should do the trick: x[which(is.na(x))]=a[which(is.na(x))] -- Michael Knudsen micknud...@gmail.com http://www.google.com/profiles/micknudsen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combine two matricies
The following code should do it. This assumes matrices a and x are of the same dimension which is why you can index a as below x[is.na(x)==TRUE]-a[is.na(x)==TRUE] -- View this message in context: http://www.nabble.com/Combine-two-matricies-tp24458609p24458797.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add grand mean to every entry in a matrix
Hi, I have a matrix: mymat - matrix(runif(10*4), ncol=4) I wish to subtract the column means, down the colums, subtract the row means from the rows and add back the grand mean - all the means should be the means of mymat rather than of the new matrix. How can I do this? Any help much appreciated. Thanks Tom _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Combine two matricies
Hi matrix is virtually a vector so you can find index of values of x and add those values to propper places in ax ax - a idx - which(!is.na(x)) ax[idx] - x[idx] ax [,1] [,2] [,3] [1,]321 [2,]422 [3,]552 Regards Petr r-help-boun...@r-project.org napsal dne 13.07.2009 11:31:12: Hi, I have two matricies a and x: a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]431 [3,]542 x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]3 NA NA [2,] NA22 [3,] NA52 I wish to combine these two into one matrix using the values from x where x has values, and values from a where x has NA's, giving a new matrix which would look like this: ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3) [,1] [,2] [,3] [1,]321 [2,]422 [3,]552 I want an automatic way of doing this as my actual application is a much larger matrix. Thanks in advance Tom _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add grand mean to every entry in a matrix
On Mon, 2009-07-13 at 09:06 +, Tom Liptrot wrote: Hi, I have a matrix: mymat - matrix(runif(10*4), ncol=4) I wish to subtract the column means, down the colums, subtract the row means from the rows and add back the grand mean - all the means should be the means of mymat rather than of the new matrix. How can I do this? Any help much appreciated. See ?sweep as one way to remove a set of statistics from a matrix. To compute the statistics to sweep, you should look at ?rowMeans and ?colMeans, plus ?mean (for the overall mean of the matrix). The function below encapsulates the various steps that will do the manipulation for you. I've added a conversion for the input object if it is a data frame as mean() works differently on a matrix compared to a df. (Your actual data is more likely to be in a data frame than a matrix initially.) set.seed(1) mymat - matrix(runif(10*4), ncol=4) grandMean - function(m) { if((df - is.data.frame(m))) m - data.matrix(m) rm - rowMeans(m) cm - colMeans(m) gm - mean(m) m - sweep(m, 1, rm, -) # 1 means from the rows m - sweep(m, 2, cm, -) # 2 means from the cols m - m + gm # here we treat m as a vector if(df) # return to a data.frame if one originally m - as.data.frame(m) return(m) } grandMean(mymat) grandMean(as.data.frame(mymat)) HTH G Thanks Tom -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to perform a Likelihood-ratio test?
Hi. I would like to use a likelihood-ratio test to compare a linear model (lm()) to another linear model containing the first one to see if the extra factors are needed - but I was unable to find any help on how to do that. Could you help me and tell me what to do? Or tell me where to find help on this topic? Many thanks in advance! Lars _ s. It's easy! aspxmkt=en-us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug in package.skeleton, R 2.9.0?
Dear all, I am using package.skeleton to build a small packages of misc function for personal use. I have recently discovered that the option force=TRUE doesn't seem to do what is meant to do. Here's what I'm doing: setwd(/Users/danielk/Documents/R/packages/dk) files - paste(codebase, dir(codebase, pattern=.R), sep=/) package.skeleton(name=dk, force=TRUE, code_files=files) Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Copying code files ... Making help files ... Done. Further steps are described in './dk/Read-and-delete-me'. Now, everything seems fine, but changes to files in me codebase folder, doesn't come along if the folder dk/R already contains the files, even though I use force=TRUE. If I remove the dk/R folder or the dk folder altogether, the changes come along so to me it seems that it's the overwrite part that doesn't work as it should - or am I doing something wrong here? To me, it seems that the function safe.dir.create (which is defined in package.skeleton never overwrites folders, yielding force=TRUE useless. See below for sessionInfo. Thanks a bunch Daniel sessionInfo() R version 2.9.0 (2009-04-17) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.0 -- Contact information: Daniel Klevebring M. Sc. Eng., Ph.D. Student Dept of Gene Technology Royal Institute of Technology, KTH SE-106 91 Stockholm, Sweden Visiting address: Roslagstullsbacken 21, B3 Delivery address: Roslagsvägen 30B, 104 06, Stockholm Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075, SE-10450 Stockholm E-mail: dan...@biotech.kth.se Phone: +46 8 5537 8337 (Office) Phone: +46 704 71 65 91 (Mobile) Web: http://www.biotech.kth.se/genetech/index.html Fax: +46 8 5537 8481 -- Contact information: Daniel Klevebring M. Sc. Eng., Ph.D. Student Dept of Gene Technology Royal Institute of Technology, KTH SE-106 91 Stockholm, Sweden Visiting address: Roslagstullsbacken 21, B3 Delivery address: Roslagsvägen 30B, 104 06, Stockholm Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075, SE-10450 Stockholm E-mail: dan...@biotech.kth.se Phone: +46 8 5537 8337 (Office) Phone: +46 704 71 65 91 (Mobile) Web: http://www.biotech.kth.se/genetech/index.html Fax: +46 8 5537 8481 MSN messenger: klevebr...@msn.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pbc data
Hi there, Can anyone please help me because I am going to get crazy with the pbc data set. I just want to apply simple cox regression in the data set. I am a beginner in R but I don't think I am doing anything wrong. I have the book of Fleming and Harrington 1990. I perform cox regression by typing: out- coxph(Surv(times/365,status)~log(bili)+log(proth)+edema+log(albumin)+age) out Call: coxph(formula = Surv(times/365, status) ~ log(bili) + log(proth) + edema + log(albumin) + age) coef exp(coef) se(coef) z p log(bili) 0.8636 2.3716 0.08294 10.41 0.0e+00 log(proth) 2.3868 10.8791 0.76851 3.11 1.9e-03 edema 0.8963 2.4505 0.27141 3.30 9.6e-04 log(albumin) -2.5069 0.0815 0.65292 -3.84 1.2e-04 age 0.0396 1.0404 0.00767 5.16 2.4e-07 Likelihood ratio test=231 on 5 df, p=0 n=416 (2 observations deleted due to missingness) Which is not exactly what fleming presents in table 4.6.3 page 195. Edema coef is 0.8592. I searched the net (please google: modern regression methods for survival data stockholm) and there some slides there. The results of the slides are again slightly different and the analysis seems to be in R. So we all disagree. (I notice that 2 observation are excluded in my analysis, but not in the slides' analysis) Also I check the age column, in the book the first two elemets are, 58.7652 and 56.4463 while in R 58.80548 56.48493. Age looks totally different. Can anyone please help me. Also in the article Survival model predictive accuracy and ROC curves , which is free just google it, in page 27 the results seem to be different again. (caution with log(alb) or alb). Does anyone has any information about this data set that clarifies things for me?? Leo. ___ ×ñçóéìïðïéåßôå Yahoo!; ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] zero cells in one variable in logistic regression
Dear all. I am sort of beginner with R. I do logistic regression with binomial response variable and several continuous and categorical variables. In one categorical variable, zero cell occures (2x2 table looks like 7 - 0 23 - 25 This leads to overestimating of odds ratio and inflated confidence interval for odds for given variable. The variable is significant in univariate test. I do not necessarilly need odd ratio, but I need the explained deviance by this variable and I really want to keep this variable in the model. It probably matters for explained deviance. How to treat this problem? Thanks for help, Anna Bucharova -- View this message in context: http://www.nabble.com/zero-cells-in-one-variable-in-logistic-regression-tp24458629p24458629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Times series adjustment
Dear all, I want make correction depth of a bathymetric data set. To do so, I have the depth data set sample every second (a depth at each second) in one hand, and in the other hand, I have a tide variation level data set sample every 250 ms. The time register in each data sets (tide and bathymetric) is express in seconds followinf this format : hh:mm:ss.ss I would like to rectify the depth (bathymetry) by substrating the tide at each depth time. I tried to construct time series using pastecs package but didn't succed Example of the tide data set: Time Correction 11:55:07.58 -2.64 11:55:07.68 -2.64 11:55:07.88 -2.64 11:55:08.28 -2.64 11:55:08.38 -2.64 Example of the bathymetric data: XY Date Time Depth -2.620748 48.75121 01/00/00 12:07:16.000 -25.6 -2.620714 48.75121 01/00/00 12:07:17.000 -25.8 -2.620698 48.75121 01/00/00 12:07:17.000 -26.0 -2.620682 48.75121 01/00/00 12:07:18.000 -26.1 -2.620667 48.75121 01/00/00 12:07:18.000 -26.1 How can I construct the time series based on that two data sets allowing me then to to the depth rectification ? Regards -- Cordialement Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zero cells in one variable in logistic regression
On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote: Dear all. I am sort of beginner with R. I do logistic regression with binomial response variable and several continuous and categorical variables. In one categorical variable, zero cell occures (2x2 table looks like 7 - 0 23 - 25 This leads to overestimating of odds ratio and inflated confidence interval for odds for given variable. The variable is significant in univariate test. I do not necessarilly need odd ratio, but I need the explained deviance by this variable and I really want to keep this variable in the model. It probably matters for explained deviance. How to treat this problem? Thanks for help, Anna Bucharova -- You might consider glmrob in package:robustbase. See http://www.jstatsoft.org/v10/i04 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Times series adjustment
Try this: # First read in the data using chron times class Lines.Bary - XY Date Time Depth -2.620748 48.75121 01/00/00 12:07:16.000 -25.6 -2.620714 48.75121 01/00/00 12:07:17.000 -25.8 -2.620698 48.75121 01/00/00 12:07:17.000 -26.0 -2.620682 48.75121 01/00/00 12:07:18.000 -26.1 -2.620667 48.75121 01/00/00 12:07:18.000 -26.1 Lines.Depth - Time Correction 11:55:07.58 -2.64 11:55:07.68 -2.64 11:55:07.88 -2.64 11:55:08.28 -2.64 11:55:08.38 -2.64 library(chron) DF.Depth - read.table(textConnection(Lines.Depth), header = TRUE, as.is = TRUE) DF.Depth$Time - times(DF.Depth$Time) DF.Bary - read.table(textConnection(Lines.Bary), header = TRUE, as.is = TRUE) DF.Bary$Time - times(DF.Bary$Time) # Now perform the calculation using findInterval DF.Bary$Depth - DF.Bary$Depth + DF.Depth[findInterval(DF.Bary$Time, DF.Depth$Time), Correction] On Mon, Jul 13, 2009 at 6:11 AM, Poizot Emmanuelemmanuel.poi...@cnam.fr wrote: Dear all, I want make correction depth of a bathymetric data set. To do so, I have the depth data set sample every second (a depth at each second) in one hand, and in the other hand, I have a tide variation level data set sample every 250 ms. The time register in each data sets (tide and bathymetric) is express in seconds followinf this format : hh:mm:ss.ss I would like to rectify the depth (bathymetry) by substrating the tide at each depth time. I tried to construct time series using pastecs package but didn't succed Example of the tide data set: Time Correction 11:55:07.58 -2.64 11:55:07.68 -2.64 11:55:07.88 -2.64 11:55:08.28 -2.64 11:55:08.38 -2.64 Example of the bathymetric data: X Y Date Time Depth -2.620748 48.75121 01/00/00 12:07:16.000 -25.6 -2.620714 48.75121 01/00/00 12:07:17.000 -25.8 -2.620698 48.75121 01/00/00 12:07:17.000 -26.0 -2.620682 48.75121 01/00/00 12:07:18.000 -26.1 -2.620667 48.75121 01/00/00 12:07:18.000 -26.1 How can I construct the time series based on that two data sets allowing me then to to the depth rectification ? Regards -- Cordialement Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Partial Correlation
Why do we get Partial correlation values greater than 1? I have used the default function pcor.mat :-- I have manipulated the default pcor.mat function a bit so ignore tha variables corr_type,element1_in_no,element2_in_no,P.Please ignore the “pairwise” section and have a look at athe “listwise ” part i.e else part. *pcor.mat - function(x,y,z,method=p,na.rm=T,corr_type,element1_in_no,element2_in_no,P){ * ** *print(pcor.mat)* *x - c(x)* *y - c(y)* *z - as.data.frame(z)* *print(z)* *#print(element1_in_no)* *#print(element1_in_no)* ** * * *if(dim(z)[2] == 0){ * *stop(There should be given data\n)* *}* * * *data - data.frame(x,y,z)* ** ** *if(corr_type==pairwise)* *{* ** *print(inside pairwise)* *rxx.z -P[as.numeric(element1_in_no),as.numeric(element2_in_no)]* *#print(rxx.z)* *#print(rxx.z)* ** *return(rxx.z)* * * *}* *else* *{* *print(inside listwise)* *if(na.rm == T){* *data = na.omit(data)* *}* ** *xdata - na.omit(data.frame(data[,c(1,2)])) #i1,C1* *print(printing xdata...)* *print(xdata)* *Sxx - cov(xdata,xdata,m=method)* * print(Sxx...)* *print(Sxx )* * * *xzdata - na.omit(data)* *xdata - data.frame(xzdata[,c(1,2)])* *zdata - data.frame(xzdata[,-c(1,2)])* *print(zdata..)* *print(zdata)* *Sxz - cov(xdata,zdata,m=method)* * print(Sxz. )* *print(Sxz)* * * *zdata - na.omit(data.frame(data[,-c(1,2)]))* *Szz - cov(zdata,zdata,m=method)* *print(Szz )* *print(Szz)* *#print(new type par corr)* *#P-partialCorr_matrix(data)* ** *}* ** ** *# is Szz positive definite?* *zz.ev - eigen(Szz)$values* *if(min(zz.ev)[1]0){* ** *stop(\'Szz\' is not positive definite!\n) * *}* ** ** *# partial correlation* *Sxx.z - Sxx - Sxz %*% solve(Szz) %*% t(Sxz)* *print(Sxx.z)* *print(qr(Sxx.z))* *rxx.z - cov2cor(Sxx.z)[1,2]* ** *return(rxx.z)* *}* *Cov2cor function:-* *cov2cor-function (V) * *{* * print(inside cov2cor)* * * * * * p - (d - dim(V))[1]* *if (!is.numeric(V) || length(d) != 2L || p != d[2L]) * *stop('V' is not a square numeric matrix)* *Is - sqrt(1/diag(V))* *print(Is)* *print(Is)* *if (any(!is.finite(Is))) * *warning(diag(.) had 0 or NA entries; non-finite result is doubtful)* *r - V* *r[] - Is * V * rep(Is, each = p)* * print(r)* * print(r[])* * * *r[cbind(1L:p, 1L:p)] - 1* *r* ** *}* * * Sxx , Sxz , Szz all these three values I have calculated and they match with SPSS result. After that I am not understanding why my partial correlation result doesn’t match with the SPSS result. Sxx.z - Sxx - Sxz %*% solve(Szz) %*% t(Sxz) # how to calculate in SPSS.After this as you can see in pcor.mat function rxx.z - cov2cor(Sxx.z)[1,2] is computed. Don’t know where things are going wrong.** *I will attach the datas of categories and items which form the categories.* *Category 1 has items -1,5 * *Category 2 has items – 2,4,6,7,8,9,11* *Category 3 has items -3,12,14* *Category 4 has items -10,13,15* * Category values are actually mean of the items which form the category.* Here in pcor.mat
Re: [R] Splitting dataset for Tuning Parameter with Cross Validation
Seems to me if splitting once for all the bias will be big and if splitting once for each choice of parameters the variance ill be big. In LibSVM, for each choice of (c, gamma), the searching script grid.py calls svm_cross_validation() which has a random split of the dataset. So seems to me it is the second method. As to the first one, I come to it in Ch 7 Section 10 of The Elements of Statistical Learning by Hastie where it says first split the dataset, then evaluate validation error CV(alpha) and vary the complexity parameter value alpha to find the one giving smallest validation error. It appears to me the splitting is once for all choices of the complexity parameter. Thanks! --- On Sun, 7/12/09, Tim timlee...@yahoo.com wrote: From: Tim timlee...@yahoo.com Subject: [R] Splitting dataset for Tuning Parameter with Cross Validation To: r-h...@stat.math.ethz.ch Date: Sunday, July 12, 2009, 6:58 PM Hi, My question might be a little general. I have a number of values to select for the complexity parameters in some classifier, e.g. the C and gamma in SVM with RBF kernel. The selection is based on which values give the smallest cross validation error. I wonder if the randomized splitting of the available dataset into folds is done only once for all those choices for the parameter values, or once for each choice? And why? Thanks and regards! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help With Fleiss Kappa
If you apply the function to a simple dataframe or show your code, you might be able to get more accurate help. I've used the IRR package in the past and haven't noticed any problems (maybe I overlooked them ?) david freedman mehdi ebrahim wrote: Hi All, I am using fleiss kappa for inter rater agreement. Are there any know issues with Fleiss kappa calculation in R? Even when I supply mock data with total agreement among the raters I do not get a kappa value of 1. instead I am getting negative values. I am using the irr package version 0.70 Any help is much appreciated. Thanks and Regards M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-With-Fleiss-Kappa-tp24456963p24461821.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fliess Kappa
Hello Everyone I am calculating Fleiss Kappa, I have 28 raters, 5 Subjects and 5 ratings (i.e. A dataset with 28 columns and 5 rows). The problem is that there are 2 missing values in the data. (Right now I have manually deleted the values in csv file and then imported it in R 1) Would it better to replace those with 0 or should those be omitted? By omission I will be left with only 26 raters. 2) Also when we use na.omit(objectname) it deleted missing values row-wise, how can data be omitted column-wise? 3) My second problem is that overall agreement comes to zero, whereas the data is not showing agreement to be close to zero. The ratings 4 and 5 are being given maximum time (the ratings are 1: total disagreement and 5: Total Agreement) whereas for 1 and 2 its very less. Results I got using agree() agree(demoA) Percentage agreement (Tolerance=0) Subjects = 5 Raters = 26 %-agree = 0 can any1 point where I might be wrong? 4) Can we calculate percentage of agreement for each subject? If yes how? Please help me for these queries Thank you very much in advance Regards Sunita -- View this message in context: http://www.nabble.com/Fliess-Kappa-tp24460901p24460901.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R in rearranging equations
Hi, can anyone tell me if R can be used to rearrange very complicated equations in terms of one of the variables? I have: dx/dt = a*b*m*y*(1-x)-r*x and, having set: dy/dx = 0, need to rearrange in terms of x. The problem I have is that I don't know how to rearrange equations when the variables are not yet defined (I get messages warning me that x etc can't be found). I'm not sure if R can actually do this (though I know programs like Maxima can) but ideally the solution we get should be in R. I am also aware that this isn't difficult to solve by hand, but the remaining equations I have to work through have up to double the number of variables. Any help will be appreciated! Stenort -- View this message in context: http://www.nabble.com/R-in-rearranging-equations-tp24461619p24461619.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subset range selections
How would I write the two selections each in a single subset command? 1) Two non-overlapping time ranges I want to collect together - before 10AM and after noon. Should be an OR function: X = subset(A, t1000) + subset(A, t1200) 2) One range between two defined times like after 10AM and before noon. Should be an AND function: X1 = subset(A, t1000) X = subset(X1, t1200) Thanks, Mark P.S. - The help system seems very difficult for finding this sort of information! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anomaly in sample() function
Hi all Maybe someone knows a way to solve this anomaly in sample(): I like to compute a sample (n=100) with replications from a population of 2500 units but if I draw repeated samples from it I dont get what seems to be a representative sample if I look at other partitions of the population. Enclosed is the population g99 with 4 columns: (units, partition 1 (site), partition 2 (type), weights); and my R program. The problem: Some categories from partition 2 (type) which I use to check for representativeness, deviates up to 20 percentage points from the population. As criterion I have computed the mean difference and the SD of the relative frequencies between sample and pop. What mean deviation is to expect? Thanks for any ideas, W. Polasek dimnames(g99)[[1]] =paste(g99[,1]) s1= g99[paste(sample(g99[,1], 100, F, g99[,4])),1:4] dim(s1) j2 =table(s1[,3]) #sample density j2g= table(g99[,3]) #pop density chisq.test(j2g,j2) p2=100*j2g / sum(j2g) #rel. frequency in pop pd=p2-100* j2/sum(j2) #difference of rel. frequency between pop and sample round(rbind(j2g, p2, pd),2) sum(abs(pd));sd(pd) #look for the 'best' representative sample __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 95% Confidence Intervals for AUC - $auc.samples from the Daim Package
Hi I am trying to perform a bootstrap estimate of classification accuracy of a logistic regression using the 'Daim' package in r using the code at the bottom of this post, this all works great and I get the .632+ misclassification accuracy, specificity, sensitivity, AUC etc etc but what I would like is to access the list of AUC for each of the bootstrap samples as I need calculate the 95% confidence intervals for the AUC of the ROC curve for this data using this model. I was hoping to get this from the 2.5% and 97.5% quartiles of the AUCs of the 10,000 bootstrap samples that were calculated. I have set returnSample to TRUE which I think should mean the data from each bootstrap is saved and when I run auc(x) I get the following auc(x,auc.samples) $auc.632p [1] 0.8296788 $auc.632 [1] 0.8302014 $auc.loob [1] 0.8216521 $auc.app [1] 0.8455208 $auc.samples list() my question is how do I access the $auc.samples list() or write it to a csv file or something similar to get the 2.5% and 96.5% quartile range from? or is there another way to get the 95% CI for the AUC? Many thanks in advance Lara lara.har...@bbsrc.ac.uk library(Daim) mylda - function(formula,train,test){ model - lda(formula,train) predict(model,test)$posterior[,pos] } x-Daim(OUTBREAK2 ~ COHESION.2 + COHESION.23 + ED.2 + PLAND.12 + PLAND.2 + PLAND.25 + PLAND.26 + ALTITUDE_MEAN + SLOPE_MEAN + LSI, model=mylda,data=landscape,labpos=1, control=Daim.control(method=boot,number=1),returnSample=TRUE, cutoff=0.71) auc(x) summary(x) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset range selections
1) X - subset(A, (t 1000) | (t 1200)) 2) X - subset(A, (t 1000) (t 1200)) On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote: How would I write the two selections each in a single subset command? 1) Two non-overlapping time ranges I want to collect together - before 10AM and after noon. Should be an OR function: X = subset(A, t1000) + subset(A, t1200) 2) One range between two defined times like after 10AM and before noon. Should be an AND function: X1 = subset(A, t1000) X = subset(X1, t1200) Thanks, Mark P.S. - The help system seems very difficult for finding this sort of information! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in rearranging equations
rSymPy and Ryacas can do algebraic manipulation. See http://rsympy.googlecode.com http://ryacas.googlecode.com library(rSymPy) Loading required package: rJava sympy(var('a b m r y x')) [1] (a, b, m, r, y, x) sympy(solve(a*b*m*y*(1-x)-r*x, x)) [1] [-a*b*m*y/(-r - a*b*m*y)] library(Ryacas) a - Sym(a); b - Sym(b); m - Sym(m); r - Sym(r) x - Sym(x); y - Sym(y) Solve(a*b*m*y*(1-x)-r*x, x) [1] Starting Yacas! expression(list(x == a * b * m * y/(a * b * m * y + r))) On Mon, Jul 13, 2009 at 9:23 AM, stenortsten...@merchantsconnected.org.uk wrote: Hi, can anyone tell me if R can be used to rearrange very complicated equations in terms of one of the variables? I have: dx/dt = a*b*m*y*(1-x)-r*x and, having set: dy/dx = 0, need to rearrange in terms of x. The problem I have is that I don't know how to rearrange equations when the variables are not yet defined (I get messages warning me that x etc can't be found). I'm not sure if R can actually do this (though I know programs like Maxima can) but ideally the solution we get should be in R. I am also aware that this isn't difficult to solve by hand, but the remaining equations I have to work through have up to double the number of variables. Any help will be appreciated! Stenort -- View this message in context: http://www.nabble.com/R-in-rearranging-equations-tp24461619p24461619.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ?max (so far...)
Belated answer: A few remarks regarding your questions: Your running max problem could be solved in the following way: (which is a soution based o Duncan Murdoch's suggestion, but a little bit more general. foldOrbit-function(x,fun){ res-numeric(length(x)) res[1]-x[1] for (i in 2:length(x)) res[i]-fun(res[i-1],x[i]) res } or more generally applySliding-function(x,fun,winlength=length(x)){ res-numeric(length(x)) for (i in seq_along(x)) {res[i]-fun(x[(max(1,i-winlength+1)):i])} res } foldOrbit(x,max) will give you the running maxes of vector x. For max, taking the max of the max of the sequence without the last element and the last element gives the max of the whole sequence. It also works for min, sum, prod (all these are associative). applySliding is more general. The second argument is the function you want to apply in running mode. If you do not give the winlength, it will apply the function in running mode an give correct result for nonassociatve functions also. If you give the winlength, it will only use the last winlength elements of the vector. Examples: foldOrbit(1:10,max) applySliding(1:10,max) applySliding(1:10,max,3) And now, for something completely different: You seem to want to combine Excel and R in you work. Possibly you can make your work easier if you user RExcel, which is an add-in allowing to use R from within Excel. Information is available at rcom.univie.ac.at and there is (half hour long) video demonstrating how to use R from within Excel. On Jul 1, 2009, at 10:26 PM, Mark Knecht wrote: On Wed, Jul 1, 2009 at 12:54 PM, Duncan Murdochmurd...@stats.uwo.ca wrote: On 01/07/2009 1:26 PM, Mark Knecht wrote: On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdochmurd...@stats.uwo.ca wrote: On 01/07/2009 11:49 AM, Mark Knecht wrote: Hi, I have a data.frame that is date ordered by row number - earliest date first and most current last. I want to create a couple of new columns that show the max and min values from other columns *so far* - not for the whole data.frame. It seems this sort of question is really coming from my lack of understanding about how R intends me to limit myself to portions of a data.frame. I get the impression from the help files that the generic way is that if I'm on the 500th row of a 1000 row data.frame and want to limit the search max does to rows 1:500 I should use something like [1:row] but it's not working inside my function. The idea works outside the function, in the sense I can create tempt1[1:7] and the max function returns what I expect. How do I do this with row? Simple example attached. hp should be 'highest p', ll should be 'lowest l'. I get an error message Error in 1:row : NA/NaN argument Thanks, Mark SNIP HighLow = function (MyFrame) { temp1 - MyFrame$p[1:row] MyFrame$hp - max(temp1) ## Highest p temp1 - MyFrame$l[1:row] MyFrame$ll - min(temp1) ## Lowest l return(MyFrame) } You get an error in this function because you didn't define row, so R assumes you mean the function in the base package, and 1:row doesn't make sense. What you want for the highest so far is the cummax (for cumulative maximum) function. See ?cummax. Duncan Murdoch Duncon, OK, thanks. That makes sense, as long as I want the cummax from the beginning of the data.frame. (Which is exactly what I asked for!) How would I do this in the more general case if I was looking for the cummax of only the most recent 50 rows in my data.frame? What I'm trying to get down to is that as I fill in my data.frame I need to be able get a max or min or standard deviation of the previous so many rows of data - not the whole column - and I'm just not grasping how to do this. Is seems like I should be able to create a data set that's only a portion of a column while I'm in the function and then take the cummax on that, or use it as an input to a standard deviation, etc.? What you describe might be called a running max. The caTools package has a runmax function that probably does what you want. More generally, you can always write a loop. They aren't necesssrily fast or elegant, but they're pretty general. For example, to calculate the max of the previous 50 observations (or fewer near the start of a vector), you could do x - ... some vector ... result - numeric(length(x)) for (i in seq_along(x)) { result[i] - max( x[ max(1, i-49):i ]) } Duncan Murdoch Thanks for the pointer. I'll check it out. Today I've managed to get pretty much all of my Excel spreadsheet built in R except for some of the charts. It took me a week and a half in Excel. This is my 3rd full day with R. Charts are next. I appreciate your help and the help I've gotten from others. Thanks so much. cheers, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
Re: [R] Help With Fleiss Kappa
At 6:48 PM +1200 7/13/09, mehdi ebrahim wrote: Hi All, I am using fleiss kappa for inter rater agreement. Are there any know issues with Fleiss kappa calculation in R? Even when I supply mock data with total agreement among the raters I do not get a kappa value of 1. instead I am getting negative values. I am using the irr package version 0.70 Any help is much appreciated. Mehdi, Are you by any chance giving the function the agreement matrix rather than the raw data? The kappam.fliess function seems to want the ratings rather than the agreement matrix. Bill -- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Attend ISSID/ARP:2009 http://issid.org/issid.2009/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zero cells in one variable in logistic regression
dear anna, if you are not interested in point estimate and SE of the parameter of the aforementioned categorical variable, I believe the conventional glm(..,family=binomial) is correct. In particular, the returned deviance is reliable and also it is the relevant likelihood ratio test.. hope this helps, vito David Winsemius ha scritto: On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote: Dear all. I am sort of beginner with R. I do logistic regression with binomial response variable and several continuous and categorical variables. In one categorical variable, zero cell occures (2x2 table looks like 7 - 0 23 - 25 This leads to overestimating of odds ratio and inflated confidence interval for odds for given variable. The variable is significant in univariate test. I do not necessarilly need odd ratio, but I need the explained deviance by this variable and I really want to keep this variable in the model. It probably matters for explained deviance. How to treat this problem? Thanks for help, Anna Bucharova -- You might consider glmrob in package:robustbase. See http://www.jstatsoft.org/v10/i04 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Heckman Selection MOdel Help in R
On Mon, Jul 13, 2009 at 11:18 AM, Pathak, Sauravs.patha...@imperial.ac.uk wrote: Dear Arne I have gone through the paper and I have tried it at my end, I would really appreciate if you could address the following: 1. Based upon your suggestion I used the following: regmod2 - selection(s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata, method = 2step) On trying the above( notice that I have changed heckit to selection in the above command, i get the following error message Error in coef.probit(result$probit) : could not find function coef.maxLik That's weird. Which versions of R, sampleSelection, and maxLik do you use? Before trying the above I tried the following: 2. When I tried to do the Heckman selection model in stages , the first run was successful, I mean, using the following: myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + + imf_pop + estbbo_m, family = binomial(link = probit)) summary(myProbit) I am successful upto this point, but 3. When I try calculating the IMR using the following: adpopdata$IMR-invMillsRatio(myProbit)$IMR1 I get the error below Error in `$-.data.frame`(`*tmp*`, IMR, value = c(2.50039945424535, : replacement has 257358 rows, data has 343251 I guess that you have some NAs in the data so that you have the IMRs not for all observations but only for the observations witout NAs. R myIMRs - invMillsRatio(myProbit)$IMR1 should work. Is there a code to calculate IMR by hand?? Yes, inside invMillsRatio() However, why do you want to do this? what I see is that the number of rows of IMR calculated and the number of rows in the actual data set do not match (may be some missing value issues, I am not sure, if it is, how to fix it?) and hence IMR could not be added to my original data set, how do I fix this and then proceed to get correct IMR to use in my outcome equation (the OLS stage) This is really taking a lot of time, I am working on it for weeks, can you please help me kindly, If you wish I can send you the data set as well Please try to fix it yourself. Arne -Original Message- From: Arne Henningsen [mailto:arne.henning...@googlemail.com] Sent: 13 July 2009 00:56 To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee Subject: Re: Heckman Selection MOdel Help in R Hi Saurav! On Sun, Jul 12, 2009 at 6:06 PM, Pathak, Sauravs.patha...@imperial.ac.uk wrote: I am new to R, I have to do a 2 step Heckman model, my selection equation is below which I was successful in running but I am unable to proceed further, I have so far used the following command glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, family = binomial(link = probit)) My question is 1. How do i discard the non significant selection variables (one out of the seven variables above is non-significant) and calculate the Inverse Mills Ratio of the significant variables 2. I need the inverse mills ratio from the above to run the outcome equation model using OLS with the Inverse mills ratio calculated on the basis of the above probit as the control in my outcome equation, hence I need to get the IMR (Is there another direct way?) 3. How can this be done in R using my concept or otherwise does there exist another way of doing what I wish to achieve On trying regmod - heckit(s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata,method=2step) I get Error: could not find function heckit Error: could not find function invMillsRatio Am I missing out something, do i have to install something apart from R also, so far I have used install.packages( sampleSelection, repos=http://R-Forge.R-project.org; ) install.packages(Rcmdr, dependencies=TRUE) Even then I am unable to run heckit, please help You have to install (only once) and *load* the package before you can use it: R library( sampleSelection ) I suggest that you do NOT use function heckit but function selection, because the latter allows you to estimate the model both by the 2-step and the 1-step (ML) method (depending on argument method). Our paper about the sampleSelection package published in the JSS could be also helpful for you: http://www.jstatsoft.org/v27/i07/ Arne -- Arne Henningsen http://www.arne-henningsen.name -- Arne Henningsen http://www.arne-henningsen.name __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plm Issues
Duh! Thanks and good advice. I was using 2.7.2 because it was, until recently, the latest version working with RPy (http://rpy.sourceforge.net/). Also didn't realize plm was still actively developed. Interesting that since plm now correctly handles diff and lag operations, it actually breaks with the behavior of lm: a=ts(c(1,2,4)) lm(a~diff(a)) Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels = TRUE) : variable lengths differ (found for 'diff(a)') To regress a on its difference, one needs the more laborious: a=ts(c(1,2,4)) adata=as.data.frame(cbind(a,diff(a))) colnames(adata)=c(a,diffa) lm(a~diffa,data=adata) Call: lm(formula = a ~ diffa, data = adata) Coefficients: (Intercept)diffa 02 From the R help Fitting Linear ModelsUsing time series Considerable care is needed when using lm with time series. Unless na.action = NULL, the time series attributes are stripped from the variables before the regression is done. (This is necessary as omitting NAs would invalidate the time series attributes, and if NAs are omitted in the middle of the series the result would no longer be a regular time series.) Even if the time series attributes are retained, they are not used to line up series, so that the time shift of a lagged or differenced regressor would be ignored. It is good practice to prepare a data argument by ts.intersectts.union.html(..., dframe = TRUE), then apply a suitable na.action to that data frame and call lm with na.action = NULL so that residuals and fitted values are time series. On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote: The first think one need to do when has a so old version, is update it :-) After, if the problem remain, try get help with the colleagues. best milton On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote: Hi List I'm having difficulty understanding how plm should work with dynamic formulas. See the commands and output below on a standard data set. Notice that the first summary(plm(...)) call returns the same result as the second (it shouldn't if it actually uses the lagged variable requested). The third call results in error (trying to use diff'ed variable in regression) Other info: I'm running R 2.7.2 on WinXP cheers *data(Gasoline,package=Ecdat) Gasoline_plm-plm.data(Gasoline,c(country,year)) pdim(Gasoline_plm) **Balanced Panel: n=18, T=19, N=342 * *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**)) **Oneway (individual) effect Within Model Call: plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm) Balanced Panel: n=18, T=19, N=342 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -0.40100 -0.08410 -0.00858 0.08770 0.73400 Coefficients : Estimate Std. Error t-value Pr(|t|) lincomep -0.761830.03535 -21.551 2.2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Total Sum of Squares: 17.061 Residual Sum of Squares: 6.9981 Multiple R-Squared: 0.58981 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm)) **Oneway (individual) effect Within Model Call: plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm) Balanced Panel: n=18, T=19, N=342 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -0.40100 -0.08410 -0.00858 0.08770 0.73400 Coefficients : Estimate Std. Error t-value Pr(|t|) lag(lincomep) -0.761830.03535 -21.551 2.2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Total Sum of Squares: 17.061 Residual Sum of Squares: 6.9981 Multiple R-Squared: 0.58981 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981 * *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))* *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data = mydata, : variable lengths differ (found for 'diff(lincomep)') * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variance explained by each predictor in GAM
Many thanks for the advice David. I would really like to figure out, though, how to get the contribution of each factor to the Rsq - something like a Beta coefficient for GAM. Ideas? KC On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.netwrote: On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote: Hi, I am using mgcv:gam and have developed a model with 5 smoothed predictors and one factor. gam1 - gam(log.sp~ s(Spr.precip,bs=ts) + s(Win.precip,bs=ts) + s( Spr.Tmin,bs=ts) + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts) +factor(site),data=dat3) The total deviance explained = 70.4%. I would like to extract the variance explained by each predictor. Is there a straightforward way to do this? I have tried dropping a term and recalculating the model, but the edf's change if there is any correlation among variables, thereby making all of the relationships different. I haven't yet figured out how to fix the smoothing terms- I get syntax error messages. Among other variations, I tried, for example, log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +... ?anova.gam Obviously I cannot test this with your dat3. You get an F-statistic for each s() term by default and you are referred to saummary.gam for further explanation. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] weighted cox ph model
Dear all, I am struggling with the weighted Efron approximation. I really need some help. I am confused by the explanation regarding applying weights to Efron approximation on the paper of “A Package for Survival Analysis in S” (Terry M. Therneau, Feb 1999). In the section 3.3.7 Weighted Cox models, it states that, for a simple example with 5 subjects in time order and 1 and 2 died at the same time, the second term of the weighted partial likelihood is either w1r1/ ( w1r1 + w3r3+ w4r4+ w5r5) or w1r1/ (w2r2+ w3r3+ w4r4+ w5r5) The author suggests that to compute the Efron approximation one can simply separately replace the numerator with 0.5 (w1r1 + w2r2) and the denominator with 0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5. Does this suggestion refer to the second term (stated above) of the likelihood function only ? If so should the first term of the likelihood function should still be w2r2/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5) or w1r1/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5)? Then to compute the Efron approximation, should the numerator of the this likelihood function should be 0.5 (w1r1 + w2r2)* 0.5 (w1r1 + w2r2), i.e. the likelihood function should be [0.5 (w1r1 + w2r2)/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5) ]* [0.5 (w1r1 + w2r2)/( 0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5)]? Since we add the probability to the numerator of the second term, we should add the probability to the numerator of the first term as well. If 1 and 2 each have the probability of 0.5 for the deaths in the second term they should also have the probability of 0.5 for the deaths in the first term. That’s my thinking. If it is wrong, please correct it for me. I will really appreciate it. By the way, I tried another weighted Efron approximation, which can yield the same result with R if and only if the weights for tied data are the same. Otherwise it yields the different results but close to the R’s. The formula has been attached with this message already. Could someone give me some hints regarding my formula, please? Any suggestions will be sincerely appreciated. Bessy http://www.nabble.com/file/p24462598/Weighted%2BEfron%2BApproximation_Nabble_R.doc Weighted+Efron+Approximation_Nabble_R.doc -- View this message in context: http://www.nabble.com/weighted-cox-ph-model-tp24462598p24462598.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset range selections
Thanks Jim. How does one search the help system for info on simple logic like this? On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote: 1) X - subset(A, (t 1000) | (t 1200)) 2) X - subset(A, (t 1000) (t 1200)) On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote: How would I write the two selections each in a single subset command? 1) Two non-overlapping time ranges I want to collect together - before 10AM and after noon. Should be an OR function: X = subset(A, t1000) + subset(A, t1200) 2) One range between two defined times like after 10AM and before noon. Should be an AND function: X1 = subset(A, t1000) X = subset(X1, t1200) Thanks, Mark P.S. - The help system seems very difficult for finding this sort of information! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plm Issues
The dyn and dynlm packages can handle time series in lm and glm. (dyn can also handle many additional regression functions as well) In the case of dyn just write dyn$lm instead of lm like this: library(dyn) a - ts(c(1, 2, 4)) dyn$lm(a ~ diff(a)) Call: lm(formula = dyn(a ~ diff(a))) Coefficients: (Intercept) diff(a) 02 On Mon, Jul 13, 2009 at 10:17 AM, Damien Mooredamienlmo...@gmail.com wrote: Duh! Thanks and good advice. I was using 2.7.2 because it was, until recently, the latest version working with RPy (http://rpy.sourceforge.net/). Also didn't realize plm was still actively developed. Interesting that since plm now correctly handles diff and lag operations, it actually breaks with the behavior of lm: a=ts(c(1,2,4)) lm(a~diff(a)) Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels = TRUE) : variable lengths differ (found for 'diff(a)') To regress a on its difference, one needs the more laborious: a=ts(c(1,2,4)) adata=as.data.frame(cbind(a,diff(a))) colnames(adata)=c(a,diffa) lm(a~diffa,data=adata) Call: lm(formula = a ~ diffa, data = adata) Coefficients: (Intercept) diffa 0 2 From the R help Fitting Linear ModelsUsing time series Considerable care is needed when using lm with time series. Unless na.action = NULL, the time series attributes are stripped from the variables before the regression is done. (This is necessary as omitting NAs would invalidate the time series attributes, and if NAs are omitted in the middle of the series the result would no longer be a regular time series.) Even if the time series attributes are retained, they are not used to line up series, so that the time shift of a lagged or differenced regressor would be ignored. It is good practice to prepare a data argument by ts.intersectts.union.html(..., dframe = TRUE), then apply a suitable na.action to that data frame and call lm with na.action = NULL so that residuals and fitted values are time series. On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote: The first think one need to do when has a so old version, is update it :-) After, if the problem remain, try get help with the colleagues. best milton On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote: Hi List I'm having difficulty understanding how plm should work with dynamic formulas. See the commands and output below on a standard data set. Notice that the first summary(plm(...)) call returns the same result as the second (it shouldn't if it actually uses the lagged variable requested). The third call results in error (trying to use diff'ed variable in regression) Other info: I'm running R 2.7.2 on WinXP cheers *data(Gasoline,package=Ecdat) Gasoline_plm-plm.data(Gasoline,c(country,year)) pdim(Gasoline_plm) **Balanced Panel: n=18, T=19, N=342 * *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**)) **Oneway (individual) effect Within Model Call: plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm) Balanced Panel: n=18, T=19, N=342 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -0.40100 -0.08410 -0.00858 0.08770 0.73400 Coefficients : Estimate Std. Error t-value Pr(|t|) lincomep -0.76183 0.03535 -21.551 2.2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Total Sum of Squares: 17.061 Residual Sum of Squares: 6.9981 Multiple R-Squared: 0.58981 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm)) **Oneway (individual) effect Within Model Call: plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm) Balanced Panel: n=18, T=19, N=342 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -0.40100 -0.08410 -0.00858 0.08770 0.73400 Coefficients : Estimate Std. Error t-value Pr(|t|) lag(lincomep) -0.76183 0.03535 -21.551 2.2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Total Sum of Squares: 17.061 Residual Sum of Squares: 6.9981 Multiple R-Squared: 0.58981 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981 * *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))* *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data = mydata, : variable lengths differ (found for 'diff(lincomep)') * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
Re: [R] subset range selections
You can find the operators with: ?| This will show you logical operators. You can do: ?+ to get the others. On Mon, Jul 13, 2009 at 10:21 AM, Mark Knechtmarkkne...@gmail.com wrote: Thanks Jim. How does one search the help system for info on simple logic like this? On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote: 1) X - subset(A, (t 1000) | (t 1200)) 2) X - subset(A, (t 1000) (t 1200)) On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote: How would I write the two selections each in a single subset command? 1) Two non-overlapping time ranges I want to collect together - before 10AM and after noon. Should be an OR function: X = subset(A, t1000) + subset(A, t1200) 2) One range between two defined times like after 10AM and before noon. Should be an AND function: X1 = subset(A, t1000) X = subset(X1, t1200) Thanks, Mark P.S. - The help system seems very difficult for finding this sort of information! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting dataset for Tuning Parameter with Cross Validation
Here is what the train() function in the caret package does by default (you can change this behavior; see below). Using the entire data set, estimate the RBF parameter using the sigest() function in the kernlab package (which, if I recall correctly involves the median of a sample of kernel matrix values). Using this fixed value, the cost function is varied over a common set of held-out samples. More specifically, every value of the cost parameter is evaluated on the same exact folds. I've been able to achieve pretty good performance this way in almost every case where I've done the comparison, Based on these performance values, you can select the cost function based on the best performance. There are also ways of selecting the simplest model that is within the uncertainty of the numerically optimal model (that is done using the selectionFunction argument of trainControl). I should also note that you can tune across any grid of cost and sigma (this is done via the tuneGrid argument of train()). Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues with file.info?
Yes. You are right that is the trick:) Here is what I'm seeing: input_path-c(C:/WINDOWS/system32) file.info(list.files(input_path)) size isdir mode mtime ctime atime exe $winnt$.inf NANA NA NA NA NA NA 1025 NANA NA NA NA NA NA ... xsi.jar NA NA NA NA NA NA NA xsi.zip NA NA NA NA NA NA NA zipfldr.dll NA NA NA NA NA NA NA input_path-c(C:/WINDOWS/system32/winmine.exe) file.info(input_path) size isdir mode mtime ctime atime exe C:/WINDOWS/system32/winmine.exe 119808 FALSE 777 2002-12-31 07:00:00 2008-08-15 14:28:37 2009-07-13 09:39:32 win32 Not sure why, file.info is not working for the array of input files/folders/etc. but successfully works when an individual file in input. Thanks again for your insights and feedback. --- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote: From: Rolf Turner r.tur...@auckland.ac.nz Subject: Re: [R] Issues with file.info? To: Jason Rupert jasonkrup...@yahoo.com Cc: R-help@r-project.org R-help@r-project.org Date: Thursday, July 9, 2009, 3:13 PM On 10/07/2009, at 6:02 AM, Jason Rupert wrote: Are there any tricks associated with file.info? I just tried it on a directory folder and it returned NA for all fields for all files. I tried it on a different folder with different files and it still returned NA. I tried it on a specific file and it returned all the proper info correctly. Just wondering if there are any tricks I've overlooked. Yes. You've overlooked the trick of telling us about the specifics of your operating system, version of R, etc., and of showing exactly what commands you ``tried'', i.e. of reading the Posting Guide. cheers, Rolf Turner ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nonlinear Least Squares nls() programming help
MathZero wrote: Hi, I am trying to use the nls() function to closely approximate a vector of values, colC and I'm running into trouble. I am not sure how if I am asking the program to do what I think its doing, because the same minimization in Excel's Solver does not run into problems. Fitting a function is an approximation, trying to find a minimum. Think of frozen mountain lake surrounded by mountains. Excel's Solver will report the highest tip of the snowflake on the lake, if it finds it. nls will find out that the lake is essentially flat compare to the surrounding and tell you this fact in unkind word. In your case, I assume there is an (additional?) problem that the solution might be non-identifyable. It could required additional constraints, for example that one coefficient is always larger than the other. Dieter -- View this message in context: http://www.nabble.com/Nonlinear-Least-Squares-nls%28%29-programming-help-tp24445472p24463097.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adjusting survival using coxph
I have what I *think* should be a simple problem in R, and hope someone might be able to help me. I'm working with cancer survival data, and would like to calculate adjusted survival figures based on the age of the patient and the tumour classification. A friendly statistician told me I should use Cox proportional hazards to do this, and I've made some progress with using the coxph function. However, there doesn't seem to be a simple way to get adjusted survival figures, and I'm still a bit unsure whether I'm using the function correctly. I've found plenty of examples of how to plot survival curves for a single catagorical variable, like tumour stage. However, its a bit less obvious how to properly combine multiple variables (like stage and grade) - examples I've found aren't very helpful. For instance, can someone explain what these two function calls are doing differently? cox.model1 - coxph(Surv(time,outcome)~stage+grade,method=breslow) cox.model2 - coxph(Surv(time,outcome)~stage+grade +stage:grade,method=breslow) Also, am I right to think that this is how I would plot the survival curves for stage in a 50 year old patient? cox.model3 - coxph(Surv(time,outcome)~stage+I (age-50),method=breslow) plot(survfit(cox.model3,newdata=data.frame(stage2=c (III,IV),age=c(50,50 Finally, assuming I manage to correctly call the coxph function in the first place (!), what is the sensible way to adjust the survival for each patient? My understanding of what I'm trying/hoping to do has led me down the following dark alley; 1) pick a sensible centre point (median age, modal stage/grade) as my 'standard' case 2) obtain the survival probabilities for this 'standard' case from the Cox model 3) for each patient find the probability of their observed survival, for their specific age/stage/grade 4) the adjusted survival is the survival time from the 'standard' case that matches the observed survival probability This seems long-winded, and for all I know might be completely the wrong way to do it. hopefully someone will be able to offer some friendly advice! Apologies for the long-ish post, and thanks in advance for any help. Chris Jones. --- Gynaecological Cancer Research Laboratories, UCL EGA Institute for Women's Health, University College London, Paul O'Gorman Building, 72 Huntley Street, London WC1E 6DD United Kingdom Telephone; 020 3108 2007 Fax; 020 3108 2010 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset range selections
On 7/13/2009 10:21 AM, Mark Knecht wrote: Thanks Jim. How does one search the help system for info on simple logic like this? Look in the manuals (the pdf ones), rather than the man pages. For example, expressions like the ones below are in the Introduction to R, section 2.4, Logical Vectors. (The table of contents contains Logical vectors, within Simple manipulations.) Duncan Murdoch On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote: 1) X - subset(A, (t 1000) | (t 1200)) 2) X - subset(A, (t 1000) (t 1200)) On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote: How would I write the two selections each in a single subset command? 1) Two non-overlapping time ranges I want to collect together - before 10AM and after noon. Should be an OR function: X = subset(A, t1000) + subset(A, t1200) 2) One range between two defined times like after 10AM and before noon. Should be an AND function: X1 = subset(A, t1000) X = subset(X1, t1200) Thanks, Mark P.S. - The help system seems very difficult for finding this sort of information! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues with file.info?
On 7/13/2009 10:42 AM, Jason Rupert wrote: Yes. You are right that is the trick:) Here is what I'm seeing: input_path-c(C:/WINDOWS/system32) file.info(list.files(input_path)) size isdir mode mtime ctime atime exe $winnt$.inf NANA NA NA NA NA NA 1025 NANA NA NA NA NA NA ... xsi.jar NA NA NA NA NA NA NA xsi.zip NA NA NA NA NA NA NA zipfldr.dll NA NA NA NA NA NA NA input_path-c(C:/WINDOWS/system32/winmine.exe) file.info(input_path) size isdir mode mtime ctime atime exe C:/WINDOWS/system32/winmine.exe 119808 FALSE 777 2002-12-31 07:00:00 2008-08-15 14:28:37 2009-07-13 09:39:32 win32 Not sure why, file.info is not working for the array of input files/folders/etc. but successfully works when an individual file in input. Look at what list.files() returns: there is no path, just the filename. If you want to use file.info, you need the full path, so you'd use file.info(list.files(input_path,full=TRUE)) (That takes a long time, by the way: there are a lot of files there, and they all need to be opened for inspection!) Duncan Murdoch Thanks again for your insights and feedback. --- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote: From: Rolf Turner r.tur...@auckland.ac.nz Subject: Re: [R] Issues with file.info? To: Jason Rupert jasonkrup...@yahoo.com Cc: R-help@r-project.org R-help@r-project.org Date: Thursday, July 9, 2009, 3:13 PM On 10/07/2009, at 6:02 AM, Jason Rupert wrote: Are there any tricks associated with file.info? I just tried it on a directory folder and it returned NA for all fields for all files. I tried it on a different folder with different files and it still returned NA. I tried it on a specific file and it returned all the proper info correctly. Just wondering if there are any tricks I've overlooked. Yes. You've overlooked the trick of telling us about the specifics of your operating system, version of R, etc., and of showing exactly what commands you ``tried'', i.e. of reading the Posting Guide. cheers, Rolf Turner ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] math expressions in graphs
Dear useRs, I want to put a math expression in the ylab in my plot which should be a Delta and a 'y' with a trace and a hat above it and a 't' as subscription. How could I manage to do it? Thanks in advance, Regards, Rafael. [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variance explained by each predictor in GAM
You can get some idea by doing something like the following, which compares the r^2 for models b and b2, i.e. with and without s(x2). It keeps the smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes penalization altogether btw, which is not what was wanted). dat - gamSim(1,n=400,dist=normal,scale=2) b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat) b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat) summary(b2)$dev.expl summary(b)$dev.expl On Monday 13 July 2009 15:09, Kayce Anderson wrote: Many thanks for the advice David. I would really like to figure out, though, how to get the contribution of each factor to the Rsq - something like a Beta coefficient for GAM. Ideas? KC On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.netwrote: On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote: Hi, I am using mgcv:gam and have developed a model with 5 smoothed predictors and one factor. gam1 - gam(log.sp~ s(Spr.precip,bs=ts) + s(Win.precip,bs=ts) + s( Spr.Tmin,bs=ts) + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts) +factor(site),data=dat3) The total deviance explained = 70.4%. I would like to extract the variance explained by each predictor. Is there a straightforward way to do this? I have tried dropping a term and recalculating the model, but the edf's change if there is any correlation among variables, thereby making all of the relationships different. I haven't yet figured out how to fix the smoothing terms- I get syntax error messages. Among other variations, I tried, for example, log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +... ?anova.gam Obviously I cannot test this with your dat3. You get an F-statistic for each s() term by default and you are referred to saummary.gam for further explanation. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variance explained by each predictor in GAM
It appears you are conflating beta coefficients (individual covariate effect measures) with overall model fit measures. Beta coefficients are not directly comparable to R-squared measures in ordinary least squares analyses, so why would they be so in gam models? I cannot tell whether you actually looked at anova.gam or consulted Wood's book (which is really the better place to go for advice regarding mgcv function questions. You were offered an apportioned F- statistic for each covariate smooth. How was that not satisfactory? What is your goal in this search? -- DW -- On Jul 13, 2009, at 10:09 AM, Kayce Anderson wrote: Many thanks for the advice David. I would really like to figure out, though, how to get the contribution of each factor to the Rsq - something like a Beta coefficient for GAM. Ideas? KC On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote: Hi, I am using mgcv:gam and have developed a model with 5 smoothed predictors and one factor. gam1 - gam(log.sp~ s(Spr.precip,bs=ts) + s(Win.precip,bs=ts) + s( Spr.Tmin,bs=ts) + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts) +factor(site),data=dat3) The total deviance explained = 70.4%. I would like to extract the variance explained by each predictor. You may need to define your terms and there may not be an analog Is there a straightforward way to do this? I have tried dropping a term and recalculating the model, but the edf's change if there is any correlation among variables, thereby making all of the relationships different. I haven't yet figured out how to fix the smoothing terms- I get syntax error messages. Among other variations, I tried, for example, log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +... ?anova.gam Obviously I cannot test this with your dat3. You get an F-statistic for each s() term by default and you are referred to saummary.gam for further explanation. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object .trPaths not found
Uwe Ligges schrieb: I have to admit that I have no idea what we are talking about here (yes, I tend to forget many things these days) - and you have not cited the original message, unfortunately (nor have you specifies R versions, Tinn-R versions and both OS versions, but just one) ... Best wishes, Uwe Original question rkevinbur...@charter.net wrote: I am running an R script with Tinn-R (2.2.0.1) and I get the error message Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : object .trPaths not found Any solutions? System R 2.8.1 Tinn-R 2.2.02 windows Xp professional Servicepack 3 This combination works on my Pc without problems And the code is working on the other system with Tinn-R problems with the R-Editor without problems so it seems to be a Tinn_R problem but maybe anybody has a solution for that. Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] math expressions in graphs
Dear Rafael, Try the following which seems to be close to what you asked for: plot(1, ylab = expression( Delta ~ hat(y) [t] ) ) HTH, Jorge On Mon, Jul 13, 2009 at 10:56 AM, Rafael Moral rafa_moral2...@yahoo.com.brwrote: Dear useRs, I want to put a math expression in the ylab in my plot which should be a Delta and a 'y' with a trace and a hat above it and a 't' as subscription. How could I manage to do it? Thanks in advance, Regards, Rafael. [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice xyplot: same scales within one factor
I am using R 2.8.1 and lattice to produce xyplots conditioned on two factors. What I would like is to have the scales be free between values of one factor, but some within. Thus, in this example, xyplot(mpg ~ disp | factor(gear) + factor(cyl), mtcars, scales=list(x=list(relation=free))) rather than having the x scales be free within a gear as well, I want it to be the same for the gear, but free between cyl (and thus only have x scales below the bottom panels with no scales or white space in the middle between panels). Any help would be greatly appreciated! -Orion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dse model setup help
I tried to specify a model in dse1 but something isn't right. Anybody have any tips? model-SS(F=f,G=g,H=h,Q=q,z0=z,P0=p) Error in locateSS(model$R, constants$R, R, p, p, plist) : The dimension of something in the SS model structure is bad. dim(f) [1] 5 5 dim(g) [1] 5 1 dim(h) [1] 1 5 dim(q) [1] 5 5 dim(z) [1] 5 1 dim(p) [1] 5 5 thanks, Bob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survSplit with data.frame containing a Surv object
Dear All, since years I am struggling with Surv objects in data.frames. The following seems to have to do with it. See below the modified example from the help page of survSplit. The original works, as expected. If, however, a Surv object is added to the data.frame, each record gets doubled. Is there some solution other than avoiding Surv objects in data.frames? Thanks, Heinz require(survival) ## from the help page aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml) summary(aml3) coxph(Surv(time,status)~x,data=aml) ## the same coxph(Surv(start,time,status)~x,data=aml3) ## added to show doubling of records aml.so - aml aml.so$surv.object - with(aml, Surv(time, status)) aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml3.so) sessionInfo('survival') R version 2.9.1 Patched (2009-07-07 r48910) i386-pc-mingw32 locale: LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252 attached base packages: character(0) other attached packages: [1] survival_2.35-4 loaded via a namespace (and not attached): [1] base_2.9.1 graphics_2.9.1 grDevices_2.9.1 methods_2.9.1 [5] splines_2.9.1 stats_2.9.1 utils_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variance explained by each predictor in GAM
Simon,That produced exactly what I was looking for. Thanks so much for the humble help. KC On Mon, Jul 13, 2009 at 9:10 AM, Simon Wood s.w...@bath.ac.uk wrote: You can get some idea by doing something like the following, which compares the r^2 for models b and b2, i.e. with and without s(x2). It keeps the smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes penalization altogether btw, which is not what was wanted). dat - gamSim(1,n=400,dist=normal,scale=2) b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat) b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat) summary(b2)$dev.expl summary(b)$dev.expl On Monday 13 July 2009 15:09, Kayce Anderson wrote: Many thanks for the advice David. I would really like to figure out, though, how to get the contribution of each factor to the Rsq - something like a Beta coefficient for GAM. Ideas? KC On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.netwrote: On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote: Hi, I am using mgcv:gam and have developed a model with 5 smoothed predictors and one factor. gam1 - gam(log.sp~ s(Spr.precip,bs=ts) + s(Win.precip,bs=ts) + s( Spr.Tmin,bs=ts) + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts) +factor(site),data=dat3) The total deviance explained = 70.4%. I would like to extract the variance explained by each predictor. Is there a straightforward way to do this? I have tried dropping a term and recalculating the model, but the edf's change if there is any correlation among variables, thereby making all of the relationships different. I haven't yet figured out how to fix the smoothing terms- I get syntax error messages. Among other variations, I tried, for example, log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +... ?anova.gam Obviously I cannot test this with your dat3. You get an F-statistic for each s() term by default and you are referred to saummary.gam for further explanation. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to perform a Likelihood-ratio test?
Hi Lars, Using the definition of Likelihood-ratio test is straightforward, Jeff Lars Bergemann wrote: Hi. I would like to use a likelihood-ratio test to compare a linear model (lm()) to another linear model containing the first one to see if the extra factors are needed - but I was unable to find any help on how to do that. Could you help me and tell me what to do? Or tell me where to find help on this topic? Many thanks in advance! Lars _ s. It's easy! aspxmkt=en-us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graph: axis label font
Hi, excuse me for my english, i am using R on windows and i have to do several graphs with axis labels and the axis text thicks has a specified font type, (Arial) and a specified font size. How can i do these? Thank you in advance -- View this message in context: http://www.nabble.com/graph%3A-axis-label-font-tp24463974p24463974.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting a timeseries element into a string
Thanks Ron. I apologize for not properly specifying which class/package I was using. I am using the timeSeries class from the timeSeries Package, version 2100.83 from rmetrics.org The code you supplied works for me! -- thanks again. Warm regards, Andrew -- Ron Burns-2 wrote: Here is what I do: R library(fBasics) R ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , FinCenter =) R class(ts) [1] timeSeries attr(,package) [1] timeSeries R (s - as.character(time(ts)[1])) [1] 1970-01-01 R class(s) [1] character tradenet wrote: I added a reproducible example to my question... ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , FinCenter = ) (ts[1,0]) #returns first date in return series GMT 1970-01-01 ttt-(sprintf(%s,ts[1,0])) print(ttt) character(0) ttt-(ts[1,0]) print(ttt) GMT 1970-01-01 what I want to get is a string containing 1970-01-01 Thank you. tradenet wrote: I have a timeseries object, ts, and want to get the first date in the series into a string so I can concatenate it with a SQL query. Input and output are shown below. I must be missing something very basic, but I can't seem to pry the data (2008-07-01) into a string variable. Any suggestions would be appreciated. Thank you, Andrew =script: class(ts) (ts[1,0]) #returns first date in return series ttt-(sprintf(%s,ts[1,0])) print(ttt) =output: class(ts) [1] timeSeries attr(,package) [1] timeSeries (ts[1,0]) #returns first date in return series GMT 2008-07-01 ttt-(sprintf(%s,ts[1,0])) print(ttt) character(0) -- R. R. Burns Physicist (Retired) Oceanside, CA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/getting-a-timeseries-element-into-a-string-tp24429876p24465343.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to perform a Likelihood-ratio test?
Le lundi 13 juillet 2009 à 13:04 +0200, Lars Bergemann a écrit : Hi. I would like to use a likelihood-ratio test to compare a linear model (lm()) to another linear model containing the first one to see if the extra factors are needed - but I was unable to find any help on how to do that. Could you help me and tell me what to do? Or tell me where to find help on this topic? ?anova? Check the test argument ...:-) Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to perform a Likelihood-ratio test?
On Mon, 13 Jul 2009, Jeff Xu wrote: Hi Lars, Using the definition of Likelihood-ratio test is straightforward, To which I would add, following the instructions in the _posting guide_ is equally straightforward. viz help.search(loglikelihood) takes you to ?stats::logLik Chuck Jeff Lars Bergemann wrote: Hi. I would like to use a likelihood-ratio test to compare a linear model (lm()) to another linear model containing the first one to see if the extra factors are needed - but I was unable to find any help on how to do that. Could you help me and tell me what to do? Or tell me where to find help on this topic? Many thanks in advance! Lars _ s. It's easy! aspxmkt=en-us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to perform a Likelihood-ratio test?
Dear Lars, One way would be anova(model1, model2) See ?anova for more information. HTH, Jorge On Mon, Jul 13, 2009 at 7:04 AM, Lars Bergemann lars.bergem...@hotmail.comwrote: Hi. I would like to use a likelihood-ratio test to compare a linear model (lm()) to another linear model containing the first one to see if the extra factors are needed - but I was unable to find any help on how to do that. Could you help me and tell me what to do? Or tell me where to find help on this topic? Many thanks in advance! Lars _ s. It's easy! aspxmkt=en-us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help me get this function to work...
I have a function (see below). This function has one object, ID. If I run the loops by itself using a character value (ie,VFFF1-7), then the loops work fine. However, when I try to insert the character value via the function call, it doesn't work. I don't get an error, but the TotalCover.df dataframe does not update according to the loop criteria. Any obvious problems that you can see? Cover Function# #Define Variables Quadrats.df-unique(data.df$Quadrat) TotalCover.df-cbind(0:750/10,0,0,0,0,0,0) colnames(TotalCover.df)- c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare) Shrub.df-data.df[data.df$Layer==Shrub,] Tree.df-data.df[data.df$Layer==Tree,] Cover.fn-function(ID){ ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,] for (j in 1:length(ShrubCover.df[,Quadrat])){ for (i in 1:751){ if(TotalCover.df[i,Station]=ShrubCover.df[j,Start] TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) TotalCover.df[i,Shrub]- 1 } } TreeCover.df-Tree.df[Tree.df$Quadrat==ID,] for (j in 1:length(TreeCover.df[,Quadrat])){ for (i in 1:751){ if(TotalCover.df[i,Station]=TreeCover.df[j,Start] TotalCover.df[i,Station]= TreeCover.df[j,Stop]) TotalCover.df[i,Tree]- 1 } } } Cover.fn(VFFF1-7) _ right from Hotmail. Check it out. M_WL_QA_HM_celebrity_photos2_072009cat=celebrity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.delim skips first column (why?)
Hi people, I have a text file like this one posted: snp_id genechromosome distance_from_gene_center positionpop1pop2pop3pop4pop5pop6pop7 rs2129081 RAPT2 3 -129993 upstream 0.439009 1.169210NA 0.2330200.093042NA -0.902596 rs1202698 RAPT2 3 -128695 upstream NA 1.815000NA 0.3990791.8142701.382950 NA rs1163207 RAPT2 3 -128224 upstream NA NA NA NA NA NA NA rs1834127 RAPT2 3 -128106 upstream NA NA NA NA NA NA 2.180670 rs2114211 RAPT2 3 -126738 upstream -0.468279 -1.447620 NA 0.010616-0.414581 NA 0.550447 rs2113151 RAPT2 3 -124620 upstream -0.897660 -1.971020 NA -0.920327 -0.764658 NA 0.337127 rs2524130 RAPT2 3 -123029 upstream -0.109795 -0.004646 -0.412059 1.1167400.667567 -0.924529 0.962841 rs1381318 RAPT2 3 -12818 upstream -0.911662 -1.791580 NA -0.945716 -1.239640 NA 0.004876 rs2113319 RAPT2 3 -122028 upstream -0.911662 -1.738610 NA -0.945716 -1.240950 NA -0.005318 When I use read.delim (or any read function) on it, R skips the first column, and I don' understand why. For example: $: R data = read.delim('snp_file.txt', head=T, sep='\t') Now, I would expect data$snp_id to contain snp ids, and data$gene to contain gene names; but it is not like this: data$snp_id [1] RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 Levels: RAPT2 data$gene [1] 3 3 3 3 3 3 3 3 3 summary(data) snp_id gene chromosome distance_from_gene_center RAPT2:9 Min. :3 Min. :-129993 upstream:9 1st Qu.:3 1st Qu.:-128224 Median :3 Median :-126738 Mean :3 Mean :-113806 3rd Qu.:3 3rd Qu.:-123029 Max. :3 Max. : -12818 data$pop7 [1] NA NA NA NA NA NA NA NA NA Notice that it did use snp_id as the header for the first column, but it skips completely al the data from that column, and all the fields are shifted, so the last column is filled with NA values. What I am doing wrong? Can it be a problem of my data files? I have tried to modify them a bit (add new columns, etc..) but it didn't work. I am running R from an Ubuntu system: sessionInfo() R version 2.9.1 (2009-06-26) i486-pc-linux-gnu locale: LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it_IT.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Giovanni Dall'Olio, phd student Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain) My blog on bioinformatics: http://bioinfoblog.it [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph: axis label font
On Mon, Jul 13, 2009 at 5:31 PM, serbringbracard...@email.it wrote: excuse me for my english, i am using R on windows and i have to do several graphs with axis labels and the axis text thicks has a specified font type, (Arial) and a specified font size. How can i do these? Thank you in advance Interesting question, I didn't know the answer to, so I tried to look it up. There might be some help towards the bottom of this page: http://www.statmethods.net/advgraphs/parameters.html It seems to be specific for Windows, so I can't test it myself. -- Michael Knudsen micknud...@gmail.com http://lifeofknudsen.blogspot.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help get this simple function to work...
I have a function (see below). This function has one object, ID. If I run the loops by themselves using a character value (ie,VFFF1-7) instead of the function object, then the loops work fine. However, when I try to insert the character value via the function call, it doesn't work. I don't get an error, but the TotalCover.df dataframe does not update according to the loop criteria. Any obvious problems that you can see? Cover Function# #Define Variables Quadrats.df-unique(data.df$Quadrat) TotalCover.df-cbind(0:750/10,0,0,0,0,0,0) colnames(TotalCover.df)- c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare) Shrub.df-data.df[data.df$Layer==Shrub,] Tree.df-data.df[data.df$Layer==Tree,] Cover.fn-function(ID){ ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,] for (j in 1:length(ShrubCover.df[,Quadrat])){ for (i in 1:751){ if(TotalCover.df[i,Station]=ShrubCover.df[j,Start] TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) TotalCover.df[i,Shrub]- 1 } } TreeCover.df-Tree.df[Tree.df$Quadrat==ID,] for (j in 1:length(TreeCover.df[,Quadrat])){ for (i in 1:751){ if(TotalCover.df[i,Station]=TreeCover.df[j,Start] TotalCover.df[i,Station]= TreeCover.df[j,Stop]) TotalCover.df[i,Tree]- 1 } } } Cover.fn(VFFF1-7) -- View this message in context: http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survSplit with data.frame containing a Surv object
On Mon, 13 Jul 2009, Heinz Tuechler wrote: Dear All, since years I am struggling with Surv objects in data.frames. The following seems to have to do with it. See below the modified example from the help page of survSplit. The original works, as expected. If, however, a Surv object is added to the data.frame, each record gets doubled. Is there some solution other than avoiding Surv objects in data.frames? I think you can modify survSplit so that it will properly handle Surv objects. Change this line: newdata - lapply(data, rep, ntimes + 1) to this: newdata - lapply(data, function(x) { x - as.matrix(x) x[rep(1:nrow(x), ntimes + 1),] }) or something similar that results Surv objects being rep()'ed rowwise rather than elementwise and returned as objects of the right dimension (rather than as a vector). Caveat: This works in the example you give, but I've not tested this extensively. HTH, Chuck Thanks, Heinz require(survival) ## from the help page aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml) summary(aml3) coxph(Surv(time,status)~x,data=aml) ## the same coxph(Surv(start,time,status)~x,data=aml3) ## added to show doubling of records aml.so - aml aml.so$surv.object - with(aml, Surv(time, status)) aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml3.so) sessionInfo('survival') R version 2.9.1 Patched (2009-07-07 r48910) i386-pc-mingw32 locale: LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252 attached base packages: character(0) other attached packages: [1] survival_2.35-4 loaded via a namespace (and not attached): [1] base_2.9.1 graphics_2.9.1 grDevices_2.9.1 methods_2.9.1 [5] splines_2.9.1 stats_2.9.1 utils_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me get this function to work...
In R a function only returns the last evaluation, so you need to wrap up all of the local results into a list at the end of the function. On Jul 13, 2009, at 1:23 PM, Chip Maney wrote: I have a function (see below). This function has one object, ID. If I run the loops by itself using a character value (ie,VFFF1-7), then the loops work fine. However, when I try to insert the character value via the function call, it doesn't work. I don't get an error, but the TotalCover.df dataframe does not update according to the loop criteria. Any obvious problems that you can see? Cover Function# #Define Variables Quadrats.df-unique(data.df$Quadrat) TotalCover.df-cbind(0:750/10,0,0,0,0,0,0) colnames(TotalCover.df)- c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare) Shrub.df-data.df[data.df$Layer==Shrub,] Tree.df-data.df[data.df$Layer==Tree,] Cover.fn-function(ID){ ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,] for (j in 1:length(ShrubCover.df[,Quadrat])){ for (i in 1:751){ if (TotalCover.df[i,Station]=ShrubCover.df[j,Start] TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) TotalCover.df[i,Shrub]- 1 } } TreeCover.df-Tree.df[Tree.df$Quadrat==ID,] for (j in 1:length(TreeCover.df[,Quadrat])){ for (i in 1:751){ if (TotalCover.df[i,Station]=TreeCover.df[j,Start] TotalCover.df[i,Station]= TreeCover.df[j,Stop]) TotalCover.df[i,Tree]- 1 } } # Perhaps something like: list(ShrubCover.df, TreeCover.df, TotalCover.df) } Cover.fn(VFFF1-7) _ right from Hotmail. Check it out. M_WL_QA_HM_celebrity_photos2_072009cat=celebrity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to run a R program with input arguments
I am a beginner in R and know only a little about it yet. I have a script written in R language, named as a.txt for example. I am using a Linux machine, at present I only know that I can type R in the terminal and then copy-paste the content in a.txt to the R's interface to execute the program. However, I want to know if there is any method that allows me to add some input arguments directly after the R in the terminal. Specifically, I want to type the following in the cmd line: R (some flags or option) a.txt then the program will begin to run. Besides, if the program will read a data file first, can I also specify the data file in the command line? Then the complete command will become: R a.txt data.txt This is important for a beginner. Thanks very much!:working: -- View this message in context: http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object .trPaths not found
Hi, using a variable named .trPaths is a feature and a frequent nuisance of recent Tinn-R and also explained in the Tinn-R docs. Solution 1: In Tinn-R select the menu: R -- Configure -- and then Temporarily or Permanent Solution 2: Manually edit file R-installation-directory/etc/Rconfig.site The minimal line to add is: .trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 'block.r', 'lines.r'), sep='') Hope it helps Thomas P. Knut Krueger schrieb: Uwe Ligges schrieb: I have to admit that I have no idea what we are talking about here (yes, I tend to forget many things these days) - and you have not cited the original message, unfortunately (nor have you specifies R versions, Tinn-R versions and both OS versions, but just one) ... Best wishes, Uwe Original question rkevinbur...@charter.net wrote: I am running an R script with Tinn-R (2.2.0.1) and I get the error message Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : object .trPaths not found Any solutions? System R 2.8.1 Tinn-R 2.2.02 windows Xp professional Servicepack 3 This combination works on my Pc without problems And the code is working on the other system with Tinn-R problems with the R-Editor without problems so it seems to be a Tinn_R problem but maybe anybody has a solution for that. Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me get this function to work...
On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote: In R a function only returns the last evaluation, so you need to wrap up all of the local results into a list at the end of the function. SNIP David Winsemius, MD Heritage Laboratories West Hartford, CT How important it is to wrap the list in a return statement, ala return(list(ShrubCover.df, TreeCover.df, TotalCover.df)) or answer - list(ShrubCover.df, TreeCover.df, TotalCover.df) return(answer) Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to run a R program with input arguments
See commandArgs() function. On Mon, Jul 13, 2009 at 4:15 PM, edisonying edisonying1...@gmail.comwrote: I am a beginner in R and know only a little about it yet. I have a script written in R language, named as a.txt for example. I am using a Linux machine, at present I only know that I can type R in the terminal and then copy-paste the content in a.txt to the R's interface to execute the program. However, I want to know if there is any method that allows me to add some input arguments directly after the R in the terminal. Specifically, I want to type the following in the cmd line: R (some flags or option) a.txt then the program will begin to run. Besides, if the program will read a data file first, can I also specify the data file in the command line? Then the complete command will become: R a.txt data.txt This is important for a beginner. Thanks very much!:working: -- View this message in context: http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] \dQuote in packages
Thank you! (That was easy to fix.) How does one deal with quoting (in \reference)? The following line causes problems: \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A New Data Mining Approach for Longitudinal Data}.} The error given is: Warning in parse_Rd(./man/predict.Rd, encoding = unknown) : ./man/predict.Rd:28: unknown macro '\dquote' *** error on file ./man/predict.Rd Error : ./man/predict.Rd:28: Unrecognized macro \dquote The manual for writing R packages said I should not just use the character . What should I be using here? Thanks again! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages Rebecca Sela wrote: Here is one Rd file with problems, now inline so that it can be read: \name{simpleREEMdata} \docType{data} \alias{simpleREEMdata} \title{Sample Data for RE-EM trees} \description{ This data set is consists of a panel of 50 individuals with 12 observations per individual. The data is based on a regression tree with an initial split based on a dummy variable (\code{D}) and a second split based on time in the branch where \code{D=1}. The observations include both randomly generated individual-specific effects and observation-specific errors. } \format{ The data has 600 rows and 5 columns. The columns are: insert here: \itemize{ \item{\code{Y}}{the target variable} \item{\code{t}}{a numeric predictor (time)} \item{\code{D}}{a catergorical predictor with two levels, 0 and 1} \item{\code{ID}}{the identifier for each individual} \item{\code{X}}{another covariate (which is intentionally unrelated to the target variable)} insert here: } or in other words, you need an itemize environment in order to use \item within \format, see the manual Writing R Extensions. Best, Uwe } \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A New Data Mining Approach for Longitudinal Data}.} \keyword{datasets} Thanks again for your help! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages Rebecca, the attachments have been stripped off by the mailing list. Rebecca Sela wrote: That's good to know. I have attached three Rd files that gave errors (others gave identical errors). I would love to know what is wrong with them. I'm using 2.1.1 because that is what is installed on the Linux computer I have access to. (I haven't bothered figuring out how to assemble a package in Windows.) You should *really* upgrade! That version is outdated for several years now. How to do it on Windows: See the R Installation and Administration manual with its corresponding section. Best, Uwe Thank you for your help! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages The difference you are experiencing is the new Rd2 parser that is more picky now (but also prevents to produce wrong documentation files). If you make the code of the Rd available, someone might be able to help. Are you really under R-2.1.1 ??? That is really ancient! Best, Uwe Ligges Rebecca Sela wrote: I am in the process of submitting a package to CRAN. R CMD check ran successfully on the package on my local computer, using R version 2.1.1. However, on the computers for CRAN (with version 2.10.0), the following errors occurred: Warning in parse_Rd(./man/predict.Rd, encoding = unknown) : ./man/predict.Rd:28: unknown macro '\dquote' *** error on file ./man/predict.Rd Error : ./man/predict.Rd:28: Unrecognized macro \dquote Warning in parse_Rd(./man/print.Rd, encoding = unknown) : ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects' Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) : ./man/simpleREEMdata.Rd:10: unknown macro '\item' Are \dquote, \sideeffects, and \item not supported in newer versions of R? Is there some underlying problem that I should fix that makes these show up? Thank you very much. Rebecca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
[R] .RData signature
Is there a way to identify the beginning and end of an .RData file? I am trying to restore lost files on opensolaris where the table of contents is overwritten and will be reading the raw disk byte by byte. I looked at several files in hex and the end seems different, while most of them, but not all begin with RDX2.X Thank you. Stephen C. Bond __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error: optim(rho, n2ll.rho, method = method, control = control, beta = parm$beta, : initial value in 'vmmin' is not finite
Hi Brandy, The `vmmin' refers to a variable metric algorithm, which is a quasi-Newton method for optimization. This algorithm is used when `method=BFGS' in the optim() call. The quasi-Newton methods iteratively build an approopriate Hessian matrix, which is of dimension p x p, where p is the problem size. In your case the hessian matrix is 4000 x 4000, which BFGS isi unable to handle. If the code for the function lnam() and dependent functions is entirely in R, you can try a different algorithm than BFGS. For example, you can try CG, which can handle high-dimensional optimization. However, before doing that I would make sure that CG works well and gives same answer as BFGS on smaller problems. If CG doesn't cut it, then I would try the spg() function in the BB package. If the code for lnam() is not in R, then I would contact the package author to help you out with incorporating other optimizers for your high-dimensional problem. Hope this helps, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Brandy Lee Aven Sent: Thursday, July 09, 2009 11:11 PM To: r-help@r-project.org Subject: [R] error: optim(rho, n2ll.rho, method = method, control = control, beta = parm$beta, : initial value in 'vmmin' is not finite I am trying to use the lnam autocorrelation model from the SNA package. I have it running for smaller adjacency matrices (1,500) it works just fine but when my matrices are bigger 4000+. I get the error: lnam1_01.adj- lnam(data01$adopt,x01,ec2001.csr) Error in optim(rho, n2ll.rho, method = method, control = control, beta = parm$beta, : initial value in 'vmmin' is not finite I have looked at the lnam code and cant even figure out what vmmin is. Is there anyway around this? Am I doing something wrong? What makes me think that its about the size of the adjacency matrix is that I can run the same command on similar objects that are just smaller and it works fine. please help! sessionInfo() R version 2.9.1 (2009-06-26) x86_64-pc-linux-gnu locale: C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] numDeriv_2009.2-1 sna_2.0-1 loaded via a namespace (and not attached): [1] tools_2.9.1 class(data01$adopt) #This is the response vector y [1] integer data01$adopt[1:10] # Its just a binary outcome for all vertices [1] 0 0 0 0 0 0 0 0 0 0 ..until 4,003 class(x01) #X01 is a matrix of my six covariates for all vertices [1] matrix #here is the an example of the data x01[1:10,1:6] on01 indegree outdegree between eigen numalters01 1 1 0 0 0 1 1 19 1 0 1 0 0 1 123 1 0 1 0 0 1 140 1 0 1 0 0 1 169 1 0 1 0 0 1 189 1 0 1 0 0 1 195 1 0 1 0 0 2 204 1 0 1 0 0 1 231 1 0 2 0 0 1 252 1 0 3 0 0 4 # this is the adjacency matrix (in Sparse matrix format) that causes the error. I have another that is 10,500 and does the same thing. dim(ec2001.csr) [1] 4003 4003 class(ec2001.csr) [1] matrix.csr attr(,package) [1] SparseM ec2001.csr[1:10,1:10] #here is what it looks like 1 19 123 140 169 189 195 204 231 252 1 1 0 0 0 0 0 0 0 0 0 19 0 0 0 1 0 0 0 0 0 0 123 0 0 0 0 0 0 0 0 0 0 140 0 0 0 0 0 0 0 0 0 0 169 0 0 0 0 0 0 0 0 0 0 189 0 0 1 0 0 0 0 0 0 0 195 0 0 0 0 0 0 0 0 0 0 204 0 0 0 0 0 0 0 0 0 0 231 0 0 0 0 0 0 0 0 0 0 252 0 0 0 0 0 0 0 0 0 0 #There are also no infinite values in the objects. is.infinite(x01) [1] FALSE . N is.infinite(data01$adopt) [1] FALSE .N is.infinite(ec2001.csr) [1] FALSE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] regression with replication
Dear all, I would like to fit a linear regression with replication (on each year, observation is replicated, e.g 4 times). The independent variable ranges for instance 1-5 year, so I expect to have a linear fit of 5 points. For that purpose I do these (with dummy variables x and y): x-rep(seq(1:5),4) y-rnorm(20) linreg-lm(y~x) fitted.values(linreg) # why produce 20 points of estimate? predict(linreg)# why produce 20 points of estimate? Please somebody explain: 1. why both fitted.values and predict functions produced 20 points of estimate, NOT 5 points. 2. is lm(y~x) correct to solve this regression case, or there's a correct procedure. Many thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me get this function to work...
How important it is to wrap the list in a return statement, ala return(list(ShrubCover.df, TreeCover.df, TotalCover.df)) or answer - list(ShrubCover.df, TreeCover.df, TotalCover.df) return(answer) --- Completely Un. Consult the R Docs, especially the R Language Definition manual, for answers to this and your other numerous basic questions. That's what they're there for. -- Bert Gunter Genentech, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] testing equality of two dependent correlations + normality issue
Hi, I am turning to you with a (hopefully simple?) stats question. I would like to test equality of two correlation coefficients in a setting with three variables X,Y,Z, i.e. equality of r(X,Y) and r(Z,Y). I have found a formula to transform the 2 dependent correlations difference to t-distribution with N-3 df: t = (rxy - rzy)* SQRT[{(n - 3)(1 + rxz)}/ {2(1 - rxy^2 - rxz^2 - rzy^2 + 2rxy*rxz*rzy)}] (Blalock, H., 1972. Social Statistics. NY: McGraw-Hill. Page 406-7). Am actually not sure whether this is exact or approximate (even given normality assumption, the Fisher's Z-transform which this is - I assume - based on, is approximate, right?). But to make it a bit more complicated, Shapiro-Wilks test of normality gives p=0.022 for variable X. Therefore assuming normality may not be safe (justifiable) at all? What do I do then? Do I report this test as assymptotically valid, or do I run some other test? Any ideas? Many thanks in advance, Jaroslav -- View this message in context: http://www.nabble.com/testing-equality-of-two-dependent-correlations-%2B-normality-issue-tp24465768p24465768.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re ference
Hello R users, I have developed code in R that generates populations of non-overlapping random polygons (polygonal maps) for testing certain stereological estimation methods. I have tried to look for articles or papers online on the generation of random polygonal maps and have found very little. There seems to be a much greater concern with generating single random polygons rather than whole maps of random polygons. Does anyone have an idea of where I should look or what keywords to use in my search? I really appreciate your patience with R novices like myself. -- View this message in context: http://www.nabble.com/reference-tp24467408p24467408.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reducing arrays for comparison with each other...
Hi, First up, thanks again for all the help I'm getting on this list. I'm making great headway in analyzing my experimental data on an experiment by experiment basis. No way I could have done this in the time I've done it without your help. This email is partially a question about R but is also soliciting overall guidance if anyone wants to give it. I somehow managed to get an electrical engineering degree decades ago from the University of California system without ever being required to take a statistics class so in that area I'm lacking background. Maybe there's a good subject someone can point me toward. (Keep it simple though!) OK, I've now got dozens of experiments sitting in individual data.frames. In most cases the data.frames look more or less like the example below - Experiment Start time down the left side, Experiment Completion time across the top and results in the cells. The following data started at 830 and is 3 minute data so the first measurement came at 833, then 836, etc. EnTime 836 842 845 848 851 854 900 903 1 833 -3860 -772 -938 -7720 -386 2 8360 -386 00 0 -246 714 -632 3 83900 00 -38600 -772 4 84200 -3860 00 -682 0 5 84500 00 000 0 6 84800 00 -38600 0 7 85100 00 000 0 8 85400 00 00 -386 0 9 85700 00 000 0 10 90000 00 000 0 11 90300 00 00 The issue I'm thinking about now is how to compare experiments when the time increment for each is different? This one is 3 minutes, others are 1 minute, 5 minutes, 13 minutes, etc. My initial thought is to try and roll things up into common sizes, like 30 minute or 60 minute chunks. Granted, I won't get the same number of measurements in each chunk, but it's a start and would yield a data.frame that's easy to view in the Rgui console so I'd be happy with that, but is that a good method in terms of the statistics of the problem? should I be paying attention to the number of experiments, maybe as a percentage of the whole, or something else? I don't know. Anyway, I'd like to try rolling this data up by time increment to start and I'm wondering if there any easy ways to do that? Taking 15 minute increments I would like to get something like EnTime845 900 915 1 845 SUM SUM SUM 2 900 SUM SUM SUM 4 915 SUM SUM SUM where the SUM is the sum of the pieces making up this new cell. I tried using subset but apparently it doesn't work on arrays like this, telling me that EnTime=830 isn't valid for factors. I can certainly find ways to munge the data set, changing EnTime values but that seems like stuff I shouldn'y be doing. that said it's pretty easy to do I suspect. (Modulo 15, etc.) Is there a way to use cast() to grab time increments. Instead of cast(Results, EnTime ~ ExTime, ...) returning by the exact value, can cast collect values together in groups? Any ideas appreciated before I start banging away at doing my less interesting methods. Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me get this function to work...
On 7/13/2009 3:21 PM, Mark Knecht wrote: On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote: In R a function only returns the last evaluation, so you need to wrap up all of the local results into a list at the end of the function. SNIP David Winsemius, MD Heritage Laboratories West Hartford, CT How important it is to wrap the list in a return statement, ala return(list(ShrubCover.df, TreeCover.df, TotalCover.df)) or answer - list(ShrubCover.df, TreeCover.df, TotalCover.df) return(answer) Those are almost identical. The only difference is that the second version will leave the local variable answer in the evaluation frame. For most functions the evaluation frame disappears after the return, but sometimes it lives a while longer. For example, you can return it explicitly by putting environment() as one of the return values. More commonly it is returned implicitly as the environment of a function created during evaluation, e.g. buildf - function() { + f - function(x) x + 1 + return(f) + } g - buildf() g(5) [1] 6 ls(environment(g)) [1] f The last command looks in the environment of g, and sees the local variable f there, of which g is a copy. If you had done answer - f return(answer) you'd also see answer there. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SweaveListingUtils question
Hello, recently I read about the SweaveListingUtils package and now I want to try it out. However, I can not make it work... Below a minimal example. The problem seems to be the following line (generated by SweaveListingPreparations()?): \ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.6\textwidth}}{}% If I comment out this line, it works. What can I do about this? I am using the \begin{Scode} notation instead of Rnw files. May this be the problem? For some technical reason, I would like to stick on this notation. Any help appreciated, kind regards, Karsten. begin example \documentclass[9pt]{beamer} \usepackage{fancyvrb} \usepackage{listings} % choose language and char set \usepackage[ngerman]{babel} \usepackage[ansinew]{inputenc} % \SweaveOpts{prefix.string=Routput/parcel, keep.source=TRUE} \begin{Scode}{results=tex, echo=FALSE} require(SweaveListingUtils) SweaveListingoptions(intermediate = FALSE) SweaveListingPreparations() \end{Scode} \begin{document} \begin{frame}[fragile] \frametitle{Your title} \begin{lstlisting}options(digits=3) set.seed(1) require(Hmisc) age - rnorm(1000,50,10) sex - sample(c('female','male'),1000,TRUE) out - histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06), main = 'Back to Back Histogram') #! just adding color barplot(-out$left, col=red , horiz=TRUE, space=0, add=TRUE, axes=FALSE) barplot(out$right, col=blue, horiz=TRUE, space=0, add=TRUE, axes=FALSE) \end{lstlisting} \end{frame} \begin{Scode}{echo=FALSE} unloadNamespace(SweaveListingUtils) \end{Scode} \end{document} % end example [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] .RData signature
On 7/13/2009 3:38 PM, sten...@go.com wrote: Is there a way to identify the beginning and end of an .RData file? I am trying to restore lost files on opensolaris where the table of contents is overwritten and will be reading the raw disk byte by byte. I looked at several files in hex and the end seems different, while most of them, but not all begin with RDX2.X No, there is nothing special written at the end of a file. Many .RData files are compressed (by internal gzip), so the RDX2 header is hidden, and you'll see bytes 1F 8B instead. (I don't think we write the trailer bytes on the gzip stream, but I'm not sure about that.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] \dQuote in packages
On 7/13/2009 3:27 PM, Rebecca Sela wrote: Thank you! (That was easy to fix.) How does one deal with quoting (in \reference)? The following line causes problems: \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A New Data Mining Approach for Longitudinal Data}.} The error given is: Warning in parse_Rd(./man/predict.Rd, encoding = unknown) : ./man/predict.Rd:28: unknown macro '\dquote' *** error on file ./man/predict.Rd Error : ./man/predict.Rd:28: Unrecognized macro \dquote The manual for writing R packages said I should not just use the character . What should I be using here? Are you sure you showed us the right line? The error message says \dquote, but the line contains \dQuote. \dquote is wrong, \dQuote is fine. Duncan Murdoch Thanks again! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages Rebecca Sela wrote: Here is one Rd file with problems, now inline so that it can be read: \name{simpleREEMdata} \docType{data} \alias{simpleREEMdata} \title{Sample Data for RE-EM trees} \description{ This data set is consists of a panel of 50 individuals with 12 observations per individual. The data is based on a regression tree with an initial split based on a dummy variable (\code{D}) and a second split based on time in the branch where \code{D=1}. The observations include both randomly generated individual-specific effects and observation-specific errors. } \format{ The data has 600 rows and 5 columns. The columns are: insert here: \itemize{ \item{\code{Y}}{the target variable} \item{\code{t}}{a numeric predictor (time)} \item{\code{D}}{a catergorical predictor with two levels, 0 and 1} \item{\code{ID}}{the identifier for each individual} \item{\code{X}}{another covariate (which is intentionally unrelated to the target variable)} insert here: } or in other words, you need an itemize environment in order to use \item within \format, see the manual Writing R Extensions. Best, Uwe } \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A New Data Mining Approach for Longitudinal Data}.} \keyword{datasets} Thanks again for your help! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages Rebecca, the attachments have been stripped off by the mailing list. Rebecca Sela wrote: That's good to know. I have attached three Rd files that gave errors (others gave identical errors). I would love to know what is wrong with them. I'm using 2.1.1 because that is what is installed on the Linux computer I have access to. (I haven't bothered figuring out how to assemble a package in Windows.) You should *really* upgrade! That version is outdated for several years now. How to do it on Windows: See the R Installation and Administration manual with its corresponding section. Best, Uwe Thank you for your help! Rebecca - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Rebecca Sela rs...@stern.nyu.edu Cc: r-help r-help@r-project.org Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern Subject: Re: [R] \dQuote in packages The difference you are experiencing is the new Rd2 parser that is more picky now (but also prevents to produce wrong documentation files). If you make the code of the Rd available, someone might be able to help. Are you really under R-2.1.1 ??? That is really ancient! Best, Uwe Ligges Rebecca Sela wrote: I am in the process of submitting a package to CRAN. R CMD check ran successfully on the package on my local computer, using R version 2.1.1. However, on the computers for CRAN (with version 2.10.0), the following errors occurred: Warning in parse_Rd(./man/predict.Rd, encoding = unknown) : ./man/predict.Rd:28: unknown macro '\dquote' *** error on file ./man/predict.Rd Error : ./man/predict.Rd:28: Unrecognized macro \dquote Warning in parse_Rd(./man/print.Rd, encoding = unknown) : ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects' Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) : ./man/simpleREEMdata.Rd:10: unknown macro '\item' Are \dquote, \sideeffects, and \item not supported in newer versions of R? Is there some underlying problem that I should fix that makes these show up? Thank you very much. Rebecca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object .trPaths not found
Thomas Petzoldt schrieb: Hi, using a variable named .trPaths is a feature and a frequent nuisance of recent Tinn-R and also explained in the Tinn-R docs. Solution 1: In Tinn-R select the menu: R -- Configure -- and then Temporarily or Permanent Solution 2: Manually edit file R-installation-directory/etc/Rconfig.site The minimal line to add is: .trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 'block.r', 'lines.r'), sep='') Hope it helps no for me - I do not switched to permanent and i did not have the Rconfig.site file edited but it is working So i do not know *why it is working* ;-) I hope yes for the others (I will forward it) on the computers where it is not working Thank you Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survSplit with data.frame containing a Surv object
At 20:18 13.07.2009, Charles C. Berry wrote: On Mon, 13 Jul 2009, Heinz Tuechler wrote: Dear All, since years I am struggling with Surv objects in data.frames. The following seems to have to do with it. See below the modified example from the help page of survSplit. The original works, as expected. If, however, a Surv object is added to the data.frame, each record gets doubled. Is there some solution other than avoiding Surv objects in data.frames? I think you can modify survSplit so that it will properly handle Surv objects. Change this line: newdata - lapply(data, rep, ntimes + 1) to this: newdata - lapply(data, function(x) { x - as.matrix(x) x[rep(1:nrow(x), ntimes + 1),] }) or something similar that results Surv objects being rep()'ed rowwise rather than elementwise and returned as objects of the right dimension (rather than as a vector). Caveat: This works in the example you give, but I've not tested this extensively. HTH, Chuck Thanks, Heinz require(survival) ## from the help page aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml) summary(aml3) coxph(Surv(time,status)~x,data=aml) ## the same coxph(Surv(start,time,status)~x,data=aml3) ## added to show doubling of records aml.so - aml aml.so$surv.object - with(aml, Surv(time, status)) aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml3.so) sessionInfo('survival') R version 2.9.1 Patched (2009-07-07 r48910) i386-pc-mingw32 locale: LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252 attached base packages: character(0) other attached packages: [1] survival_2.35-4 loaded via a namespace (and not attached): [1] base_2.9.1 graphics_2.9.1 grDevices_2.9.1 methods_2.9.1 [5] splines_2.9.1 stats_2.9.1 utils_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 Thank you Chuck, it seems to work also with my real data, but I noted that in the example aml$x, which is a factor, gets converted to character in aml3.so. Maybe, if I find the time, I should look at as.data.frame.matrix and rbind for Surv objects. Thanks again, Heinz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help get this simple function to work...
Your basic problem is that you are trying to change the value of a global object (TotalCover.df) from within a function. A better approach is to pass in the object that you are changing and then return it as the value of the function. You can then assign it back to the original object if you want. I think something like this should work, but the code you have is not executable since some objects are not defined (e.g., data.df): #Define Variables Quadrats.df-unique(data.df$Quadrat) TotalCover.df-cbind(0:750/10,0,0,0,0,0,0) colnames(TotalCover.df)- c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare) Shrub.df-data.df[data.df$Layer==Shrub,] Tree.df-data.df[data.df$Layer==Tree,] Cover.fn-function(ID, TC.df){ ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,] for (j in 1:length(ShrubCover.df[,Quadrat])){ for (i in 1:751){ if(TC.df[i,Station]=ShrubCover.df[j,Start] TC.df[i,Station]= ShrubCover.df[j,Stop]) TC.df[i,Shrub]- 1 } } TreeCover.df-Tree.df[Tree.df$Quadrat==ID,] for (j in 1:length(TreeCover.df[,Quadrat])){ for (i in 1:751){ if(TC.df[i,Station]=TreeCover.df[j,Start] TC.df[i,Station]= TreeCover.df[j,Stop]) TC.df[i,Tree]- 1 } } TC.df # return value } Cover.fn(VFFF1-7, TotalCover.df) On Mon, Jul 13, 2009 at 2:06 PM, chipmaneychipma...@hotmail.com wrote: I have a function (see below). This function has one object, ID. If I run the loops by themselves using a character value (ie,VFFF1-7) instead of the function object, then the loops work fine. However, when I try to insert the character value via the function call, it doesn't work. I don't get an error, but the TotalCover.df dataframe does not update according to the loop criteria. Any obvious problems that you can see? Cover Function# #Define Variables Quadrats.df-unique(data.df$Quadrat) TotalCover.df-cbind(0:750/10,0,0,0,0,0,0) colnames(TotalCover.df)- c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare) Shrub.df-data.df[data.df$Layer==Shrub,] Tree.df-data.df[data.df$Layer==Tree,] Cover.fn-function(ID){ ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,] for (j in 1:length(ShrubCover.df[,Quadrat])){ for (i in 1:751){ if (TotalCover.df[i,Station]=ShrubCover.df[j,Start] TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) TotalCover.df[i,Shrub]- 1 } } TreeCover.df-Tree.df[Tree.df$Quadrat==ID,] for (j in 1:length(TreeCover.df[,Quadrat])){ for (i in 1:751){ if (TotalCover.df[i,Station]=TreeCover.df[j,Start] TotalCover.df[i,Station]= TreeCover.df[j,Stop]) TotalCover.df[i,Tree]- 1 } } } Cover.fn(VFFF1-7) -- View this message in context: http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re gression with replication
AgusSusanto wrote: Dear all, I would like to fit a linear regression with replication (on each year, observation is replicated, e.g 4 times). The independent variable ranges for instance 1-5 year, so I expect to have a linear fit of 5 points. For that purpose I do these (with dummy variables x and y): x-rep(seq(1:5),4) y-rnorm(20) linreg-lm(y~x) fitted.values(linreg) # why produce 20 points of estimate? predict(linreg)# why produce 20 points of estimate? Please somebody explain: 1. why both fitted.values and predict functions produced 20 points of estimate, NOT 5 points. Because R doesn't know your regression is replicated. It just knows you gave it 20 pairs of predictors and responses. Perhaps you want predict(linreg,newdata=list(x=unique(x))) ? 2. is lm(y~x) correct to solve this regression case, or there's a correct procedure. If you expected the samples within year to be independent of each other (which seems a little dicey) then this would be correct; otherwise the coefficients will be correct but the residual df etc. will be wrong. You could try something like ymeans - tapply(y,x,mean) xvals - unique(x) lm(ymeans ~ xvals) or a more complicated approach (probably unnecessary here): library(nlme) lme(y~x,random=~1|year) [I think] -- View this message in context: http://www.nabble.com/regression-with-replication-tp24468326p24470213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survSplit with data.frame containing a Surv object
On Mon, 13 Jul 2009, Heinz Tuechler wrote: At 20:18 13.07.2009, Charles C. Berry wrote: On Mon, 13 Jul 2009, Heinz Tuechler wrote: Dear All, since years I am struggling with Surv objects in data.frames. The following seems to have to do with it. See below the modified example from the help page of survSplit. The original works, as expected. If, however, a Surv object is added to the data.frame, each record gets doubled. Is there some solution other than avoiding Surv objects in data.frames? I think you can modify survSplit so that it will properly handle Surv objects. Change this line: newdata - lapply(data, rep, ntimes + 1) to this: newdata - lapply(data, function(x) { x - as.matrix(x) x[rep(1:nrow(x), ntimes + 1),] }) or something similar that results Surv objects being rep()'ed rowwise rather than elementwise and returned as objects of the right dimension (rather than as a vector). Caveat: This works in the example you give, but I've not tested this extensively. HTH, Chuck Thanks, Heinz require(survival) ## from the help page aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml) summary(aml3) coxph(Surv(time,status)~x,data=aml) ## the same coxph(Surv(start,time,status)~x,data=aml3) ## added to show doubling of records aml.so - aml aml.so$surv.object - with(aml, Surv(time, status)) aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml3.so) sessionInfo('survival') R version 2.9.1 Patched (2009-07-07 r48910) i386-pc-mingw32 locale: LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252 attached base packages: character(0) other attached packages: [1] survival_2.35-4 loaded via a namespace (and not attached): [1] base_2.9.1 graphics_2.9.1 grDevices_2.9.1 methods_2.9.1 [5] splines_2.9.1 stats_2.9.1 utils_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 Thank you Chuck, it seems to work also with my real data, but I noted that in the example aml$x, which is a factor, gets converted to character in aml3.so. Maybe, if I find the time, I should look at as.data.frame.matrix and rbind for Surv objects. Heinz, Try newdata - lapply(data, function(x) { if (!is.matrix(x)) { rep(x,ntimes +1 ) } else { x - as.matrix(x) x[rep(1:nrow(x), ntimes + 1),]}}) HTH, Chuck Thanks again, Heinz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survSplit with data.frame containing a Surv object
On Mon, 13 Jul 2009, Charles C. Berry wrote: On Mon, 13 Jul 2009, Heinz Tuechler wrote: At 20:18 13.07.2009, Charles C. Berry wrote: On Mon, 13 Jul 2009, Heinz Tuechler wrote: Dear All, since years I am struggling with Surv objects in data.frames. The following seems to have to do with it. See below the modified example from the help page of survSplit. The original works, as expected. If, however, a Surv object is added to the data.frame, each record gets doubled. Is there some solution other than avoiding Surv objects in data.frames? I think you can modify survSplit so that it will properly handle Surv objects. Change this line: newdata - lapply(data, rep, ntimes + 1) to this: newdata - lapply(data, function(x) { x - as.matrix(x) x[rep(1:nrow(x), ntimes + 1),] }) or something similar that results Surv objects being rep()'ed rowwise rather than elementwise and returned as objects of the right dimension (rather than as a vector). Caveat: This works in the example you give, but I've not tested this extensively. HTH, Chuck Thanks, Heinz require(survival) ## from the help page aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml) summary(aml3) coxph(Surv(time,status)~x,data=aml) ## the same coxph(Surv(start,time,status)~x,data=aml3) ## added to show doubling of records aml.so - aml aml.so$surv.object - with(aml, Surv(time, status)) aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start, event=status,episode=i) summary(aml3.so) sessionInfo('survival') R version 2.9.1 Patched (2009-07-07 r48910) i386-pc-mingw32 locale: LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252 attached base packages: character(0) other attached packages: [1] survival_2.35-4 loaded via a namespace (and not attached): [1] base_2.9.1 graphics_2.9.1 grDevices_2.9.1 methods_2.9.1 [5] splines_2.9.1 stats_2.9.1 utils_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 Thank you Chuck, it seems to work also with my real data, but I noted that in the example aml$x, which is a factor, gets converted to character in aml3.so. Maybe, if I find the time, I should look at as.data.frame.matrix and rbind for Surv objects. Heinz, Try newdata - lapply(data, function(x) { if (!is.matrix(x)) { rep(x,ntimes +1 ) } else { Oops. The next line is unneeded: x - as.matrix(x) Chuck x[rep(1:nrow(x), ntimes + 1),]}}) HTH, Chuck Thanks again, Heinz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls, reach limit bounds
Hi Uyen, You do not have enough information to estimate 4 parameters in your nonlinear model. Even though you have 12 data points, only 6 are contributing independent information (you essentially have 6 replicate points). If you plot the fittted function you will see that it fits your data really well. However, you will also see that the fitted curve is far from reaching the asymptote. An implication of this is that you cannot reliably estimate b0 and b1 separately. So, you need more data, especially for larger x values in the asymptotic region. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Saturday, July 11, 2009 9:50 PM To: UyenThao Nguyen Cc: r-help@r-project.org Subject: Re: [R] nls, reach limit bounds Have you plotted the data? There are two standard sources of non-convergence problems like this: First, there may not be enough information in your data to estimate all four parameters. Second, if that's not the case, then your starting values are not appropriate. I routinely use optim or nlminb to find a sensible solution, because these general purpose optimizers are more likely to converge than nls. To do this, I write a function with a name like SSElogistic to compute the sum of squares of residuals for your model. I like to use optim with hessian=TRUE. Then I compute eigen(fit$hessian, symmetric=TRUE), with fit = the object returned by optim. If the smallest eigenvalue is negative, it says that optim found a saddle point. If the smallest eigenvalue is less than, e.g., 1e-8 times the largest, it says that the smallest eigenvector is very poorly estimated. Round those numbers off grossly to 1 significant digit, and they will likely suggest which parameter you can fix and drop from the model. Hope this helps. Spencer Graves UyenThao Nguyen wrote: Hi, I am trying to fit a 4p logistic to this data, using nls function. The function didn't freely converge; however, it converged if I put a lower and an upper bound (in algorithm port). Also, the b1.A parameter always takes value of the upper bound, which is very strange. Has anyone experienced about non-convergent of nls and how to deal with this kind of problem? Thank you very much. 3 y x 1 0.8924619 -0.31875876 2 1.1814749 -0.21467016 3 1.6148266 0.06069784 4 2.2091363 0.54032947 5 2.7019079 1.04921802 6 3.0679585 1.60745502 9 0.9436973 -0.31875876 10 1.2201133 -0.21467016 11 1.6470043 0.06069784 12 2.2090048 0.54032947 13 2.6864857 1.04921802 14 3.0673523 1.60745502 new.cont=nls.control(maxiter = 1, tol = 1e-05, minFactor = 1e-08, printEval = FALSE, warnOnly = FALSE) b0.A=.9*min(DAT$y) b1.A=1.1*max(DAT$y)-b0.A b2.A=-1*mean(DAT$x) b3.A=1 b0.A b1.A b2.A b3.A nls.mdl.A=nls(y~b0 + b1/(1+exp(-b2-b3*x)),data=DAT,start = list(b0=b0.A, b1=b1.A, b2=b2.A, b3=b3.A), lower=-10, upper=10, algorithm=port,trace=T,control=new.cont) ## [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine two rows (in a dataframe) into a third row?
Hi Henrique other R-helpers, Thank you for helping me last week. I used Henrique's suggestion to develop some code (below) to combine two rows in my dataframe into a third row, and then delete the original two rows. It works well. My solution is not very elegant however; if there's a function (or a better way) to accomplish this in 1-2 lines (rather than my 6) I'd appreciate knowing about it. Many thanks, Mark Na #make some data for this example data-data.frame(c(1A,1B),c(10,15)) names(data)-c(id,value) data$value-as.numeric(as.character(data$value)) #combine two lines into one by summing their values in the value column fixed-data.frame() #create empty data frame to hold fixed rows fixed-rbind(fixed, aggregate(data[value],list(substr(data[,id],1,1)), sum)) #copy previous line as necessary for other fixes names(fixed)-c(id,value) #fix column names #bind the fixed line to the main dataframe and delete the original lines data-rbind(data,fixed) #add fixed lines to data data-data[-which(c(1A,1B) %in% data$id),] #delete lines from data rownames(data) - 1:nrow(data) #renumber rows On Thu, Jul 9, 2009 at 5:58 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: aggregate(x[VALUE], list(substr(x[,ID], 1, 1)), sum) On Thu, Jul 9, 2009 at 7:27 PM, Mark Na mtb...@gmail.com wrote: Dear R-helpers, I have two rows in my dataframe: IDVALUE 1A10 1B15 and I would like to combine these two rows into a single (new) row in my dataframe: IDVALUE 125 ...simply by specifying a new value for ID and summing the two VALUES. I have been trying to do this with with rbind, but it's not working. I'd appreciate any pointers. Thanks, Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fortran function for quantiles
Hi, I was wondering whether there is any Fortran function or associated library for evaluating the quantiles of a set of values (something which the R-function quantile() does). Any help will be much appreciated. Thanks and regards, Dhiman Bhadra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Heckman Selection MOdel Help in R
On Mon, Jul 13, 2009 at 4:26 PM, saurav pathakpathak.sau...@gmail.com wrote: I am using R 2.9.1, That's good! I am not sure about the version of sampleSelection and maxLik This is important! Please check the version numbers, e.g. with R help(package=maxLik) R help(package=sampleSelection) BTW: Did you install the development version of the maxLik package from R-Forge? If yes, please use the stable version that is available on CRAN. Let em explain, In my data the DV used as 's' in the formula has some missing values, which can lead to bias in our case, so I used adpopdata$s - ifelse(is.na(Ln-oy5_1),0,1) ie I convert all the missing values for the variable ln_oy5_1 to 0 and all non-missings as 1, so is the source of missing values from the IVs used in the following: myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + + imf_pop + estbbo_m, family = binomial(link = probit)) so I dont know where the missing values are coming from, can you suggest how to correct it?? They might come from the explanatory variables. Please check, e.g. with R sum( is.na( adpopdata$s ) ) R sum( is.na( adpopdata$age ) ) R sum( is.na( adpopdata$gender ) ) ... R sum( is.na( adpopdata$estbbo_m ) ) Please reply to all (i.e. including R-help) so that others who will have similar questions and problems in the future could benefit from our discussion. Arne On Mon, Jul 13, 2009 at 3:09 PM, Arne Henningsen arne.henning...@googlemail.com wrote: On Mon, Jul 13, 2009 at 11:18 AM, Pathak, Sauravs.patha...@imperial.ac.uk wrote: Dear Arne I have gone through the paper and I have tried it at my end, I would really appreciate if you could address the following: 1. Based upon your suggestion I used the following: regmod2 - selection(s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata, method = 2step) On trying the above( notice that I have changed heckit to selection in the above command, i get the following error message Error in coef.probit(result$probit) : could not find function coef.maxLik That's weird. Which versions of R, sampleSelection, and maxLik do you use? Before trying the above I tried the following: 2. When I tried to do the Heckman selection model in stages , the first run was successful, I mean, using the following: myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + + imf_pop + estbbo_m, family = binomial(link = probit)) summary(myProbit) I am successful upto this point, but 3. When I try calculating the IMR using the following: adpopdata$IMR-invMillsRatio(myProbit)$IMR1 I get the error below Error in `$-.data.frame`(`*tmp*`, IMR, value = c(2.50039945424535, : replacement has 257358 rows, data has 343251 I guess that you have some NAs in the data so that you have the IMRs not for all observations but only for the observations witout NAs. R myIMRs - invMillsRatio(myProbit)$IMR1 should work. Is there a code to calculate IMR by hand?? Yes, inside invMillsRatio() However, why do you want to do this? what I see is that the number of rows of IMR calculated and the number of rows in the actual data set do not match (may be some missing value issues, I am not sure, if it is, how to fix it?) and hence IMR could not be added to my original data set, how do I fix this and then proceed to get correct IMR to use in my outcome equation (the OLS stage) This is really taking a lot of time, I am working on it for weeks, can you please help me kindly, If you wish I can send you the data set as well Please try to fix it yourself. Arne -Original Message- From: Arne Henningsen [mailto:arne.henning...@googlemail.com] Sent: 13 July 2009 00:56 To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee Subject: Re: Heckman Selection MOdel Help in R Hi Saurav! On Sun, Jul 12, 2009 at 6:06 PM, Pathak, Sauravs.patha...@imperial.ac.uk wrote: I am new to R, I have to do a 2 step Heckman model, my selection equation is below which I was successful in running but I am unable to proceed further, I have so far used the following command glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, family = binomial(link = probit)) My question is 1. How do i discard the non significant selection variables (one out of the seven variables above is non-significant) and calculate the Inverse Mills Ratio of the significant variables 2. I need the inverse mills ratio from the above to run the outcome equation model using OLS with the Inverse mills ratio calculated on the basis of the above probit as the control in my outcome equation, hence I need to get the IMR (Is there another direct way?) 3. How can this be done in R using my concept or otherwise does there exist another way of doing what I wish to achieve On trying
[R] Rgraphviz ignoring outputorder attribute
Hi - since Rgraphviz was officially moved over to Bioconductor, this might be a misguided post, but it's worth a shot: I'm plotting a graph using Rgraphviz and in an attempt to force the edges to be behind the nodes (on a fairly complex graph), I set the outputorder graph attribute to edgesfirst (as is described in the graphviz docs here - http://www.graphviz.org/doc/info/attrs.html). This field does also appear (although with a short and cryptic description) in the Rgraphviz package documentation. However I still will get some stray edges that traverse over the nodes. (And yes, I've set the fillcolor to a non-transparent color.) (Also, yes, I'm setting the rest of the graph/edge/node properties correctly as all of my other attribute settings are being honored by Rgraphviz.) The problem occurs with multiple devices including pdf(), png(), and X11(). Anyone else ever have edges on top of nodes and find outputorder ignores your request? -Murat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to run a R program with input arguments
edisonying wrote: I am a beginner in R and know only a little about it yet. I have a script written in R language, named as a.txt for example. I am using a Linux machine, at present I only know that I can type R in the terminal and then copy-paste the content in a.txt to the R's interface to execute the program. However, I want to know if there is any method that allows me to add some input arguments directly after the R in the terminal. Specifically, I want to type the following in the cmd line: R (some flags or option) a.txt then the program will begin to run. Besides, if the program will read a data file first, can I also specify the data file in the command line? Then the complete command will become: R a.txt data.txt This is important for a beginner. Thanks very much!:working: We usually have R execute scripts in the following manner: R --vanilla --slave --args [your args here] scriptFile.R Inside scriptFile.R the arguments can be retrieved by using commandArgs(trailingOnly = T). The trailingOnly flag causes only the arguments following --args to be be returned and not --args, --vanilla (no save, no restore, quick startup) and --slave (makes R run quiet). Personally, I get tired of typing R --vanilla --slave --args all the time and prefer to use Rscript. Since you are using Linux you can do the same by putting the following hashbang at the top of your file: #!/usr/bin/env Rscript Then your script can be run using: ./scriptFile.R [your args here] The arguments are still accessed inside the script using commandArgs(T) Good luck! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24471559.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SweaveListingUtils question
Hello group, recently I read about the SweaveListingUtils package and now I would like to try it out. However I can not make it run... Below is a minimal example. The problem seems the following line, generated by the package: \ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.6\textwidth}}{}% If I comment it out, it works. What can I do about it? I am not sure if this is the right mailing list for this question since it touches both R and LaTeX. Sorry if I am wrong... Any hint appreciated, kind regards, Karsten. %% begin example \documentclass[9pt]{beamer} \usepackage{fancyvrb} \usepackage{listings} % choose language and char set \usepackage[ngerman]{babel} \usepackage[ansinew]{inputenc} \SweaveOpts{prefix.string=Routput/parcel, keep.source=TRUE} \begin{Scode}{results=tex, echo=FALSE} require(SweaveListingUtils) SweaveListingoptions(intermediate = FALSE) SweaveListingPreparations() \end{Scode} \begin{document} \begin{frame}[fragile] \frametitle{Your title} \begin{lstlisting}options(digits=3) set.seed(1) require(Hmisc) age - rnorm(1000,50,10) sex - sample(c('female','male'),1000,TRUE) out - histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06), main = 'Back to Back Histogram') #! just adding color barplot(-out$left, col=red , horiz=TRUE, space=0, add=TRUE, axes=FALSE) barplot(out$right, col=blue, horiz=TRUE, space=0, add=TRUE, axes=FALSE) \end{lstlisting} \end{frame} \begin{Scode}{echo=FALSE} unloadNamespace(SweaveListingUtils) \end{Scode} \end{document} %% end example [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Smoothing parameter or Bandwidth (h)
Hello to everybody, I need to know the Smoothing Parameter to obtain Home Range of an animal through the Area Kernel. I have 200 locations with x and y. How can I obtain the Smoothing Parameter with R for LSCV, CV and Href method??? Thank you. -- View this message in context: http://www.nabble.com/Smoothing-parameter-or-Bandwidth-%28h%29-tp24470439p24470439.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem predict/poly
Dear R experts, I am observing undesired behavior of predict(fit, newdata), in case when fit object is produced by lm() involving a poly(). Here is how to reproduce: x - c(1:10) y - sin(c(1:10)) fit - lm(formula=y~poly(x, 5, raw=TRUE)) predict(fit, newdata=data.frame(x=c(1:10))) ## this works predict(fit, newdata=data.frame(x=c(1:1))) ## this is broken, error below Error in poly(x, 5, raw = TRUE) : 'degree' must be less than number of unique points The problem is in poly(): if (raw) { if (degree = length(unique(x))) stop('degree' must be less than number of unique points) This assertion is only warranted when poly is used to fit a model. But it is unnecessary and undesired if poly() is used to obtain prediction. Or am I missing something? Why I would want to obtain model prediction for a single point is another story. I am filling a matrix of model values one element at a time, so that the matrix can then be given to persp() (I am fitting a two-variable model). Alex Stolpovsky - This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to run a R program with input arguments
Thanks for your help. I found another cmd as follows: R CMD BATCH infile outfile what's the difference between this expression and your suggestion? cls59 wrote: edisonying wrote: I am a beginner in R and know only a little about it yet. I have a script written in R language, named as a.txt for example. I am using a Linux machine, at present I only know that I can type R in the terminal and then copy-paste the content in a.txt to the R's interface to execute the program. However, I want to know if there is any method that allows me to add some input arguments directly after the R in the terminal. Specifically, I want to type the following in the cmd line: R (some flags or option) a.txt then the program will begin to run. Besides, if the program will read a data file first, can I also specify the data file in the command line? Then the complete command will become: R a.txt data.txt This is important for a beginner. Thanks very much!:working: We usually have R execute scripts in the following manner: R --vanilla --slave --args [your args here] scriptFile.R Inside scriptFile.R the arguments can be retrieved by using commandArgs(trailingOnly = T). The trailingOnly flag causes only the arguments following --args to be be returned and not --args, --vanilla (no save, no restore, quick startup) and --slave (makes R run quiet). Personally, I get tired of typing R --vanilla --slave --args all the time and prefer to use Rscript. Since you are using Linux you can do the same by putting the following hashbang at the top of your file: #!/usr/bin/env Rscript Then your script can be run using: ./scriptFile.R [your args here] The arguments are still accessed inside the script using commandArgs(T) Good luck! -Charlie -- View this message in context: http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24472117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.