Re: [R] Replace columns in a data.frame randomly splitted
The thing is that I've already been working with df1, and I was looking for a function that could replace values knowing rows and columns. Does it exist? Thank you, u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/Replace-columns-in-a-data-frame-randomly-splitted-tp4122926p4127405.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error adding Bigmemory package
On 30.11.2011 21:53, RogerP wrote: I am trying to add the bigmemory packages but I get the following error message: In file included from bigmemory.cpp:14:0: ../inst/include/bigmemory/isna.hpp: In function 'bool neginf(double)': ../inst/include/bigmemory/isna.hpp:22:57: error: 'isinf' was not declared in this scope make: *** [bigmemory.o] Error 1 ERROR: compilation failed for package 'bigmemory' * removing '/usr/local/lib/R/library/bigmemory' * restoring previous '/usr/local/lib/R/library/bigmemory' The downloaded packages are in '/tmp/RtmpNrkt2a/downloaded_packages' Updating HTML index of packages in '.Library' Warning message: In install.packages(bigmemory) : installation of package 'bigmemory' had non-zero exit status install.packages(gsl) Error in install.packages(gsl) : object 'gsl' not found install.packages(gnulib) Error in install.packages(gnulib) : object 'gnulib' not found I am using the GNU compiler version 4.5.1 from blastwave on a Solaris 10 box. I can't seem to find how to compile bigmembory successfully. I can complile and add other packages. See http://cran.r-project.org/web/checks/check_results_bigmemory.html and find it also fails the Solaris based CRAN checks. Hence please report to the package maintainer directly, since reports from CRAN maintainers and the CRAN results are obviously ignored. Best, Uwe Ligges roger -- View this message in context: http://r.789695.n4.nabble.com/Error-adding-Bigmemory-package-tp4124863p4124863.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] References for book R In Action by Kabacoff
I know this is not really an R question - it is a query about a recent book on R (R In Action) by Robert Kabacoff, (Manning Publications 2011). There are many references to interesting topics in R in the book, BUT, I do not find a bibliography/list of references in the book! Does anybody know if there are errata for the book available some place? Thanks, Ravi -- View this message in context: http://r.789695.n4.nabble.com/References-for-book-R-In-Action-by-Kabacoff-tp4127625p4127625.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] References for book R In Action by Kabacoff
On 01.12.2011 10:10, Ravi Kulkarni wrote: I know this is not really an R question - it is a query about a recent book on R (R In Action) by Robert Kabacoff, (Manning Publications 2011). There are many references to interesting topics in R in the book, BUT, I do not find a bibliography/list of references in the book! Does anybody know if there are errata for the book available some place? I'd ask the author! Uwe Ligges Thanks, Ravi -- View this message in context: http://r.789695.n4.nabble.com/References-for-book-R-In-Action-by-Kabacoff-tp4127625p4127625.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: R code
Hi Hi everybody, I am unable to resolve this error using the following for loop. Would appreciate help. The same loop works with for(i in 1:92) strangely. I checked the .raw input file and everything is kosher, even Line 547 mentioned in the error message. I wonder if there is any problem with the paste function. I do not believe it. In that case the message will be, that R can not open the file. You speak about input file but actually you are reading 93 files. Which one is wrong? Did you check that one? And how did you checked it? There could be some unseen characters in this row. Without some more info I doubt you get any better advice. Petr Thanks very much in advance. ** for(i in 1:93) { inputdta- read.table(file=gsub( ,,paste(JSR_network_[,i,].RAW), fixed=TRUE), header=TRUE) colnames(inputdta) - c(jsrid,firmid,jsrstart,injsr) print(nrow(inputdta)) } *** Message: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 547 did not have 4 elements -- View this message in context: http://r.789695.n4.nabble.com/R-code- tp4125904p4125904.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how big (in RAM and/or disk storage) is each of these objects in a list?
On 26.11.2011 20:36, John wrote: On Sat, 26 Nov 2011 12:41:08 -0600 Paul Johnsonpauljoh...@gmail.com wrote: Greetings, friends (and others :) ) We generated a bunch of results and saved them in an RData file. We can open, use, all is well, except that the size of the saved file is quite a bit larger than we expected. I suspect there's something floating about in there that one of the packages we are using puts in, such as a spare copy of a data frame that is saved in some subtle way that has escaped my attention. Consider a list of objects. Are there ways to do these things: 1. ask R how much memory is used by the things inside the list? A bit late, but hopefully still helpful: See ?object.size and particularly note Associated space (e.g. the environment of a function and what the pointer in a ‘EXTPTRSXP’ points to) is not included in the calculation. 2. Does as.expression(anObject) print everything in there? Or, is there a better way to convert each thing to text or some other format that you can actually read line by line to see what is in there, to see everything? See ?dump and also carefully read the notes. If there's no giant hidden data frame floating about, I figure I'll have to convert symmetric matrices to lower triangles or such to save space. Unless R already is automatically saving a matrix in that way but just showing me the full matrix, which I suppose is possible. If you have other ideas about general ways to make saved objects smaller, I'm open for suggestions. Look carefully for environments attached to one or more of the objects. Best, Uwe Ligges As an initial step, what is the result of running ls() with your RData file loaded? You should get a list of what is in memory. Using RData files can be as space-efficient or costly as the user's habits. Did you use save() or the save.image() command to produce the file? The save.image() command stashes what is in memory and if you've run a number of experimental procedures that did not pan out and you did not discard with the results with rm(), they were saved to the rdata file along with the information you did want, a procedure rather like filing away all your work in a file drawer and then emptying the waste basket into the drawer as well. If you save the data with ascii = TRUE as an option, you can troll through the file and read what you saved. JWD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R code
the problem might be a missing quote. Try quote = '' Or it might be a comment character, so try comment.char = '' This happens a lot if your data is not clean Sent from my iPad On Nov 30, 2011, at 19:48, nsaraf nsa...@gmail.com wrote: ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing ticks in the 3rd and 4th row of a lattice
Dear Mr Sarkar, My apologies for not being articulate enough. Your solution worked perfectly. Many thanks, Ashim On Tue, Nov 22, 2011 at 9:12 PM, Deepayan Sarkar deepayan.sar...@gmail.comwrote: On Fri, Nov 18, 2011 at 11:22 AM, Ashim Kapoor ashimkap...@gmail.com wrote: Dear all, I want to draw ticks on the 3rd and 4th row of a lattice. How do I do this ? In my search of the help, I discovered a parameter alternating,which kind of says where the ticks will be but does not suffice for me. You need to explain more clearly what you want. Your plot does already have ticks in the 3rd and 4th row. If you need the x-axes labeled in each row, you need to have something like scales=list(x = list(relation = free, rot = 45, ... -Deepayan I am running this command : - barchart(X03/1000~time|Company, data=df1[which(df1$time!=1),], horiz=F, scales=list(x=list(rot=45,labels=paste(Mar,c(07,08,09,10,11 ,par.strip.text=list(lineheight=1,lines=2,cex=.75), ylab = In Rs. Million,xlab=,layout=c(3,4),as.table=T,between=list(y=1)) where my data is : - dput(df1) structure(list(Company = structure(c(9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L), .Label = c(Bharat Petroleum Corpn. Ltd., Chennai Petroleum Corpn. Ltd., Company Name, Essar Oil Ltd., Hindalco Industries Ltd., Hindustan Petroleum Corpn. Ltd., Indian Oil Corpn. Ltd., Mangalore Refinery Petrochemicals Ltd., Reliance Industries Ltd., Steel Authority Of India Ltd., Sterlite Industries (India) Ltd.), class = factor), time = c(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), X03 = c(722931.1, 751620.5, 304456.3, 294868.9, 192712.6, 36695.4, 188313.4, 98954.9, 100088.7, 72379.9, 848517.5, 864562.2, 347310.9, 301022.1, 253514.5, 165661.6, 206377.7, 108897, 109336.3, 71207.6, 1003504.6, 1145993.8, 392261.5, 341086, 289737.4, 359837.2, 252964.3, 90036.2, 90474.8, 127623.2, 1411082.1, 907480.4, 364637.5, 290915.7, 255397.4, 328557.2, 202855.3, 118725.4, 116647.6, 106254.9, 1772254.7, 1204856.9, 469935.6, 313527.6, 320131.1, 384323.5, 260813.9, 137403.3, 137238.5, 136888.4, 1151658, 974902.76, 375720.36, 308284.06, 262298.6, 255014.98, 64.92, 110803.36, 110757.18, 102870.8)), row.names = c(Reliance Industries Ltd..7, Indian Oil Corpn. Ltd..7, Bharat Petroleum Corpn. Ltd..7, Hindustan Petroleum Corpn. Ltd..7, Mangalore Refinery Petrochemicals Ltd..7, Essar Oil Ltd..7, Chennai Petroleum Corpn. Ltd..7, Hindalco Industries Ltd..7, Sterlite Industries (India) Ltd..7, Steel Authority Of India Ltd..7, Reliance Industries Ltd..8, Indian Oil Corpn. Ltd..8, Bharat Petroleum Corpn. Ltd..8, Hindustan Petroleum Corpn. Ltd..8, Mangalore Refinery Petrochemicals Ltd..8, Essar Oil Ltd..8, Chennai Petroleum Corpn. Ltd..8, Hindalco Industries Ltd..8, Sterlite Industries (India) Ltd..8, Steel Authority Of India Ltd..8, Reliance Industries Ltd..9, Indian Oil Corpn. Ltd..9, Bharat Petroleum Corpn. Ltd..9, Hindustan Petroleum Corpn. Ltd..9, Mangalore Refinery Petrochemicals Ltd..9, Essar Oil Ltd..9, Chennai Petroleum Corpn. Ltd..9, Hindalco Industries Ltd..9, Sterlite Industries (India) Ltd..9, Steel Authority Of India Ltd..9, Reliance Industries Ltd..10, Indian Oil Corpn. Ltd..10, Bharat Petroleum Corpn. Ltd..10, Hindustan Petroleum Corpn. Ltd..10, Mangalore Refinery Petrochemicals Ltd..10, Essar Oil Ltd..10, Chennai Petroleum Corpn. Ltd..10, Hindalco Industries Ltd..10, Sterlite Industries (India) Ltd..10, Steel Authority Of India Ltd..10, Reliance Industries Ltd..11, Indian Oil Corpn. Ltd..11, Bharat Petroleum Corpn. Ltd..11, Hindustan Petroleum Corpn. Ltd..11, Mangalore Refinery Petrochemicals Ltd..11, Essar Oil Ltd..11, Chennai Petroleum Corpn. Ltd..11, Hindalco Industries Ltd..11, Sterlite Industries (India) Ltd..11, Steel Authority Of India Ltd..11, Reliance Industries Ltd..1, Indian Oil Corpn. Ltd..1, Bharat Petroleum Corpn. Ltd..1, Hindustan Petroleum Corpn. Ltd..1, Mangalore Refinery Petrochemicals Ltd..1, Essar Oil Ltd..1, Chennai Petroleum Corpn. Ltd..1, Hindalco Industries Ltd..1, Sterlite Industries (India) Ltd..1, Steel Authority Of India Ltd..1 ), .Names = c(Company, time, X03), reshapeLong = structure(list( varying = structure(list(X03 = c(X03.07, X03.08, X03.09, X03.10, X03.11, X03.1)), .Names = X03, v.names = X03, times = c(7, 8, 9, 10, 11, 1)), v.names = X03, idvar = Company, timevar = time), .Names = c(varying, v.names, idvar, timevar)), class = data.frame)
Re: [R] Non-finite finite-difference value error in eha's aftreg
Le mercredi 30 novembre 2011 à 18:41 +0100, Göran Broström a écrit : On Wed, Nov 16, 2011 at 3:06 PM, Milan Bouchet-Valat nalimi...@club.fr wrote: Hi list! I'm getting an error message when trying to fit an accelerated failure time parametric model using the aftreg() function from package eha: Error in optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), : non-finite finite-difference value [2] This only happens when adding four specific covariates at the same time in the model (see below). I understand that kind of problem can come from a too high correlations between my covariates, but is there anything I can do to avoid it? Yes, remove one (or more) covariate(s). The design matrix is almost certainly singular. Ah, thanks, that's what I feared. ;-) What do you mean, though, by almost? Is there a way to measure that? Does something need to be improved in aftreg.fit? Yes, error messages. I'll look at it. Cool. It's always good to get a clear message saying that you're asking for something impossible, so that you don't try to fix it (like I did). Regards My data set is constituted of 34,505 observations (years) of 2,717 individuals, which seems reasonable to me to fit a complex model like that (covariates are all factors with less than 10 levels). I can send it by private mail if somebody wants to help debugging this. The details of the model and errors follow, but feel free to ask for more testing. I'm using R 2.13.1 (x86_64-redhat-linux-gnu), eha 2.0-5 and survival 2.36-9. Thanks for your help! m -aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1 ++ t.since.school.q, +data=ms, dist=loglogistic, id=ident) Error in optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), : non-finite finite-difference value [2] Calls: aftreg - aftreg.fit - aftp0 - optim traceback() 4: optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), hessian = TRUE) 3: aftp0(printlevel, ns, nn, id, strata, Y, X, offset, dis, means) 2: aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, center) 1: aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1 + t.since.school.q, data = ms, dist = loglogistic, id = ident) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Göran Broström __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x, y for point of intersection
Monica has sent me some data and code for taking a quick look. As it turned out, there was a simple programming error on her side. The segm_distance() function in package 'pracma' is working correctly. And there is no minimization procedure in here, it simply solves some equations from plane geometry. Maybe the function suggested by Don is faster, I haven't checked that. Regards, Hans Werner 2011/11/29 Monica Pisica pisican...@hotmail.com Hi again, Working with my real data and not with the little example i sent to the list i discovered that segm_distance function from package pracma does not converge to 0 in all cases, even if i increase the number of iteration to 10,000 for example. It seems that it depends on the initialization point - most like a minimization function. So my thanks go to Don who's suggestion works for the real data as well without any problems - so far ;-) He suggested to use the function crossing.psp from package spatstat. Thanks again to all who have answered and helped to solve my problem. I certainly learned few new things. Monica From: macque...@llnl.gov To: pisican...@hotmail.com CC: r-help@r-project.org Date: Wed, 23 Nov 2011 14:03:42 -0800 Subject: Re: [R] x, y for point of intersection The function crossing.psp() in the spatstat package might be of use. Here's an excerpt from its help page: crossing.psp package:spatstat R Documentation Crossing Points of Two Line Segment PatternsDescription: Finds any crossing points between two line segment patterns. Usage: crossing.psp(A,B) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPSS - R
Dear Kristi, Also. can anyone recommend any resources to help SPSS users learn to things in R? You may want to have a look at the R2STATS package, a simple GUI for linear models. Best, Yvonnick Noel University of Brittany Department of Psychology Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] permutate elements in a vector
Hi all, I have a vector, e.g., A = c(10, 20, 30, 40). This 4 numbers have 4! = 24 different combination. I want to generate a matrix with one combination in one row, so the output would be B = 10 20 30 40 10 40 20 30 ... Does anyone know how to do this easily in R? Thank you very much. Wendy -- View this message in context: http://r.789695.n4.nabble.com/permutate-elements-in-a-vector-tp4127821p4127821.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Raster manipulation in R
Hello everyone in the forum For introducing myself I would say I have a basic knowledge of R. Now I am intended to implement a flood algorithm using R. I have some MatLab experience doing these, and as an example to explain more or less what I want, I have a m code to calculate the slope from a Digital elevation model: slope=zeros(row,col); for i=2:row-1 for j=2:col-1 dzdx=(raster(i,j+1)-raster(i,j-1))/(2*res); dzdy=(raster(i+1,j)-raster(i-1,j))/(2*res); slope(i,j)=sqrt(dzdx^2+dzdy^2); end end; The question is to know how to do the similar procedure on R. All suggestions are welcome Thanks All best, Jorge PD:I am using R on windows system 64 bits -- View this message in context: http://r.789695.n4.nabble.com/Raster-manipulation-in-R-tp4127768p4127768.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining arima and ar function
Hi everyone I've got a problem regarding the arima() and the ar() function for autoregressive series. I would simply like to combine them. To better understand my question, I first show you how I'm using these two functions individually (see file in the attachement). 1) apply(TSX,2, function(x) ar(na.omit(x),method=mle)$order # this function finds the optimal (autoregressive) order for each series (briefly speaking: for each series I get a specific number) 2) apply(TSX,2,function(x)arima(x,order=c(1,0,0))$residuals # this function puts an autoregressive model of order 1 on each series What I'd like to do, is to combine them, such that the resulting orders (numbers) from the first function are used as orders in the second function. So in the specific example attached the results from the first function are 6,5,1 and these numbers should be used as the order in function two. I tried to simply put the two functions together as follows: apply(TSX,2,function(x) arima(x,order=c((apply(TSX,2, function(x)ar(na.omit(x),method=mle)$order))),0,0)) However, I get the error message 'order' must be a non-negative numeric vector of length 3. Any hints? I hope you can help me with this issue. Many thanks and best regards, S.B. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Resampling with replacement on a binary (0, 1) dataset to get Cis
...is it possible to do that? I apologize for something that must be a very trivial question for most of you but, unfortunately, it is not for me. A binary variable is measured, say, 50 times each year during 10 year. My interest is focused on the percentage of 1s with respect to the total if each year. There is no way to repeat those measure within each year and getting the CIs by the normal way. By the way, it would be important to get even a rough estimate of the CIs of these estimates (/n/1//n/1+/n/0). In case this is not a blasphemy, how might be done in R? Thanks in advance for any help -- View this message in context: http://r.789695.n4.nabble.com/Resampling-with-replacement-on-a-binary-0-1-dataset-to-get-Cis-tp4127990p4127990.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R endnote entry
1. May I suggest that you try Mendeley for bibliographic management 2. There are also citations for specific R packages when you download each version. Let's use the ade4 package as an example: http://cran.r-project.org/web/packages/ade4/index.html if you selection citation info the corresponding bibtex input has been provided. @Article{, title = {The ade4 package: implementing the duality diagram for ecologists}, author = {S. Dray and A.B. Dufour}, journal = {Journal of Statistical Software}, year = {2007}, volume = {22}, pages = {1-20}, number = {4}, } This is a reliable way to deal with references in other programs ie: LaTeX...etc. I save all my references as bibtex into mendeley and then have a subfolders for R analysis/methods that contain the package citations. Using a .txt (bibtex file made from the ade4 package citation), my inserted citation in a word document from Mendeley looks like this: (R Development Core Team, 2011) and the formal reference in the bibliography is: R Development Core Team. 2011. R: A Language and Environment for Statistical Computing. If however, you are still determined to use endnote, then you have 2 ways to configure your reference, manually add a reference by copy/paste the correct title from the citation, etc. Then configure your .ens style for the appropriate journal, etc. If you insert the citation and it becomes R.D.C.T then it's likely the original input is fine and you have to configure your style output. good luck! - --- Heather A. Wright, PhD candidate Ecology and Evolution of Plankton Stazione Zoologica Anton Dohrn Villa Comunale 80121 - Napoli, Italy -- View this message in context: http://r.789695.n4.nabble.com/R-endnote-entry-tp4126826p4127866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] efficient ways to dynamically grow a dataframe
Hi, I'm trying to write a small microsimulation in R: that is, I have a dataframe with info on N individuals for the base-year and I have to grow it dynamically for T periods: df = data.frame( id = 1:N, x = ) The most straightforward way to solve the problem that came to my mind is to create for every period a new dataframe: for(t in 1:T){ for(i in 1:N){ row = data.frame( id = i, t = t, x = ... ) df = rbind(df,row) } } This is very inefficient and my pc gets immediately stucked as N is raised above some thousands. As an alternative, I created an empty dataframe for all the projected periods, and then filled it: df1 = data.frame( id = rep(1:N,T), t = rep(1:T, each = N), x = rep(NA,N*T) ) for(t in 1:T){ for(i in 1:N){ x = ... df1[df1$id==i df1$t==t,x] = x } } df = rbind(df,df1) This is also too slow, and my PC gets stucked. I don't want to go for a matrix, because I'd loose the column names and everything will become too much error-prone. Any suggestions on how to do it? Thanks in advance, Matteo -- Matteo Richiardi University of Turin Faculty of Law Department of Economics Cognetti De Martiis via Po 53, 10124 Torino Email: matteo.richia...@unito.it Tel. +39 011 670 3870 Web page: http://www.personalweb.unito.it/matteo.richiardi/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?
Dear Gavin, Thanks a lot for your reply. I am still not sure if your suggestion can help me or not because I have got different types of errors while trying it. I have tried my best to troubleshoot them and made some progress, but I have reached a dead end (I don’t know how to go on). Below I am copping my TinnR script. It contains not only the R script but also the errors returned in the R console. I apologize that the script is still not really reproducible but I just do not know how to create a sample data frame for it. Thanks again for your help. Edwin. ##Load required packages## library(vegan) library(MASS) library (FD) library(dummies) ##Load full table of species traits and taxonomic data## AllTraits-read.delim(SpeciesTraitsFullTbl.txt,header=T) ##Edit data## row.names(AllTraits)-AllTraits$species.code TraitsNMDS-AllTraits[,c(8:12,14:18,23,25,30)] ##Check variable class using function allClass from http://gettinggeneticsdone.blogspot.com/2010 /08/quickly-find-class-of-dataframe-vectors.html## allClass - function(x) {unlist(lapply(unclass(x),class))} allClass(TraitsNMDS) ##Change variables to fulfill requirements for gowdis() function## #Define ordinal variables# TraitsNMDS$leaf.type-ordered(TraitsNMDS$leaf.type, levels = c(simple, pinnate, bipinnate)) TraitsNMDS$dispersal.rank-ordered(TraitsNMDS$dispersal.rank, levels = c(s_disp,m_disp,l_disp)) #coerce integer an binary variables to numeric# TraitsNMDS$max.height-as.numeric(TraitsNMDS$max.height) TraitsNMDS$heterospory-as.numeric(TraitsNMDS$heterospory) ##Test wrapper function## wrapper- function(x, method, ...) {gowdis(x, ...)} WrapTest1-metaMDS(TraitsNMDS, distfun = wrapper, x=TraitsNMDS, asym.bin = heterospory, ord = podani) ##Returned error: 'Error in if (any(autotransform, noshare 0, wascores) any(comm 0)) { : missing value where TRUE/FALSE needed'## #Set autotransform, noshare and wascores to FALSE (even if TRUE is desired) to see if problems is with metaMDS function# WrapTest2-metaMDS(TraitsNMDS, distfun = wrapper, x=TraitsNMDS, asym.bin = heterospory, ord= podani, autotransform =FALSE, noshare = FALSE, wascores = FALSE) #Returned error: 'Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables'# #transform factors to dummies and numeric# TraitsNMDSCompleteDumm-dummy.data.frame(TraitsNMDSComplete, c(leaf.type,dispersal.rank)) TraitsNMDSCompleteDumm$leaf.typesimple-as.numeric(TraitsNMDSCompleteDumm$leaf.typesimple) TraitsNMDSCompleteDumm$leaf.typepinnate-as.numeric(TraitsNMDSCompleteDumm$leaf.typepinnate) TraitsNMDSCompleteDumm$leaf.typebipinnate-as.numeric(TraitsNMDSCompleteDumm$leaf.typebipinnate) TraitsNMDSCompleteDumm$dispersal.ranks_disp-as.numeric(TraitsNMDSCompleteDumm$dispersal.ranks_disp) TraitsNMDSCompleteDumm$dispersal.rankm_disp-as.numeric(TraitsNMDSCompleteDumm$dispersal.rankm_disp) TraitsNMDSCompleteDumm$dispersal.rankl_disp-as.numeric(TraitsNMDSCompleteDumm$dispersal.rankl_disp) ##Re-test wrapping function## WrapTest3-metaMDS(TraitsNMDSCompleteDumm, distfun = wrapper, x=TraitsNMDSCompleteDumm,asym.bin =heterospory, ord = podani) #The function runs but returns an error related to the gowdis() function after the transformation of the data is done: #'Square root transformation Wisconsin double standardization. Error in gowdis(x, ...) : w needs to be a numeric vector of length = number of variables in x' # But testing the wrapper alone does work!# DistMatrixWrapper-wrapper(x=TraitsNMDSCompleteDumm,asym.bin = heterospory, ord = podani) class(DistMatrixWrapper) [1] dist # I do not know why the error is produced -- View this message in context: http://r.789695.n4.nabble.com/I-cannot-get-species-scores-to-plot-with-site-scores-in-MDS-when-I-use-a-distance-matrix-as-input-Pr-tp4103699p4127949.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian Quantile regression installation
Please send your questions to the R-help list, and not directly to me. You provide your OS this time, but not any of the other information requested in the posting guide, like your version of R. What you did would be nice too. Used the menu option? Or used install.packages() at the command line? If this is an internet-enabled computer, the latter is by far the easiest option: it will download and install the package for you. On Thu, Dec 1, 2011 at 6:34 AM, kalam narendar reddy narendarcse...@gmail.com wrote: Dear Sarah Goslee, The following i sthe error if i tried to use the install packages from local zip files ven though i have loacated the bayesQR_1.3 which is winRAR zip archieve fie. Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open the connection In addition: Warning message: In read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open compressed file 'bayesQR_1.3(1)/DESCRIPTION', probable reason 'No such file or directory' The filename shown here is not the one that's in the CRAN archive - did you try to download it more than once? That may confuse Windows, I don't know. I don't use Windows (another reason you shouldn't be emailing me directly). But try renaming it without the (1) and see if that works. tell me what is the reason for it. i have tried to install the package from local zip file(binary version for windows of bayesian quantile regression ).my os is windows7 and i have followed the procedure as it is told in the manual for windows for package installation.still i am unanle to install.pls understand my problem. thanks in advance for your kind reply.pls reply as early as possible. Sarah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nipals in the chemometrics package in R
Hello i need some precision about nipals in the chemometrics package in R . When i use nipals in chemometrics i obtain T and P matrix. I really don't understand what to do with these two matrix to obtain the scores for every the component (like in spss fo example) Comp1Comp2 Comp3 quest1 0,8434 0,54333 0,3466 quest2 0,665 0,7655 0,433 Thank you very much for your help (I know that X=TP+E)... But don't understand else __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to speed up int2bin conversion?
Dear R-help members, I'm processing a large amount of MODIS data where quality assessment information is stored as an integer value for each pixel. I have to converted this number to an 8 digit binary flag to get access to the stored quality code (e.g. in2bin(165,8) = 1 0 1 0 0 1 0 1). Unfortunately, I did not manage to find a package providing a fast function to do so. I need to run this on millions of pixels and thus wrote the following function. int2bin - function(x,ndigits){ base - array(NA, dim=c(length(x), ndigits)) for(q in 1:ndigits){ base[, ndigits-q+1] - (x %% 2) x - (x %/% 2) } bin- apply(base,1,paste,collapse=) return(bin) } Since it is still slow, I have to find a way to express this more elegantly. I'd really appreciate any help. Thanking you, with best regards Jonas Jägermeyr Potsdam Institute for Climate Impact Research Research Domain II [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML with missing data in sem package
Dear Dustin, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Dustin Fife Sent: November-30-11 9:51 PM To: r-help@r-project.org Subject: [R] FIML with missing data in sem package Is there a way to use full information maximum likelihood (FIML) to estimate missing data in the sem package? For example, suppose I have a dataset with complete information on X1-X3, but missing data (MAR) on X4. Is there a way to use FIML in this case? I know lavaan and openmx allow No, the sem package doesn't handle missing data. You could, however, use another package, such as mice, for multiple imputation of the missing data. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox you to do it, but I couldn't find anything in the documentation for the sem package. Thanks! -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bigmemory on Solaris
At one point we might have gotten something working (older version?) on Solaris x86, but were never successful on Solaris sparc that I remember -- it isn't a platform we can test and support. We believe there are problems with BOOST library compatibilities. We'll try (again) to clear up the other warnings in the logs, though. !-) We should also revisit the possibility of a CRAN BOOST library for use by a small group of packages (like bigmemory) which might make patches to BOOST easier to track and maintain. This might improve things in the long run. Jay -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permutate elements in a vector
Wendy wrote Hi all, I have a vector, e.g., A = c(10, 20, 30, 40). This 4 numbers have 4! = 24 different combination. I want to generate a matrix with one combination in one row, so the output would be B = 10 20 30 40 10 40 20 30 ... Does anyone know how to do this easily in R? Thank you very much. Wendy You want permutations not combinations. library(e1071) permutations(4) * 10 or library(gtools) A - 10*(1:4) permutations(4,4,A) Berend -- View this message in context: http://r.789695.n4.nabble.com/permutate-elements-in-a-vector-tp4127821p4128227.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get inflection point in binomial glm
Dear All, I have a binomial response with one continuous predictor (d) and one factor (g) (8 levels dummy-coded). glm(resp~d*g, data, family=binomial) Y=b0+b1*X1+b2*X2 ... b7*X7 how can I get the inflection point per group, e.g., P(d)=.5 I would be grateful for any help. Thanks in advance, René __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raster manipulation in R
jjcabrera20 wrote on 12/01/2011 04:02:47 AM: Hello everyone in the forum For introducing myself I would say I have a basic knowledge of R. Now I am intended to implement a flood algorithm using R. I have some MatLab experience doing these, and as an example to explain more or less what I want, I have a m code to calculate the slope from a Digital elevation model: slope=zeros(row,col); for i=2:row-1 for j=2:col-1 dzdx=(raster(i,j+1)-raster(i,j-1))/(2*res); dzdy=(raster(i+1,j)-raster(i-1,j))/(2*res); slope(i,j)=sqrt(dzdx^2+dzdy^2); end end; The question is to know how to do the similar procedure on R. All suggestions are welcome Thanks All best, Jorge PD:I am using R on windows system 64 bits If I am interpreting your code correctly (I don't use MatLab myself), something like this should give you the same result in R: # example matrix of heights raster - matrix(runif(20, 10, 30), nrow=4, ncol=5) # example resolution res - 8.5 # dimensions of matrix drast - dim(raster) # for every non-boundary point in the matrix, # calculate the distances between its adjacent columns (dzdx) and rows (dzdy) dzdx - raster[2:(drast[1]-1), 3:drast[2]] - raster[2:(drast[1]-1), 1:(drast[2]-2)] dzdy - raster[3:drast[1], 2:(drast[2]-1)] - raster[1:(drast[1]-2), 2:(drast[2]-1)] # calculate the slope from these distances and the resolution slope - sqrt(dzdx^2 + dzdy^2) / (2*res) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simplify source code
Hi now I'd like to do for (colname in c('ColName1','ColName2','ColName3')) { dat - measurements$colname But that does not work, though I can write measurements$C1 (same as measurements$C1) (but different to measurements[C1]!) Can you give me a hint? greetings Christof Am 26-11-2011 23:30, schrieb Christof Kluß: Hi I would like to shorten mod1 - nls(ColName2 ~ ColName1, data = table, ...) mod2 - nls(ColName3 ~ ColName1, data = table, ...) mod3 - nls(ColName4 ~ ColName1, data = table, ...) ... is there something like cols = c(ColName2,ColName3,ColName4,...) for i in ... mod[i-1] - nls(ColName[i] ~ ColName1, data = table, ...) I am looking forward to help Christof __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Resampling with replacement on a binary (0, 1) dataset to get Cis
On Dec 1, 2011, at 6:34 AM, lincoln wrote: ...is it possible to do that? I apologize for something that must be a very trivial question for most of you but, unfortunately, it is not for me. A binary variable is measured, say, 50 times each year during 10 year. My interest is focused on the percentage of 1s with respect to the total if each year. There is no way to repeat those measure within each year and getting the CIs by the normal way. By the way, it would be important to get even a rough estimate of the CIs of these estimates (/n/1//n/1+/n/0). In case this is not a blasphemy, how might be done in R? ?prop.test Thanks in advance for any help David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permutate elements in a vector
Wendy wrote on 12/01/2011 04:27:09 AM: Hi all, I have a vector, e.g., A = c(10, 20, 30, 40). This 4 numbers have 4! = 24 different combination. I want to generate a matrix with one combination in one row, so the output would be B = 10 20 30 40 10 40 20 30 ... Does anyone know how to do this easily in R? Thank you very much. Wendy This has been asked and answered in earlier posts. Search R-Help for permutations. Here is one solution (see below). Jean http://tolstoy.newcastle.edu.au/R/e8/help/09/10/0983.html From: David Winsemius Date: Sat, 10 Oct 2009 require(combinat) permn(c(23,46,70,71,89)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to speed up int2bin conversion?
If we assume that you are just convert 8 bits, here is one way with a table lookup. If more than 8 bits, just partition the data and repeat. This sets up a mapping table one time for the lookup. Does 1M in 0.3 seconds on my computer; is this fast enough? # initialize a matrix with 8 bit binary values # one time require(bitops) b2c - character(256) for (i in 0:255){ + b2c[i + 1] - sprintf(%1d%1d%1d%1d%1d%1d%1d%1d + , bitAnd(i, 0x80) != 0 + , bitAnd(i, 0x40) != 0 + , bitAnd(i, 0x20) != 0 + , bitAnd(i, 0x10) != 0 + , bitAnd(i, 0x8) != 0 + , bitAnd(i, 0x4) != 0 + , bitAnd(i, 0x2) != 0 + , bitAnd(i, 0x1) != 0 + ) + } # create vector with 1M values x - as.integer(1:1e6) int2bin - function(val) + { + b2c[bitAnd(val, 0xff) + 1] + } system.time(int2bin(x)) user system elapsed 0.310.000.32 On Thu, Dec 1, 2011 at 7:14 AM, Jonas Jägermeyr jonas...@pik-potsdam.de wrote: Dear R-help members, I'm processing a large amount of MODIS data where quality assessment information is stored as an integer value for each pixel. I have to converted this number to an 8 digit binary flag to get access to the stored quality code (e.g. in2bin(165,8) = 1 0 1 0 0 1 0 1). Unfortunately, I did not manage to find a package providing a fast function to do so. I need to run this on millions of pixels and thus wrote the following function. int2bin - function(x,ndigits){ base - array(NA, dim=c(length(x), ndigits)) for(q in 1:ndigits){ base[, ndigits-q+1] - (x %% 2) x - (x %/% 2) } bin- apply(base,1,paste,collapse=) return(bin) } Since it is still slow, I have to find a way to express this more elegantly. I'd really appreciate any help. Thanking you, with best regards Jonas Jägermeyr Potsdam Institute for Climate Impact Research Research Domain II [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem installing the package tkrplot
Thanks Peter. I was knowing that R had not found the header tk.h file. But i did not where it was. I opened my Manage Packages Synaptic and installed tk-dev and other similar packages, and voila! It work. I coul install tkrplot and the other packages that i was needing. THANKS, for your answer, This is part of your Linux distribution. On Wed, Nov 30, 2011 at 9:08 PM, peter dalgaard pda...@gmail.com wrote: On Nov 30, 2011, at 15:31 , Marcos Amaris Gonzalez wrote: Hello people. My problem is that when I try to install the package tkrplot, I got the next problem: You seem to be missing the tk.h file. This is, most likely, in a tcl/tk development package that you haven't installed. This is part of your Linux distribution, not R. (You're not telling us which Linux you are running, and it can't be pinpointed any further without that information, but look for something like tk-devel and tcl-devel.) install.packages(tkrplot) Installing package(s) into ‘/home/marcos/R/i686-pc-linux-gnu-library/2.13’ (as ‘lib’ is unspecified) probando la URL 'http://www.vps.fmvz.usp.br/CRAN/src/contrib/tkrplot_0.0-23.tar.gz' Content type 'application/x-gzip' length 39037 bytes (38 Kb) URL abierta == downloaded 38 Kb * installing *source* package ‘tkrplot’ ... configure: creating ./config.status config.status: creating src/Makevars ** libs gcc -I/usr/share/R/include -I/usr/include/tcl8.5 -I/usr/include/tcl8.5 -fpic -std=gnu99 -O3 -pipe -g -c tcltkimg.c -o tcltkimg.o tcltkimg.c:2:16: fatal error: tk.h: No existe el fichero o el directorio compilation terminated. make: *** [tcltkimg.o] Error 1 ERROR: compilation failed for package ‘tkrplot’ * removing ‘/home/marcos/R/i686-pc-linux-gnu-library/2.13/tkrplot’ Warning in install.packages : installation of package 'tkrplot' had non-zero exit status The downloaded packages are in ‘/tmp/RtmpHRfZ1k/downloaded_packages’ In this moment I have R version 2.13.2 (2011-09-30). I stayed thankfull for the any help. --- Marcos Amaris Gonzalez __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com --- Marcos Amaris Gonzalez __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML with missing data in sem package
On 12/01/2011 07:18 AM, John Fox wrote: To:r-help@r-project.org Subject: [R] FIML with missing data in sem package You should check out the OpenMx R package. Just search for OpenMx and SEM. You can download from the web site. It does FIML and is an excellent SEM package. Rick B . __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simplify source code
measurements[[colname]] ?'[[' On Thu, Dec 1, 2011 at 8:34 AM, Christof Kluß ckl...@email.uni-kiel.de wrote: Hi now I'd like to do for (colname in c('ColName1','ColName2','ColName3')) { dat - measurements$colname But that does not work, though I can write measurements$C1 (same as measurements$C1) (but different to measurements[C1]!) Can you give me a hint? greetings Christof Am 26-11-2011 23:30, schrieb Christof Kluß: Hi I would like to shorten mod1 - nls(ColName2 ~ ColName1, data = table, ...) mod2 - nls(ColName3 ~ ColName1, data = table, ...) mod3 - nls(ColName4 ~ ColName1, data = table, ...) ... is there something like cols = c(ColName2,ColName3,ColName4,...) for i in ... mod[i-1] - nls(ColName[i] ~ ColName1, data = table, ...) I am looking forward to help Christof __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hi all.regarding quantile regression results..
i know this is not about R. After applying quantile regression with t=0.5,0.6 on the data set WBC( Wisconsin Breast Cancer)with 678 observations and 9 independent variables(inp1,inp2,...inp9) and 1 dependent variable(op) i have got the following results for beta values. when t=0.5(median regression) beta values b1=0.002641,b2=0.045746,b3=0. 005282,b4=0.004397,b5=0.002641,b6=0.065807,b7=0.005282 ,b8=0.031394,b9=0.004993 and intercept is -0.181388 and when t=0.6 beta values are b1=0,b2=0.01,b3=0,b4=-0.002,b5=0.004,b6=0.,b7=0.002,b8=0,b9=0 and intercept is -0.009 . sir,how to interpret the above beta coefficients and what do they mean exactly??. t=0.5 means are we considering first 50% of the total data? t=0.6 means are we considering the first 60% of the total data? can we write a equation like y=intercept+b1*inp11+b2*inp29+b3*inp3+b4*inp4+b5*inp5+b6*inp6+b7*inp7+b8*inp8+b9*inp9 as in Linear Regression to calculate the predicted output of y or not?? If we are taking into consideration 5 quantiles of data ,Does it mean that we are dividing data it into 5 parts??which variables i have to consider if the data is to be divided into 5 parts? And i got 5 equations for 5 quantiles, what exactly each equation represents? Can i write single equation for the data set as in mean regression by combining the 5 equations of each quantile. Please reply me. Thanks In advance, With regards Kalam Naerndar Reddy M.Tech(IT), University of Hyderabad. -- View this message in context: http://r.789695.n4.nabble.com/hi-all-regarding-quantile-regression-results-tp4128248p4128248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] References for book R In Action by Kabacoff
In the ebook version there is a list of references (pp. 434-437). Date: Thu, 1 Dec 2011 10:48:45 +0100 From: lig...@statistik.tu-dortmund.de To: ravi.k...@gmail.com CC: r-help@r-project.org Subject: Re: [R] References for book R In Action by Kabacoff On 01.12.2011 10:10, Ravi Kulkarni wrote: I know this is not really an R question - it is a query about a recent book on R (R In Action) by Robert Kabacoff, (Manning Publications 2011). There are many references to interesting topics in R in the book, BUT, I do not find a bibliography/list of references in the book! Does anybody know if there are errata for the book available some place? I'd ask the author! Uwe Ligges Thanks, Ravi -- View this message in context: http://r.789695.n4.nabble.com/References-for-book-R-In-Action-by-Kabacoff-tp4127625p4127625.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing a function, want a string argument to define the name of the excel sheet to be called
My question is this: is there a way I can make one of the arguments of the function be a string the user can enter, and then have that be the excel filename? ie, foo - function(x,y,NAME){ #make a matrix with x rows and y cols M - matrix(nrow=x,ncol=y) #write the matrix write.table(M, file = result.csv,append=TRUE, sep = ,) } I've had a look but I couldn't find help for this particular problem and it's one I'd like to solve, so I can change make several excel files to solve the analysis my actual function does. Thank you very much, Aodhán -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-function-want-a-string-argument-to-define-the-name-of-the-excel-sheet-to-be-called-tp4128384p4128384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nipals in the chemometrics package in R
zz dd: If you want a typical score plot, say PC1 vs PC2, then plot the first and second columns of your score matrix (T) against each other and label the points with the row names (if available) or create a text label from where ever you have this information stored. To get an unlabeled score plot, you could simply do: plot(T[,1], T[,2]) If you are using the package chemometrics, you should take a look at the book that goes with it too: Varmuza and Filzmoser, Multivariate Statistical Analysis in Chemometrics. If you need more guidance than this, you should probably give us more details about what you are doing. Bryan Prof. Bryan Hanson Dept of Chemistry Biochemistry DePauw University Greencastle IN 46135 USA academic.depauw.edu/~hanson/deadpezsociety.html github.com/bryanhanson academic.depauw.edu/~hanson/UMP/Index.html On Dec 1, 2011, at 6:51 AM, zz dd wrote: Hello i need some precision about nipals in the chemometrics package in R . When i use nipals in chemometrics i obtain T and P matrix. I really don't understand what to do with these two matrix to obtain the scores for every the component (like in spss fo example) Comp1Comp2 Comp3 quest1 0,8434 0,54333 0,3466 quest2 0,665 0,7655 0,433 Thank you very much for your help (I know that X=TP+E)... But don't understand else __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a function, want a string argument to define the name of the excel sheet to be called
Sure, change your example as follows and then you can pass the name properly: foo - function(x,y,NAME = filename.csv){ #make a matrix with x rows and y cols M - matrix(nrow=x,ncol=y) #write the matrix write.table(M, file = NAME,append=TRUE, sep = ,) } Bryan Prof. Bryan Hanson Dept of Chemistry Biochemistry DePauw University Greencastle IN 46135 USA academic.depauw.edu/~hanson/deadpezsociety.html github.com/bryanhanson academic.depauw.edu/~hanson/UMP/Index.html On Dec 1, 2011, at 8:43 AM, AOLeary wrote: My question is this: is there a way I can make one of the arguments of the function be a string the user can enter, and then have that be the excel filename? ie, foo - function(x,y,NAME){ #make a matrix with x rows and y cols M - matrix(nrow=x,ncol=y) #write the matrix write.table(M, file = result.csv,append=TRUE, sep = ,) } I've had a look but I couldn't find help for this particular problem and it's one I'd like to solve, so I can change make several excel files to solve the analysis my actual function does. Thank you very much, Aodhán -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-function-want-a-string-argument-to-define-the-name-of-the-excel-sheet-to-be-called-tp4128384p4128384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient ways to dynamically grow a dataframe
First, dataframes can be much slower than matrices, for example, if you are changing/accessing values a lot. I would suggest that you use a matrix since is seems that all your values are numeric. Allocate a large empty matrix to start (hopefully as large as you need). If you exceed this, you have the option of 'rbind'ing more empty rows on and continuing. This might depend on how large your final matrix might be (you did not state the boundary conditions). On Thu, Dec 1, 2011 at 6:34 AM, Matteo Richiardi matteo.richia...@unito.it wrote: Hi, I'm trying to write a small microsimulation in R: that is, I have a dataframe with info on N individuals for the base-year and I have to grow it dynamically for T periods: df = data.frame( id = 1:N, x = ) The most straightforward way to solve the problem that came to my mind is to create for every period a new dataframe: for(t in 1:T){ for(i in 1:N){ row = data.frame( id = i, t = t, x = ... ) df = rbind(df,row) } } This is very inefficient and my pc gets immediately stucked as N is raised above some thousands. As an alternative, I created an empty dataframe for all the projected periods, and then filled it: df1 = data.frame( id = rep(1:N,T), t = rep(1:T, each = N), x = rep(NA,N*T) ) for(t in 1:T){ for(i in 1:N){ x = ... df1[df1$id==i df1$t==t,x] = x } } df = rbind(df,df1) This is also too slow, and my PC gets stucked. I don't want to go for a matrix, because I'd loose the column names and everything will become too much error-prone. Any suggestions on how to do it? Thanks in advance, Matteo -- Matteo Richiardi University of Turin Faculty of Law Department of Economics Cognetti De Martiis via Po 53, 10124 Torino Email: matteo.richia...@unito.it Tel. +39 011 670 3870 Web page: http://www.personalweb.unito.it/matteo.richiardi/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordinal data simulation
On Dec 1, 2011, at 2:06 AM, yuying shi wrote: Dear all, I am doing an ordinal data simulation. I have a question regarding the cut off values between categories. Suppose I have three categories, if I do regression, there should be two cut off values. I find some simulation code for the ordinal data. However, they usually only generate a uniform random number and compare the probability with the random number. That's not a good description of what they are doing. Could any one be so kind to tell me the rational behind it. I don't know why we need a random number rather than just two fixed cut off values. One way to generate a random categorical variable constrained to three values would just be to take a uniform variate on the interval [0,1] and classify it based on whether which of the three segments it is in for 0 a b 1. This really is not an appropriate question for rhelp so you are advised to submit further such questions somewhere they are solicited such as statsexchange.com. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raster manipulation in R
Hi, Alternatively you could have a look at the function terrain in the raster package, it can calculate the slope for you using different algorithms, not sure if the one below is included though. For typical spatial requests like this, you could also use the mailing-list r-sig-geo. Cheers, Jon On 01-Dec-11 14:27, Jean V Adams wrote: jjcabrera20 wrote on 12/01/2011 04:02:47 AM: Hello everyone in the forum For introducing myself I would say I have a basic knowledge of R. Now I am intended to implement a flood algorithm using R. I have some MatLab experience doing these, and as an example to explain more or less what I want, I have a m code to calculate the slope from a Digital elevation model: slope=zeros(row,col); for i=2:row-1 for j=2:col-1 dzdx=(raster(i,j+1)-raster(i,j-1))/(2*res); dzdy=(raster(i+1,j)-raster(i-1,j))/(2*res); slope(i,j)=sqrt(dzdx^2+dzdy^2); end end; The question is to know how to do the similar procedure on R. All suggestions are welcome Thanks All best, Jorge PD:I am using R on windows system 64 bits If I am interpreting your code correctly (I don't use MatLab myself), something like this should give you the same result in R: # example matrix of heights raster- matrix(runif(20, 10, 30), nrow=4, ncol=5) # example resolution res- 8.5 # dimensions of matrix drast- dim(raster) # for every non-boundary point in the matrix, # calculate the distances between its adjacent columns (dzdx) and rows (dzdy) dzdx- raster[2:(drast[1]-1), 3:drast[2]] - raster[2:(drast[1]-1), 1:(drast[2]-2)] dzdy- raster[3:drast[1], 2:(drast[2]-1)] - raster[1:(drast[1]-2), 2:(drast[2]-1)] # calculate the slope from these distances and the resolution slope- sqrt(dzdx^2 + dzdy^2) / (2*res) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jon Olav Skøien Joint Research Centre - European Commission Institute for Environment and Sustainability (IES) Global Environment Monitoring Unit Via Fermi 2749, TP 440, I-21027 Ispra (VA), ITALY jon.sko...@jrc.ec.europa.eu Tel: +39 0332 789206 Disclaimer: Views expressed in this email are those of the individual and do not necessarily represent official views of the European Commission. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R logo in eps formt
See this earlier post for SVG logos: http://tolstoy.newcastle.edu.au/R/e12/devel/10/10/0112.html Using Image Magick, do something like convert logo.svg logo.eps On Thu, 2011-12-01 at 10:56 +0700, Ben Madin wrote: G'day all, Sorry if this message has been posted before, but searching for R is always difficult... I was hoping for a copy of the logo in eps format? Can I do this from R, or is one available for download? cheers Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML with missing data in sem package
Good idea. I'll give it a try. Thanks! On Thu, Dec 1, 2011 at 6:18 AM, John Fox j...@mcmaster.ca wrote: Dear Dustin, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Dustin Fife Sent: November-30-11 9:51 PM To: r-help@r-project.org Subject: [R] FIML with missing data in sem package Is there a way to use full information maximum likelihood (FIML) to estimate missing data in the sem package? For example, suppose I have a dataset with complete information on X1-X3, but missing data (MAR) on X4. Is there a way to use FIML in this case? I know lavaan and openmx allow No, the sem package doesn't handle missing data. You could, however, use another package, such as mice, for multiple imputation of the missing data. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox you to do it, but I couldn't find anything in the documentation for the sem package. Thanks! -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get inflection point in binomial glm
Assuming a logistic model, for each group solve for d at Y=0, or solve for d at p=0.5, where d is your continuous predictor, Y is the logit score, and p is the probability of success in the binomial model. After that you can get the standard error of the inflection point by Taylor series (aka delta method). Another idea is to re-parameterize the logistic model to include explicitly the inflection point, thus you'll get its estimate and standard error directly as a result of the fit. For example, disregarding the g factor predictor (or for each group), a logistic model such as P(d) = 1/(1+exp(log(1/19)(d-d50)/(d95-d50)) includes the inflection point directly (d50) and is a re-parameterized version of the usual logistic model P(d) =1/(1+exp(b0+b1*d)) where parameters b0 and b1 have been replaced by d95 and d50, the predictor at 50% probability (inflection point), and the predictor at 95% probability, respectively. HTH Rubén -Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de René Mayer Enviado el: jueves, 01 de diciembre de 2011 14:25 Para: r-help@r-project.org Asunto: [R] how to get inflection point in binomial glm Dear All, I have a binomial response with one continuous predictor (d) and one factor (g) (8 levels dummy-coded). glm(resp~d*g, data, family=binomial) Y=b0+b1*X1+b2*X2 ... b7*X7 how can I get the inflection point per group, e.g., P(d)=.5 I would be grateful for any help. Thanks in advance, René __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x, y for point of intersection
Hi everybody, Thanks for checking my code, Hans, it help to see where my initial mistake was. I am sorry i assumed that there was a minimization problem. In short i had 2 wavy lines (left and right) that didn't intersect and lots of straight parallel lines that intersect the first 2 lines. I wanted to solve the intersection for the left and right line simultaneously and put everything in a big loop. And there the mistake was. Meanwhile i was able to use Don suggestion with no problem and no loop. Thanks again, Monica Date: Thu, 1 Dec 2011 12:13:49 +0100 Subject: Re: [R] x, y for point of intersection From: hwborch...@googlemail.com To: r-help@r-project.org CC: macque...@llnl.gov; pisican...@hotmail.com; dwinsem...@comcast.net; ted.hard...@wlandres.net Monica has sent me some data and code for taking a quick look. As it turned out, there was a simple programming error on her side. The segm_distance() function in package 'pracma' is working correctly. And there is no minimization procedure in here, it simply solves some equations from plane geometry. Maybe the function suggested by Don is faster, I haven't checked that. Regards, Hans Werner 2011/11/29 Monica Pisica pisican...@hotmail.com Hi again, Working with my real data and not with the little example i sent to the list i discovered that segm_distance function from package pracma does not converge to 0 in all cases, even if i increase the number of iteration to 10,000 for example. It seems that it depends on the initialization point - most like a minimization function. So my thanks go to Don who's suggestion works for the real data as well without any problems - so far ;-) He suggested to use the function crossing.psp from package spatstat. Thanks again to all who have answered and helped to solve my problem. I certainly learned few new things. Monica From: macque...@llnl.gov To: pisican...@hotmail.com CC: r-help@r-project.org Date: Wed, 23 Nov 2011 14:03:42 -0800 Subject: Re: [R] x, y for point of intersection The function crossing.psp() in the spatstat package might be of use. Here's an excerpt from its help page: crossing.psp package:spatstat R Documentation Crossing Points of Two Line Segment PatternsDescription: Finds any crossing points between two line segment patterns. Usage: crossing.psp(A,B) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hi all.regarding quantile regression results..
This really isn't the appropriate forum for most of your questions: I'd suggest you work through the Wikipedia article on quantiles regression and direct follow up to stats.stackexchange.com. As to the R question: use the predict() function. Michael On Dec 1, 2011, at 8:12 AM, narendarreddy kalam narendarcse...@gmail.com wrote: i know this is not about R. After applying quantile regression with t=0.5,0.6 on the data set WBC( Wisconsin Breast Cancer)with 678 observations and 9 independent variables(inp1,inp2,...inp9) and 1 dependent variable(op) i have got the following results for beta values. when t=0.5(median regression) beta values b1=0.002641,b2=0.045746,b3=0. 005282,b4=0.004397,b5=0.002641,b6=0.065807,b7=0.005282 ,b8=0.031394,b9=0.004993 and intercept is -0.181388 and when t=0.6 beta values are b1=0,b2=0.01,b3=0,b4=-0.002,b5=0.004,b6=0.,b7=0.002,b8=0,b9=0 and intercept is -0.009 . sir,how to interpret the above beta coefficients and what do they mean exactly??. t=0.5 means are we considering first 50% of the total data? t=0.6 means are we considering the first 60% of the total data? can we write a equation like y=intercept+b1*inp11+b2*inp29+b3*inp3+b4*inp4+b5*inp5+b6*inp6+b7*inp7+b8*inp8+b9*inp9 as in Linear Regression to calculate the predicted output of y or not?? If we are taking into consideration 5 quantiles of data ,Does it mean that we are dividing data it into 5 parts??which variables i have to consider if the data is to be divided into 5 parts? And i got 5 equations for 5 quantiles, what exactly each equation represents? Can i write single equation for the data set as in mean regression by combining the 5 equations of each quantile. Please reply me. Thanks In advance, With regards Kalam Naerndar Reddy M.Tech(IT), University of Hyderabad. -- View this message in context: http://r.789695.n4.nabble.com/hi-all-regarding-quantile-regression-results-tp4128248p4128248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient ways to dynamically grow a dataframe
I'd also suggest you read circle 2 of the R inferno (just google it) which has some helpful tips on how to deal with these sorts of problems. Also, did you know that matrices can have column names and that rbind() preserves them? E.g., m - matrix(1:6, 3); colnames(m) - letters[1:2] print(m) print(rbind(m, c(10, 11))) Michael On Thu, Dec 1, 2011 at 9:02 AM, jim holtman jholt...@gmail.com wrote: First, dataframes can be much slower than matrices, for example, if you are changing/accessing values a lot. I would suggest that you use a matrix since is seems that all your values are numeric. Allocate a large empty matrix to start (hopefully as large as you need). If you exceed this, you have the option of 'rbind'ing more empty rows on and continuing. This might depend on how large your final matrix might be (you did not state the boundary conditions). On Thu, Dec 1, 2011 at 6:34 AM, Matteo Richiardi matteo.richia...@unito.it wrote: Hi, I'm trying to write a small microsimulation in R: that is, I have a dataframe with info on N individuals for the base-year and I have to grow it dynamically for T periods: df = data.frame( id = 1:N, x = ) The most straightforward way to solve the problem that came to my mind is to create for every period a new dataframe: for(t in 1:T){ for(i in 1:N){ row = data.frame( id = i, t = t, x = ... ) df = rbind(df,row) } } This is very inefficient and my pc gets immediately stucked as N is raised above some thousands. As an alternative, I created an empty dataframe for all the projected periods, and then filled it: df1 = data.frame( id = rep(1:N,T), t = rep(1:T, each = N), x = rep(NA,N*T) ) for(t in 1:T){ for(i in 1:N){ x = ... df1[df1$id==i df1$t==t,x] = x } } df = rbind(df,df1) This is also too slow, and my PC gets stucked. I don't want to go for a matrix, because I'd loose the column names and everything will become too much error-prone. Any suggestions on how to do it? Thanks in advance, Matteo -- Matteo Richiardi University of Turin Faculty of Law Department of Economics Cognetti De Martiis via Po 53, 10124 Torino Email: matteo.richia...@unito.it Tel. +39 011 670 3870 Web page: http://www.personalweb.unito.it/matteo.richiardi/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] round to specific intervals
Dear R users/helpers, I am wondering is there an existing function in which you can round numbers to a set of values. I know you can use 5 * round(x/5) for rounding to the nearest 5 or so, but what if the interval size is not constant. For example: ## Not run test - rnorm(100) round(test, c(1, 5, 10, 20, 50)) so that the test is rounded to the closest value in the vector. Thanks for the help. Cheers, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nomogram with stratified cph in rms package, how to get failure probability
Hazard() is not implemented except for parametric survival models. You are not calling nomogram() correctly; in rms the plotting step is separated from the nomogram computations. To plot cumulative event rates do something like: mort10 - function(lp) 1 - surv(10,lp) and tell nomogram about mort10 using fun=. Frank min wrote Hello, I am using Dr. Harrell's rms package to make a nomogram. I was able to make a beautiful one. However, I want to change 5-year survival probability to 5-year failure probability. I couldn’t get hazard rate from Hazard(f1) because I used cph for the model. Here is my code: library(rms) f1 - cph(Surv(retime,dfs) ~ age+her2+t_stage+n_stage+er+grade+cytcyt+Cyt_PCDK2 , data=data11, surv=T, x=T, y=T, time.inc=5) surv- Survival(f1) haz- Hazard(f1) Here is the Error in UseMethod(Hazard) : no applicable method for 'Hazard' applied to an object of class c('cph', 'Design', 'coxph') surv10 - function(lp) surv(10,lp) surv5 - function(lp) surv(5,lp) quant - Quantile(f1) at.surv - c(0.1, 0.3, 0.5, 0.7, 0.9) at.med1-c(2,3,4, 5,6,7,8, 10,15,20,25, 30) par(cex=0.8) nom- nomogram(f1, conf.int=F, fun=list(surv5, surv10), funlabel=c('5-Year Survival Probability', '10-Year Survival Probability' ), lp=F, fun.at=c(at.surv, at.surv),label.every=1, force.label=FALSE, cex.axis=0.8, verbose=TRUE, cex.var=0.8) How can I show failure probability instead of survival probability? I would very much appreciate any assistance in this matter. Thank you Very much. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Nomogram-with-stratified-cph-in-rms-package-how-to-get-failure-probability-tp4123249p4129173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round to specific intervals
?findInterval .. would get you the endpoints and then you could determine which is nearer. Of course in your example, everything would get rounded to 1. -- Bert On Thu, Dec 1, 2011 at 7:55 AM, Michael Kao mkao006rm...@gmail.com wrote: Dear R users/helpers, I am wondering is there an existing function in which you can round numbers to a set of values. I know you can use 5 * round(x/5) for rounding to the nearest 5 or so, but what if the interval size is not constant. For example: ## Not run test - rnorm(100) round(test, c(1, 5, 10, 20, 50)) so that the test is rounded to the closest value in the vector. Thanks for the help. Cheers, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a function, want a string argument to define the name of the excel sheet to be called
Thank you very much, that does exactly what I want it to! :) Aodhán -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-function-want-a-string-argument-to-define-the-name-of-the-excel-sheet-to-be-called-tp4128384p4128495.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R, PostgresSQL and poor performance
Hi List Apologies if this isn't the correct place for this query (I've tried a search of the mail archives but not had much joy). I'm running R (2.14.0) on a Mac (OSX v 10.5.8, 2.66GHz, 4GB memory) and am having a few performance issues with reading data in from a Postres database (using RPostgreSQL). My query / code are as below # - library('RPostgreSQL') drv - dbDriver(PostgreSQL) dbh - dbConnect(drv,user= ,password= ,dbname= ,host= ) sql - select id, date, lon, lat, date_trunc('day' , date) as jday, extract('hour' from date) as hour, extract('year' from date) as year from observations where pt = 6 and date = '1990-01-01' and date '1995-01-01' and lon 180 and lon 290 and lat -30 and lat 30 and sst is not null dataIn - dbGetQuery(dbh,sql) # - All variables are reals other than id which is varchar(10) and date which is a timestamp, approximately 1.5 million rows are returned by the query and it takes order 10 second to execute using psql (the command line client for Postgres) and a similar time using pgAdmin 3. In R it takes several minutes to run and I'm unsure where the bottleneck is occurring. I don't believe it's in the database access as the query disappears from the server status (of pgAdmin) quite quickly and where it only takes ~10 seconds from psql. I've only been using R for the last month or two so any tips / advice on how to speed up the query or how to track down the source of the poor performance would be appreciated. For info, I get similar poor performance running R (2.14.0) on a linux workstation (Linux 2.6.16.60-0.91.1-smp x86_64, 42GB memory, 2 6 core processors) running SUSE. Thanks in advance Dave Berry. -- This message (and any attachments) is for the recipient ...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do Hotelling's t2 test?
Hi, I want to do a 2 sample hotelling's test but i can't figure out how. When i type T2.test it says there is no such test and when i tried library(rrcov) it says there is no such program. Cheers. -- View this message in context: http://r.789695.n4.nabble.com/How-to-do-Hotelling-s-t2-test-tp4128748p4128748.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Resampling with replacement on a binary (0, 1) dataset to get Cis
Thanks. So, suppose for one specific year (first year over 10) the percentage of successes deriving from 100 trials with 38 successes (and 62 failures), its value would be 38/100=0.38. I could calculate its confidence intervals this way: success-38 total-100 prop.test(success,total,p=0.5,alternative=two.sided) 1-sample proportions test with continuity correction data: success out of total, null probability 0.5 X-squared = 5.29, df = 1, p-value = 0.02145 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.2863947 0.4829411 sample estimates: p 0.38 So it would be var$1=0.38 , CI=0.286-0.483 Is it correct? -- View this message in context: http://r.789695.n4.nabble.com/Resampling-with-replacement-on-a-binary-0-1-variable-to-get-CIs-tp4127990p4129048.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to speed up int2bin conversion?
On 12/01/2011 02:46 PM, jim holtman wrote: If we assume that you are just convert 8 bits, here is one way with a table lookup. If more than 8 bits, just partition the data and repeat. This sets up a mapping table one time for the lookup. Does 1M in 0.3 seconds on my computer; is this fast enough? # initialize a matrix with 8 bit binary values # one time require(bitops) b2c- character(256) for (i in 0:255){ + b2c[i + 1]- sprintf(%1d%1d%1d%1d%1d%1d%1d%1d + , bitAnd(i, 0x80) != 0 + , bitAnd(i, 0x40) != 0 + , bitAnd(i, 0x20) != 0 + , bitAnd(i, 0x10) != 0 + , bitAnd(i, 0x8) != 0 + , bitAnd(i, 0x4) != 0 + , bitAnd(i, 0x2) != 0 + , bitAnd(i, 0x1) != 0 + ) + } # create vector with 1M values x- as.integer(1:1e6) int2bin- function(val) + { + b2c[bitAnd(val, 0xff) + 1] + } system.time(int2bin(x)) user system elapsed 0.310.000.32 On Thu, Dec 1, 2011 at 7:14 AM, Jonas Jägermeyrjonas...@pik-potsdam.de wrote: Dear R-help members, I'm processing a large amount of MODIS data where quality assessment information is stored as an integer value for each pixel. I have to converted this number to an 8 digit binary flag to get access to the stored quality code (e.g. in2bin(165,8) = 1 0 1 0 0 1 0 1). Unfortunately, I did not manage to find a package providing a fast function to do so. I need to run this on millions of pixels and thus wrote the following function. int2bin- function(x,ndigits){ base- array(NA, dim=c(length(x), ndigits)) for(q in 1:ndigits){ base[, ndigits-q+1]- (x %% 2) x- (x %/% 2) } bin- apply(base,1,paste,collapse=) return(bin) } Since it is still slow, I have to find a way to express this more elegantly. I'd really appreciate any help. Thanking you, with best regards Jonas Jägermeyr Potsdam Institute for Climate Impact Research Research Domain II [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Great Jim! Thanks a lot. New way of thinking, appreciated. I also need 16 and 32 bit flags and hence I just added those two lines: int2bin16- function(val) { paste(int2bin(val%/%256),int2bin(val),sep=) } int2bin32- function(val) { paste(int2bin(val%/%256^3),int2bin(val%/%256^2),int2bin(val%/%256),int2bin(val),sep=) } Best, Jonas Jägermeyr Potsdam Institute for Climate Impact Research Research Domain II __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML with missing data in sem package
What is your goal? I have used and like mice pretty well, but using mice + sem to try to address missingness seems like more work than using FIML in OpenMx or lavaan to try to address it. Is there a reason you want to use the sem package or a reason you do not want to use the others? I tried lavaan and couldn't get it to work. I noticed it was in beta so I figured it wasn't a good idea to use for simulations I intend to publish (especially when I can't get them to work properly). Either way, whether I use OpenMX or lavaan, it will require me to learn a new syntax, and I guess I'm just lazy. I'm comfortable with sem. On Thu, Dec 1, 2011 at 6:27 AM, Dustin Fife fife.dus...@gmail.com wrote: Good idea. I'll give it a try. Thanks! On Thu, Dec 1, 2011 at 6:18 AM, John Fox j...@mcmaster.ca wrote: Dear Dustin, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Dustin Fife Sent: November-30-11 9:51 PM To: r-help@r-project.org Subject: [R] FIML with missing data in sem package Is there a way to use full information maximum likelihood (FIML) to estimate missing data in the sem package? For example, suppose I have a dataset with complete information on X1-X3, but missing data (MAR) on X4. Is there a way to use FIML in this case? I know lavaan and openmx allow No, the sem package doesn't handle missing data. You could, however, use another package, such as mice, for multiple imputation of the missing data. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox you to do it, but I couldn't find anything in the documentation for the sem package. Thanks! -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ -- Dustin Fife Graduate Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nomogram with stratified cph in rms package, how to get failure probability
Got it. Thank you so much for the help! Min Yi MD, PhD Department of Surgical Oncology - Unit 1484 1400 Pressler Street, #FCT17.6061 University of Texas M.D. Anderson Cancer Center P.O. Box 301402 Houston, TX 77230-1402 Phone: 713-563-1874 Fax: 713-792-4689 From: Frank Harrell [via R] [mailto:ml-node+s789695n4129173...@n4.nabble.com] Sent: Thursday, December 01, 2011 10:15 AM To: Yi,Min Subject: Re: Nomogram with stratified cph in rms package, how to get failure probability Hazard() is not implemented except for parametric survival models. You are not calling nomogram() correctly; in rms the plotting step is separated from the nomogram computations. To plot cumulative event rates do something like: mort10 - function(lp) 1 - surv(10,lp) and tell nomogram about mort10 using fun=. Frank min wrote Hello, I am using Dr. Harrell's rms package to make a nomogram. I was able to make a beautiful one. However, I want to change 5-year survival probability to 5-year failure probability. I couldnât get hazard rate from Hazard(f1) because I used cph for the model. Here is my code: library(rms) f1 - cph(Surv(retime,dfs) ~ age+her2+t_stage+n_stage+er+grade+cytcyt+Cyt_PCDK2 , data=data11, surv=T, x=T, y=T, time.inc=5) surv- Survival(f1) haz- Hazard(f1) Here is the Error in UseMethod(Hazard) : no applicable method for 'Hazard' applied to an object of class c('cph', 'Design', 'coxph') surv10 - function(lp) surv(10,lp) surv5 - function(lp) surv(5,lp) quant - Quantile(f1) at.surv - c(0.1, 0.3, 0.5, 0.7, 0.9) at.med1-c(2,3,4, 5,6,7,8, 10,15,20,25, 30) par(cex=0.8) nom- nomogram(f1, conf.int=F, fun=list(surv5, surv10), funlabel=c('5-Year Survival Probability', '10-Year Survival Probability' ), lp=F, fun.at=c(at.surv, at.surv),label.every=1, force.label=FALSE, cex.axis=0.8, verbose=TRUE, cex.var=0.8) How can I show failure probability instead of survival probability? I would very much appreciate any assistance in this matter. Thank you Very much. Frank Harrell Department of Biostatistics, Vanderbilt University If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Nomogram-with-stratified-cph-in-rms-package-how-to-get-failure-probability-tp4123249p4129173.html To unsubscribe from Nomogram with stratified cph in rms package, how to get failure probability, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4123249code=bXlpQG1kYW5kZXJzb24ub3JnfDQxMjMyNDl8MTkzMzMxNzE4Mg==. NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespacebreadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Nomogram-with-stratified-cph-in-rms-package-how-to-get-failure-probability-tp4123249p4129184.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round to specific intervals
On 01/12/2011 10:55 AM, Michael Kao wrote: Dear R users/helpers, I am wondering is there an existing function in which you can round numbers to a set of values. I know you can use 5 * round(x/5) for rounding to the nearest 5 or so, but what if the interval size is not constant. For example: ## Not run test- rnorm(100) round(test, c(1, 5, 10, 20, 50)) so that the test is rounded to the closest value in the vector. I think you could put together such a thing using cut(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forests in R
The first version of the package was created by re-writing the main program in the original Fortran as C, and calls other Fortran subroutines that were mostly untouched, so dynamic memory allocation can be done. Later versions have most of the Fortran code translated/re-written in C. Currently the only Fortran part is the node splitting in classification trees. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder Sent: Thursday, December 01, 2011 12:33 AM To: Axel Urbiz Cc: R-help@r-project.org Subject: Re: [R] Random Forests in R On Wed, Nov 30, 2011 at 7:48 PM, Axel Urbiz axel.ur...@gmail.com wrote: I understand the original implementation of Random Forest was done in Fortran code. In the source files of the R implementation there is a note C wrapper for random forests: get input from R and drive the Fortran routines.. I'm far from an expert on this...does that mean that the implementation in R is through calls to C functions only (not Fortran)? So, would knowing C be enough to understand this code, or Fortran is also necessary? I haven't seen the C and Fortran code for Random Forest but I understand the note to say that R code calls some C functions that pre-process (possibly re-format etc) the data, then call the actual Random Forest method that's written in Fortran, then possibly post-process the output and return it to R. It would imply that to understand the actual Random Forest code, you will have to read the Fortran source code. Best, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to speed up int2bin conversion?
You might be interested in package bit. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Jonas Jägermeyr jonas...@pik-potsdam.de wrote: Dear R-help members, I'm processing a large amount of MODIS data where quality assessment information is stored as an integer value for each pixel. I have to converted this number to an 8 digit binary flag to get access to the stored quality code (e.g. in2bin(165,8) = 1 0 1 0 0 1 0 1). Unfortunately, I did not manage to find a package providing a fast function to do so. I need to run this on millions of pixels and thus wrote the following function. int2bin - function(x,ndigits){ base - array(NA, dim=c(length(x), ndigits)) for(q in 1:ndigits){ base[, ndigits-q+1] - (x %% 2) x - (x %/% 2) } bin- apply(base,1,paste,collapse=) return(bin) } Since it is still slow, I have to find a way to express this more elegantly. I'd really appreciate any help. Thanking you, with best regards Jonas J�germeyr Potsdam Institute for Climate Impact Research Research Domain II [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round to specific intervals
Thanks for all the response, I manage to write a small function to complete what I want based on the wonderful helps. iround - function(x, interval){ ## Round numbers to desired interval ## ## Args: ## x:Numeric vector to be rounded ## interval: The interval the values should be rounded towards. ## Retunrs: ## a numeric vector with x rounded to the desired interval. ## interval[ifelse(x min(interval), 1, findInterval(x, interval))] } iround(0:100, c(1, 5, 10, 20, 50)) Cheers, and thanks for all the help again. On 1/12/2011 5:20 p.m., Duncan Murdoch wrote: On 01/12/2011 10:55 AM, Michael Kao wrote: Dear R users/helpers, I am wondering is there an existing function in which you can round numbers to a set of values. I know you can use 5 * round(x/5) for rounding to the nearest 5 or so, but what if the interval size is not constant. For example: ## Not run test- rnorm(100) round(test, c(1, 5, 10, 20, 50)) so that the test is rounded to the closest value in the vector. I think you could put together such a thing using cut(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get inflection point in binomial glm
On Dec 1, 2011, at 8:24 AM, René Mayer wrote: Dear All, I have a binomial response with one continuous predictor (d) and one factor (g) (8 levels dummy-coded). glm(resp~d*g, data, family=binomial) Y=b0+b1*X1+b2*X2 ... b7*X7 Dear Dr Mayer; I think it might be a bit more complex than that. I think you should get 15 betas rather than 8. Have you done it? how can I get the inflection point per group, e.g., P(d)=.5 Wouldn't that just be at d=1/beta in each group? (Thinking, perhaps naively, in the case of X=X1 that (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta *d*(X==X1) ) # all other terms = 0 And taking the log of both sides, and then use middle school math to solve. Oh, wait. Muffed my first try on that for sure. Need to add back both the constant intercept and the baseline d coefficient for the non-b0 levels. (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta_0 + beta_d_0*d + beta_n + beta_d_n *d*(X==Xn) ) And just (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta_0 + beta_d_0*d ) # for the reference level. This felt like an exam question in my categorical analysis course 25 years ago. (Might have gotten partial credit for my first stab, depending on how forgiving the TA was that night.) I would be grateful for any help. Thanks in advance, René -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do Hotelling's t2 test?
Hi, I want to do a 2 sample hotelling's test but i can't figure out how. When i type T2.test it says there is no such test and when i tried library(rrcov) it says there is no such program. Have you installed rrcov using install.packages? S Ellison*** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Resampling with replacement on a binary (0, 1) dataset to get Cis
On Dec 1, 2011, at 10:49 AM, lincoln wrote: Thanks. So, suppose for one specific year (first year over 10) the percentage of successes deriving from 100 trials with 38 successes (and 62 failures), its value would be 38/100=0.38. I could calculate its confidence intervals this way: success-38 total-100 prop.test(success,total,p=0.5,alternative=two.sided) 1-sample proportions test with continuity correction data: success out of total, null probability 0.5 X-squared = 5.29, df = 1, p-value = 0.02145 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.2863947 0.4829411 sample estimates: p 0.38 So it would be var$1=0.38 , CI=0.286-0.483 Is it correct? I couldn't tell if this were homework, so I just threw out a starting point. If you were told to do a resampling method then this would just be a starting point and you would be expected to go further with the boot function. -- View this message in context: http://r.789695.n4.nabble.com/Resampling-with-replacement-on-a-binary-0-1-variable-to-get-CIs-tp4127990p4129048.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] legend, lheight, and alignment
Hello, A bit of fairly simple code, yet I don't seem to be able to manipulate it quite as much as I would like: 1) It would be nice if the objects appearing in the legend were aligned, and by aligned I mean the boxes are centered over the lines. Do I need to adjust the use of NA in the code below to accomplish this? Here's how it appears on my machine: http://r.789695.n4.nabble.com/file/n4129402/example.png 2) Because I'm using a call to 'expression' in the text contained in the legend, it would be nice to space the lines of text a bit more. My feeble attempt was to increase the lheight parameter in the hopes this would affect the legend...to no avail. Further, lheight cannot be added to the argument list of legend. I've been unable to track down another parameter applicable to legend, suggestions very much appreciated. par(lheight=2) plot(1,1,col=white) legend(center, legend=c(expression(paste(Observed ,italic(bar(EC)[e]))),expression(paste(Simulated ,italic(bar(EC)[e]))),test1,test2), lty = c(NA,NA,1,1), col = c(black,black,red,blue), density = c(NA,25,NA,NA), border=c(black,black,NA,NA), fill = c(grey,black,NA,NA), angle = c(NA,45,NA,NA), cex = 1.25, bty = n, xpd = TRUE) sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.13.2 -- View this message in context: http://r.789695.n4.nabble.com/legend-lheight-and-alignment-tp4129402p4129402.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML with missing data in sem package
On 12/01/2011 05:25 PM, Dustin Fife wrote: What is your goal? I have used and like mice pretty well, but using mice + sem to try to address missingness seems like more work than using FIML in OpenMx or lavaan to try to address it. Is there a reason you want to use the sem package or a reason you do not want to use the others? I tried lavaan and couldn't get it to work. I noticed it was in beta so I figured it wasn't a good idea to use for simulations I intend to publish (especially when I can't get them to work properly). Either way, whether I use OpenMX or lavaan, it will require me to learn a new syntax, and I guess I'm just lazy. I'm comfortable with sem. The 'beta' status of lavaan implies that things (eg function arguments) may change from one version to another, until most planned features are implemented. However, the computations/results themselves are pretty stable. Many SEM reseachers do use lavaan today for simulation studies, and some have been published already. I can provide you with some references off-list if you are interested. Yves Rosseel. http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R, PostgresSQL and poor performance
On Thu, Dec 1, 2011 at 10:02 AM, Berry, David I. d...@noc.ac.uk wrote: Hi List Apologies if this isn't the correct place for this query (I've tried a search of the mail archives but not had much joy). I'm running R (2.14.0) on a Mac (OSX v 10.5.8, 2.66GHz, 4GB memory) and am having a few performance issues with reading data in from a Postres database (using RPostgreSQL). My query / code are as below # - library('RPostgreSQL') drv - dbDriver(PostgreSQL) dbh - dbConnect(drv,user=…,password=…,dbname=…,host=…) sql - select id, date, lon, lat, date_trunc('day' , date) as jday, extract('hour' from date) as hour, extract('year' from date) as year from observations where pt = 6 and date = '1990-01-01' and date '1995-01-01' and lon 180 and lon 290 and lat -30 and lat 30 and sst is not null dataIn - dbGetQuery(dbh,sql) If this is a large table of which the desired rows are a small fraction of all rows then be sure there indexes on the variables in your where clause. You can also try it with the RpgSQL driver although there is no reason to think that that would be faster. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package source branching
Dear List, I was wondering if any of you worked on an R package which has many branches on a repository i.e. SVN. Is it recommended to branch an R package source tree based on a specific project? I know it depends on project but it would be great to hear opinions from R community. Best, Mehmet Süzen, PhD LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector
Hi, On Thu, Dec 1, 2011 at 10:35 AM, Majid golden_boy...@yahoo.com wrote: Hi. Can you please answer to my questions about R ? 1.how can I write command for vector ? for exaple in this sample : I have this : a1 - c (1:10) now how can I put in the vector ? I'm afraid I don't understand your question. a1 is a vector that you've created. What do you want to do with it? Sarah bye for now, Thanks a lot. Majid. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What's the baseline model when using coxph with factor variables?
Hi all, I'm trying to fit a Cox regression model with two factor variables but have some problems with the interpretation of the results. Considering the following model, where var1 and var2 can assume value 0 and 1: coxph(Surv(time, cens) ~ factor(var1) * factor(var2), data=temp) What is the baseline model? Is that considering the whole population or the case when both var1 and var2 = 0? Kind regards, andi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] References for book R In Action by Kabacoff (Uwe Ligges)
The references are here: http://manning.com/kabacoff/excerpt_references.pdf (they will be included on the next printing too, got omitted by mistake) regards, Pablo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting the occurences of a charater within a string
I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimation of AR(1) Model with Markov Switching
Dear R users, I have been trying to obtain the MLE of the following model state 0: y_t = 2 + 0.5 * y_{t-1} + e_t state 1: y_t = 0.5 + 0.9 * y_{t-1} + e_t where e_t ~ iidN(0,1) transition probability between states is 0.2 I've generated some fake data and tried to estimate the parameters using the constrOptim() function but I can't get sensible answers using it. I've tried using nlminb and maxLik but they haven't helped. Any tips on how I could possibly rewrite my likelihood in a better way to improve my results would be welcome. Given below is my R code # markov switching regime model # generate data for a AR(1) markov switching model with the following pars # state 0: y_t = 2 + 0.5 * y_{t-1} + e_t # state 1: y_t = 0.5 + 0.9 * y_{t-1} + e_t # where e_t ~ N(0,1) # transition probabilities p_s0_s1 = p_s1_s0 = 0.20 # generate realisations of the state gamma_s0 - qnorm(0.8) gamma_s1 - qnorm(0.2) gamma - rep(0,100) state - rep(0,100) # choose initial state at t=0 to be state 0 gamma[1] - gamma_s0 state[1] - 0 for(i in 2:100) { if(rnorm(1) gamma[i-1]) { gamma[i] - gamma_s0 state[i] - 0 } else { gamma[i] - gamma_s1 state[i] - 1 } } # generate observations # choose y_0 = 0 # recall state at t=1 was set to 0 y1 - 2 + 0.5 * 0 + rnorm(1) y - rep(0,100) y[1] - y1 for(i in 2:100) { if(state[i]==0) { y[i] - 2 + 0.5 * y[i-1] + rnorm(1) } else { y[i] - 0.5 + 0.9 * y[i-1] + rnorm(1) } } # convert into time series object y - ts(y, start = 1, freq = 1) # construct negative conditional likelihood function neg.logl - function(theta, data) { # construct parameters beta_s0 - theta[1:2] beta_s1 - theta[3:4] sigma2 - exp(theta[5]) gamma0 - theta[6] gamma1 - theta[7] # construct probabilities #probit specification p_s0_s0 - pnorm(gamma_s0) p_s0_s1 - pnorm(gamma_s1) p_s1_s0 - 1-pnorm(gamma_s0) p_s1_s1 - 1-pnorm(gamma_s1) # create data matrix X - cbind(1,y) # assume erogodicity of the markov chain # use unconditional probabilities p0_s0 - (1 - p_s1_s1) / (2 -p_s0_s0 -p_s1_s1) p0_s1 - 1-p0_s0 # create variables p_s0_t_1 - rep(0, nrow(X)) p_s1_t_1 - rep(0, nrow(X)) p_s0_t - rep(0, nrow(X)) p_s1_t - rep(0, nrow(X)) f_s0 - rep(0,nrow(X)-1) f_s1 - rep(0,nrow(X)-1) f - rep(0,nrow(X)-1) logf - rep(0, nrow(X)-1) p_s0_t[1] - p0_s0 p_s1_t[1] - p0_s1 # initiate hamilton filter for(i in 2:nrow(X)) { # calculate prior probabilities using the TPT # TPT for this example gives us # p_si_t_1 = p_si_t_1_si * p_si_t + p_si_t_1_sj * p_si_t # where p_si_t_1 is the prob state_t = i given information @ time t-1 # p_si_t_1_sj is the prob state_t = i given state_t_1 = j, and all info @ time t-1 # p_si_t is the prob state_t = i given information @ time t # in this simple example p_si_t_1_sj = p_si_sj p_s0_t_1[i] - (p_s0_s0 * p_s0_t[i-1]) + (p_s0_s1 * p_s1_t[i-1]) p_s1_t_1[i] - (p_s1_s0 * p_s0_t[i-1]) + (p_s1_s1 * p_s1_t[i-1]) # calculate density function for observation i # f_si is density conditional on state = i # f is the density f_s0[i] - dnorm(y[i]-X[i-1,]%*%beta_s0, sd = sqrt(sigma2)) f_s1[i] - dnorm(y[i]-X[i-1,]%*%beta_s1, sd = sqrt(sigma2)) f[i] - (f_s0[i] * p_s0_t_1[i]) + (f_s1[i] * p_s1_t_1[i]) # calculate filtered/posterior probabilities using bayes rule # p_si_t is the prob that state = i given information @ time t p_s0_t[i] - (f_s0[i] * p_s0_t_1[i]) / f[i] p_s1_t[i] - (f_s1[i] * p_s1_t_1[i]) / f[i] logf[i] - log(f[i]) } logl -sum(logf) return(-logl) } # restrict intercept in state model 0 to be greater than intercept in state model 1 # thus matrix of restrictions R is [1 0 -1 0 0 0 0] R - matrix(c(1,0,-1,0,0,0,0), nrow = 1) # pick start values for the 7 unknown parameters start_val - matrix(runif(7), nrow = 7) # ensures starting values are in the feasible set start_val[1,] - start_val[3,] + 0.1 # estimate pars results -constrOptim(start_val,neg.logl,grad = NULL, ui = R, ci = 0) Regards, N -- View this message in context: http://r.789695.n4.nabble.com/Estimation-of-AR-1-Model-with-Markov-Switching-tp4129417p4129417.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bigmemory on Solaris
Thanks again for your help. I've been able to add several packages, bigmemory seems to be the only one to fail and it fails on isinf. Is there a way I can download the code and change it to include a ininf function or definition? I'm using the GNU compiler; should I have been using the SUN Studio compiler? Roger -- View this message in context: http://r.789695.n4.nabble.com/bigmemory-on-Solaris-tp4128179p4129290.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the baseline model when using coxph with factor variables?
On Dec 1, 2011, at 12:00 PM, Andreas Schlicker wrote: Hi all, I'm trying to fit a Cox regression model with two factor variables but have some problems with the interpretation of the results. Considering the following model, where var1 and var2 can assume value 0 and 1: coxph(Surv(time, cens) ~ factor(var1) * factor(var2), data=temp) What is the baseline model? Is that considering the whole population or the case when both var1 and var2 = 0? This has been discussed several times in the past on rhelp. My suggestion would be to search your favorite rhelp archive using baseline hazard Therneau, since Terry Therneau is the author of survival. (The answer is closer to the first than to the second.) Kind regards, andi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
Sorry if this is a duplicate: This is a re-post because the pdf's mentioned below did not go through. Hello, I'm new'ish to R, and very new to glm. I've read a lot about my issue: Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred ...including: http://tolstoy.newcastle.edu.au/R/help/05/07/7759.html http://r.789695.n4.nabble.com/glm-fit-quot-fitted-probabilities-numerically-0-or-1-occurred-quot-td849242.html (note that I never found: MASS4 pp.197-8 However, Ted's post was quite helpful.) This is a common question, sorry. Because it is a common issue I am posting everything I know about the issue and how I think I am not falling into the same trap at the others (but I must be due to some reason I am not yet aware of). From the two links above I gather that my warning glm.fit: fitted probabilities numerically 0 or 1 occurred arises from a perfect fit situation (i.e. the issue where all the high value x's (predictor variables) are Y=1 (response=1) or the other way around). I don't feel my data has this issue. Please point out how it does! The list post instructions state that I can attach pdf's, so I attached plots of my data right before I do the call to glm. The attachments are plots of my data stored in variable l_yx (as can be seen in the axis names): My response (vertical axis) by row index (horizontal axis): plot(l_yx[,1],type='h') My predictor variable (vertical axis) by row index index (horizontal axis): plot(l_yx[,2],type='h') So here is more info on my data frame/data (in case you can't see my pdf attachments): unique(l_yx[,1]) [1] 0 1 mean(l_yx[,2]) [1] 0.01123699 max(l_yx[,2]) [1] 14.66518 min(l_yx[,2]) [1] 0 attributes(l_yx) $dim [1] 690303 2 $dimnames $dimnames[[1]] NULL $dimnames[[2]] [1] y x With the above data I do: l_logit = glm(y~x, data=as.data.frame(l_yx), family=binomial(link=logit)) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred Why am I getting this warning when I have data points of varying values for y=1 and y=0? In other words, I don't think I have the linear separation issue discussed in one of the links I provided. PS - Then I do this and I get a odds ratio a crazy size: l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds $coefficients[2] l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the coeffcients l_exp_coef x 3161.781 So for one unit increase in the predictor variable I get 3160.781% (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either. How do I correct for this issue? (I tried multiplying the predictor variables by a constant and the odds ratio goes down, but the warning above still persists and shouldn't the odds ratio be predictor variable size independent?) Thank you for your help! Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the baseline model when using coxph with factor variables?
Terry will correct me if I'm wrong, but I don't think the answer to this question is specific to the coxph function. For all the [well-written] formula-based modelling functions (essentially, those that call model.frame and model.matrix to interpret the formula) the option contrasts controls how factor variables are parameterized in the model matrix. contr.treatment makes the baseline the first factor level, contr.SAS makes the baseline the last, contr.sum makes the baseline the mean, etc. E.g., df - data.frame(time=sin(1:20)+2, cens=rep(c(0,0,1), len=20), var1=factor(rep(0:1, each=10)), var2=factor(rep(0:1, 10))) options(contrasts=c(contr.treatment, contr.treatment)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var11 0.1640 1.180.822 0.1995 0.84 var21 0.0806 1.080.830 0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 options(contrasts=c(contr.SAS, contr.SAS)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var10 -0.1640 0.8490.822 -0.1995 0.84 var20 -0.0806 0.9230.830 -0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 options(contrasts=c(contr.sum, contr.sum)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var11 -0.0820 0.9210.411 -0.1995 0.84 var21 -0.0403 0.9600.415 -0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 (lm() has a contrasts argument that can override getOption(contrasts) and set different contrasts for each variable but coxph() does not have that argument.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, December 01, 2011 9:36 AM To: a.schlic...@nki.nl Cc: r-help@r-project.org Subject: Re: [R] What's the baseline model when using coxph with factor variables? On Dec 1, 2011, at 12:00 PM, Andreas Schlicker wrote: Hi all, I'm trying to fit a Cox regression model with two factor variables but have some problems with the interpretation of the results. Considering the following model, where var1 and var2 can assume value 0 and 1: coxph(Surv(time, cens) ~ factor(var1) * factor(var2), data=temp) What is the baseline model? Is that considering the whole population or the case when both var1 and var2 = 0? This has been discussed several times in the past on rhelp. My suggestion would be to search your favorite rhelp archive using baseline hazard Therneau, since Terry Therneau is the author of survival. (The answer is closer to the first than to the second.) Kind regards, andi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a function, want a string argument to define the name of the excel sheet to be called
Just a heads up -- I don't think your code will work with an actual .xls(x) file, only .txt, .csv, etc (aka, plain text files). I may be wrong about that, but if you actually need to work with Excel files directly you will need an additional package. Michael On Thu, Dec 1, 2011 at 9:10 AM, AOLeary aodha...@gmail.com wrote: Thank you very much, that does exactly what I want it to! :) Aodhán -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-function-want-a-string-argument-to-define-the-name-of-the-excel-sheet-to-be-called-tp4128384p4128495.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transform data.frame holding answers -- data.frame holding logicals
Hello Hello I have a data frame, x, holding 5 persons answering the question which cars they have used: # the data frame x - as.data.frame( matrix( c('BMW', '', '', 'Mercedes', 'VW', '', 'Skoda', 'VW', 'BMW', '', '', '', 'VW', 'Skoda', '' ), ncol=3, dimnames=list( NULL, paste( 'v', 1:3, sep='' ) ) ) ) How do I transform this data frame to a data frame stating whether they have used the presented car: BMW Mercedes VW Skoda 1 TF F F 2 FT T F 3 TF T T 4 FF F F 5 FF T T My idea was to first find all levels by: v - unique(unlist(lapply(x, levels))) But then I am stuck. Thanks for help, *S* -- Sascha Vieweg, saschav...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace columns in a data.frame randomly splitted
I repeat myself: Any more automated solution will depend on whether your data has rownames or not. [...] create a plain text representation of R data using the dput() command. Another way that might make more sense is to cbind() the data you need later on before the split and then it will be carried through. Michael On Thu, Dec 1, 2011 at 2:54 AM, agent dunham crossp...@hotmail.com wrote: The thing is that I've already been working with df1, and I was looking for a function that could replace values knowing rows and columns. Does it exist? Thank you, u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/Replace-columns-in-a-data-frame-randomly-splitted-tp4122926p4127405.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
On Dec 1, 2011, at 18:54 , Ben quant wrote: Sorry if this is a duplicate: This is a re-post because the pdf's mentioned below did not go through. Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is arbitrary binary). Anyways, see below [snip] With the above data I do: l_logit = glm(y~x, data=as.data.frame(l_yx), family=binomial(link=logit)) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred Why am I getting this warning when I have data points of varying values for y=1 and y=0? In other words, I don't think I have the linear separation issue discussed in one of the links I provided. I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at d - as.data.frame(l_yx) with(d, y[order(x)]) if it comes out as all zeros followed by all ones or vice versa, then you have the problem. PS - Then I do this and I get a odds ratio a crazy size: l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds $coefficients[2] l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the coeffcients l_exp_coef x 3161.781 So for one unit increase in the predictor variable I get 3160.781% (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either. How do I correct for this issue? (I tried multiplying the predictor variables by a constant and the odds ratio goes down, but the warning above still persists and shouldn't the odds ratio be predictor variable size independent?) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change the limits of a plot a posteriori
Hi all How can I change the limits (xlim or ylim) in a plot that has been already created? For example, consider this naive example curve(dbeta(x,2,4)) curve(dbeta(x,8,13),add=T,col=2) When adding the second curve, it goes off the original limits computed by R for the first graph, which are roughly, c(0,2.1) I know two obvious solutions for this, which are: 1) passing a sufficiently large parameter e.g. ylim=c(0,4) to the first graphic curve(dbeta(x,2,4),ylim=c(0,4)) curve(dbeta(x,8,13),add=T,col=2) or 2) switch the order in which I plot the curves curve(dbeta(x,8,13),col=2) curve(dbeta(x,2,4),add=T) but I guess if there is any way of adjusting the limits of the graphic a posteriori, once you have a plot with the undesired limits, forcing R to redraw it with the new limits, but without having to execute again the curve commands Hope I made myself clear Best regards and thank you very much in advance -- View this message in context: http://r.789695.n4.nabble.com/Change-the-limits-of-a-plot-a-posteriori-tp4129750p4129750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assign name to object for each iteration in a loop.
Hi R-users, I'm trying to produce decompositions of a multiple time-series, grouped by a factor (called area). I'm modifying the code in the STLperArea function of package ndvits, as this function only plots produces stl plots, it does not return the underlying data. I want to extract the trend component of each decomposition (x$time.series[,trend]), assign a name based on the factor area. My input data look like this: Area is a factor, with three (but could be many more) levels. area 1 2 3 Ystart=2000 TS is a timeseries: X249 X265 X281 X297 X2000113 1 0.2080 0.2165 0.2149 0.2314 0.2028 2 0.1578 0.1671 0.1577 0.1593 0.1672 3 0.1897 0.1948 0.2290 0.2292 0.2067 Here's the function: STLpA-function(TS, area, Ystart, period=23, nSG=5,5, DSG=0) { require (RTisean) for(i in 1:unique(area)){ vi.metric=TS[area==i] filt.vi-sav_gol(vi.metric,n=nSG,D=DSG) vi.sg-ts(filt.vi[,1], start=Ystart,frequency=period) stld.tmp-stl(vi.sg, s.window=periodic, robust=TRUE, na.action=na.approx) stld.trend-stld.temp$time.series[,trend] } assign(paste(stld, i , sep= .), stld.trend) vi.trend-ls(pattern= ^stld..$) return(vi.trend) } When I call this function with signal=STLpA(TS,area,Ystart=2000,period=23, nSG= 5,5, DSG=0)) I get this error: Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat' In addition: Warning message: In 1:unique(area) : numerical expression has 3 elements: only the first used I'm guessing this is because I'm assigning names to each temporary stl.trend file incorrectly. Can anyone improve on my rather poor efforts here? Many thanks, Louise -- View this message in context: http://r.789695.n4.nabble.com/Assign-name-to-object-for-each-iteration-in-a-loop-tp4129752p4129752.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nested random effects with lmer
Hi, I have a multilevel situation where subjects are nested within clinics, and each subject has multiple measurements. For simplicity, suppose there 4 clinics, 3 subjects per clinic, and each subject has 3 repeated measures. Outcome is continuous. I am trying to implement this model with lmer function in lme4 library. lmeobj1-lmer(outcome~med+time+medtime+(time|clinid)+(subjid|clinid)) My data set is as follows: med has two levels, time has three levels, medtime is the interaction between med and time, subjid is the id numbers for subjects, clinid is the id numbers for clinics. I pasted the data below. The above code is giving the same results with fixed effects regression. In addition, whatever I put as random effects is not making a difference at all. What am I doing wrong? cbind(outcome,med,time,medtime,subjid,clinid) outcome med time medtime subjid clinid [1,]0.8511317 0 0 010 4 [2,]3.9524773 0 1 010 4 [3,]2.1637332 0 2 010 4 [4,]1.6713403 1 0 0 1 1 [5,]1.7379582 1 1 1 1 1 [6,]5.0951544 1 2 2 1 1 [7,]0.9523377 1 0 0 2 1 [8,]2.4599613 1 1 1 2 1 [9,]5.1759744 1 2 2 2 1 [10,] -0.9495338 1 0 0 4 2 [11,]2.9300264 1 1 1 4 2 [12,]3.7025651 1 2 2 4 2 [13,]1.1628554 1 0 0 6 2 [14,]4.1536130 1 1 1 6 2 [15,]5.0869188 1 2 2 6 2 [16,]0.3114241 0 0 0 7 3 [17,]0.6007636 0 1 0 7 3 [18,]2.2783941 0 2 0 7 3 [19,] -0.7095164 0 0 0 8 3 [20,] -0.1541448 0 1 0 8 3 [21,]1.6037837 0 2 0 8 3 [22,]1.9994632 0 0 0 9 3 [23,]0.8047261 0 1 0 9 3 [24,]0.1828438 0 2 0 9 3 [25,]2.7219649 0 0 012 4 [26,]2.3046305 0 1 012 4 [27,]1.1838549 0 2 012 4 [28,]0.6292358 1 0 0 3 1 [29,]2.9953366 1 1 1 3 1 [30,]3.9291266 1 2 2 3 1 [31,]0.8726671 0 0 011 4 [32,] -0.4657651 0 1 011 4 [33,]3.0364098 0 2 011 4 [34,]0.3046405 1 0 0 5 2 [35,]0.8789185 1 1 1 5 2 [36,]4.1729769 1 2 2 5 2 Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about plsr() results
Hi, With some help I learned how to use plsr(Y~x, 2, data=my) function in R (and build my from vector Y and matrix x). But still I have question about results interpretation. In the end I want to construct prediction function in form: Y=a1x1+a2x2 But I do not understand how to do it. Documentation do not describe this. Thanks for help. ;) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the limits of a plot a posteriori
jcano wrote on 12/01/2011 12:12:03 PM: Hi all How can I change the limits (xlim or ylim) in a plot that has been already created? For example, consider this naive example curve(dbeta(x,2,4)) curve(dbeta(x,8,13),add=T,col=2) When adding the second curve, it goes off the original limits computed by R for the first graph, which are roughly, c(0,2.1) I know two obvious solutions for this, which are: 1) passing a sufficiently large parameter e.g. ylim=c(0,4) to the first graphic curve(dbeta(x,2,4),ylim=c(0,4)) curve(dbeta(x,8,13),add=T,col=2) or 2) switch the order in which I plot the curves curve(dbeta(x,8,13),col=2) curve(dbeta(x,2,4),add=T) but I guess if there is any way of adjusting the limits of the graphic a posteriori, once you have a plot with the undesired limits, forcing R to redraw it with the new limits, but without having to execute again the curve commands Hope I made myself clear Best regards and thank you very much in advance There is no way to adjust the limits of the plot a posteriori. You could set it up so that the limits of the plot are determined directly from the data you are plotting. x - seq(0, 1, 0.01) y.range - range(dbeta(x, 2, 4), dbeta(x, 8, 13)) curve(dbeta(x, 2, 4), ylim=y.range) curve(dbeta(x, 8, 13), ylim=y.range, add=T, col=2) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get inflection point in binomial glm
Thanks David and Rubén! @David: indeed 15 betas I forgot the interaction terms, thanks for correcting! @Rubén: the re-parameterize would be done within nls()? how to do this practically with including the factor predictor? and yes, we can solve within each group for Y=0 getting 0=b0+b1*X |-b0 -b0=b1*X |/b1 -b0/b1=X but I was hoping there might a more general solution for the case of multiple logistic regression. HTH René Zitat von David Winsemius dwinsem...@comcast.net: On Dec 1, 2011, at 8:24 AM, René Mayer wrote: Dear All, I have a binomial response with one continuous predictor (d) and one factor (g) (8 levels dummy-coded). glm(resp~d*g, data, family=binomial) Y=b0+b1*X1+b2*X2 ... b7*X7 Dear Dr Mayer; I think it might be a bit more complex than that. I think you should get 15 betas rather than 8. Have you done it? how can I get the inflection point per group, e.g., P(d)=.5 Wouldn't that just be at d=1/beta in each group? (Thinking, perhaps naively, in the case of X=X1 that (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta *d*(X==X1) ) # all other terms = 0 And taking the log of both sides, and then use middle school math to solve. Oh, wait. Muffed my first try on that for sure. Need to add back both the constant intercept and the baseline d coefficient for the non-b0 levels. (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta_0 + beta_d_0*d + beta_n + beta_d_n *d*(X==Xn) ) And just (Pr[y==1])/(1-Pr[y==1])) = 1 = exp( beta_0 + beta_d_0*d ) # for the reference level. This felt like an exam question in my categorical analysis course 25 years ago. (Might have gotten partial credit for my first stab, depending on how forgiving the TA was that night.) I would be grateful for any help. Thanks in advance, René -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assign name to object for each iteration in a loop.
I think part of your problem is the loop: you probably mean for(i in unique(area)) Michael On Thu, Dec 1, 2011 at 1:13 PM, lglew l.g...@soton.ac.uk wrote: Hi R-users, I'm trying to produce decompositions of a multiple time-series, grouped by a factor (called area). I'm modifying the code in the STLperArea function of package ndvits, as this function only plots produces stl plots, it does not return the underlying data. I want to extract the trend component of each decomposition (x$time.series[,trend]), assign a name based on the factor area. My input data look like this: Area is a factor, with three (but could be many more) levels. area 1 2 3 Ystart=2000 TS is a timeseries: X249 X265 X281 X297 X2000113 1 0.2080 0.2165 0.2149 0.2314 0.2028 2 0.1578 0.1671 0.1577 0.1593 0.1672 3 0.1897 0.1948 0.2290 0.2292 0.2067 Here's the function: STLpA-function(TS, area, Ystart, period=23, nSG=5,5, DSG=0) { require (RTisean) for(i in 1:unique(area)){ vi.metric=TS[area==i] filt.vi-sav_gol(vi.metric,n=nSG,D=DSG) vi.sg-ts(filt.vi[,1], start=Ystart,frequency=period) stld.tmp-stl(vi.sg, s.window=periodic, robust=TRUE, na.action=na.approx) stld.trend-stld.temp$time.series[,trend] } assign(paste(stld, i , sep= .), stld.trend) vi.trend-ls(pattern= ^stld..$) return(vi.trend) } When I call this function with signal=STLpA(TS,area,Ystart=2000,period=23, nSG= 5,5, DSG=0)) I get this error: Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat' In addition: Warning message: In 1:unique(area) : numerical expression has 3 elements: only the first used I'm guessing this is because I'm assigning names to each temporary stl.trend file incorrectly. Can anyone improve on my rather poor efforts here? Many thanks, Louise -- View this message in context: http://r.789695.n4.nabble.com/Assign-name-to-object-for-each-iteration-in-a-loop-tp4129752p4129752.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the limits of a plot a posteriori
On 01/12/2011 1:12 PM, jcano wrote: Hi all How can I change the limits (xlim or ylim) in a plot that has been already created? You can't, if you're using classic R graphics. They use an ink on paper model of graphics. If you want to change what you've drawn, you get a new piece of paper. Your two solutions below are the usual methods to work around this limitation. The first is the easiest method in general, though it's not particularly easy if you're using curve(). Generally if you're planning to plot y1 vs x1 and y2 vs x2, you can set your ylim to range(c(y1, y2)) and your xlim to range(c(x1, x2)) and things are fine. If you want to follow this strategy with curve(), you need to call it twice for each curve: once to find the range of points it would plot (they are in the return value of the function), and a second time to redo the plot with the calculated xlim and ylim. Duncan Murdoch For example, consider this naive example curve(dbeta(x,2,4)) curve(dbeta(x,8,13),add=T,col=2) When adding the second curve, it goes off the original limits computed by R for the first graph, which are roughly, c(0,2.1) I know two obvious solutions for this, which are: 1) passing a sufficiently large parameter e.g. ylim=c(0,4) to the first graphic curve(dbeta(x,2,4),ylim=c(0,4)) curve(dbeta(x,8,13),add=T,col=2) or 2) switch the order in which I plot the curves curve(dbeta(x,8,13),col=2) curve(dbeta(x,2,4),add=T) but I guess if there is any way of adjusting the limits of the graphic a posteriori, once you have a plot with the undesired limits, forcing R to redraw it with the new limits, but without having to execute again the curve commands Hope I made myself clear Best regards and thank you very much in advance -- View this message in context: http://r.789695.n4.nabble.com/Change-the-limits-of-a-plot-a-posteriori-tp4129750p4129750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform data.frame holding answers -- data.frame holding logicals
saschaview wrote on 12/01/2011 12:30:18 PM: Hello I have a data frame, x, holding 5 persons answering the question which cars they have used: # the data frame x - as.data.frame( matrix( c('BMW', '', '', 'Mercedes', 'VW', '', 'Skoda', 'VW', 'BMW', '', '', '', 'VW', 'Skoda', '' ), ncol=3, dimnames=list( NULL, paste( 'v', 1:3, sep='' ) ) ) ) How do I transform this data frame to a data frame stating whether they have used the presented car: BMW Mercedes VW Skoda 1 TF F F 2 FT T F 3 TF T T 4 FF F F 5 FF T T My idea was to first find all levels by: v - unique(unlist(lapply(x, levels))) But then I am stuck. Thanks for help, *S* -- Sascha Vieweg, saschav...@gmail.com Try this: uniq.cars - sort(unique(unlist(x))) y - t(apply(x, 1, function(xrow) match(uniq.cars, unique(xrow dimnames(y)[[2]] - uniq.cars y2 - !is.na(y[, -1]) y2 Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector
Hi, On Thu, Dec 1, 2011 at 2:43 PM, Majid golden_boy...@yahoo.com wrote: Dear Sarah. Thanks so much,Really I am new in this software,I am wrking to learn the software. First, you should always send your replies to the list. That way information can help others, and more people are available to provide advice. Second, is this homework? Third, see below. I have this sample and I was checking it : #Generating data=== # in here we produce patterned data #Do this and see what will happen: 1:10 # now put this seri in a vector a1 - c (1:10) #== #Do this and see what will happen: seq(1, 5, 0.5) # now put this seri in a vector a2 - c (seq(1, 5, 0.5)) Can you please learn me what is the matter about these orders ? Nothing is wrong with these commands. Both make vectors. The c() is unnecessary in both cases, but still works. If this is not homework, you still need to explain what you *expect* to have happen, otherwise nobody can help you. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the baseline model when using coxph with factor variables?
On Dec 1, 2011, at 1:00 PM, William Dunlap wrote: Terry will correct me if I'm wrong, but I don't think the answer to this question is specific to the coxph function. It depends on our interpretation of the questioner's intent. My answer was predicated on the assumption that the phrase baseline model meant baseline survival function, ... S_0(t) in survival analysis notation. For all the [well-written] formula-based modelling functions (essentially, those that call model.frame and model.matrix to interpret the formula) the option contrasts controls how factor variables are parameterized in the model matrix. contr.treatment makes the baseline the first factor level, contr.SAS makes the baseline the last, contr.sum makes the baseline the mean, etc. E.g., df - data.frame(time=sin(1:20)+2, cens=rep(c(0,0,1), len=20), var1=factor(rep(0:1, each=10)), var2=factor(rep(0:1, 10))) options(contrasts=c(contr.treatment, contr.treatment)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var11 0.1640 1.180.822 0.1995 0.84 var21 0.0806 1.080.830 0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 options(contrasts=c(contr.SAS, contr.SAS)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var10 -0.1640 0.8490.822 -0.1995 0.84 var20 -0.0806 0.9230.830 -0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 options(contrasts=c(contr.sum, contr.sum)) coxph(Surv(time, cens) ~ var1 + var2, data=df) Call: coxph(formula = Surv(time, cens) ~ var1 + var2, data = df) coef exp(coef) se(coef) zp var11 -0.0820 0.9210.411 -0.1995 0.84 var21 -0.0403 0.9600.415 -0.0971 0.92 Likelihood ratio test=0.05 on 2 df, p=0.974 n= 20, number of events= 6 (lm() has a contrasts argument that can override getOption(contrasts) and set different contrasts for each variable but coxph() does not have that argument.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of David Winsemius Sent: Thursday, December 01, 2011 9:36 AM To: a.schlic...@nki.nl Cc: r-help@r-project.org Subject: Re: [R] What's the baseline model when using coxph with factor variables? On Dec 1, 2011, at 12:00 PM, Andreas Schlicker wrote: Hi all, I'm trying to fit a Cox regression model with two factor variables but have some problems with the interpretation of the results. Considering the following model, where var1 and var2 can assume value 0 and 1: coxph(Surv(time, cens) ~ factor(var1) * factor(var2), data=temp) What is the baseline model? Is that considering the whole population or the case when both var1 and var2 = 0? This has been discussed several times in the past on rhelp. My suggestion would be to search your favorite rhelp archive using baseline hazard Therneau, since Terry Therneau is the author of survival. (The answer is closer to the first than to the second.) Kind regards, andi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector
On Thu, Dec 1, 2011 at 3:11 PM, Majid golden_boy...@yahoo.com wrote: Hi, yes, It is homework, Then ask your TA/instructor for help. These are 2 command: first for generating data: (1:10) that output is 1 2 310 ok ? second is : a1-c( 1:10) what is the output ?I didnot see any thing. Exactly as it should be. There's no problem here, except that you need to read help(-) Thanks, Majid. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
Thank you for the feedback, but my data looks fine to me. Please tell me if I'm not understanding. I followed your instructions and here is a sample of the first 500 values : (info on 'd' is below that) d - as.data.frame(l_yx) x = with(d, y[order(x)]) x[1:500] # I have 1's and 0's dispersed throughout [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0 [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 # I get the warning still l_df = as.data.frame(l_yx) l_logit = glm(y~x, data=l_df, family=binomial(link=logit)) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred # some info on 'd' above: d[1:500,1] [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d[1:500,2] [1] 3.023160e-03 7.932130e-02 0.00e+00 1.779657e-02 1.608374e-01 0.00e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02 2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03 [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03 0.00e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02 2.954641e-04 1.434859e-03 6.964976e-04 0.00e+00 1.202162e-02 [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04 1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01 3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02 [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02 2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02 0.00e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04 [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02 2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03 1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01 [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03 9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03 9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02 [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02 6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02 6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02 [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03 1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02 3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01 [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02 6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03 3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04 [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02 1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03 2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03 [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02 2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02 3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02 [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03 6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02 8.288831e-03 3.170856e-04
[R] calculate mean of multiple rows in a data frame
Dear all, I have a data frame (DF) in the following format: NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686. I would like to consolidate the data frame by parsing through the rows, and where the NAME is identical, consolidate into one row and return the mean. I can do this for the first lines (Control_1 etc) by using aggregate() aggregate(DF[,-c(1:2)], by=list(DF$NAME), mean) but since aggregate looks for unique lines it won't consolidate e.g. lines 5/6 and 7/8. Is there a way of telling aggregate to grep just the first part of the name (i.e. up to ~) and consolidate those? I could pre-grep the file before importing into R, but I'd like to do it within R if possible. Thanks for any suggestions [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining arima and ar function
I think your attachment got scrubbed so I can't verify this, but i think the difficulty is that the inner apply returns a whole set of orders, instead of just one and that throws arima off: would this work? getOrder - function(x) ar(na.omit(x), method = mle)$order all.equal(apply(TSX, 2, getOrder), apply(TSX,2, function(x) ar(na.omit(x),method=mle)$order) # Verify it works. apply(TSX,2,function(x)arima(x,order=c(getOrder(x),0,0))$residuals Michael On Thu, Dec 1, 2011 at 4:57 AM, Samir Benzerfa benze...@gmx.ch wrote: Hi everyone I've got a problem regarding the arima() and the ar() function for autoregressive series. I would simply like to combine them. To better understand my question, I first show you how I'm using these two functions individually (see file in the attachement). 1) apply(TSX,2, function(x) ar(na.omit(x),method=mle)$order # this function finds the optimal (autoregressive) order for each series (briefly speaking: for each series I get a specific number) 2) apply(TSX,2,function(x)arima(x,order=c(1,0,0))$residuals # this function puts an autoregressive model of order 1 on each series What I'd like to do, is to combine them, such that the resulting orders (numbers) from the first function are used as orders in the second function. So in the specific example attached the results from the first function are 6,5,1 and these numbers should be used as the order in function two. I tried to simply put the two functions together as follows: apply(TSX,2,function(x) arima(x,order=c((apply(TSX,2, function(x)ar(na.omit(x),method=mle)$order))),0,0)) However, I get the error message 'order' must be a non-negative numeric vector of length 3. Any hints? I hope you can help me with this issue. Many thanks and best regards, S.B. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: calculate mean of multiple rows in a data frame
NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 Sorry, that should look like this: NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 NAME ID a b c d 1 Control_1 probe~B01R01C01 3 22 926 774 2 Control_2 probe~B01R01C02 712 13 32 179 3 Control_1 probe~B01R09C01 937 824 898 668 4 Control_3 probe~B01R09C02 464 836 508 53 5 MM0289~RFU:11810.15 probe~B29R13C06 99 544 607 984 6 MM0289~RFU:9238.41 probe~B29R13C05 605 603 862 575 7 MM16597~RFU:36765.38 probe~B44R15C20 700 923 219 582 8 MM16597~RFU:41258.94 probe~B44R15C19 132 777 497 995 --- On Thu, 1/12/11, Jabez Wilson jabez...@yahoo.co.uk wrote: From: Jabez Wilson jabez...@yahoo.co.uk Subject: calculate mean of multiple rows in a data frame To: R-Help r-h...@stat.math.ethz.ch Date: Thursday, 1 December, 2011, 20:45 Dear all, I have a data frame (DF) in the following format: NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686. I would like to consolidate the data frame by parsing through the rows, and where the NAME is identical, consolidate into one row and return the mean. I can do this for the first lines (Control_1 etc) by using aggregate() aggregate(DF[,-c(1:2)], by=list(DF$NAME), mean) but since aggregate looks for unique lines it won't consolidate e.g. lines 5/6 and 7/8. Is there a way of telling aggregate to grep just the first part of the name (i.e. up to ~) and consolidate those? I could pre-grep the file before importing into R, but I'd like to do it within R if possible. Thanks for any suggestions [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
On Dec 1, 2011, at 21:32 , Ben quant wrote: Thank you for the feedback, but my data looks fine to me. Please tell me if I'm not understanding. Hum, then maybe it really is a case of a transition region being short relative to the range of your data. Notice that the warning is just that: a warning. I do notice that the distribution of your x values is rather extreme -- you stated a range of 0--14 and a mean of 0.01. And after all, an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units. Have a look at range(x[y==0]) and range(x[y==1]). I followed your instructions and here is a sample of the first 500 values : (info on 'd' is below that) d - as.data.frame(l_yx) x = with(d, y[order(x)]) x[1:500] # I have 1's and 0's dispersed throughout [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0 [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 # I get the warning still l_df = as.data.frame(l_yx) l_logit = glm(y~x, data=l_df, family=binomial(link=logit)) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred # some info on 'd' above: d[1:500,1] [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d[1:500,2] [1] 3.023160e-03 7.932130e-02 0.00e+00 1.779657e-02 1.608374e-01 0.00e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02 2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03 [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03 0.00e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02 2.954641e-04 1.434859e-03 6.964976e-04 0.00e+00 1.202162e-02 [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04 1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01 3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02 [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02 2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02 0.00e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04 [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02 2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03 1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01 [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03 9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03 9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02 [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02 6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02 6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02 [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03 1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02 3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01 [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02 6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03 3.805619e-03 1.092683e-02 1.760635e-01
[R] MCMCglmm error with multinomial distribution
With binomial/binary responses (0|1) running MCMCglmm with family=multinomial terminates with Error in if (nJ 1) { : missing value where TRUE/FALSE needed with family=categorical there are no errors I have not looked in the code, do I need format the responses TRUE/FALSE , or is this just a bug? -- Håvard Wahl Kongsgård __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.