[R] a publication quality table for summary.zeroinfl()
I will swallow my pride and post to this list for the second time in 24 hours--I have a paper due to a reviewer and I am desperate. Has anybody written code to move the output from summary() called on the results of a zeroinfl() model (from the pscl package) into a form suitable for publication? When I hit send on this message I am going to begin hand typing stars into a spreadsheet. The problem is that the zero-inflated model has two parts: a count and a zero portion--its coefficients are stored in separate arrays and there is a Log(theta) that needs to be thrown in there that is in a completely separate place in the structure of the summary function. As a result the functions that I have found for outputting summaries of other linear models all give error messages when attempted on this summary object. My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic and Linear Regression Libraries
Hi all, I'm trying to discover the options available to me for logistic and linear regression. I'm doing some tests on a dataset and want to see how different flavours of the algorithms cope. So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Is there a list anywhere detailing the options available which details the specific algorithms used? Thanks in advance, Phil -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
glm is not, and never was. part of the MASS package. It's in the stats package. Have you sorted out why there is a big difference between the results you get using glm and lrm? Are you confident it is due to the algorithms used and not your ability to use the software? To be helpful, if disappointing, I think the answer to your question is no. You will need to seek out the algorithms from the published information on them individually. W. From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of tdm [ph...@philbrierley.com] Sent: 31 October 2009 16:53 To: r-help@r-project.org Subject: [R] Logistic and Linear Regression Libraries Hi all, I'm trying to discover the options available to me for logistic and linear regression. I'm doing some tests on a dataset and want to see how different flavours of the algorithms cope. So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Is there a list anywhere detailing the options available which details the specific algorithms used? Thanks in advance, Phil -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
Hi Phil, So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Be sure that you mean what you say, that you are saying what you mean, and that you know what you mean when making such statements, especially on this list. glm is not in MASS, so perhaps you mean polr in package MASS. And no, there is no big difference. You are doing something wrong. ## Edited output from polr and lrm house.lrm - lrm(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) house.lrm Logistic Regression Model lrm(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq) snip CoefS.E.Wald Z P y=Medium 0.4961 0.12485 3.97 0.0001 y=High-0.6907 0.12547 -5.50 0. Infl=Medium 0.5664 0.10465 5.41 0. Infl=High 1.2888 0.12716 10.14 0. Type=Apartment -0.5724 0.11924 -4.80 0. Type=Atrium-0.3662 0.15517 -2.36 0.0183 Type=Terrace -1.0910 0.15149 -7.20 0. Cont=High 0.3603 0.09554 3.77 0.0002 house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) summary(house.plr) Re-fitting to get Hessian Call: polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq) Coefficients: Value Std. Error t value InflMedium 0.5663937 0.10465285 5.412120 InflHigh 1.2888191 0.12715641 10.135699 TypeApartment -0.5723501 0.11923824 -4.800055 TypeAtrium-0.3661866 0.15517362 -2.359851 TypeTerrace -1.0910149 0.15148617 -7.202076 ContHigh 0.3602841 0.09553587 3.771192 snip Regards, Mark. tdm wrote: Hi all, I'm trying to discover the options available to me for logistic and linear regression. I'm doing some tests on a dataset and want to see how different flavours of the algorithms cope. So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Is there a list anywhere detailing the options available which details the specific algorithms used? Thanks in advance, Phil -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140375.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a publication quality table for summary.zeroinfl()
Hi Chris, My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. I would never use Excel for this since there are far superior tools available. But it is very easy to assemble the two parts into a single table for further manipulation. Not sure about the clipboard route, but the following rather crude approach saves the object (a list) to a *.csv file in your working directory. ## Simple, crude approach, with theta replicated as last column in the saved *.csv data(bioChemists, package = pscl) fm_zinb - zeroinfl(art ~ . | 1, data = bioChemists, dist = negbin) Temptab - list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta) getwd() write.csv(Temptab, file=Temptab.csv) read.csv(Temptab.csv) ## If you have a latex distribution library(Hmisc) latex(list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta), cdec=c(3,3,3,4,3)) Regards, Mark. Chris Fowler wrote: I will swallow my pride and post to this list for the second time in 24 hours--I have a paper due to a reviewer and I am desperate. Has anybody written code to move the output from summary() called on the results of a zeroinfl() model (from the pscl package) into a form suitable for publication? When I hit send on this message I am going to begin hand typing stars into a spreadsheet. The problem is that the zero-inflated model has two parts: a count and a zero portion--its coefficients are stored in separate arrays and there is a Log(theta) that needs to be thrown in there that is in a completely separate place in the structure of the summary function. As a result the functions that I have found for outputting summaries of other linear models all give error messages when attempted on this summary object. My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/a-publication-quality-table-for-summary.zeroinfl%28%29-tp26140199p26140542.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Back-to-back graph with lattice package
Hi, Thank you for your reply. I have looked into the epicalc package already and it will not help me this time. I realize that I wasn't clear enough. In my example I want to relate all my data to Country, i.e. the distribution of Hair, Eye color, Sex (they are dichotomous in my example) and Age should all be related to each of the five countries. Pyramid in epicalc can't be used for relating e.g.Hair with Country, and I can't get the labels for countries. Furthermore, I want all the data in the nice panels of lattice as one single plot, which will occupy rather little space. So, yes, I'm looking for a work-around. When I make barcharts or dotplots from my categorical data, I first need make a table. One work-around could be to e.g. in the Sex-column of the table multiply the values for male by '-1' to get negative values, Then the bars or dots for male would go to the left and the female bars go to the right. Unfortunately, I don't know how do these calculations and if the plot looks good. Erik 2009/10/30 David Winsemius dwinsem...@comcast.net: On Oct 30, 2009, at 9:11 AM, Erik Svensson wrote: Hi, I am an [R] and lattice beginner. I an doing an epidemiological study and want to present my data in a condensed and clear manner as possible. The example below is fake, but representative for the the data I want to show. I have a data frame with different categorical data and one column of numerical data: Country, Hair, Eyes, Sex, Age sweden, blond, blue, male, 25 sweden, dark, brown, female, 30 denmark, dark, blue, male, 22 denmark, blond, blue, female 33 norway, dark, blue, female, 21 norway, blond, blue, male, 31 finland, dark, blond, male, 43 island, blond, blue, female, 40 ...and so on for 300 rows... Country - as.factor(rep(c(Sweden, Denmark, Norway, Finland, Iceland,Sweden, Denmark, Norway, Finland,Sweden ),50)) Hair - as.factor(rep(c(blond, dark, blond, dark, blond),100)) Eyes - as.factor(rep(c(blue, blue, blue, brown),125)) Sex - as.factor(rep(c(male, female, female, male),125)) Age - rep(c(1:20, 1:100, 76:80),4) Country has 5 levels, Age has many. Hair, Eyes, and Sex have two levels each I want to do one lattice graph with 5 columns of panels with these descriptive data. I would prefer to use the lattice barchart or the corresponding dotplot with lines (except for the Age panel) The Age panel and probably the Country panel I think I can solve myself: barchart(Country) # This only gives one panel, though... densityplot(~ Age|Country, layout=c(1,5), xlim=c(0,100),scales = list(y=list (relation=free))) But, for the dichotomous variables Hair, Eyes, and Sex columns I would like to have a bihistogram aka back-to-back aka age-sex-pyramid solution. I want all bars to be horizontal, the names of the countries to the left on the y-axis. I would also like to sort the country names according to a list of their names. For the back-to-back graph I have only managed to get two panels per column instead of one: barchart(table(Country, Hair),groups=FALSE) My suggestion would be to look for a worked solution. library(sos) ???pyramid The two I found upon taking my own advice are pyramid in epicalc and pyramid.plot in plotrix. Experimenting shows that the epicalc pyramid solution is more flexible and requires less pre-processing of the data. library(epicalc) data(Oswego) use(Oswego) pyramid(age, sex) # ... and on your data: pyramid(Age, Hair) # ... seems to deliver the requested plot. How can I do all this? Is it possible at all? library(fortunes) fortune(Only how) Erik Svensson -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a publication quality table for summary.zeroinfl()
Chris, If you want to go the clipboard route, use write.table(): d1 - coef(summary(fm_zip2))$count d2 - coef(summary(fm_zip2))$zero d - rbind(d1, rep(NA, 4), d2) write.table(d, file=clipboard, col.names=NA, sep=\t) Now paste into Excel. -Peter Ehlers Mark Difford wrote: Hi Chris, My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. I would never use Excel for this since there are far superior tools available. But it is very easy to assemble the two parts into a single table for further manipulation. Not sure about the clipboard route, but the following rather crude approach saves the object (a list) to a *.csv file in your working directory. ## Simple, crude approach, with theta replicated as last column in the saved *.csv data(bioChemists, package = pscl) fm_zinb - zeroinfl(art ~ . | 1, data = bioChemists, dist = negbin) Temptab - list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta) getwd() write.csv(Temptab, file=Temptab.csv) read.csv(Temptab.csv) ## If you have a latex distribution library(Hmisc) latex(list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta), cdec=c(3,3,3,4,3)) Regards, Mark. Chris Fowler wrote: I will swallow my pride and post to this list for the second time in 24 hours--I have a paper due to a reviewer and I am desperate. Has anybody written code to move the output from summary() called on the results of a zeroinfl() model (from the pscl package) into a form suitable for publication? When I hit send on this message I am going to begin hand typing stars into a spreadsheet. The problem is that the zero-inflated model has two parts: a count and a zero portion--its coefficients are stored in separate arrays and there is a Log(theta) that needs to be thrown in there that is in a completely separate place in the structure of the summary function. As a result the functions that I have found for outputting summaries of other linear models all give error messages when attempted on this summary object. My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
Hi Bill, Thanks for you comments. You may be right in that my ability to use the software may be the problem. I was using lm to fit a model with 'target' values of 0 or 1. I then discovered there was a lrm model as well, so just replaced lm with lrm and expected it to be fine. Then I found that the lrm model was predicting values greater than 1 for logistic regression - what am I doing wrong? Initially I was just looking at AUC, which are similar for both models, although different enough for me to be concerned. My data is highly correlated so I want to use a Hessian based algorithm rather then a non-hessian based algorithm, which is why I asked the initial question as to what other logistic regression models existed in R. Anyway, do you know why the lrm predict give me a values of 3.38? model_lr - lm(as.formula(paste(mytarget, ~ . )) , data=df_train) model_lrA - lrm(as.formula(paste(mytarget, ~ . )) , data=df_train) scores_lr_test - predict(model_lr, df_test) scores_lr_train - predict(model_lr, df_train) scores_lrA_test - predict(model_lrA, df_test) scores_lrA_train - predict(model_lrA, df_train) print(scores) print(scores_lr_train[1]) print(scoresA) print(scores_lrA_train[1]) print(colAUC(scores_lr_train,trainY)) print(colAUC(scores_lrA_train,trainY)) print(colAUC(scores_lr_test,testY)) print(colAUC(scores_lrA_test,testY)) [1] scores 1 0.9887154 [1] scoresA 1 3.389009 [,1] 0 vs. 1 0.9448262 [,1] 0 vs. 1 0.9487878 [,1] 0 vs. 1 0.9346953 [,1] 0 vs. 1 0.9357858 1 of 1[1] Bill.Venables wrote: glm is not, and never was. part of the MASS package. It's in the stats package. Have you sorted out why there is a big difference between the results you get using glm and lrm? Are you confident it is due to the algorithms used and not your ability to use the software? To be helpful, if disappointing, I think the answer to your question is no. You will need to seek out the algorithms from the published information on them individually. W. From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of tdm [ph...@philbrierley.com] Sent: 31 October 2009 16:53 To: r-help@r-project.org Subject: [R] Logistic and Linear Regression Libraries Hi all, I'm trying to discover the options available to me for logistic and linear regression. I'm doing some tests on a dataset and want to see how different flavours of the algorithms cope. So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Is there a list anywhere detailing the options available which details the specific algorithms used? Thanks in advance, Phil -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26141353.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MatLab SimBiology
Hello, Please look here: http://crantastic.org/search?q=biology http://crantastic.org/search?q=biologyRegards, Carlos. On Fri, Oct 30, 2009 at 6:12 PM, mau...@alice.it wrote: Is there any R package that implements the same capability of MatLab toolbox called SimBiology ? We are expecially interested in protein-protein interactions and network analysis. As far as I know SimBiology implements a system of ODEs reflecting the kinetic chemical reactions. We would be more interested in stochastic simulations. Thank you in advance. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple linear regression with constraint (imposition)
Thank you Ravi Ravi Varadhan wrote: Simply do this: lm(y ~ I(X1 - X2)) Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: CE.KA ce.kay...@yahoo.fr Date: Friday, October 30, 2009 7:48 pm Subject: Re: [R] Multiple linear regression with constraint (imposition) To: r-help@r-project.org Sorry there was a mistake: Hi R users I want to do a multiple linear regression with R With a normal model (Y=Cste+A1*X1+A2*X2) the program would be lm(Y~X1+X2) My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2 What is the program for such a model? Best regards -- View this message in context: Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Multiple-linear-regression-with-constraint-%28imposition%29-tp26138081p26141448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple linear regression with constraint (imposition)
Thank you David David Winsemius wrote: On Oct 30, 2009, at 7:47 PM, CE.KA wrote: Sorry there was a mistake: I could not see what was different? Hi R users I want to do a multiple linear regression with R With a normal model (Y=Cste+A1*X1+A2*X2) the program would be lm(Y~X1+X2) My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2 What is the program for such a model? Try calculating X3= X2-X1 and estimate the shared parameter with: lm(y ~ X3) or instead; lm(Y ~ I(X2 - X1) ) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Multiple-linear-regression-with-constraint-%28imposition%29-tp26138081p26141453.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multicore package: sharing/modifying variable accross processes
Renaud, Package bigmemory can help you with shared-memory matrices, either in RAM or filebacked. Mutex support currently exists as part of the package, although for various reasons will soon be abstracted from the package and provided via a new package, synchronicity. bigmemory works beautifully with multicore. Feel free to email us with questions, and we appreciate feedback. Jay Original message: Hi, I want to parallelize some computations when it's possible on multicore machines. Each computation produces a big objects that I don't want to store if not necessary: in the end only the object that best fits my data have to be returned. In non-parallel mode, a single gloabl object is updated if the current computation gets a better result than the best previously found. My plan was to use package multicore. But there is obviously an issue of concurrent access to the global result variable. Is there a way to implement something like a lock/mutex to ensure make the procedure thread safe? Maybe something already exist to deal with such things? It looks like package multicore run the different processes in different environments with copy-on-change of everything when forking. Anybody has experimented working with a shared environment with package multicore? -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Why would it be so? 1 / (1 + Exp(-1 * 3.38)) = 0.967 tdm wrote: Anyway, do you know why the lrm predict give me a values of 3.38? -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26141558.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
tdm wrote: OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Why would it be so? Please take some time to read the help files on these functions so that you at least understand that they fit different models. ?Design:::lrm ?Design:::predict.lrm ?Design:::datadist ?lm This really is just a plainer way of saying what I said before. HTH, Mark. tdm wrote: OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Why would it be so? 1 / (1 + Exp(-1 * 3.38)) = 0.967 tdm wrote: Anyway, do you know why the lrm predict give me a values of 3.38? -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26141711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
On Oct 31, 2009, at 7:29 AM, tdm wrote: OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Yes, at least for certain meanings of this. When working with probabilities or propotions as the dependent variable, the estimates will be similar in the central regions of the data but diverge at the extremes, although it is linear regression that blows up when estimating probabilities. Logistic regression is by design constrained to predict a true probability even outside the range of the data, while the prediction for an ordinary least squares model has no such constraint. Why would it be so? 1 / (1 + Exp(-1 * 3.38)) = 0.967 I am guessing that you have not read the help page for predict.lrm. In it Harrell clearly indicates that the default output from that function is of type lp or the linear predictor. Since you were apparently unaware that logistic regression and simple linear regression were different approaches to modeling, then you are probably also unaware that you need to use the inverse of the logistic transformation on the linear predictor, which expressed on the log odds scale, to get back a probability estimate. tdm wrote: Anyway, do you know why the lrm predict give me a values of 3.38? == David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MatLab SimBiology
I would have used the search strategy stochastic. Despite the risk that such a strategy could bring up anything to do with statistics, it in fact brings me to Stefano Iacus' package sde (and refers to the accompanying book, Simulation and Inference for Stochastic Differential Equations With R Examples). Further links on that page were to: http://bm2.genes.nig.ac.jp/RGM2/pkg.php?p=sde .. which gives a better picture of the capabilities of that package. Also a possible package of possible interest to this rather vague questioner might be: http://crantastic.org/packages/PSM And even though it is pitched to population biologists, I thought that a package that included a decaying-dimerization reaction set, [and] linear chain system: http://crantastic.org/packages/GillespieSSA ... might have examples of interest to biological chemists. -- David. On Oct 31, 2009, at 7:05 AM, Carlos Ortega wrote: Hello, Please look here: http://crantastic.org/search?q=biology http://crantastic.org/search?q=biologyRegards, Carlos. On Fri, Oct 30, 2009 at 6:12 PM, mau...@alice.it wrote: Is there any R package that implements the same capability of MatLab toolbox called SimBiology ? We are expecially interested in protein-protein interactions and network analysis. As far as I know SimBiology implements a system of ODEs reflecting the kinetic chemical reactions. We would be more interested in stochastic simulations. Thank you in advance. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Luis Miguel Delgado Gomez/BBK está ausente d e la oficina.
Estaré ausente de la oficina desde el 30/10/2009 y no volveré hasta el 12/11/2009. Responderé a su mensaje cuando regrese. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polar.plot
--- On Fri, 10/30/09, Tony Greig tony.gr...@gmail.com wrote: From: Tony Greig tony.gr...@gmail.com Two questions: 1 - Say I have average speed and directions for tide and I would like to plot them on a polar plot, but with different colors so I can indicate the two directions. I'm using polar.plot from the plotrix library. How can I add a second b and dir.b series to a polar.plot? You may be able to set par(new=TRUE) but it looks to me like polar.plot sets the radius for each graph so you would need to come up with a way of dealing with this. library(plotrix) a = 3 dir.a = 85 b = 4 dir.b = 250 polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T, radial.lim=c(0,5), line.col=2, lwd=2) 2 - Which parameter in polar.plot can I use to set the orientation for the grid labels which seem to default at 90 in my example above? start ? polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T, radial.lim=c(0,5), line.col=2, lwd=2) op - par(new=TRUE) polar.plot(b, dir.b, start = 90, clockwise = T, show.grid.labels = T, radial.lim=c(0,5), line.col=1, lwd=2) par(op) __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame?
Do you mean to apply the same calculation to each element? ? apply mydata - data.frame(aa = 1:5, bb= 11:15) mp5 - function(x) x*5 mp5data - apply(mydata, 2, mp5) mp5data This is functionally equivelent to a double if loop. mydata - data.frame(aa = 1:5, bb= 11:15) newdata - data.frame(matrix(rep(NA,10), nrow=5)) for (i in 1:length(mydata[1,])) { for (j in 1:5) { newdata[j,i] - mydata[j,i]*5 } } newdata --- On Fri, 10/30/09, Robert Wilkins robst...@gmail.com wrote: From: Robert Wilkins robst...@gmail.com Subject: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame? To: r-help@r-project.org Received: Friday, October 30, 2009, 11:46 AM How do you do a double loop through a matrix or data frame , and within each iteration , do a calculation and append it to a new, second data frame? (So, if your original matrix or data frame is 4 x 5 , then 20 calculations are done, and the new data frame, which had 0 rows to start with, now has 20 rows) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help me improving my code
Hi, I am new to R. My problem is with the ordered logistic model. Here is my question: Generate an order discrete variable using the variable wrwage1 = wages in first full calendar quarter after benefit application in the following way: * wage*1*Ordered *= 1 *if*0 *· wrwage*1 * *1000 2 *if*1000 *· wrwage*1 * *2000 3 *if*2000 *· wrwage*1 * *3000 4 *if*3000 *· wrwage*1 * *4000 5 *ifwrwage*1 *¸ *4000. Estimate an order logit model using the new created discrete variable as the dependent variable with the following control variables: black = 1 if black hispanic = 1 if hispanic otherrace = 1 if black = 0 and hispanic = 0 female = 1 if female wba = claimant's UI weekly bene¯t amount (*=week*) age = age of claimant when claim is ¯led (years) bp1000 = base period wages (thousand $) inuidur1 = duration of unemployment (weeks) I have written the program as: pennw - read.table(G:/Econometrics II/pennsylvania.txt, header=TRUE) names(pennw) [1] group t0t1t2t3t4 [7] intercept black hispanic femalewba bonus [13] age recallpdbp1000numpaywrwage0 [19] wrwage1 wrwage2 wrwage3 wrwage4 inuidur1 regui wage - pennw$wrwage1 wage[wage=0 wage1000] - 1 wage[wage=1000 wage2000] - 2 wage[wage=2000 wage3000] - 3 wage[wage=3000 wage4000] - 4 wage[wage=4000] - 5 intercept - pennw$intercept blac - pennw$black hisp - pennw$hispanic fem - pennw$female wba - pennw$wba age - pennw$age bpw - pennw$bp1000 dur - pennw$inuidur1 fw - factor(wage) Ik - model.matrix(~fw-1) I1 - Ik[,1] I2 - Ik[,2] I3 - Ik[,3] I4 - Ik[,4] I5 - Ik[,5] yelmon - function(theta,x){ + mu1 - theta[1] + mu2 - theta[2] + mu3 - theta[3] + mu4 - theta[4] + mu5 - theta[5] + beta - theta[6] + logL - sum((Ik*log((exp(mu2-(x*beta))/(1+exp(mu2-(x*beta-(exp(mu1-(x*beta))/(1+exp(mu1-(x*beta))+(Ik*log((exp(mu3-(x*beta))/(1+exp(mu3-(x*beta-(exp(mu2-(x*beta))/(1+exp(mu2-(x*beta))+(Ik*log((exp(mu4-(x*beta))/(1+exp(mu4-(x*beta-(exp(mu3-(x*beta))/(1+exp(mu3-(x*beta))+(Ik*log((exp(mu5-(x*beta))/(1+exp(mu5-(x*beta-(exp(mu4-(x*beta))/(1+exp(mu4-(x*beta))+(Ik*log(1-(exp(mu5-(x*beta))/(1+exp(mu5-(x*beta))) + return(-logL) + } optim(yelmon, c(0,1), x=c(intercept, blac, hisp, fem, wba, age, bpw, dur)) I should be getting the values for all betas and mus here ) The program should be done by maximum likelihood not by calling ready-made commands Rubel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Johnson-Neyman procedure (ANCOVA)
On 10/30/2009 9:20 PM, Stropharia wrote: Dear R users, Does anyone know of a package that implements the Johnson-Neyman procedure - for testing differences among groups when the regression slopes are heterogeneous (in an ANCOVA model)? I did not get any matches for functions when searching using R Site Search. If it has not been implemented in R, does anyone know of another statistical package that has this procedure? I found very old scripts available for Mathematica, but have had difficulty getting them to work. http://www.comm.ohio-state.edu/ahayes/SPSS%20programs/modprobe.htm Thanks in advance. best, Steve -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me improving my code
This is rather obviously homework, and you have not read the Posting Guide, and you have not addressed the question of academic integrity policies that are probably in force at your university. -- David On Oct 31, 2009, at 1:30 PM, md. jakir hussain talukdar wrote: Hi, I am new to R. My problem is with the ordered logistic model. Here is my question: Generate an order discrete variable using the variable wrwage1 = wages in first full calendar quarter after benefit application in the following way: * wage*1*Ordered *= 1 *if*0 *· wrwage*1 * *1000 2 *if*1000 *· wrwage*1 * *2000 3 *if*2000 *· wrwage*1 * *3000 4 *if*3000 *· wrwage*1 * *4000 5 *ifwrwage*1 *¸ *4000. Estimate an order logit model using the new created discrete variable as the dependent variable with the following control variables: black = 1 if black hispanic = 1 if hispanic otherrace = 1 if black = 0 and hispanic = 0 female = 1 if female wba = claimant's UI weekly bene¯t amount (*=week*) age = age of claimant when claim is ¯led (years) bp1000 = base period wages (thousand $) inuidur1 = duration of unemployment (weeks) I have written the program as: pennw - read.table(G:/Econometrics II/pennsylvania.txt, header=TRUE) names(pennw) [1] group t0t1t2t3t4 [7] intercept black hispanic femalewba bonus [13] age recallpdbp1000numpay wrwage0 [19] wrwage1 wrwage2 wrwage3 wrwage4 inuidur1 regui wage - pennw$wrwage1 wage[wage=0 wage1000] - 1 wage[wage=1000 wage2000] - 2 wage[wage=2000 wage3000] - 3 wage[wage=3000 wage4000] - 4 wage[wage=4000] - 5 intercept - pennw$intercept blac - pennw$black hisp - pennw$hispanic fem - pennw$female wba - pennw$wba age - pennw$age bpw - pennw$bp1000 dur - pennw$inuidur1 fw - factor(wage) Ik - model.matrix(~fw-1) I1 - Ik[,1] I2 - Ik[,2] I3 - Ik[,3] I4 - Ik[,4] I5 - Ik[,5] yelmon - function(theta,x){ + mu1 - theta[1] + mu2 - theta[2] + mu3 - theta[3] + mu4 - theta[4] + mu5 - theta[5] + beta - theta[6] + logL - sum((Ik*log((exp(mu2-(x*beta))/(1+exp(mu2-(x*beta-(exp(mu1- (x*beta))/(1+exp(mu1-(x*beta))+(Ik*log((exp(mu3-(x*beta))/ (1+exp(mu3-(x*beta-(exp(mu2-(x*beta))/(1+exp(mu2-(x*beta))+ (Ik*log((exp(mu4-(x*beta))/(1+exp(mu4-(x*beta-(exp(mu3-(x*beta))/ (1+exp(mu3-(x*beta))+(Ik*log((exp(mu5-(x*beta))/(1+exp(mu5- (x*beta-(exp(mu4-(x*beta))/(1+exp(mu4-(x*beta))+(Ik*log(1- (exp(mu5-(x*beta))/(1+exp(mu5-(x*beta))) + return(-logL) + } optim(yelmon, c(0,1), x=c(intercept, blac, hisp, fem, wba, age, bpw, dur)) I should be getting the values for all betas and mus here ) The program should be done by maximum likelihood not by calling ready-made commands Rubel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] get names of list elements
Hi Consider a - 1:5; b - 6:10 x - list(a=a,b=b) y - list(a,b) I can get c(a,b) from x using names(x). Is it also possible to get a and b from y? (The command names(y) gives NULL.) Thanks in advance. Chirok __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame?
On Oct 31, 2009, at 12:33 PM, John Kane wrote: Do you mean to apply the same calculation to each element? ? apply mydata - data.frame(aa = 1:5, bb= 11:15) mp5 - function(x) x*5 mp5data - apply(mydata, 2, mp5) mp5data It would have been much more compact (and in the spirit of functional programming) to just do: mp5data - mydata*5 # binary operations applied to dataframes give sensible results when the data types allow. mp5data aa bb [1,] 5 55 [2,] 10 60 [3,] 15 65 [4,] 20 70 [5,] 25 75 Many times the loops are implicit in the vectorized design of R. And that solution would not result in the structure requested, for which some further straightening would be needed: mp5data - as.vector(as.matrix(mydata)*5) mp5data [1] 5 10 15 20 25 55 60 65 70 75 -- David This is functionally equivelent to a double if loop. mydata - data.frame(aa = 1:5, bb= 11:15) newdata - data.frame(matrix(rep(NA,10), nrow=5)) for (i in 1:length(mydata[1,])) { for (j in 1:5) { newdata[j,i] - mydata[j,i]*5 } } newdata --- On Fri, 10/30/09, Robert Wilkins robst...@gmail.com wrote: From: Robert Wilkins robst...@gmail.com Subject: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame? To: r-help@r-project.org Received: Friday, October 30, 2009, 11:46 AM How do you do a double loop through a matrix or data frame , and within each iteration , do a calculation and append it to a new, second data frame? (So, if your original matrix or data frame is 4 x 5 , then 20 calculations are done, and the new data frame, which had 0 rows to start with, now has 20 rows) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get names of list elements
I don't think so. You can only get the name of something that has a name, and in this case it does not. -Ista 2009/10/31 Chirok Han chirok@gmail.com: Hi Consider a - 1:5; b - 6:10 x - list(a=a,b=b) y - list(a,b) I can get c(a,b) from x using names(x). Is it also possible to get a and b from y? (The command names(y) gives NULL.) Thanks in advance. Chirok __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get names of list elements
list() does not assign names but data.frame(a, b) does and can be used at least in the case shown where a and b have the same length. And, of course, a data frame is a list. a - 1:5; b - 6:10 names(data.frame(a, b)) [1] a b 2009/10/31 Chirok Han chirok@gmail.com: Hi Consider a - 1:5; b - 6:10 x - list(a=a,b=b) y - list(a,b) I can get c(a,b) from x using names(x). Is it also possible to get a and b from y? (The command names(y) gives NULL.) Thanks in advance. Chirok __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] na.omit leaves cases with NA's intact
Strange, this normally works, but in a recent run, I have a data set in an xts format, that has a lot of NA's in a few of the variables in the leading and trailing positions, due to some lagging calculations. Before running an analysis, I use newdata-na.omit(orginaldata) and normally a dim(newdata) shows the fewer rows. Now, for some reason I do this operation and see that hundreds of rows SHOULD be removed, (I can plainly see the NAs in there) and even test is.na(orginaldata$variable) and get a clear TRUE, but the case still remains after the na.omit operation. Yes, I'm spelling it right. I'm doing this with many sets of data, and it works great except for this one data set Any idea if there are limits on when this function works, or more importantly, if there is a manual way to do it as a workaround? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] na.omit leaves cases with NA's intact
What class is 'orginaldata' (and its columns if a data frame)? Note that packge xts defines methods for na.omit: library(xts) ... methods(na.omit) [1] na.omit.data.frame* na.omit.default*na.omit.ts* [4] na.omit.xts* so this is possibly something peculiar to package xts. See the footer of this message: a reproducible example would help enormously in finding the root cause here. On Sat, 31 Oct 2009, David L. Van Brunt, Ph.D. wrote: Strange, this normally works, but in a recent run, I have a data set in an xts format, that has a lot of NA's in a few of the variables in the leading and trailing positions, due to some lagging calculations. Before running an analysis, I use newdata-na.omit(orginaldata) and normally a dim(newdata) shows the fewer rows. Now, for some reason I do this operation and see that hundreds of rows SHOULD be removed, (I can plainly see the NAs in there) and even test is.na(orginaldata$variable) and get a clear TRUE, but the case still remains after the na.omit operation. Yes, I'm spelling it right. I'm doing this with many sets of data, and it works great except for this one data set Any idea if there are limits on when this function works, or more importantly, if there is a manual way to do it as a workaround? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoiding loop
Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value|choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value|choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots with k-means
what is the dimension of your data? you might try projecting the points into planes defined by 3 cluster centers. plot, for each cluster, a density plot or histogram of distances to the cluster center, and perhaps overlay the density curve for points not in the cluster. albyn Quoting Iuri Gavronski i...@proxima.adm.br: Hi, I'm doing a k-means cluster with 6 clusters and 15 variables. Any suggestions on how to plot the results? I've tried the standard xy plot, but couldn't get much of it. Thansk in advance, Iuri. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots with k-means
oops, I see the answer in your question: 15... Quoting Albyn Jones jo...@reed.edu: what is the dimension of your data? you might try projecting the points into planes defined by 3 cluster centers. plot, for each cluster, a density plot or histogram of distances to the cluster center, and perhaps overlay the density curve for points not in the cluster. albyn Quoting Iuri Gavronski i...@proxima.adm.br: Hi, I'm doing a k-means cluster with 6 clusters and 15 variables. Any suggestions on how to plot the results? I've tried the standard xy plot, but couldn't get much of it. Thansk in advance, Iuri. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing identical IDs
On Oct 30, 2009, at 3:29 PM, PDXRugger wrote: David, You are correct. I think the frist two assumptions can be thrown out and only the latter two (c,d) can be considered. So how would i combine Acres for matching Bldgids based on assumptions c,d? I also think your requested output failed to add the ImpValue of rows 10 and 11 (but did add for cases 5 6 as well as 8 9. So it's not exactly clear what you want. David Winsemius wrote: On Oct 29, 2009, at 5:23 PM, PDXRugger wrote: Terrific help thank you. dupbuild-aggregate(DF$Acres, list(Bldgid), sum) This line worked best. Now im going to challenge everyone (i think?) Consider the following: Acres-c(100,101,100,130,156,.5,293,300,.09,100,12.5) Bldgid-c(1,2,3,4,5,5,6,7,7,8,8) Year-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) ImpValue-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) DF=cbind(Acres,Bldgid,Year,ImpValue) DF-as.data.frame(DF) I would like to do the same, except there are some rules i want to follow. I only want to aggregate the Acres if : a) The Years are not identical b) The ImpValues are not identical So we are striking out a and b and we only want to sum values where the following obtain: c) The Years are identical and the ImpValue are not d)The ImpValues are identical and the Years are not Do an outer join on the DF: mrgDF - merge(DF, DF, by=Bldgid, all=T) # and identify cases within that construct that satisfy the non- equalities: with( mrgDF, mrgDF[ImpValue.x != ImpValue.y | Acres.y != Acres.x , ] ) Bldgid Acres.x Year.x ImpValue.x Acres.y Year.y ImpValue.y 6 5 156.00 1955 50000.50 1955 1200 7 50.50 1955 1200 156.00 1955 5000 11 7 300.00 1990 10000.09 1991300 12 70.09 1991300 300.00 1990 1000 15 8 100.00 2000 1000 12.50 2000 1000 16 8 12.50 2000 1000 100.00 2000 1000 You should be able to take it from here ... after you figure out what it is that you want. -- David. As I review your Boolean logic, I run into serious problems. c) and d) cannot be true if a and b) are true. So no cases satisfy all 4 specs. In particular both of the pairs you say you want aggregated (5+6) and 10+11) violate rule a) and the second pair also violates b). -- David but if the Acres and ImpValues are identical i would still like to add the Acres together and form one case. If the cases are put together i would also like to add the ImpValues together. So the below Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.00 5 1955 5000 60.50 5 1955 1200 7 293.00 6 1999 500 8 300.00 7 1990 1000 90.09 7 1991 300 10 100.00 8 2000 1000 11 12.50 8 2000 1000 would become Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 6200 7 293.00 6 1999 500 8 300.09 7 1990 1300 10 112.50 8 2000 1000 Thanks, i gave it a bunch of shots but nothing worth posting. PDXRugger wrote: Hello All, I would like to select records with identical IDs then sum an attribute then and return them to the data frame as a single record. Please consider Acres-c(100,101,100,130,156,.5,293,300,.09) Bldgid-c(1,2,3,4,5,5,6,7,7) DF=cbind(Acres,Bldgid) DF-as.data.frame(DF) So that: Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.00 5 6 0.50 5 7 293.00 6 8 300.00 7 9 0.09 7 Becomes Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.50 5 7 293.00 6 8 300.09 7 dup-unique(DF$Bldgid[duplicated(Bldgid)]) dupbuild-DF[DF$Bldgid %in% dup,] dupbuild..dupareasum-sum(dupbuild$Acres[duplicated(dupbuild $Bldgid)]) This sums the unique Ids of the duplicated records, not whati want. Thanks ahead of time JR -- View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context:
Re: [R] opacity under dispersion command under plotrix
On 10/30/2009 10:02 PM, Joe King wrote: Is there anyway to make the lines in the dispersion command come forward in a plot and allow the fill in the dispersion parameter be transparent so all of the lines I am using to note confidence intervals are shown? Hi Joe, I have probably missed something, as a call to dispersion with type=l will just display lines with no fill. I'm not sure exactly what you mean by come forward, I'm guessing this means displayed last, so just make the call to dispersion the last one for that plot. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polar.plot
On 10/31/2009 04:49 AM, Tony Greig wrote: Hi, Two questions: 1 - Say I have average speed and directions for tide and I would like to plot them on a polar plot, but with different colors so I can indicate the two directions. I'm using polar.plot from the plotrix library. How can I add a second b and dir.b series to a polar.plot? library(plotrix) a = 3 dir.a = 85 b = 4 dir.b = 250 polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T, radial.lim=c(0,5), line.col=2, lwd=2) 2 - Which parameter in polar.plot can I use to set the orientation for the grid labels which seem to default at 90 in my example above? Hi Tony, The first one is easy: polar.plot(c(a,b), c(dir.a,dir.b), start = 90, clockwise = T, show.grid.labels=FALSE, radial.lim=c(0,5), line.col=2, lwd=2) I have had one other person as about an add option, and I might include that in a future version. The second one is a bit harder. You probably noticed that I changed the show.grid.labels argument to FALSE in the above. par(xpd=TRUE) boxed.labels(rep(0,5),1:5,1:5,border=NA) par(xpd=FALSE) This will put the labels vertically up from the center. Doing fancier things like having the labels at an angle would require calculating the positions, which isn't too hard. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix^(-1/2)
Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do I get the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) of the matrix... Thank you very much in advance! Best regards, Kajan Saied [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 80, Issue 30
On 10/30/2009 10:35 PM, Huidong TIAN wrote: Dear friends, I will be very happy if anyone tell me the way to change work directory permanently? I mean not use the function setwd() which can only change temporary, when you close the console, it will the old directory. Sys.setenv(R_USER = '') also doesn't work. Hi Huidong, There are a few ways. One is to define a .First function in the directory where R starts and include the setwd() command in that: .First-function() { setwd(/home/huidong/R) } If you start up a new R session, create the .First function and then answer y to: Save workspace image? [y/n/c]: this function will be run when you start up R. You can add other commands to customize the R session, but remember to do so in a fresh session, then quit saving the workspace image each time you change .First. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do I get the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
This is pretty much equivalent: tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) / tapply(DF$value, DF$time, sum) And both will probably fail if the number of groups with choice==1 is different than the number overall. -- David. On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote: one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value| choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
Thank you both. However, using tapply() instead of a loop does not seem to improve my code much. I am using this inside of an optimization function, and it still takes more than it needs... CC: bbom...@hotmail.com; r-help@r-project.org From: dwinsem...@comcast.net To: d.rizopou...@erasmusmc.nl Subject: Re: [R] avoiding loop Date: Sat, 31 Oct 2009 22:26:17 -0400 This is pretty much equivalent: tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) / tapply(DF$value, DF$time, sum) And both will probably fail if the number of groups with choice==1 is different than the number overall. -- David. On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote: one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value| choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.