Re: [R] Good pointers for understanding the R language implementation
This might be useful: http://adv-r.had.co.nz/ On Tue, Apr 5, 2016 at 10:31 AM, Francisco Banhawrote: > Dear All, > > I'm currently working on a project with > the purpose of remotely executing R code, which requires me to have to > work with the code of R itself. I've searched the Internet for good > information that will help me understand how R is implemented but what > I've got so far isn't detailed enough. > I've looked specifically at > CRAN's manuals on the official website but they only address this issue > briefly. I've also looked at other contents online but so far nothing > has turned up that has the level of detail that I need to properly > understand the inner workings of R. > For example, I need to understand > how exactly an expression is parsed and evaluated, because I will need > to intervene in the process to decide whether to execute it remotely or > not. > Does anyone know of good pointers that would help me understand this? > Thanks for any help! > > Best regards, > Francisco > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reduced set of alternatives in package mlogit
Hi Jose, You're referring to your response variable when you're saying it's missing some of the choices, right? Are your response choices ever known or do they just occur with extremely low frequency? Either way, I think the mlogit package would be inappropriate for you. I imagine you would have much better luck using MCMCpack or writing a model with rstan or something Bayesian. Unless I'm missing something. Chris On Fri, Apr 1, 2016 at 8:40 AM, Jose Marcos Ferraro < jose.ferr...@logiteng.com> wrote: > -Original Message- > From: Bert Gunter [mailto:bgunter.4...@gmail.com] > Sent: quinta-feira, 31 de março de 2016 20:22 > To: Jose Marcos Ferraro> Cc: r-help@r-project.org > Subject: Re: [R] reduced set of alternatives in package mlogit > > code? example data? We can only guess based on your vague post. > > "PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code." > > Moreover, this sounds like a statistical question, not a question about R > programming, and so might be more appropriate for a statistical list like > stats.stackexchange.com . > > Cheers, > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Thu, Mar 31, 2016 at 2:51 PM, Jose Marcos Ferraro < > jose.ferr...@logiteng.com> wrote: > > I'm trying to estimate a multinomial logit model but in some choices > only alternatives from a subset of all possible alternatives can be chosen. > > At the moment I get around it by creating "dummy" variables to mean the > alternative is not available and let it estimate this coefficient as highly > negative. Is there a better way to do it? > > > > Sorry if I was not clear enough, but there is hardly any code to show. > The problem is that a parameter or function is lacking (or , mostly > likely, I can't find it), so in some sense the problem itself is that > there is no code to show. > > In what follows choice situations , alternatives, wide, and variables have > the same meaning that they have on the mlogit documentation. All variables > are alternative specific. > > 1)I want to estimate a multinomial Logit using the mlogit package > > 2)I have a dataset, made of choice situations > > 3)There is a set of alternatives > > 4)in some choice situations, not all alternatives were available, but only > a subset of them. So there are no variables for the unavailable > alternatives and the chosen alternative evidently belongs to the set of > available ones. > > 5)I use mlogit.data to prepare the dataset from a "wide" dataframe . There > is no option to have only a subset of alternatives and the resulting object > will have them all , that is, there will be a line for every alternative > and every choice situation, even if in reality some of them were not > available. The variables of these alternatives did not exist, so must be > filled with 0s or any other made up value > > 6) If ones estimate a model from this data it will be wrong > > 7) It is possible to get an "almost right" model by using a dummy variable > marking which alternatives are unavailable, for as it is only used in > alternatives that are never chosen, its coefficient will get negative with > big absolute value, in practice giving almost 0% probability for them > > 8)this is a workaround because it obligates the model to estimate a number > that should be -infinity and this is known in advance, so it's ugly and > difficult to know what the numeric consequences are as the coefficient can > never converge. In fact, I don't use it the way I described for these > reasons, preferring a more complex but almost equivalent formulation. The > important point is that I want a clean solution, not a workaround > > 9)I demand simply if mlogit package has such functionality > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amelia, Zelig and latex in R
Seems you're after the pooled results. Would the following work? library(Amelia) library(Zelig) library(xtable) data(africa) m = 10 imp1 - amelia(x = africa,cs=country,m=m) imp2 - amelia(x = africa,cs=country,m=m) lm.imputed1 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp1) lm.imputed2 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp2) lm1 - as.data.frame(summary(lm.imputed1)$coef) lm2 - as.data.frame(summary(lm.imputed2)$coef) lm1[,2] - ifelse(lm1[,4].001,paste(lm1[,2],***,sep= ), ifelse(lm1[,4].01,paste(lm1[,2],**,sep= ), ifelse(lm1[,4].05,paste(lm1[,2],*,sep= ), ifelse(lm1[,4].1,paste(lm1[,2],.,sep= ),lm1[,2] lm2[,2] - ifelse(lm2[,4].001,paste(lm2[,2],***,sep= ), ifelse(lm2[,4].01,paste(lm2[,2],**,sep= ), ifelse(lm2[,4].05,paste(lm2[,2],*,sep= ), ifelse(lm2[,4].1,paste(lm2[,2],.,sep= ),lm2[,2] ## ONE OPTIONS ## lms - as.data.frame(cbind(lm1[,1],lm2[,1],lm1[,2],lm2[,2])) rownames(lms) - rownames(lm1) colnames(lms) - c(Imp1.Est,Imp2.Est,Imp1.SE,Imp2.SE) xtable(lms) ## OR ## xtable(cbind(lm1[,1:2],lm2[,1:2])) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amelia, Zelig and latex in R
Hi, I am glad you could get it to work. I don't really know I usually just use xtable and any additional formatting I need done I do in my LaTeX editor. Perhaps there isn't a nice tex format out of the box for MI data. Once you write some nice code, you could keep reusing it or better yet package it :) On Sunday, August 18, 2013, Francesco Sarracino wrote: That's right! Your advice is in the right direction and with little adjustments it did the job. However, I admit it was tricky and the result looks a bit artisanal and needs some polishing that I will do by hand in the tex code. Is it possible that there is no way to get nicely latex formatted tables concerning multiply imputed data-set? But maybe I should open a separate thread on this. Thanks a lot for your kind and patient help. Best regards, f. On 18 August 2013 15:56, Christopher Desjardins cddesjard...@gmail.comjavascript:_e({}, 'cvml', 'cddesjard...@gmail.com'); wrote: Seems you're after the pooled results. Would the following work? library(Amelia) library(Zelig) library(xtable) data(africa) m = 10 imp1 - amelia(x = africa,cs=country,m=m) imp2 - amelia(x = africa,cs=country,m=m) lm.imputed1 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp1) lm.imputed2 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp2) lm1 - as.data.frame(summary(lm.imputed1)$coef) lm2 - as.data.frame(summary(lm.imputed2)$coef) lm1[,2] - ifelse(lm1[,4].001,paste(lm1[,2],***,sep= ), ifelse(lm1[,4].01,paste(lm1[,2],**,sep= ), ifelse(lm1[,4].05,paste(lm1[,2],*,sep= ), ifelse(lm1[,4].1,paste(lm1[,2],.,sep= ),lm1[,2] lm2[,2] - ifelse(lm2[,4].001,paste(lm2[,2],***,sep= ), ifelse(lm2[,4].01,paste(lm2[,2],**,sep= ), ifelse(lm2[,4].05,paste(lm2[,2],*,sep= ), ifelse(lm2[,4].1,paste(lm2[,2],.,sep= ),lm2[,2] ## ONE OPTIONS ## lms - as.data.frame(cbind(lm1[,1],lm2[,1],lm1[,2],lm2[,2])) rownames(lms) - rownames(lm1) colnames(lms) - c(Imp1.Est,Imp2.Est,Imp1.SE,Imp2.SE) xtable(lms) ## OR ## xtable(cbind(lm1[,1:2],lm2[,1:2])) -- Francesco Sarracino, Ph.D. fsarracino https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amelia, Zelig and latex in R
Does this do what you want? library(Amelia) library(Zelig) library(stargazer) library(xtable) data(africa) m = 10 imp1 - amelia(x = africa,cs=country,m=m) imp2 - amelia(x = africa,cs=country,m=m) lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1) lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2) # Stargazer for(i in 1:m){ print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef))) } # xtable for(i in 1:m){ print(xtable(summary(lm.imputed1[[i]]))) print(xtable(summary(lm.imputed2[[i]]))) } On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino f.sarrac...@gmail.comwrote: Dear listers, I am running some OLS on multiply imputed data using Amelia. I first imputed the data with Amelia. than I run a OLS using Zelig to obtain a table of results accounting for the multiply imputed data-sets. And I'd like to do this for various models. Finally, I want to output all the models in a table of results for latex. I've tried with Stargazer because it seems to support Zelig output, but when I run stargazer on a set of objects containing the output of zelig, I get the following error: Error: unrecognized object type. this message is repeated for each model I passed to Stargazer. I am sorry I can't provide a working example, because I should make up some multiply imputed data first. Hoewever, summarizing what I did is: imputed1 - amelia(x=data1, m=10) imputed2 - amelia(x=data2, m=10) lm.imputed1 - zelig(Y ~ X + Z, data = imputed1) lm.imputed2 - zelig(Y ~ X + Z, data = imputed2) stargazer(lm.imputed1, lm.imputed2) The outcome is the error I mentioned above. Thanks in advance for all the support you can offer. Regards, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amelia, Zelig and latex in R
What do you mean by results? Do you want just the estimated parameters? And are you looking for one big table with all the estimated parameters from all imputation runs? Chris On Sat, Aug 17, 2013 at 11:18 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Hi Christopher, thanks for your reply. Unfortunately, that's not what I am looking for. I would like to have a table with the results of the two models (lm.imputed1 and lm.imputed2) in two separate columns. According to stargazer syntax I should type something like: stargazer(lm.imputed1, lm.imputed2, summary = FALSE) but then I get my error: Error: Unrecognized object type. Even though your example is insightful, I can't figure out how to solve my problem. Any advice is very welcome. Regards, f. On 17 August 2013 17:02, Christopher Desjardins cddesjard...@gmail.comwrote: Does this do what you want? library(Amelia) library(Zelig) library(stargazer) library(xtable) data(africa) m = 10 imp1 - amelia(x = africa,cs=country,m=m) imp2 - amelia(x = africa,cs=country,m=m) lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1) lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2) # Stargazer for(i in 1:m){ print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef))) } # xtable for(i in 1:m){ print(xtable(summary(lm.imputed1[[i]]))) print(xtable(summary(lm.imputed2[[i]]))) } On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Dear listers, I am running some OLS on multiply imputed data using Amelia. I first imputed the data with Amelia. than I run a OLS using Zelig to obtain a table of results accounting for the multiply imputed data-sets. And I'd like to do this for various models. Finally, I want to output all the models in a table of results for latex. I've tried with Stargazer because it seems to support Zelig output, but when I run stargazer on a set of objects containing the output of zelig, I get the following error: Error: unrecognized object type. this message is repeated for each model I passed to Stargazer. I am sorry I can't provide a working example, because I should make up some multiply imputed data first. Hoewever, summarizing what I did is: imputed1 - amelia(x=data1, m=10) imputed2 - amelia(x=data2, m=10) lm.imputed1 - zelig(Y ~ X + Z, data = imputed1) lm.imputed2 - zelig(Y ~ X + Z, data = imputed2) stargazer(lm.imputed1, lm.imputed2) The outcome is the error I mentioned above. Thanks in advance for all the support you can offer. Regards, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amelia, Zelig and latex in R
Oh and are you looking for just the summarized results over all the imputed runs? i thought you wanted them from each iteration. On Sat, Aug 17, 2013 at 11:38 AM, Christopher Desjardins cddesjard...@gmail.com wrote: What do you mean by results? Do you want just the estimated parameters? And are you looking for one big table with all the estimated parameters from all imputation runs? Chris On Sat, Aug 17, 2013 at 11:18 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Hi Christopher, thanks for your reply. Unfortunately, that's not what I am looking for. I would like to have a table with the results of the two models (lm.imputed1 and lm.imputed2) in two separate columns. According to stargazer syntax I should type something like: stargazer(lm.imputed1, lm.imputed2, summary = FALSE) but then I get my error: Error: Unrecognized object type. Even though your example is insightful, I can't figure out how to solve my problem. Any advice is very welcome. Regards, f. On 17 August 2013 17:02, Christopher Desjardins cddesjard...@gmail.comwrote: Does this do what you want? library(Amelia) library(Zelig) library(stargazer) library(xtable) data(africa) m = 10 imp1 - amelia(x = africa,cs=country,m=m) imp2 - amelia(x = africa,cs=country,m=m) lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1) lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2) # Stargazer for(i in 1:m){ print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef))) } # xtable for(i in 1:m){ print(xtable(summary(lm.imputed1[[i]]))) print(xtable(summary(lm.imputed2[[i]]))) } On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Dear listers, I am running some OLS on multiply imputed data using Amelia. I first imputed the data with Amelia. than I run a OLS using Zelig to obtain a table of results accounting for the multiply imputed data-sets. And I'd like to do this for various models. Finally, I want to output all the models in a table of results for latex. I've tried with Stargazer because it seems to support Zelig output, but when I run stargazer on a set of objects containing the output of zelig, I get the following error: Error: unrecognized object type. this message is repeated for each model I passed to Stargazer. I am sorry I can't provide a working example, because I should make up some multiply imputed data first. Hoewever, summarizing what I did is: imputed1 - amelia(x=data1, m=10) imputed2 - amelia(x=data2, m=10) lm.imputed1 - zelig(Y ~ X + Z, data = imputed1) lm.imputed2 - zelig(Y ~ X + Z, data = imputed2) stargazer(lm.imputed1, lm.imputed2) The outcome is the error I mentioned above. Thanks in advance for all the support you can offer. Regards, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Randomly drop a percent of data from a data.frame
Hi, I have the following data. set.seed(6245) data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5)) round(data,digits=3) x1 x2 x3 x4 1 0.482 1.320 -0.859 -0.142 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 -0.069 0.354 4 -0.086 0.475 0.244 0.781 5 0.690 -0.181 1.274 1.633 What I would like to do is drop 20% of the data. But I want this 20% to only come from dropping data from x3 and x4. It doesn't have to be evenly, i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one observation has missing data on only one variable. I just want to drop 20% of the data through x3 and x4 only. In other words, x1 x2 x3 x4 1 0.482 1.320 -0.859 NA 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 NA 0.354 4 -0.086 0.475 NA 0.781 5 0.690 -0.181 NA 1.633 OR x1 x2 x3 x4 1 0.482 1.320 NA -0.142 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 NA NA 4 -0.086 0.475 0.244 NA 5 0.690 -0.181 1.274 1.633 OR x1 x2 x3 x4 1 0.482 1.320 -0.859 -0.142 2 -0.753 -0.041 -0.063 NA 3 0.028 -0.256 -0.069 NA 4 -0.086 0.475 0.244 NA 5 0.690 -0.181 1.274 NA ETC. are all fine. Any ideas how I can do this? Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Randomly drop a percent of data from a data.frame
Hi, Thanks for the help. What I actually ended up doing was writing a copy of for loops and I ended up getting something works. Thanks. Chris On Fri, Aug 16, 2013 at 4:34 PM, arun smartpink...@yahoo.com wrote: Hi, May be this helps: #data1 (changed `data` to `data1`) set.seed(6245) data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5)) data1- round(data1,digits=3) data2- data1 data1[,3:4]-lapply(data1[,3:4],function(x){x1- match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[ is.na(x1)]-NA;x}) data1 # x1 x2 x3 x4 #1 0.482 1.320 NA -0.142 #2 -0.753 -0.041 -0.063 0.886 #3 0.028 -0.256 -0.069 0.354 #4 -0.086 0.475 0.244 0.781 #5 0.690 -0.181 1.274 1.633 #or data2[,3:4]-lapply(data2[,3:4],function(x){x1- match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[ is.na(x1)]-NA;x}) data2 # x1 x2 x3 x4 #1 0.482 1.320 -0.859 -0.142 #2 -0.753 -0.041 NA NA #3 0.028 -0.256 -0.069 0.354 #4 -0.086 0.475 0.244 0.781 #5 0.690 -0.181 1.274 1.633 A.K. - Original Message - From: Christopher Desjardins cddesjard...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, August 16, 2013 3:02 PM Subject: [R] Randomly drop a percent of data from a data.frame Hi, I have the following data. set.seed(6245) data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5)) round(data,digits=3) x1 x2 x3 x4 1 0.482 1.320 -0.859 -0.142 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 -0.069 0.354 4 -0.086 0.475 0.244 0.781 5 0.690 -0.181 1.274 1.633 What I would like to do is drop 20% of the data. But I want this 20% to only come from dropping data from x3 and x4. It doesn't have to be evenly, i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one observation has missing data on only one variable. I just want to drop 20% of the data through x3 and x4 only. In other words, x1 x2 x3 x4 1 0.482 1.320 -0.859 NA 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 NA 0.354 4 -0.086 0.475 NA 0.781 5 0.690 -0.181 NA 1.633 OR x1 x2 x3 x4 1 0.482 1.320 NA -0.142 2 -0.753 -0.041 -0.063 0.886 3 0.028 -0.256 NA NA 4 -0.086 0.475 0.244 NA 5 0.690 -0.181 1.274 1.633 OR x1 x2 x3 x4 1 0.482 1.320 -0.859 -0.142 2 -0.753 -0.041 -0.063 NA 3 0.028 -0.256 -0.069 NA 4 -0.086 0.475 0.244 NA 5 0.690 -0.181 1.274 NA ETC. are all fine. Any ideas how I can do this? Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining data from different saved files with same object names into one data frame
Thanks Jim. That'll work for me! Chris On Mon, Feb 4, 2013 at 11:17 AM, jim holtman jholt...@gmail.com wrote: Try this storing them to a list with the j,k,l,m as an index: result - list() for(j in 1:2){ for(k in 1:6){ for(l in 1:2){ for(m in 1:2){ fsumstatZINB=paste(/Users/chris/Dropbox/phd/analysis/Simulation/zinbmodels/summaries/zinbsumstat-,j,k,l,m,.rdata, sep=) load(fsumstatZINB) print(zinbMeans[,1]) result[[paste(j,k,l,m)]] - zinbMeans } } } } On Mon, Feb 4, 2013 at 12:02 PM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi, I am simulating data from a zero-inflated negative binomial model. I have 4 different conditions and I've saved various information about the models in zinbsumbstat-jklm.rdata (as shown below). What I want to do is print information from each of these *.rdata and store them in one large object (a matrix or a data frame). However, within each data the objects all have the same name. So I don't know how to combine these into one object. Below is the code that I could use to paste all the information into the console. for(j in 1:2){ for(k in 1:6){ for(l in 1:2){ for(m in 1:2){ fsumstatZINB=paste(/Users/chris/Dropbox/phd/analysis/Simulation/zinbmodels/summaries/zinbsumstat-,j,k,l,m,.rdata, sep=) load(fsumstatZINB) print(zinbMeans[,1]) } } } } How could I combine all the information from these loops into a matrix() or a data.frame()? Two of the datasets [1,2] are pasted in a public Dropbox. Thanks! Chris [1] https://dl.dropbox.com/u/1501309/zinbsumstat-.rdata [2] https://dl.dropbox.com/u/1501309/zinbsumstat-1112.rdata [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simulation with cpm package
Hi, I am running the following code based on the cpm vignette's code. I believe the code is syntactically correct but it just seems to hang R. I can get this to run if I set the sims to 100 but with 2000 it just hangs. Any ideas why? Thanks, Chris library(cpm) cpmTypes - c(Kolmogorov-Smirnov,Mann-Whitney,Cramer-von-Mises) changeMagnitudes - c(1, 2, 4, 5) changeLocations - c(50,100,300) sims - 2000 ARL0 - 500 startup - 20 results - list() for (cpmType in cpmTypes) { results[[cpmType]] - matrix(numeric(length(changeMagnitudes) * length(changeLocations)), nrow = length(changeMagnitudes)) for (cm in 1:length(changeMagnitudes)) { for (cl in 1:length(changeLocations)) { print(sprintf(cpm:%s magnitude::%s location:%s, cpmType, changeMagnitudes[cm], changeLocations[cl])) temp - numeric(sims) for (s in 1:sims) { x -c(rchisq(changeLocations[cl], df=3), rchisq(2000, df=changeMagnitudes[cm])) temp[s] -detectChangePoint(x, cpmType, ARL0=ARL0, startup=startup)$detectionTime } results[[cpmType]][cm,cl] - mean(temp[temp changeLocations[cl]]) - changeLocations[cl] } } } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing NAs in long format
Thanks these different examples work perfectly. Chris On Sat, Nov 3, 2012 at 8:32 PM, arun smartpink...@yahoo.com wrote: HI Bill, It is much simpler. # with aggregate() and merge() res1-with(dat2,aggregate(seq_len(nrow(dat2)),by=list(idr=idr),FUN=function(i) with(dat2[i,], any(schyear=5 year ==0 res2-merge(dat2,res1,by=idr) colnames(res2)[4]-flag within(res2,{flag-as.integer(flag)}) #idr schyear year flag #1 1 4 -11 #2 1 501 #3 1 611 #4 1 721 #5 2 900 #6 2 1010 #7 2 1120 A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: arun smartpink...@yahoo.com; Christopher Desjardins cddesjard...@gmail.com Cc: R help r-help@r-project.org Sent: Saturday, November 3, 2012 9:21 PM Subject: RE: [R] Replacing NAs in long format Or, even simpler, flag - with(dat2, ave(schyear=5 year==0, idr, FUN=any)) data.frame(dat2, flag) idr schyear year flag 1 1 4 -1 TRUE 2 1 50 TRUE 3 1 61 TRUE 4 1 72 TRUE 5 2 90 FALSE 6 2 101 FALSE 7 2 112 FALSE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap Sent: Saturday, November 03, 2012 5:38 PM To: arun; Christopher Desjardins Cc: R help Subject: Re: [R] Replacing NAs in long format ave() or split-() can make that easier to write, although it may take some time to internalize the idiom. E.g., flag - rep(NA, nrow(dat2)) # add as.integer if you prefer 1,0 over TRUE,FALSE split(flag, dat2$idr) - lapply(split(dat2, dat2$idr), function(d)with(d, any(schyear=5 year==0))) data.frame(dat2, flag) idr schyear year flag 1 1 4 -1 TRUE 2 1 50 TRUE 3 1 61 TRUE 4 1 72 TRUE 5 2 90 FALSE 6 2 101 FALSE 7 2 112 FALSE or ave(seq_len(nrow(dat2)), dat2$idr, FUN=function(i)with(dat2[i,], any(schyear=5 year==0))) [1] 1 1 1 1 0 0 0 flag - ave(seq_len(nrow(dat2)), dat2$idr, FUN=function(i)with(dat2[i,], any(schyear=5 year==0))) data.frame(dat2, flag) idr schyear year flag 1 1 4 -11 2 1 501 3 1 611 4 1 721 5 2 900 6 2 1010 7 2 1120 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto: r-help-boun...@r-project.org] On Behalf Of arun Sent: Saturday, November 03, 2012 5:01 PM To: Christopher Desjardins Cc: R help Subject: Re: [R] Replacing NAs in long format Hi, May be this helps: dat2-read.table(text= idr schyear year 14 -1 150 161 172 290 2101 211 2 ,sep=,header=TRUE) dat2$flag-unlist(lapply(split(dat2,dat2$idr),function(x) rep(ifelse(any(apply(x,1,function(x) x[2]=5 x[3]==0)),1,0),nrow(x))),use.names=FALSE) dat2 # idr schyear year flag #1 1 4 -11 #2 1 501 #3 1 611 #4 1 721 #5 2 900 #6 2 1010 #7 2 1120 A.K. - Original Message - From: Christopher Desjardins cddesjard...@gmail.com To: jim holtman jholt...@gmail.com Cc: r-help@r-project.org Sent: Saturday, November 3, 2012 7:09 PM Subject: Re: [R] Replacing NAs in long format I have a similar sort of follow up and I bet I could reuse some of this code but I'm not sure how. Let's say I want to create a flag that will be equal to 1 if schyear = 5 and year = 0 for a given idr. For example dat idr schyear year 1 4 -1 1 50 1 61 1 72 2 90 2101 211 2 How could I make the data look like this? idr schyear year flag 1 4 -1 1 1 50 1 1 61 1 1 72 1 2 90 0 21010 211 2 0 I am not sure how to end up not getting both 0s and 1s for the 'flag' variable for an idr. For example, dat$flag = ifelse(schyear = 5 year ==0, 1, 0) Does not work because it will create: idr schyear year flag 1 4 -1 0 1 50 1 1 6
[R] Replacing NAs in long format
Hi, I have the following data: data[1:20,c(1,2,20)] idr schyear year 1 80 1 91 1 10 NA 2 4 NA 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 NA 2 126 3 4 NA 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 NA What I want to do is replace the NAs in the year variable with the following: idr schyear year 1 80 1 91 1 10 2 2 4 -2 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 5 2 126 3 4 -3 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 4 I have no idea how to do this. What it needs to do is make sure that for each subject (idr) that it either adds a 1 if it is preceded by a value in year or subtracts a 1 if it comes before a year value. Does that make sense? I could do this in Excel but I am at a loss for how to do this in R. Please reply to me as well as the list if you respond. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing NAs in long format
Hi Jim, Thank you so much. That does exactly what I want. Chris On Sat, Nov 3, 2012 at 1:30 PM, jim holtman jholt...@gmail.com wrote: x - read.table(text = idr schyear year + 1 80 + 1 91 + 1 10 NA + 2 4 NA + 2 5 -1 + 2 60 + 2 71 + 2 82 + 2 93 + 2 104 + 2 11 NA + 2 126 + 3 4 NA + 3 5 -2 + 3 6 -1 + 3 70 + 3 81 + 3 92 + 3 103 + 3 11 NA, header = TRUE) # you did not specify if there might be multiple contiguous NAs, # so there are a lot of checks to be made x.l - lapply(split(x, x$idr), function(.idr){ + # check for all NAs -- just return indeterminate state + if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr) + # repeat until all NAs have been fixed; takes care of contiguous ones + while (any(is.na(.idr$year))){ + # find all the NAs + for (i in which(is.na(.idr$year))){ + if ((i == 1L) (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] - .idr$year[i + 1L] - 1 + } else if ((i 1L) (!is.na(.idr$year[i - 1L]))){ + .idr$year[i] - .idr$year[i - 1L] + 1 + } else if ((i nrow(.idr)) (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] - .idr$year[i + 1L] -1 + } + } + } + return(.idr) + }) do.call(rbind, x.l) idr schyear year 1.11 80 1.21 91 1.31 102 2.42 4 -2 2.52 5 -1 2.62 60 2.72 71 2.82 82 2.92 93 2.10 2 104 2.11 2 115 2.12 2 126 3.13 3 4 -3 3.14 3 5 -2 3.15 3 6 -1 3.16 3 70 3.17 3 81 3.18 3 92 3.19 3 103 3.20 3 114 On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi, I have the following data: data[1:20,c(1,2,20)] idr schyear year 1 80 1 91 1 10 NA 2 4 NA 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 NA 2 126 3 4 NA 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 NA What I want to do is replace the NAs in the year variable with the following: idr schyear year 1 80 1 91 1 10 2 2 4 -2 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 5 2 126 3 4 -3 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 4 I have no idea how to do this. What it needs to do is make sure that for each subject (idr) that it either adds a 1 if it is preceded by a value in year or subtracts a 1 if it comes before a year value. Does that make sense? I could do this in Excel but I am at a loss for how to do this in R. Please reply to me as well as the list if you respond. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing NAs in long format
I have a similar sort of follow up and I bet I could reuse some of this code but I'm not sure how. Let's say I want to create a flag that will be equal to 1 if schyear = 5 and year = 0 for a given idr. For example dat idr schyear year 1 4 -1 1 50 1 61 1 72 2 90 2101 211 2 How could I make the data look like this? idr schyear year flag 1 4 -1 1 1 50 1 1 61 1 1 72 1 2 90 0 21010 211 2 0 I am not sure how to end up not getting both 0s and 1s for the 'flag' variable for an idr. For example, dat$flag = ifelse(schyear = 5 year ==0, 1, 0) Does not work because it will create: idr schyear year flag 1 4 -1 0 1 50 1 1 61 0 1 72 0 2 90 0 21010 211 2 0 And thus flag changes for an idr. Which it shouldn't. Thanks, Chris On Sat, Nov 3, 2012 at 5:50 PM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi Jim, Thank you so much. That does exactly what I want. Chris On Sat, Nov 3, 2012 at 1:30 PM, jim holtman jholt...@gmail.com wrote: x - read.table(text = idr schyear year + 1 80 + 1 91 + 1 10 NA + 2 4 NA + 2 5 -1 + 2 60 + 2 71 + 2 82 + 2 93 + 2 104 + 2 11 NA + 2 126 + 3 4 NA + 3 5 -2 + 3 6 -1 + 3 70 + 3 81 + 3 92 + 3 103 + 3 11 NA, header = TRUE) # you did not specify if there might be multiple contiguous NAs, # so there are a lot of checks to be made x.l - lapply(split(x, x$idr), function(.idr){ + # check for all NAs -- just return indeterminate state + if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr) + # repeat until all NAs have been fixed; takes care of contiguous ones + while (any(is.na(.idr$year))){ + # find all the NAs + for (i in which(is.na(.idr$year))){ + if ((i == 1L) (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] - .idr$year[i + 1L] - 1 + } else if ((i 1L) (!is.na(.idr$year[i - 1L]))){ + .idr$year[i] - .idr$year[i - 1L] + 1 + } else if ((i nrow(.idr)) (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] - .idr$year[i + 1L] -1 + } + } + } + return(.idr) + }) do.call(rbind, x.l) idr schyear year 1.11 80 1.21 91 1.31 102 2.42 4 -2 2.52 5 -1 2.62 60 2.72 71 2.82 82 2.92 93 2.10 2 104 2.11 2 115 2.12 2 126 3.13 3 4 -3 3.14 3 5 -2 3.15 3 6 -1 3.16 3 70 3.17 3 81 3.18 3 92 3.19 3 103 3.20 3 114 On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi, I have the following data: data[1:20,c(1,2,20)] idr schyear year 1 80 1 91 1 10 NA 2 4 NA 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 NA 2 126 3 4 NA 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 NA What I want to do is replace the NAs in the year variable with the following: idr schyear year 1 80 1 91 1 10 2 2 4 -2 2 5 -1 2 60 2 71 2 82 2 93 2 104 2 11 5 2 126 3 4 -3 3 5 -2 3 6 -1 3 70 3 81 3 92 3 103 3 11 4 I have no idea how to do this. What it needs to do is make sure that for each subject (idr) that it either adds a 1 if it is preceded by a value in year or subtracts a 1 if it comes before a year value. Does that make sense? I could do this in Excel but I am at a loss for how to do this in R. Please reply to me as well as the list if you respond. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data
[R] New Submission to CRAN note
Hi, I want to submit a package to CRAN and I am getting the following Note: * checking CRAN incoming feasibility ... NOTE New submission How can I take care of this? And/or is it a big deal? Thanks and sorry if this is something that I easily overlooked I have googled this topic for a while and found nothing. Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hat matrix for zeroinfl and hurdle objects
Hi, I am wondering if there is an easy way to access the hat matrix for zeroinfl and hurdle objects in the pscl library? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing ungrouped cases to grouped cases
Thanks the aggregate() command is what I was looking for. Chris On Thu, Jul 19, 2012 at 9:03 PM, David L Carlson dcarl...@tamu.edu wrote: dtf - read.table(text=y A B C + 0 11 2 + 0 12 1 + 1 11 2 + 0 11 2 + 1 11 2 + 1 12 1 + 0 12 2, + header=TRUE) dtagroup - aggregate(y~A+B+C, dtf, sum) # Gets you the groups. If you need the column/row order: dtagroup - dtagroup[order(dtagroup$y, decreasing=TRUE),c(4, 1:3)] -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Christopher Desjardins Sent: Thursday, July 19, 2012 7:35 PM To: R help Subject: [R] Changing ungrouped cases to grouped cases Hi, I have my data the following way: y A B C 0 11 2 0 12 1 1 11 2 0 11 2 1 11 2 1 12 1 0 12 2 . . . And so on. How can I make my data look like the following: y A B C 2 1 1 2 1 1 2 1 0 1 2 2 . . . In other words how can I change my ungrouped cases into grouped cases? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing ungrouped cases to grouped cases
As a follow up is there any way to also get the count for each combination? For example y A B C 0 11 2 0 12 1 1 11 2 0 11 2 1 11 2 1 12 1 0 12 2 Should become: y A B C count 2 1 1 24 1 1 2 12 0 1 2 2 1 . . . So I would know there were 2 successes out of 4. Thanks! Chris On Fri, Jul 20, 2012 at 10:41 AM, Christopher Desjardins cddesjard...@gmail.com wrote: Thanks the aggregate() command is what I was looking for. Chris On Thu, Jul 19, 2012 at 9:03 PM, David L Carlson dcarl...@tamu.eduwrote: dtf - read.table(text=y A B C + 0 11 2 + 0 12 1 + 1 11 2 + 0 11 2 + 1 11 2 + 1 12 1 + 0 12 2, + header=TRUE) dtagroup - aggregate(y~A+B+C, dtf, sum) # Gets you the groups. If you need the column/row order: dtagroup - dtagroup[order(dtagroup$y, decreasing=TRUE),c(4, 1:3)] -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Christopher Desjardins Sent: Thursday, July 19, 2012 7:35 PM To: R help Subject: [R] Changing ungrouped cases to grouped cases Hi, I have my data the following way: y A B C 0 11 2 0 12 1 1 11 2 0 11 2 1 11 2 1 12 1 0 12 2 . . . And so on. How can I make my data look like the following: y A B C 2 1 1 2 1 1 2 1 0 1 2 2 . . . In other words how can I change my ungrouped cases into grouped cases? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing ungrouped cases to grouped cases
Hi, I have my data the following way: y A B C 0 11 2 0 12 1 1 11 2 0 11 2 1 11 2 1 12 1 0 12 2 . . . And so on. How can I make my data look like the following: y A B C 2 1 1 2 1 1 2 1 0 1 2 2 . . . In other words how can I change my ungrouped cases into grouped cases? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding the column with the maximum value by row
Hi, Let's say I have the following data: a=matrix(c(1,2,4,4,2,1,1,2,4),nrow=3,byrow=T) a [,1] [,2] [,3] [1,]124 [2,]421 [3,]124 What syntax should I use to get R to tell me the column that corresponds to the maximum value for each row? For my example, I would like to get a vector that says 3, 1, 3 because the maximum value for row 1 is in column 3, the maximum value for row 2 is in column 1, and the maximum value for row 3 is in column 3. Does that make sense? Please cc me if you reply! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting predicted values from a zero-inflated negative binomial using zeroinfl()
Hi, I am a little confused at the output from predict() for a zeroinfl object. Here's my confusion: ## From zeroinfl package fm_zinb2 - zeroinfl(art ~ . | ., data = bioChemists, dist = negbin) ## The raw zero-inflated overdispersed data table(bioChemists$art) 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19 275 246 178 84 67 27 17 12 1 2 1 1 2 1 1 ## The default output from predict. It looks like it is doing a horrible job. Does it really predict 7 zeros? table(round(predict(fm_zinb2))) 0 1 2 3 4 5 6 10 7 354 487 45 12 6 3 1 ## The output from predict using count table(round(predict(fm_zinb2,type=count))) 1 2 3 4 5 6 10 312 536 45 12 6 3 1 ## The output from predict using zero, but here it predicts 24 structural zeros? table(round(predict(fm_zinb2,type=zero))) 0 1 891 24 So my question is how do I interpret these different outputs from the zeroinf object? What are the differences? The help page just left me confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y) and would most accurately track table(bioChemists$art) but I am wrong. How can I find the E(Y) that would most closely track the raw data? Please cc me if you reply. Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lambert (1992) simulation
On Fri, Apr 27, 2012 at 1:53 AM, Achim Zeileis achim.zeil...@uibk.ac.atwrote: On Thu, 26 Apr 2012, Christopher Desjardins wrote: Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Your implementation of the inverse logit link is wrong, it should be exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by hand but either use qlogis()/plogis() or make.link(logit). Doh! So obvious. I really should have noticed that. Thanks! I'll have a look at the rest of your code too. Cheers, Chris Your setting resulting in an almost constant probability of zero inflation and hence gamma was completely off. Below is my cleaned up code which nicely replicates the results for n = 100. The other two would require additional work because one would need to catch non-convergence etc. hth, Z ## data-generating process dgp - function(nobs = 100) { gamma - c(-1.5, 2) beta - c(1.5, -2) x - seq(0, 1, length.out = nobs) lambda - exp(beta[1] + beta[2] * x) p - plogis(gamma[1] + gamma[2] * x) y - ifelse(runif(nobs) = p, 0, rpois(nobs, lambda = lambda)) data.frame(y = y, x = x) } ## simulation of coefficients and standard errors sim - function(nrep = 2000, ...) { onesim - function(i) { d - dgp(...) m - zeroinfl(y ~ x, data = d) c(coef(m), sqrt(diag(vcov(m } t(sapply(1:nrep, onesim)) } ## run simulation #3 library(pscl) set.seed(1) cfse - sim(nobs = 100) ## mean coefficient estimates apply(cfse[, 1:4], 2, mean) ## median coefficient estimates apply(cfse[, 1:4], 2, median) ## sd of coefficient estimates apply(cfse[, 1:4], 2, sd) ## mean of the standard deviation estimates from observed information apply(cfse[, 5:8], 2, mean) Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=**N) Model.Matrix=cbind(rep(1,**length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/**exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%**Beta) CoefSimulations=matrix(nrow=**NumSimulations,ncol=2*dim(** Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U=P,0,Lambda.Draw) CoefSimulations[i,]=coef(**zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lambert (1992) simulation
Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=N) Model.Matrix=cbind(rep(1,length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%Beta) CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U=P,0,Lambda.Draw) CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] For loops
Hi, I am having trouble with syntax for a for loop. Here is what I am trying to do. class=c(rep(1,3),rep(2,3),rep(3,3)) out1=rnorm(length(class)) out2=rnorm(length(class)) out3=rnorm(length(class)) data=data.frame(class,out1,out2,out3) dat.split=split(data,data$class) for(i in 1:3){ sub[i]=dat.split[i] } However, the for loop doesn't work. I want to assign each split to a different data object. Better yet, how I could assign each class to a separate object and skip the splitting? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simplifying my code
Hi, I have a pretty simple problem. Here is the code: dat1=complete(dat.mice,1) dat2=complete(dat.mice,2) dat3=complete(dat.mice,3) dat4=complete(dat.mice,4) dat5=complete(dat.mice,5) dat6=complete(dat.mice,6) dat7=complete(dat.mice,7) dat8=complete(dat.mice,8) dat9=complete(dat.mice,9) dat10=complete(dat.mice,10) dat11=complete(dat.mice,11) dat12=complete(dat.mice,12) dat13=complete(dat.mice,13) dat14=complete(dat.mice,14) dat15=complete(dat.mice,15) dat16=complete(dat.mice,16) dat17=complete(dat.mice,17) dat18=complete(dat.mice,18) dat19=complete(dat.mice,19) dat20=complete(dat.mice,20) I would like to simplify this into a for loop. I thought this would work: for(i in 1:20){ dat[i]=complete(dat.mice,[i] } But it doesn't. Any tips? Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SPSS F-test on change in R square between hierarchical models
Hi, I am wondering if anyone knows how to perform an F-test on the change in R square between hierarchical models in R? SPSS provides this information and a researcher that I am working with is interested in getting this information. Alternatively, if someone knows how I can calculate the test statistic (SPSS calls it F-change?) and dfs that would be helpful as well. The output and the test I am looking for is available here - http://dl.dropbox.com/u/1501309/spss_rsquared.pdf Chris On Wed, Nov 23, 2011 at 6:20 AM, Smart Guy smartgu...@gmail.com wrote: Hi All, I was adding a new row of data to my data frame using rbind(). I was surprised to see that after adding new row, I lost my data frame level attibute as well as col level attribute. Please help me to insert a new row at frist or middle position so that my custom attribute is not lost. Here is what I did. age-c(15,20,18) weight-c(40,42,30) ### creating my data frame mydata - data.frame(age,weight) ### creating data frame level attribute attr(mydata,myattr)-c(myinfo) ### creating col level attribute for 'age' column ### attr(mydata$age,mycolattr)-c(mycolinfo) Checking attributes ### attributes(mydata) attributes(mydata$age) ### creating new row # newrow - data.frame(age=16, weight= 42) Inserting newrow as first row to my data frame mydata- rbind(newrow, mydata) Checking attributes again ### I lost my custom attributes attributes(mydata$age) attributes(mydata) Thanks in advance, -- SG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting just a portion of a smoother graph in ggplot2
Hi, I am using ggplot2 to with the following code: gmathk2 - qplot(time,math,colour=Kids,data=kids.ach.lm.k5,geom=smooth,method=lm,formula=y~ns(x,1)) + opts(title=Smoother Plot: Math K-5) + xlab(Time) + ylab(Math) + scale_colour_brewer(pal=Set1); gmathk2 This plots all the smoother for all the x values. What I'd like to do is plot the smoother for the x values that are only greater than or equal to 0. I don't want this: gmathk2 - qplot(time,math,colour=Kids,data=kids.ach.lm.k5,geom=smooth,method=lm,formula=y~ns(x,1)) + opts(title=Smoother Plot: Math K-5) + xlab(Time) + ylab(Math) + scale_colour_brewer(pal=Set1) + xlim(0,50); gmathk2 Because adding xlim seems to throw away the data below 0 when calculating the smoother. What I want it to do is have ggplot2 give me the same graph as the first command but just not plot the part of the smoother that is below 0. Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Suppressing the labelling of tick marks on ggplot2
Hi, I have the follow ggplot2 code I am running: ggplot(data=bb.res.math,aes(x=factor(id.bb),y=bb.math.comb,fill=BB)) + geom_bar() + facet_grid(BB~.) + scale_fill_brewer(pal=Set1) + ylab(Average Student Residual (Math)) + xlab(Student ID) The number of unique id.bb is 2207 and so my X-axis has a couple of thousands, indistinguishable tick marks that correspond to each id. I am wondering if there is a way to suppress these tick marks? Here is some test data: test id.bb bb.res.m bb.math.comb BB 1 29001 -3.0032235 BB 2 29041 -1.8758770 BB 3 346910.8182852 BB 4 42131 -2.0660215 Not BB 5 42151 -4.8622086 BB 6 439510.8873996 Not BB ggplot(data=test,aes(x=factor(id.bb),y=bb.math.comb,fill=BB)) + geom_bar() + facet_grid(BB~.) + scale_fill_brewer(pal=Set1) + ylab(Average Student Residual (Math)) + xlab(Student ID) Gives the following graph: http://dl.dropbox.com/u/1501309/ggplot2_suppress_ids.pdf I'd like to prevent 2900, 2904, 3469, etc. from showing up on the X-axis. Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence bands in ggplot2
Hi, I have the following data: est sch190 sch107 sch290 sch256 sch287 sch130 sch139 4.16656026 2.64306071 4.22579866 6.12024789 4.49624748 11.12799127 1.17353917 sch140 sch282 sch161 sch193 sch156 sch288 sch352 3.48197696 -0.29659410 -1.99194986 10.23489859 7.77342138 6.77624539 9.66795001 sch368 sch225 sch301 sch105 sch353 sch291 sch179 7.20229569 4.41989204 5.61586860 5.99460203 -2.65019242 -9.42614560 -0.25874193 sch134 sch135 sch324 sch360 bb1 3.26432479 10.52555091 -0.09637968 2.49668858 -3.24173545 se sch190sch107sch290sch256sch287sch130sch139sch140 3.165127 3.710750 4.680911 6.335386 3.896302 4.907679 4.426284 4.266303 sch282sch161sch193sch156sch288sch352sch368sch225 3.303747 4.550193 3.995261 5.787374 5.017278 7.820763 7.253183 4.483988 sch301sch105sch353sch291sch179sch134sch135sch324 4.076570 7.564359 10.456522 5.705474 4.247927 5.671536 10.567093 4.138356 sch360 bb1 4.943779 1.935142 sch [1] 190 107 290 256 287 130 139 140 282 161 193 156 288 [14] 352 368 225 301 105 353 291 179 134 135 324 360 BB From this data I have created 95% confidence intervals assuming a normal distribution. lower.95ci - est - se*qnorm(.975) upper.95ci - est + se*qnorm(.975) What I'd like to do is plot the estimate (est) and have lines attach to the points located in lower.95ci and upper.95ci. Presently I am doing the following: qplot(x=as.factor(sch),y=lower.95ci) + geom_point(aes(x=as.factor(sch),y=upper.95ci),colour=black) + geom_point(aes(x=as.factor(sch), y=est),colour=red) + ylab(Value-Added) + xlab(School) + theme_bw() Which creates this graph --- http://dl.dropbox.com/u/1501309/value_added_test.pdf That's fine except that it doesn't connect the points vertically. Does anyone know how I could make the 'black' points connect to the 'red' point, i.e. show confidence bands? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence bands in ggplot2
Thanks that worked perfectly. One thing if I may. Is it possible to make the center dot red and the lines connecting the dots black? Thanks, Chris On Jul 7, 2011, at 5:10 PM, Abhijit Dasgupta, PhD wrote: You can easily do this by: qplot(x=as.factor(sch),y=est, geom='point', colour='red') + geom_pointrange(aes(x=as.factor(sch), y=est, ymin=lower.95ci, ymax=upper.95ci))+ xlab('School') + ylab(Value-added)+theme_bw() On 07/07/2011 05:55 PM, Christopher Desjardins wrote: Hi, I have the following data: est sch190 sch107 sch290 sch256 sch287 sch130 sch139 4.16656026 2.64306071 4.22579866 6.12024789 4.49624748 11.12799127 1.17353917 sch140 sch282 sch161 sch193 sch156 sch288 sch352 3.48197696 -0.29659410 -1.99194986 10.23489859 7.77342138 6.77624539 9.66795001 sch368 sch225 sch301 sch105 sch353 sch291 sch179 7.20229569 4.41989204 5.61586860 5.99460203 -2.65019242 -9.42614560 -0.25874193 sch134 sch135 sch324 sch360 bb1 3.26432479 10.52555091 -0.09637968 2.49668858 -3.24173545 se sch190sch107sch290sch256sch287sch130sch139 sch140 3.165127 3.710750 4.680911 6.335386 3.896302 4.907679 4.426284 4.266303 sch282sch161sch193sch156sch288sch352sch368 sch225 3.303747 4.550193 3.995261 5.787374 5.017278 7.820763 7.253183 4.483988 sch301sch105sch353sch291sch179sch134sch135 sch324 4.076570 7.564359 10.456522 5.705474 4.247927 5.671536 10.567093 4.138356 sch360 bb1 4.943779 1.935142 sch [1] 190 107 290 256 287 130 139 140 282 161 193 156 288 [14] 352 368 225 301 105 353 291 179 134 135 324 360 BB From this data I have created 95% confidence intervals assuming a normal distribution. lower.95ci- est - se*qnorm(.975) upper.95ci- est + se*qnorm(.975) What I'd like to do is plot the estimate (est) and have lines attach to the points located in lower.95ci and upper.95ci. Presently I am doing the following: qplot(x=as.factor(sch),y=lower.95ci) + geom_point(aes(x=as.factor(sch),y=upper.95ci),colour=black) + geom_point(aes(x=as.factor(sch), y=est),colour=red) + ylab(Value-Added) + xlab(School) + theme_bw() Which creates this graph --- http://dl.dropbox.com/u/1501309/value_added_test.pdf That's fine except that it doesn't connect the points vertically. Does anyone know how I could make the 'black' points connect to the 'red' point, i.e. show confidence bands? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recoding several variables into one use the most recent data
Hi, I have the following data management issue. I am trying to combine multiple years of ethnicity data into one variable called ethnic. The data looks similar to the following idethnic07ethnic08 ethnic09ethnic10 1 1 1 11 2 1 1 22 3 3 4 4NA 4 2 3 NANA So, what I'd like to do is create a variable called 'ethnic' and I'd like to have this variable be filled with the most recent data available. So ethnic10 would have the highest priority, then ethnic09, followed by ethnic08, and finally ethnic07. So the ethnic variable based on the data above would look like the following: ethnic 1 2 4 3 I thought an ifelse() statement might work but I seem to be writing over my data every time I do this. Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change pattern in histograms in ggplot2
Hi, I am wondering if there is a way to change the pattern of the fill in histogram in ggplot2? By default the fill is solid and I'd like to add some sort of pattern to make it more visible that these are different levels of a factor. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing order of facet grid in ggplot2
Hi I am running the following code: sym - c(sym1,sym2,sym4) lifedxm - c(O-BD,O-WELL,O-UNI) life - c(lifedxm,lifedxm,lifedxm) tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN , TP-CLIN, TP-CLIN) data - data.frame(sym,life,tp) qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) + facet_grid(. ~ tp) This creates a facet grid where TP-ANY is followed by TP-CLIN and then TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then TP-CLIN. Is this possible? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting number of students with zeroes in long format
Hi Jorge, I want to make sure this does what I want. So I want to get a count of students that never get a suspension. Once a student has a non-zero I don't want to count that student. Each id_r is may be associated with multiple sus. Are these commands doing this? Because ... suslm[175953:nrow(suslm),c(id_r,sus)] id_r sus 999881.5 999881 1 999881.6 999881 7 999881.7 999881 0 999881.8 999881 0 999886.5 999886 0 999886.6 999886 0 999886.7 999886 0 999886.8 999886 0 999890.5 999890 0 999890.6 999890 0 999890.7 999890 0 999890.8 999890 0 999892.5 999892 0 999892.6 999892 0 999892.7 999892 0 999892.8 999892 0 999896.5 999896 0 999896.6 999896 4 999896.7 999896 3 999896.8 999896 0 999897.5 999897 0 999897.6 999897 0 999897.7 999897 0 tail(with(suslm,tapply(sus,id_r,function(x) any(x==0 999881 999886 999890 999892 999896 999897 TRUE TRUE TRUE TRUE TRUE TRUE r - with(suslm, tapply(sus, id_r, function(x) any(x 0)) tail(with(suslm, tapply(sus, id_r, function(x) any(x 0 999881 999886 999890 999892 999896 999897 TRUE FALSE FALSE FALSE TRUE FALSE Based on this 999881 and 999896 should be FALSE not TRUE I would expect if they were true for the first command they should be false for the second command right? tail(names(r[ r == TRUE ])) [1] 999752 999767 999806 999807 999881 999896 tail(names(r[ r == FALSE ])) [1] 999869 999870 999886 999890 999892 999897 This command seems to do the right thing. Is that right? On Wed, Apr 6, 2011 at 10:25 PM, Jorge Ivan Velez jorgeivanve...@gmail.comwrote: Hi Chris, Sorry I did not see your email before ;-) Here is one option: r - with(d, tapply(sus, id_r, function(x) any(x 0))) r 111516181920212224252630 3132 FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE 33 FALSE names(r[ r == TRUE ]) [1] 15 25 Regards, Jorge On Wed, Apr 6, 2011 at 5:03 PM, Christopher Desjardins wrote: Thanks. And how many could I find that have greater than 0? Chris On Wed, Apr 6, 2011 at 3:58 PM, Jorge Ivan Velez wrote: Hi Chris, Is this what you have in mind? sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0 [1] 13 HTH, Jorge On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins wrote: Hi, I have longitudinal school suspension data on students. I would like to figure out how many students (id_r) have no suspensions (sus), i.e. have a code of '0'. My data is in long format and the first 20 records look like the following: suslm[1:20,c(1,7)] id_r sus 11 0 15 10 16 0 18 0 19 0 19 0 20 0 21 0 21 0 22 0 24 0 24 0 25 3 26 0 26 0 30 0 30 0 31 0 32 0 33 0 Each id_r is unique and I'd like to know the number of id_r that have a 0 for sus not the total number of 0. Does that make sense? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting number of students with zeroes in long format
On Thu, Apr 7, 2011 at 8:07 AM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi Jorge, I want to make sure this does what I want. So I want to get a count of students that never get a suspension. Once a student has a non-zero I don't want to count that student. Each id_r is may be associated with multiple sus. Are these commands doing this? Because ... Also sus = suspension suslm[175953:nrow(suslm),c(id_r,sus)] id_r sus 999881.5 999881 1 999881.6 999881 7 999881.7 999881 0 999881.8 999881 0 999886.5 999886 0 999886.6 999886 0 999886.7 999886 0 999886.8 999886 0 999890.5 999890 0 999890.6 999890 0 999890.7 999890 0 999890.8 999890 0 999892.5 999892 0 999892.6 999892 0 999892.7 999892 0 999892.8 999892 0 999896.5 999896 0 999896.6 999896 4 999896.7 999896 3 999896.8 999896 0 999897.5 999897 0 999897.6 999897 0 999897.7 999897 0 tail(with(suslm,tapply(sus,id_r,function(x) any(x==0 999881 999886 999890 999892 999896 999897 TRUE TRUE TRUE TRUE TRUE TRUE r - with(suslm, tapply(sus, id_r, function(x) any(x 0)) tail(with(suslm, tapply(sus, id_r, function(x) any(x 0 999881 999886 999890 999892 999896 999897 TRUE FALSE FALSE FALSE TRUE FALSE Based on this 999881 and 999896 should be FALSE not TRUE for tail(with(suslm,tapply(sus,id_r,function(x) any(x==0 I would expect if they were true for the first command they should be false for the second command right? tail(names(r[ r == TRUE ])) [1] 999752 999767 999806 999807 999881 999896 tail(names(r[ r == FALSE ])) [1] 999869 999870 999886 999890 999892 999897 This command seems to do the right thing. Is that right? On Wed, Apr 6, 2011 at 10:25 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Chris, Sorry I did not see your email before ;-) Here is one option: r - with(d, tapply(sus, id_r, function(x) any(x 0))) r 111516181920212224252630 3132 FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE 33 FALSE names(r[ r == TRUE ]) [1] 15 25 Regards, Jorge On Wed, Apr 6, 2011 at 5:03 PM, Christopher Desjardins wrote: Thanks. And how many could I find that have greater than 0? Chris On Wed, Apr 6, 2011 at 3:58 PM, Jorge Ivan Velez wrote: Hi Chris, Is this what you have in mind? sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0 [1] 13 HTH, Jorge On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins wrote: Hi, I have longitudinal school suspension data on students. I would like to figure out how many students (id_r) have no suspensions (sus), i.e. have a code of '0'. My data is in long format and the first 20 records look like the following: suslm[1:20,c(1,7)] id_r sus 11 0 15 10 16 0 18 0 19 0 19 0 20 0 21 0 21 0 22 0 24 0 24 0 25 3 26 0 26 0 30 0 30 0 31 0 32 0 33 0 Each id_r is unique and I'd like to know the number of id_r that have a 0 for sus not the total number of 0. Does that make sense? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting number of students with zeroes in long format
Hi, I have longitudinal school suspension data on students. I would like to figure out how many students (id_r) have no suspensions (sus), i.e. have a code of '0'. My data is in long format and the first 20 records look like the following: suslm[1:20,c(1,7)] id_r sus 11 0 15 10 16 0 18 0 19 0 19 0 20 0 21 0 21 0 22 0 24 0 24 0 25 3 26 0 26 0 30 0 30 0 31 0 32 0 33 0 Each id_r is unique and I'd like to know the number of id_r that have a 0 for sus not the total number of 0. Does that make sense? Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting number of students with zeroes in long format
On Wed, Apr 6, 2011 at 4:03 PM, Douglas Bates ba...@stat.wisc.edu wrote: On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi, I have longitudinal school suspension data on students. I would like to figure out how many students (id_r) have no suspensions (sus), i.e. have a code of '0'. My data is in long format and the first 20 records look like the following: suslm[1:20,c(1,7)] id_r sus 11 0 15 10 16 0 18 0 19 0 19 0 20 0 21 0 21 0 22 0 24 0 24 0 25 3 26 0 26 0 30 0 30 0 31 0 32 0 33 0 Each id_r is unique and I'd like to know the number of id_r that have a 0 for sus not the total number of 0. Does that make sense? You say you have longitudinal data so may we assum that a particular id_r can occur multiple times in the data set? Yes an id_r can occur multiple times in the data set. It is not clear to me what you want the result to be for students who have no suspensions at one time but may have a suspension at another time. Are you interested in the number of students who have only zeros in the sus column? Yes. Once a student has a value other than zero I don't want to include that student in the tally. So I want to know how many students never got suspended during the study. One way to approach this task is to use tapply. I would create a data frame and convert id_r to a factor. df - within(as.data.frame(suslm), id_r - factor(id_r)) counts - with(df, lapply(sus, id_r, function(sus) all(sus == 0))) I am getting the following message: df - within(as.data.frame(suslm), id_r - factor(id_r)) counts - with(df, lapply(sus, id_r, function(sus) all(sus == 0))) Error in get(as.character(FUN), mode = function, envir = envir) : object 'id_r' of mode 'function' was not found Thanks, Chris The tapply function will split the vector sus according to the levels of id_r and apply the function to the subvectors. I just say Jorge's response and he uses the same tactic but he is looking for students who had any value of sus==0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Quick recode of -999 to NA in R
Hi, I am trying to write a loop to recode my data from -999 to NA in R. What's the most efficient way to do this? Below is what I'm presently doing, which is inefficient. Thanks, Chris dat0 - read.table(time1.dat) colnames(dat0) - c(e1dq, e1arcp, e1dev, s1prcp, s1nrcp, s1ints, a1gpar, a1pias, a1devt) dat0[dat0$e1dq==-999.,e1dq] - NA dat0[dat0$e1arcp==-999.,e1arcp] - NA dat0[dat0$e1dev==-999.,e1dev] - NA dat0[dat0$s1prcp==-999.,s1prcp] - NA dat0[dat0$s1nrcp==-999.,s1nrcp] - NA dat0[dat0$s1ints==-999.,s1ints] - NA dat0[dat0$a1gpar==-999.,a1gpar] - NA dat0[dat0$a1pias==-999.,a1pias] - NA dat0[dat0$a1devt==-999.,a1devt] - NA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quick recode of -999 to NA in R
Ah ... yes. I knew that but clearly didn't at the time of my question or script writing. Thanks, Chris On Wed, Mar 30, 2011 at 8:22 AM, Henrique Dallazuanna www...@gmail.comwrote: Try: dat0 - read.table('tim1.dat', na = -999) Ah ... yes. I knew that but clearly didn't at the time of my question or script writing. Thanks, Chris On Wed, Mar 30, 2011 at 10:15 AM, Christopher Desjardins cddesjard...@gmail.com wrote: Hi, I am trying to write a loop to recode my data from -999 to NA in R. What's the most efficient way to do this? Below is what I'm presently doing, which is inefficient. Thanks, Chris dat0 - read.table(time1.dat) colnames(dat0) - c(e1dq, e1arcp, e1dev, s1prcp, s1nrcp, s1ints, a1gpar, a1pias, a1devt) dat0[dat0$e1dq==-999.,e1dq] - NA dat0[dat0$e1arcp==-999.,e1arcp] - NA dat0[dat0$e1dev==-999.,e1dev] - NA dat0[dat0$s1prcp==-999.,s1prcp] - NA dat0[dat0$s1nrcp==-999.,s1nrcp] - NA dat0[dat0$s1ints==-999.,s1ints] - NA dat0[dat0$a1gpar==-999.,a1gpar] - NA dat0[dat0$a1pias==-999.,a1pias] - NA dat0[dat0$a1devt==-999.,a1devt] - NA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quick recode of -999 to NA in R
On Wed, Mar 30, 2011 at 10:51 AM, peter dalgaard pda...@gmail.com wrote: On Mar 30, 2011, at 16:05 , Christopher Desjardins wrote: dat0 - read.table('tim1.dat', na = -999) Ah ... yes. I knew that but clearly didn't at the time of my question or script writing. Thanks, Chris Depending on where your data came from, you could get caught by the fact that the above is really ...na.strings=-999... and that is not going to work if the actual code is (say) -999.00. Another straightforward option is dat0[dat0 == -999] - NA Thanks Peter. That way does work well too and it does what I need. Chris -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating 3 vectors that sum to 1
I have 3 vectors: p1, p2, and p3. I would like each vector to be any possible value between 0 and 1 and p1 + p2 + p3 = 1. I want to graph these and I've thought about using scatterplot3d(). Here's what I have so far. library(scatterplot3d) p1 - c(1,0,0,.5,.5,0,.5,.25,.25,.34,.33,.33,.8,.1,.1,.9,.05,.05) p2 - c(0,1,0,.5,0,.5,.25,.5,.25,.33,.34,.33,.1,.8,.1,.05,.9,.05) p3 - c(0,0,1,0,.5,.5,.25,.25,.5,.33,.33,.34,.1,.1,.8,.05,.05,.9) scatterplot3d(p1,p2,p3) However, I wonder if there is an easy way to create vectors p1, p2, and p3. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating 3 vectors that sum to 1
Thanks several of these do exactly what I want. Chris On Tue, Mar 29, 2011 at 2:42 PM, Greg Snow greg.s...@imail.org wrote: Or we could expand a bit more: require(TeachingDemos) require(gtools) n - 1000 rtrg - matrix(NA, n, 3) for (i in 1:n) rtrg[i,] - diff(c(0, sort(runif(2)), 1)) rtrg2 - matrix(NA, n, 3) for (i in 1:n) { tmp - runif(3) rtrg2[i, ] - tmp/sum(tmp) } rtrg3 - matrix( rexp(n*3), ncol=3 ) rtrg3 - rtrg3/rowSums(rtrg3) rtrg4 - rdirichlet(n, rep(1,3)) par(mfrow=c(2,2)) triplot(rtrg, pch='.') # Looks more uniformly distributed triplot(rtrg2, col=2, pch='.') # Corners are sparsely populated triplot(rtrg3, col=3, pch='.') triplot(rtrg4, col=4, pch='.') What could also be interesting in using vis.test (also TeachingDemos) to see which can be told apart from each other. My guess is that rtrg2 method will be visible different from the other 3, but the other 3 will be indistinguishable from each other. The last 2 have the advantage (the 2nd could be rewritten to have the same advantage) of being much quicker, not sure how to speed up the 1st noticibly. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Ravi Varadhan [mailto:rvarad...@jhmi.edu] Sent: Tuesday, March 29, 2011 12:59 PM To: Ravi Varadhan Cc: Greg Snow; r-help@r-project.org Subject: Re: [R] Creating 3 vectors that sum to 1 Here is an exploration of two different 3-tuple generators (that sum to 1) based on Greg's triplot function: require(TeachingDemos) n - 1000 rtrg - matrix(NA, n, 3) for (i in 1:n) rtrg[i,] - diff(c(0, sort(runif(2)), 1)) rtrg2 - matrix(NA, n, 3) for (i in 1:n) { tmp - runif(3) rtrg2[i, ] - tmp/sum(tmp) } par(mfrow=c(2,1)) triplot(rtrg) # Looks more uniformly distributed triplot(rtrg2, col=2) # Corners are sparsely populated Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Ravi Varadhan rvarad...@jhmi.edu Date: Tuesday, March 29, 2011 2:33 pm Subject: Re: [R] Creating 3 vectors that sum to 1 To: Greg Snow greg.s...@imail.org Cc: r-help@r-project.org r-help@r-project.org The following one-liner generates uniformly distributed 3-tuples that sum to 1: diff(c(0, sort(runif(2)), 1)) More, generally you can generate n-tuples that sum to unity as: diff(c(0, sort(runif(n-1)), 1)) Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Greg Snow greg.s...@imail.org Date: Tuesday, March 29, 2011 1:02 pm Subject: Re: [R] Creating 3 vectors that sum to 1 To: Christopher Desjardins cddesjard...@gmail.com, r-help@r-project.org r-help@r-project.org Do a search for Dirichlet, that may give you the tools you need. Also for plotting 3 vectors that sum to 1, instead of a 3d scatter plot you should look into a triangle or trilinear plot, see ?triplot in the TeachingDemos package (the see also for that help page lists several other implementations in other packages as well). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [ project.org] On Behalf Of Christopher Desjardins Sent: Tuesday, March 29, 2011 10:20 AM To: r-help@r-project.org Subject: [R] Creating 3 vectors that sum to 1 I have 3 vectors: p1, p2, and p3. I would like each vector to be any possible value between 0 and 1 and p1 + p2 + p3 = 1. I want to graph these and I've thought about using scatterplot3d(). Here's what I have so far. library(scatterplot3d) p1 - c(1,0,0,.5,.5,0,.5,.25,.25,.34,.33,.33,.8,.1,.1,.9,.05,.05) p2 - c(0,1,0,.5,0,.5,.25,.5,.25,.33,.34,.33,.1,.8,.1,.05,.9,.05) p3 - c(0,0,1,0,.5,.5,.25,.25,.5,.33,.33,.34,.1,.1,.8,.05,.05,.9) scatterplot3d(p1,p2,p3) However, I wonder if there is an easy way to create vectors p1, p2, and p3. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list PLEASE do read the posting guide guide.html and provide commented, minimal, self-contained, reproducible code
[R] Sweave and Textwrangler
Hi, I am trying to get TextWrangler to work with LaTeX and Sweave. Ideally I would call a script from TextWrangler that would run Sweave on a document, then LaTeX (using SyncTeX), and finally open the corresponding pdf in Skim. Of course I don't always need to run Sweave and would be looking for the option of LaTeX - Open. I found this http://www.xs4all.nl/~msneep/latex/ but it doesn't work for me as it gives me an error. Something similar to do that would be ideal. Additionally the option to use BibTeX would be great too. Additionally, I tried this http://www.stat.umn.edu/~arendahl/computing/index.html and Sweave.sh script and trying to alter that (and the *.engine files) to work with TextWrangler but to no avail. Does anyone do this and have a tip they could give me? Sorry if this is off-topic (if so please reply to me off the list). Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RWinEdt and WinEdt 6.0
Hi, I am curious if the status of RWinEdt and WinEdt 6.0 has changed since this thread http://r.789695.n4.nabble.com/RWinEdt-in-WinEdt-6-td2174285.html Thanks for the help (especially Uwe Ligges for writing RWinEdt). Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calling Gnuplot from R
Hi, I am wondering if there is a way to call Gnuplot from R and/or if anyone can recommend a package on CRAN capable of doing this? Thanks, Chris PS - Please cc me on the response. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining 4 vectors into one
Hi, How can I perform the following y1y2y3y4 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 into ... y 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 Please cc me on reply as I subscribe to the digest. Thanks! Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining 4 vectors into one
Thanks. That worked. Chris On 7/19/09 6:41 PM, jim holtman wrote: Try this: x y1 y2 y3 y4 1 1 0 3 2 2 0 1 2 1 3 3 2 0 1 4 0 5 1 0 as.vector(t(x)) [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 On Sun, Jul 19, 2009 at 7:25 PM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, How can I perform the following y1y2y3y4 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 into ... y 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 Please cc me on reply as I subscribe to the digest. Thanks! Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error with r2winbugs
Hi, I am trying to do run the following model saved in C:/bugs/sus.bug model { for (i in 1:n){ y[i] ~ dpois(lamdba[i]) log(lambda[i]) - mu+bmale[male[i]]+bschn[schn[i]]+epsilon[i] # epsilon[i] ~ dnorm(0,tau.epsilon) } mu ~ dnorm(0,.0001) bmale ~ dnorm(0,.0001) tau.epsilon - pow(sigma.epsilon, -2) sigma.epsilon ~ dunif(0,100) for (j in 1:n.schn){ bschn[j] ~ dnorm(0,tau.schn) } tau.schn - pow(sigma.schn, -2) sigma.schn ~ dunif(0,100) } I am running the following syntax: enroll.wide.m - na.omit(enroll.wide) n - nrow(enroll.wide.m) y - enroll.wide.m$sus.5 sch - as.vector(enroll.wide.m$schno.5) uniq - unique(sch) n.schn - length(uniq) male - enroll.wide.m$male schn - enroll.wide.m$schno.5 data - list (n,n.schn,y,male,schn) inits - function() {list (bschn=rnorm(n.schn), mu=rnorm(1,0,100), bmale=rnorm(1,0,100), sigma.epsilon=runif(1,0,100),sigma.schn=runif(1,0,100))} parameters - c(bschn,mu,bmale,sigma.epsilon,sigma.schn) sus.sim - bugs(data,inits,parameters,C:/bugs/susc.bug,n.chains=3,n.iter=10,debug=TRUE) And get the following error: display(log) check(C:\DOCUME~1\CHRIST~1\LOCALS~1\Temp\Rtmpyg7eDj/susc.bug.txt) empty slot not allowed in variable name data(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/data.txt) command #Bugs:data cannot be executed (is greyed out) compile(3) inits(1,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits1.txt) command #Bugs:inits cannot be executed (is greyed out) inits(2,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits2.txt) command #Bugs:inits cannot be executed (is greyed out) inits(3,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits3.txt) command #Bugs:inits cannot be executed (is greyed out) gen.inits() command #Bugs:gen.inits cannot be executed (is greyed out) thin.updater(1) update(5) command #Bugs:update cannot be executed (is greyed out) set(bschn) command #Bugs:set cannot be executed (is greyed out) set(mu) command #Bugs:set cannot be executed (is greyed out) set(bmale) command #Bugs:set cannot be executed (is greyed out) set(sigma.epsilon) command #Bugs:set cannot be executed (is greyed out) set(sigma.schn) command #Bugs:set cannot be executed (is greyed out) set(deviance) command #Bugs:set cannot be executed (is greyed out) dic.set() command #Bugs:dic.set cannot be executed (is greyed out) update(5) command #Bugs:update cannot be executed (is greyed out) coda(*,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/coda) command #Bugs:coda cannot be executed (is greyed out) stats(*) command #Bugs:stats cannot be executed (is greyed out) dic.stats() DIC history(*,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/history.odc) command #Bugs:history cannot be executed (is greyed out) save(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/log.odc) save(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/log.txt) Any help would be greatly appreciate. Also please cc me as I am digest subscriber. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to combine two data frames by ID
Hi, I am trying to combine two data frames by ID. The first data frame is the whole data set and the second data frame is a subset of the first. What I would like to do is take the values from variable, p1, from the second data frame and merge them back into that variable in the first data frame. I have ID variables that match for both data frames and the data frames obviously have differing number of rows where the first data frame second data frame. I want to keep the pre existing data for this variable that is present in the first data frame and only write over the data that I edited in the second. Thanks and please cc me as I am a digest subscriber, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 x axis question
Hi Hadley, Thanks for the reply and the great graphing package. That code is giving me the following error: qplot(reorder(model,delta),delta,data=growthm.bic) Error in UseMethod(reorder) : no applicable method for reorder Cheers, Chris On 6/28/09 8:21 PM, hadley wickham wrote: Hi Chris, Try this: qplot(reorder(model, delta), delta, data = growthm.bic) Hadley On Sun, Jun 28, 2009 at 9:53 AM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to plot them in order of ascending BIC values. I am however unclear as to how I can get the models to line up on the x-axis by BIC and not by numeric order. For example, if model 5 has a lower BIC than 1, I want it to be the first point on the left hand side of the curve. This seems to work in plot: plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model, ylab=expression(Delta[k])) # where growthm.bic$delta is my BIC value axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of the 45 models examined, i.e 1:45 Currently using qplot I have this which doesn't not work as it arranges the BIC values in order from 1:45. qplot(model,delta,data=growthm.bic) Thanks. Also please cc me as I am a digest subscriber. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 x axis question
Thanks Hadley that worked. Chris On 6/29/09 11:05 AM, hadley wickham wrote: In that case, try: qplot(reorder(factor(model),delta),delta,data=growthm.bic) Deepayan: do you think there should also be a numeric method for reorder? Hadley On Mon, Jun 29, 2009 at 10:39 AM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi Hadley, Thanks for the reply and the great graphing package. That code is giving me the following error: qplot(reorder(model,delta),delta,data=growthm.bic) Error in UseMethod(reorder) : no applicable method for reorder Cheers, Chris On 6/28/09 8:21 PM, hadley wickham wrote: Hi Chris, Try this: qplot(reorder(model, delta), delta, data = growthm.bic) Hadley On Sun, Jun 28, 2009 at 9:53 AM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to plot them in order of ascending BIC values. I am however unclear as to how I can get the models to line up on the x-axis by BIC and not by numeric order. For example, if model 5 has a lower BIC than 1, I want it to be the first point on the left hand side of the curve. This seems to work in plot: plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model, ylab=expression(Delta[k])) # where growthm.bic$delta is my BIC value axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of the 45 models examined, i.e 1:45 Currently using qplot I have this which doesn't not work as it arranges the BIC values in order from 1:45. qplot(model,delta,data=growthm.bic) Thanks. Also please cc me as I am a digest subscriber. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 x axis question
Hi, I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to plot them in order of ascending BIC values. I am however unclear as to how I can get the models to line up on the x-axis by BIC and not by numeric order. For example, if model 5 has a lower BIC than 1, I want it to be the first point on the left hand side of the curve. This seems to work in plot: plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model, ylab=expression(Delta[k])) # where growthm.bic$delta is my BIC value axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of the 45 models examined, i.e 1:45 Currently using qplot I have this which doesn't not work as it arranges the BIC values in order from 1:45. qplot(model,delta,data=growthm.bic) Thanks. Also please cc me as I am a digest subscriber. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing gird marks in ggplot2
Hi, When I zoom into a graph created in ggplot2 with the coord_cartesian(ylim=c(0,5)) option, I have no values labelled on my y-axis. For this graph ggplot2 only puts labels the y-axis at intervals of 10 (i.e. 0, 10, 20, ...). However, the major portion of the graph I am interested in is located between the values of 0 and 5 on the y-axis (thus why I am zoooming). How can I coerce ggplot2 into making the major gird marks so that 0, 1, 2, 3, 4, and 5 are shown as it is currently showing no label? Thanks. Also please cc me directly as I'm a digest subscriber. Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.