Re: [R] Good pointers for understanding the R language implementation

2016-04-06 Thread Christopher Desjardins
This might be useful: http://adv-r.had.co.nz/

On Tue, Apr 5, 2016 at 10:31 AM, Francisco Banha  wrote:

> Dear All,
>
> I'm currently working on a project with
> the purpose of remotely executing R code, which requires me to have to
> work with the code of R itself. I've searched the Internet for good
> information that will help me understand how R is implemented but what
> I've got so far isn't detailed enough.
> I've looked specifically at
> CRAN's manuals on the official website but they only address this issue
> briefly. I've also looked at other contents online but so far nothing
> has turned up that has the level of detail that I need to properly
> understand the inner workings of R.
> For example, I need to understand
>  how exactly an expression is parsed and evaluated, because I will need
> to intervene in the process to decide whether to execute it remotely or
> not.
> Does anyone know of good pointers that would help me understand this?
> Thanks for any help!
>
> Best regards,
> Francisco
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reduced set of alternatives in package mlogit

2016-04-01 Thread Christopher Desjardins
Hi Jose,

You're referring to your response variable when you're saying it's missing
some of the choices, right? Are your response choices ever known or do they
just occur with extremely low frequency? Either way, I think the mlogit
package would be inappropriate for you. I imagine you would have much
better luck using MCMCpack or writing a model with rstan or something
Bayesian. Unless I'm missing something.

Chris



On Fri, Apr 1, 2016 at 8:40 AM, Jose Marcos Ferraro <
jose.ferr...@logiteng.com> wrote:

> -Original Message-
> From: Bert Gunter [mailto:bgunter.4...@gmail.com]
> Sent: quinta-feira, 31 de março de 2016 20:22
> To: Jose Marcos Ferraro 
> Cc: r-help@r-project.org
> Subject: Re: [R] reduced set of alternatives in package mlogit
>
> code? example data?  We can only guess based on your vague post.
>
> "PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code."
>
> Moreover, this sounds like a statistical question, not a question about R
> programming, and so might be more appropriate for a statistical list like
> stats.stackexchange.com  .
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Mar 31, 2016 at 2:51 PM, Jose Marcos Ferraro <
> jose.ferr...@logiteng.com> wrote:
> > I'm trying to estimate a multinomial logit model  but in some choices
> only alternatives from a subset of all possible alternatives can be chosen.
> > At the moment I get around it by creating "dummy" variables to mean the
> alternative is not available and let it estimate this coefficient as highly
> negative. Is there a better way to do it?
> >
>
> Sorry if I was not clear enough, but  there is hardly any code to show.
> The problem is that a parameter or function is lacking (or , mostly
> likely, I can't find it), so in some sense the problem itself is that
> there is no code to show.
>
> In what follows choice situations , alternatives, wide, and variables have
> the same meaning that they have on the mlogit documentation. All variables
> are alternative specific.
>
> 1)I want to estimate a multinomial Logit  using the mlogit package
>
> 2)I have a dataset, made of choice situations
>
> 3)There is a set of alternatives
>
> 4)in some choice situations, not all alternatives were available, but only
> a subset of them. So there are no variables for the unavailable
> alternatives and the chosen alternative evidently  belongs to the set of
> available ones.
>
> 5)I use mlogit.data to prepare the dataset from a "wide" dataframe . There
> is no option to have only a subset of alternatives and the resulting object
> will have them all , that is, there will be a line for every alternative
> and every choice situation, even if in reality some of them were not
> available. The variables of these alternatives did not exist, so must be
> filled with 0s or any other made up value
>
> 6) If ones estimate a model from this data it will be wrong
>
> 7) It is possible to get an "almost right" model by using a dummy variable
> marking which alternatives are unavailable, for as it is only used in
> alternatives that are never chosen, its coefficient will get negative with
> big absolute value, in practice giving almost 0% probability for them
>
> 8)this is a workaround because it obligates the model to estimate a number
> that should be -infinity and this is known in advance, so it's ugly and
> difficult to know what the numeric consequences are as the coefficient can
> never converge. In fact, I don't use it the way I described for these
> reasons, preferring a more complex but almost equivalent formulation. The
> important point is that I want a clean solution, not a workaround
>
> 9)I demand simply if mlogit package has such functionality
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Amelia, Zelig and latex in R

2013-08-18 Thread Christopher Desjardins
Seems you're after the pooled results. Would the following work?



library(Amelia)
library(Zelig)
library(xtable)

data(africa)

m = 10
imp1 - amelia(x = africa,cs=country,m=m)
imp2 - amelia(x = africa,cs=country,m=m)
lm.imputed1 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp1)
lm.imputed2 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp2)

lm1 - as.data.frame(summary(lm.imputed1)$coef)
lm2 - as.data.frame(summary(lm.imputed2)$coef)
lm1[,2] - ifelse(lm1[,4].001,paste(lm1[,2],***,sep= ),
  ifelse(lm1[,4].01,paste(lm1[,2],**,sep= ),
 ifelse(lm1[,4].05,paste(lm1[,2],*,sep= ),
ifelse(lm1[,4].1,paste(lm1[,2],.,sep=
),lm1[,2]

lm2[,2] - ifelse(lm2[,4].001,paste(lm2[,2],***,sep= ),
  ifelse(lm2[,4].01,paste(lm2[,2],**,sep= ),
 ifelse(lm2[,4].05,paste(lm2[,2],*,sep= ),
ifelse(lm2[,4].1,paste(lm2[,2],.,sep=
),lm2[,2]

## ONE OPTIONS ##
lms - as.data.frame(cbind(lm1[,1],lm2[,1],lm1[,2],lm2[,2]))
rownames(lms) - rownames(lm1)
colnames(lms) - c(Imp1.Est,Imp2.Est,Imp1.SE,Imp2.SE)
xtable(lms)

## OR ##
xtable(cbind(lm1[,1:2],lm2[,1:2]))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amelia, Zelig and latex in R

2013-08-18 Thread Christopher Desjardins
Hi,
I am glad you could get it to work. I don't really know  I usually just
use xtable and any additional formatting I need done I do in my LaTeX
editor. Perhaps there isn't a nice tex format out of the box for MI data.
Once you write some nice code, you could keep reusing it or better yet
package it :)

On Sunday, August 18, 2013, Francesco Sarracino wrote:

 That's right!
 Your advice is in the right direction and with little adjustments it did
 the job. However, I admit it was tricky and the result looks a bit
 artisanal and needs some polishing that I will do by hand in the tex code.
 Is it possible that there is no way to get nicely latex formatted tables
 concerning multiply imputed data-set?
 But maybe I should open a separate thread on this.
 Thanks a lot for your kind and patient help.
 Best regards,
 f.


 On 18 August 2013 15:56, Christopher Desjardins 
 cddesjard...@gmail.comjavascript:_e({}, 'cvml', 'cddesjard...@gmail.com');
  wrote:

 Seems you're after the pooled results. Would the following work?



 library(Amelia)
 library(Zelig)
 library(xtable)

 data(africa)

 m = 10
 imp1 - amelia(x = africa,cs=country,m=m)
 imp2 - amelia(x = africa,cs=country,m=m)
  lm.imputed1 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp1)
 lm.imputed2 - zelig(gdp_pc ~ trade + civlib, model=ls,data = imp2)

 lm1 - as.data.frame(summary(lm.imputed1)$coef)
 lm2 - as.data.frame(summary(lm.imputed2)$coef)
 lm1[,2] - ifelse(lm1[,4].001,paste(lm1[,2],***,sep= ),
   ifelse(lm1[,4].01,paste(lm1[,2],**,sep= ),
  ifelse(lm1[,4].05,paste(lm1[,2],*,sep= ),
 ifelse(lm1[,4].1,paste(lm1[,2],.,sep=
 ),lm1[,2]

 lm2[,2] - ifelse(lm2[,4].001,paste(lm2[,2],***,sep= ),
   ifelse(lm2[,4].01,paste(lm2[,2],**,sep= ),
  ifelse(lm2[,4].05,paste(lm2[,2],*,sep= ),
 ifelse(lm2[,4].1,paste(lm2[,2],.,sep=
 ),lm2[,2]

 ## ONE OPTIONS ##
 lms - as.data.frame(cbind(lm1[,1],lm2[,1],lm1[,2],lm2[,2]))
 rownames(lms) - rownames(lm1)
 colnames(lms) - c(Imp1.Est,Imp2.Est,Imp1.SE,Imp2.SE)
 xtable(lms)

 ## OR ##
 xtable(cbind(lm1[,1:2],lm2[,1:2]))




 --
 Francesco Sarracino, Ph.D.
 fsarracino https://sites.google.com/site/fsarracino/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amelia, Zelig and latex in R

2013-08-17 Thread Christopher Desjardins
Does this do what you want?

library(Amelia)
library(Zelig)
library(stargazer)
library(xtable)

data(africa)

m = 10
imp1 - amelia(x = africa,cs=country,m=m)
imp2 - amelia(x = africa,cs=country,m=m)
lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1)
lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2)

# Stargazer
for(i in 1:m){
print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef)))
}

# xtable
for(i in 1:m){
  print(xtable(summary(lm.imputed1[[i]])))
  print(xtable(summary(lm.imputed2[[i]])))
}


On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino
f.sarrac...@gmail.comwrote:

 Dear listers,

 I am running some OLS on multiply imputed data using Amelia.
 I first imputed the data with Amelia.
 than I run a OLS using Zelig to obtain a table of results accounting for
 the multiply imputed data-sets. And I'd like to do this for various models.
 Finally, I want to output all the models in a table of results for latex.

 I've tried   with  Stargazer because it seems to support Zelig output, but
 when I run stargazer on a set of objects containing the output of zelig, I
 get the following error: Error: unrecognized object type.

 this message is repeated for each model I passed to Stargazer.

 I am sorry I can't provide a working example, because I should make up some
 multiply imputed data first. Hoewever, summarizing what I did is:

 imputed1 - amelia(x=data1, m=10)
 imputed2 - amelia(x=data2, m=10)
 lm.imputed1 - zelig(Y ~ X + Z, data = imputed1)
 lm.imputed2 - zelig(Y ~ X + Z, data = imputed2)
 stargazer(lm.imputed1, lm.imputed2)
 The outcome is the error I mentioned above.
 Thanks in advance for all the support you can offer.
 Regards,
 f.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amelia, Zelig and latex in R

2013-08-17 Thread Christopher Desjardins
What do you mean by results? Do you want just the estimated parameters? And
are you looking for one big table with all the estimated parameters from
all imputation runs?

Chris


On Sat, Aug 17, 2013 at 11:18 AM, Francesco Sarracino f.sarrac...@gmail.com
 wrote:

 Hi Christopher,
 thanks for your reply. Unfortunately, that's not what I am looking for. I
 would like to have a table with the results of the two models (lm.imputed1
 and lm.imputed2) in two separate columns.
 According to stargazer syntax I should type something like:
 stargazer(lm.imputed1, lm.imputed2, summary = FALSE)
 but then I get my error:
 Error: Unrecognized object type.

 Even though your example is insightful, I  can't  figure out how to solve
 my problem.
 Any advice is very welcome.
 Regards,
 f.


 On 17 August 2013 17:02, Christopher Desjardins cddesjard...@gmail.comwrote:

 Does this do what you want?

 library(Amelia)
 library(Zelig)
 library(stargazer)
 library(xtable)

 data(africa)

  m = 10
 imp1 - amelia(x = africa,cs=country,m=m)
 imp2 - amelia(x = africa,cs=country,m=m)
 lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1)
 lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2)

 # Stargazer
 for(i in 1:m){

 print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef)))
 }

 # xtable
 for(i in 1:m){
   print(xtable(summary(lm.imputed1[[i]])))
   print(xtable(summary(lm.imputed2[[i]])))
 }


 On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino 
 f.sarrac...@gmail.com wrote:

 Dear listers,

 I am running some OLS on multiply imputed data using Amelia.
 I first imputed the data with Amelia.
 than I run a OLS using Zelig to obtain a table of results accounting for
 the multiply imputed data-sets. And I'd like to do this for various
 models.
 Finally, I want to output all the models in a table of results for latex.

 I've tried   with  Stargazer because it seems to support Zelig output,
 but
 when I run stargazer on a set of objects containing the output of zelig,
 I
 get the following error: Error: unrecognized object type.

 this message is repeated for each model I passed to Stargazer.

 I am sorry I can't provide a working example, because I should make up
 some
 multiply imputed data first. Hoewever, summarizing what I did is:

 imputed1 - amelia(x=data1, m=10)
 imputed2 - amelia(x=data2, m=10)
 lm.imputed1 - zelig(Y ~ X + Z, data = imputed1)
 lm.imputed2 - zelig(Y ~ X + Z, data = imputed2)
 stargazer(lm.imputed1, lm.imputed2)
 The outcome is the error I mentioned above.
 Thanks in advance for all the support you can offer.
 Regards,
 f.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Francesco Sarracino, Ph.D.
 https://sites.google.com/site/fsarracino/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amelia, Zelig and latex in R

2013-08-17 Thread Christopher Desjardins
Oh and are you looking for just the summarized results over all the imputed
runs? i thought you wanted them from each iteration.



On Sat, Aug 17, 2013 at 11:38 AM, Christopher Desjardins 
cddesjard...@gmail.com wrote:

 What do you mean by results? Do you want just the estimated parameters?
 And are you looking for one big table with all the estimated parameters
 from all imputation runs?

 Chris


 On Sat, Aug 17, 2013 at 11:18 AM, Francesco Sarracino 
 f.sarrac...@gmail.com wrote:

 Hi Christopher,
 thanks for your reply. Unfortunately, that's not what I am looking for. I
 would like to have a table with the results of the two models (lm.imputed1
 and lm.imputed2) in two separate columns.
 According to stargazer syntax I should type something like:
 stargazer(lm.imputed1, lm.imputed2, summary = FALSE)
 but then I get my error:
 Error: Unrecognized object type.

  Even though your example is insightful, I  can't  figure out how to
 solve my problem.
 Any advice is very welcome.
 Regards,
 f.


 On 17 August 2013 17:02, Christopher Desjardins 
 cddesjard...@gmail.comwrote:

 Does this do what you want?

 library(Amelia)
 library(Zelig)
 library(stargazer)
 library(xtable)

 data(africa)

  m = 10
 imp1 - amelia(x = africa,cs=country,m=m)
 imp2 - amelia(x = africa,cs=country,m=m)
 lm.imputed1 - zelig(infl ~ trade + civlib, model=ls,data = imp1)
 lm.imputed2 - zelig(infl ~ trade + civlib, model=ls,data = imp2)

 # Stargazer
 for(i in 1:m){

 print(stargazer(as.data.frame(summary(lm.imputed1[[i]])$coef),as.data.frame(summary(lm.imputed2[[i]])$coef)))
 }

 # xtable
 for(i in 1:m){
   print(xtable(summary(lm.imputed1[[i]])))
   print(xtable(summary(lm.imputed2[[i]])))
 }


 On Sat, Aug 17, 2013 at 6:37 AM, Francesco Sarracino 
 f.sarrac...@gmail.com wrote:

 Dear listers,

 I am running some OLS on multiply imputed data using Amelia.
 I first imputed the data with Amelia.
 than I run a OLS using Zelig to obtain a table of results accounting for
 the multiply imputed data-sets. And I'd like to do this for various
 models.
 Finally, I want to output all the models in a table of results for
 latex.

 I've tried   with  Stargazer because it seems to support Zelig output,
 but
 when I run stargazer on a set of objects containing the output of
 zelig, I
 get the following error: Error: unrecognized object type.

 this message is repeated for each model I passed to Stargazer.

 I am sorry I can't provide a working example, because I should make up
 some
 multiply imputed data first. Hoewever, summarizing what I did is:

 imputed1 - amelia(x=data1, m=10)
 imputed2 - amelia(x=data2, m=10)
 lm.imputed1 - zelig(Y ~ X + Z, data = imputed1)
 lm.imputed2 - zelig(Y ~ X + Z, data = imputed2)
 stargazer(lm.imputed1, lm.imputed2)
 The outcome is the error I mentioned above.
 Thanks in advance for all the support you can offer.
 Regards,
 f.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Francesco Sarracino, Ph.D.
 https://sites.google.com/site/fsarracino/




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Randomly drop a percent of data from a data.frame

2013-08-16 Thread Christopher Desjardins
Hi,
I have the following data.

 set.seed(6245)
 data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 round(data,digits=3)
  x1 x2 x3 x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256 -0.069  0.354
4 -0.086  0.475  0.244  0.781
5  0.690 -0.181  1.274  1.633

What I would like to do is drop 20% of the data. But I want this 20% to
only come from dropping data from x3 and x4. It doesn't have to be evenly,
i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
observation has missing data on only one variable. I just want to drop 20%
of the data through x3 and x4 only.  In other words,

   x1 x2 x3 x4
1  0.482  1.320 -0.859 NA
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256  NA  0.354
4 -0.086  0.475  NA  0.781
5  0.690 -0.181  NA  1.633

OR

  x1 x2 x3 x4
1  0.482  1.320 NA -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256  NA  NA
4 -0.086  0.475  0.244  NA
5  0.690 -0.181  1.274  1.633

OR

  x1 x2 x3 x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063 NA
3  0.028 -0.256 -0.069 NA
4 -0.086  0.475  0.244 NA
5  0.690 -0.181  1.274 NA

ETC. are all fine.

Any ideas how I can do this?
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Randomly drop a percent of data from a data.frame

2013-08-16 Thread Christopher Desjardins
Hi,
Thanks for the help. What I actually ended up doing was writing a copy of
for loops and I ended up getting something works.
Thanks.
Chris


On Fri, Aug 16, 2013 at 4:34 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 May be this helps:
 #data1 (changed `data` to `data1`)
 set.seed(6245)
  data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  data1- round(data1,digits=3)

 data2- data1

 data1[,3:4]-lapply(data1[,3:4],function(x){x1-
 match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[
 is.na(x1)]-NA;x})
  data1
 #  x1 x2 x3 x4
 #1  0.482  1.320 NA -0.142
 #2 -0.753 -0.041 -0.063  0.886
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633


 #or
 data2[,3:4]-lapply(data2[,3:4],function(x){x1-
 match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[
 is.na(x1)]-NA;x})
  data2
 #  x1 x2 x3 x4
 #1  0.482  1.320 -0.859 -0.142
 #2 -0.753 -0.041 NA NA
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633
 A.K.



 - Original Message -
 From: Christopher Desjardins cddesjard...@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Friday, August 16, 2013 3:02 PM
 Subject: [R] Randomly drop a percent of data from a data.frame

 Hi,
 I have the following data.

  set.seed(6245)
  data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  round(data,digits=3)
   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256 -0.069  0.354
 4 -0.086  0.475  0.244  0.781
 5  0.690 -0.181  1.274  1.633

 What I would like to do is drop 20% of the data. But I want this 20% to
 only come from dropping data from x3 and x4. It doesn't have to be evenly,
 i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
 observation has missing data on only one variable. I just want to drop 20%
 of the data through x3 and x4 only.  In other words,

x1 x2 x3 x4
 1  0.482  1.320 -0.859 NA
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  0.354
 4 -0.086  0.475  NA  0.781
 5  0.690 -0.181  NA  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 NA -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  NA
 4 -0.086  0.475  0.244  NA
 5  0.690 -0.181  1.274  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063 NA
 3  0.028 -0.256 -0.069 NA
 4 -0.086  0.475  0.244 NA
 5  0.690 -0.181  1.274 NA

 ETC. are all fine.

 Any ideas how I can do this?
 Chris

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining data from different saved files with same object names into one data frame

2013-02-04 Thread Christopher Desjardins
Thanks Jim. That'll work for me!
Chris


On Mon, Feb 4, 2013 at 11:17 AM, jim holtman jholt...@gmail.com wrote:

 Try this storing them to a list with the j,k,l,m as an index:

 result - list()
 for(j in 1:2){
   for(k in 1:6){
 for(l in 1:2){
   for(m in 1:2){


 fsumstatZINB=paste(/Users/chris/Dropbox/phd/analysis/Simulation/zinbmodels/summaries/zinbsumstat-,j,k,l,m,.rdata,
 sep=)
 load(fsumstatZINB)
 print(zinbMeans[,1])
 result[[paste(j,k,l,m)]] - zinbMeans
   }
 }
   }
 }


 On Mon, Feb 4, 2013 at 12:02 PM, Christopher Desjardins
 cddesjard...@gmail.com wrote:
  Hi,
 
  I am simulating data from a zero-inflated negative binomial model. I
 have 4
  different conditions and I've saved various information about the models
 in
  zinbsumbstat-jklm.rdata (as shown below). What I want to do is print
  information from each of these *.rdata and store them in one large object
  (a matrix or a data frame). However, within each data the objects all
 have
  the same name. So I don't know how to combine these into one object.
 Below
  is the code that I could use to paste all the information into the
 console.
 
  for(j in 1:2){
for(k in 1:6){
  for(l in 1:2){
for(m in 1:2){
 
 
 fsumstatZINB=paste(/Users/chris/Dropbox/phd/analysis/Simulation/zinbmodels/summaries/zinbsumstat-,j,k,l,m,.rdata,
  sep=)
  load(fsumstatZINB)
  print(zinbMeans[,1])
}
  }
}
  }
 
  How could I combine all the information from these loops into a matrix()
 or
  a data.frame()?
 
  Two of the datasets [1,2] are pasted in a public Dropbox.
  Thanks!
  Chris
 
  [1] https://dl.dropbox.com/u/1501309/zinbsumstat-.rdata
  [2] https://dl.dropbox.com/u/1501309/zinbsumstat-1112.rdata
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simulation with cpm package

2012-11-13 Thread Christopher Desjardins
Hi,
I am running the following code based on the cpm vignette's code. I believe
the code is syntactically correct but it just seems to hang R. I can get
this to run if I set the sims to 100 but with 2000 it just hangs. Any ideas
why?
Thanks,
Chris

library(cpm)
cpmTypes - c(Kolmogorov-Smirnov,Mann-Whitney,Cramer-von-Mises)
changeMagnitudes - c(1, 2, 4, 5)
changeLocations - c(50,100,300)
sims - 2000
ARL0 - 500
startup - 20
results - list()
for (cpmType in cpmTypes) {
  results[[cpmType]] - matrix(numeric(length(changeMagnitudes) *
 length(changeLocations)), nrow =
length(changeMagnitudes))
  for (cm in 1:length(changeMagnitudes)) {
for (cl in 1:length(changeLocations)) {
  print(sprintf(cpm:%s magnitude::%s location:%s,
cpmType, changeMagnitudes[cm], changeLocations[cl]))
  temp - numeric(sims)
  for (s in 1:sims) {
x -c(rchisq(changeLocations[cl], df=3), rchisq(2000,

df=changeMagnitudes[cm]))
temp[s] -detectChangePoint(x, cpmType,
ARL0=ARL0,
startup=startup)$detectionTime
  }
  results[[cpmType]][cm,cl] - mean(temp[temp  changeLocations[cl]]) -
changeLocations[cl]
}
  } }

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing NAs in long format

2012-11-04 Thread Christopher Desjardins
Thanks these different examples work perfectly.
Chris

On Sat, Nov 3, 2012 at 8:32 PM, arun smartpink...@yahoo.com wrote:

 HI Bill,
 It is much simpler.
 # with aggregate() and merge()

 res1-with(dat2,aggregate(seq_len(nrow(dat2)),by=list(idr=idr),FUN=function(i)
 with(dat2[i,], any(schyear=5  year ==0
  res2-merge(dat2,res1,by=idr)
  colnames(res2)[4]-flag
  within(res2,{flag-as.integer(flag)})
  #idr schyear year flag
 #1   1   4   -11
 #2   1   501
 #3   1   611
 #4   1   721
 #5   2   900
 #6   2  1010
 #7   2  1120


 A.K.






 - Original Message -
 From: William Dunlap wdun...@tibco.com
 To: arun smartpink...@yahoo.com; Christopher Desjardins 
 cddesjard...@gmail.com
 Cc: R help r-help@r-project.org
 Sent: Saturday, November 3, 2012 9:21 PM
 Subject: RE: [R] Replacing NAs in long format

 Or, even simpler,

  flag - with(dat2, ave(schyear=5  year==0, idr, FUN=any))
  data.frame(dat2, flag)
   idr schyear year  flag
 1   1   4   -1  TRUE
 2   1   50  TRUE
 3   1   61  TRUE
 4   1   72  TRUE
 5   2   90 FALSE
 6   2  101 FALSE
 7   2  112 FALSE

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of William Dunlap
  Sent: Saturday, November 03, 2012 5:38 PM
  To: arun; Christopher Desjardins
  Cc: R help
  Subject: Re: [R] Replacing NAs in long format
 
  ave() or split-() can make that easier to write, although it
  may take some time to internalize the idiom.  E.g.,
 
 flag - rep(NA, nrow(dat2)) # add as.integer if you prefer 1,0 over
 TRUE,FALSE
 split(flag, dat2$idr) - lapply(split(dat2, dat2$idr),
 function(d)with(d, any(schyear=5 
  year==0)))
 data.frame(dat2, flag)
  idr schyear year  flag
1   1   4   -1  TRUE
2   1   50  TRUE
3   1   61  TRUE
4   1   72  TRUE
5   2   90 FALSE
6   2  101 FALSE
7   2  112 FALSE
  or
 ave(seq_len(nrow(dat2)), dat2$idr, FUN=function(i)with(dat2[i,],
 any(schyear=5 
  year==0)))
[1] 1 1 1 1 0 0 0
 flag - ave(seq_len(nrow(dat2)), dat2$idr,
 FUN=function(i)with(dat2[i,],
  any(schyear=5  year==0)))
 data.frame(dat2, flag)
  idr schyear year flag
1   1   4   -11
2   1   501
3   1   611
4   1   721
5   2   900
6   2  1010
7   2  1120
 
  Bill Dunlap
  Spotfire, TIBCO Software
  wdunlap tibco.com
 
 
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:
 r-help-boun...@r-project.org] On Behalf
   Of arun
   Sent: Saturday, November 03, 2012 5:01 PM
   To: Christopher Desjardins
   Cc: R help
   Subject: Re: [R] Replacing NAs in long format
  
   Hi,
   May be this helps:
   dat2-read.table(text=
   idr  schyear  year
   14  -1
   150
   161
   172
   290
   2101
   211  2
   ,sep=,header=TRUE)
  
dat2$flag-unlist(lapply(split(dat2,dat2$idr),function(x)
   rep(ifelse(any(apply(x,1,function(x) x[2]=5 
  x[3]==0)),1,0),nrow(x))),use.names=FALSE)
dat2
   #  idr schyear year flag
   #1   1   4   -11
   #2   1   501
   #3   1   611
   #4   1   721
   #5   2   900
   #6   2  1010
   #7   2  1120
   A.K.
  
  
  
  
   - Original Message -
   From: Christopher Desjardins cddesjard...@gmail.com
   To: jim holtman jholt...@gmail.com
   Cc: r-help@r-project.org
   Sent: Saturday, November 3, 2012 7:09 PM
   Subject: Re: [R] Replacing NAs in long format
  
   I have a similar sort of follow up and I bet I could reuse some of this
   code but I'm not sure how.
  
   Let's say I want to create a flag that will be equal to 1 if schyear
  = 5
   and year = 0 for a given idr. For example
  
dat
  
   idr   schyear   year
   1 4   -1
   1 50
   1 61
   1 72
   2 90
   2101
   211   2
  
   How could I make the data look like this?
  
   idr   schyear   year   flag
   1 4   -1 1
   1 50 1
   1 61 1
   1 72 1
   2 90 0
   21010
   211   2 0
  
  
   I am not sure how to end up not getting both 0s and 1s for the 'flag'
   variable for an idr. For example,
  
   dat$flag = ifelse(schyear = 5  year ==0, 1, 0)
  
   Does not work because it will create:
  
   idr   schyear   year   flag
   1 4   -1 0
   1 50 1
   1 6

[R] Replacing NAs in long format

2012-11-03 Thread Christopher Desjardins
Hi,
I have the following data:

 data[1:20,c(1,2,20)]
idr  schyear year
1   80
1   91
1  10   NA
2   4   NA
2   5   -1
2   60
2   71
2   82
2   93
2  104
2  11   NA
2  126
3   4   NA
3   5   -2
3   6   -1
3   70
3   81
3   92
3  103
3  11   NA

What I want to do is replace the NAs in the year variable with the
following:

idr  schyear year
1   80
1   91
1  10   2
2   4   -2
2   5   -1
2   60
2   71
2   82
2   93
2  104
2  11   5
2  126
3   4   -3
3   5   -2
3   6   -1
3   70
3   81
3   92
3  103
3  11   4

I have no idea how to do this. What it needs to do is make sure that for
each subject (idr) that it either adds a 1 if it is preceded by a value in
year or subtracts a 1 if it comes before a year value.

Does that make sense? I could do this in Excel but I am at a loss for how
to do this in R. Please reply to me as well as the list if you respond.

Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing NAs in long format

2012-11-03 Thread Christopher Desjardins
Hi Jim,
Thank you so much. That does exactly what I want.
Chris

On Sat, Nov 3, 2012 at 1:30 PM, jim holtman jholt...@gmail.com wrote:

  x - read.table(text = idr  schyear year
 +  1   80
 +  1   91
 +  1  10   NA
 +  2   4   NA
 +  2   5   -1
 +  2   60
 +  2   71
 +  2   82
 +  2   93
 +  2  104
 +  2  11   NA
 +  2  126
 +  3   4   NA
 +  3   5   -2
 +  3   6   -1
 +  3   70
 +  3   81
 +  3   92
 +  3  103
 +  3  11   NA, header = TRUE)
   # you did not specify if there might be multiple contiguous NAs,
   # so there are a lot of checks to be made
   x.l - lapply(split(x, x$idr), function(.idr){
 + # check for all NAs -- just return indeterminate state
 + if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr)
 + # repeat until all NAs have been fixed; takes care of contiguous ones
 + while (any(is.na(.idr$year))){
 + # find all the NAs
 + for (i in which(is.na(.idr$year))){
 + if ((i == 1L)  (!is.na(.idr$year[i + 1L]))){
 + .idr$year[i] - .idr$year[i + 1L] - 1
 + } else if ((i  1L)  (!is.na(.idr$year[i - 1L]))){
 + .idr$year[i] - .idr$year[i - 1L] + 1
 + } else if ((i  nrow(.idr))  (!is.na(.idr$year[i + 1L]))){
 + .idr$year[i] - .idr$year[i + 1L] -1
 + }
 + }
 + }
 + return(.idr)
 + })
  do.call(rbind, x.l)
  idr schyear year
 1.11   80
 1.21   91
 1.31  102
 2.42   4   -2
 2.52   5   -1
 2.62   60
 2.72   71
 2.82   82
 2.92   93
 2.10   2  104
 2.11   2  115
 2.12   2  126
 3.13   3   4   -3
 3.14   3   5   -2
 3.15   3   6   -1
 3.16   3   70
 3.17   3   81
 3.18   3   92
 3.19   3  103
 3.20   3  114
 
 


 On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins
 cddesjard...@gmail.com wrote:
  Hi,
  I have the following data:
 
  data[1:20,c(1,2,20)]
  idr  schyear year
  1   80
  1   91
  1  10   NA
  2   4   NA
  2   5   -1
  2   60
  2   71
  2   82
  2   93
  2  104
  2  11   NA
  2  126
  3   4   NA
  3   5   -2
  3   6   -1
  3   70
  3   81
  3   92
  3  103
  3  11   NA
 
  What I want to do is replace the NAs in the year variable with the
  following:
 
  idr  schyear year
  1   80
  1   91
  1  10   2
  2   4   -2
  2   5   -1
  2   60
  2   71
  2   82
  2   93
  2  104
  2  11   5
  2  126
  3   4   -3
  3   5   -2
  3   6   -1
  3   70
  3   81
  3   92
  3  103
  3  11   4
 
  I have no idea how to do this. What it needs to do is make sure that for
  each subject (idr) that it either adds a 1 if it is preceded by a value
 in
  year or subtracts a 1 if it comes before a year value.
 
  Does that make sense? I could do this in Excel but I am at a loss for how
  to do this in R. Please reply to me as well as the list if you respond.
 
  Thanks!
  Chris
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing NAs in long format

2012-11-03 Thread Christopher Desjardins
I have a similar sort of follow up and I bet I could reuse some of this
code but I'm not sure how.

Let's say I want to create a flag that will be equal to 1 if schyear   = 5
and year = 0 for a given idr. For example

 dat

idr   schyear   year
1 4   -1
1 50
1 61
1 72
2 90
2101
211   2

How could I make the data look like this?

idr   schyear   year   flag
1 4   -1 1
1 50 1
1 61 1
1 72 1
2 90 0
21010
211   2 0


I am not sure how to end up not getting both 0s and 1s for the 'flag'
variable for an idr. For example,

dat$flag = ifelse(schyear = 5  year ==0, 1, 0)

Does not work because it will create:

idr   schyear   year   flag
1 4   -1 0
1 50 1
1 61 0
1 72 0
2 90 0
21010
211   2 0

And thus flag changes for an idr. Which it shouldn't.

Thanks,
Chris


On Sat, Nov 3, 2012 at 5:50 PM, Christopher Desjardins 
cddesjard...@gmail.com wrote:

 Hi Jim,
 Thank you so much. That does exactly what I want.
 Chris


 On Sat, Nov 3, 2012 at 1:30 PM, jim holtman jholt...@gmail.com wrote:

  x - read.table(text = idr  schyear year
 +  1   80
 +  1   91
 +  1  10   NA
 +  2   4   NA
 +  2   5   -1
 +  2   60
 +  2   71
 +  2   82
 +  2   93
 +  2  104
 +  2  11   NA
 +  2  126
 +  3   4   NA
 +  3   5   -2
 +  3   6   -1
 +  3   70
 +  3   81
 +  3   92
 +  3  103
 +  3  11   NA, header = TRUE)
   # you did not specify if there might be multiple contiguous NAs,
   # so there are a lot of checks to be made
   x.l - lapply(split(x, x$idr), function(.idr){
 + # check for all NAs -- just return indeterminate state
 + if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr)
 + # repeat until all NAs have been fixed; takes care of contiguous
 ones
 + while (any(is.na(.idr$year))){
 + # find all the NAs
 + for (i in which(is.na(.idr$year))){
 + if ((i == 1L)  (!is.na(.idr$year[i + 1L]))){
 + .idr$year[i] - .idr$year[i + 1L] - 1
 + } else if ((i  1L)  (!is.na(.idr$year[i - 1L]))){
 + .idr$year[i] - .idr$year[i - 1L] + 1
 + } else if ((i  nrow(.idr))  (!is.na(.idr$year[i +
 1L]))){
 + .idr$year[i] - .idr$year[i + 1L] -1
 + }
 + }
 + }
 + return(.idr)
 + })
  do.call(rbind, x.l)
  idr schyear year
 1.11   80
 1.21   91
 1.31  102
 2.42   4   -2
 2.52   5   -1
 2.62   60
 2.72   71
 2.82   82
 2.92   93
 2.10   2  104
 2.11   2  115
 2.12   2  126
 3.13   3   4   -3
 3.14   3   5   -2
 3.15   3   6   -1
 3.16   3   70
 3.17   3   81
 3.18   3   92
 3.19   3  103
 3.20   3  114
 
 


 On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins
 cddesjard...@gmail.com wrote:
  Hi,
  I have the following data:
 
  data[1:20,c(1,2,20)]
  idr  schyear year
  1   80
  1   91
  1  10   NA
  2   4   NA
  2   5   -1
  2   60
  2   71
  2   82
  2   93
  2  104
  2  11   NA
  2  126
  3   4   NA
  3   5   -2
  3   6   -1
  3   70
  3   81
  3   92
  3  103
  3  11   NA
 
  What I want to do is replace the NAs in the year variable with the
  following:
 
  idr  schyear year
  1   80
  1   91
  1  10   2
  2   4   -2
  2   5   -1
  2   60
  2   71
  2   82
  2   93
  2  104
  2  11   5
  2  126
  3   4   -3
  3   5   -2
  3   6   -1
  3   70
  3   81
  3   92
  3  103
  3  11   4
 
  I have no idea how to do this. What it needs to do is make sure that for
  each subject (idr) that it either adds a 1 if it is preceded by a value
 in
  year or subtracts a 1 if it comes before a year value.
 
  Does that make sense? I could do this in Excel but I am at a loss for
 how
  to do this in R. Please reply to me as well as the list if you respond.
 
  Thanks!
  Chris
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data

[R] New Submission to CRAN note

2012-09-21 Thread Christopher Desjardins
Hi,
I want to submit a package to CRAN and I am getting the following Note:

* checking CRAN incoming feasibility ... NOTE
New submission

How can I take care of this? And/or is it a big deal?


Thanks and sorry if this is something that I easily overlooked I have
googled this topic for a while and found nothing.

Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hat matrix for zeroinfl and hurdle objects

2012-08-22 Thread Christopher Desjardins
Hi,

I am wondering if there is an easy way to access the hat matrix for
zeroinfl and hurdle objects in the pscl library?

Thanks,
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing ungrouped cases to grouped cases

2012-07-20 Thread Christopher Desjardins
Thanks the aggregate() command is what I was looking for.
Chris

On Thu, Jul 19, 2012 at 9:03 PM, David L Carlson dcarl...@tamu.edu wrote:

  dtf - read.table(text=y A   B   C
 + 0 11   2
 + 0 12   1
 + 1 11   2
 + 0 11   2
 + 1 11   2
 + 1 12   1
 + 0 12   2,
 + header=TRUE)
  dtagroup - aggregate(y~A+B+C, dtf, sum)
 # Gets you the groups. If you need the column/row order:

  dtagroup - dtagroup[order(dtagroup$y, decreasing=TRUE),c(4, 1:3)]

 --
 David L Carlson
 Associate Professor of Anthropology
 Texas AM University
 College Station, TX 77843-4352

  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Christopher Desjardins
  Sent: Thursday, July 19, 2012 7:35 PM
  To: R help
  Subject: [R] Changing ungrouped cases to grouped cases
 
  Hi,
  I have my data the following way:
 
  y A   B   C
  0 11   2
  0 12   1
  1 11   2
  0 11   2
  1 11   2
  1 12   1
  0 12   2
  .
  .
  .
  And so on.  How can I make my data look like the following:
  y   A  B  C
  2   1   1  2
  1   1   2  1
  0   1   2   2
  .
  .
  .
 
  In other words how can I change my ungrouped cases into grouped cases?
  Thanks!
  Chris
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing ungrouped cases to grouped cases

2012-07-20 Thread Christopher Desjardins
As a follow up is there any way to also get the count for each combination?
For example

 y A   B   C
 0 11   2
 0 12   1
 1 11   2
 0 11   2
 1 11   2
 1 12   1
 0 12   2

Should become:

 y   A  B  C  count
 2   1   1  24
 1   1   2  12
 0   1   2   2   1
.
.
.
So I would know there were 2 successes out of 4.
Thanks!
Chris



On Fri, Jul 20, 2012 at 10:41 AM, Christopher Desjardins 
cddesjard...@gmail.com wrote:

 Thanks the aggregate() command is what I was looking for.
 Chris


 On Thu, Jul 19, 2012 at 9:03 PM, David L Carlson dcarl...@tamu.eduwrote:

  dtf - read.table(text=y A   B   C
 + 0 11   2
 + 0 12   1
 + 1 11   2
 + 0 11   2
 + 1 11   2
 + 1 12   1
 + 0 12   2,
 + header=TRUE)
  dtagroup - aggregate(y~A+B+C, dtf, sum)
 # Gets you the groups. If you need the column/row order:

  dtagroup - dtagroup[order(dtagroup$y, decreasing=TRUE),c(4, 1:3)]

 --
 David L Carlson
 Associate Professor of Anthropology
 Texas AM University
 College Station, TX 77843-4352

  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Christopher Desjardins
  Sent: Thursday, July 19, 2012 7:35 PM
  To: R help
  Subject: [R] Changing ungrouped cases to grouped cases
 
  Hi,
  I have my data the following way:
 
  y A   B   C
  0 11   2
  0 12   1
  1 11   2
  0 11   2
  1 11   2
  1 12   1
  0 12   2
  .
  .
  .
  And so on.  How can I make my data look like the following:
  y   A  B  C
  2   1   1  2
  1   1   2  1
  0   1   2   2
  .
  .
  .
 
  In other words how can I change my ungrouped cases into grouped cases?
  Thanks!
  Chris
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changing ungrouped cases to grouped cases

2012-07-19 Thread Christopher Desjardins
Hi,
I have my data the following way:

y A   B   C
0 11   2
0 12   1
1 11   2
0 11   2
1 11   2
1 12   1
0 12   2
.
.
.
And so on.  How can I make my data look like the following:
y   A  B  C
2   1   1  2
1   1   2  1
0   1   2   2
.
.
.

In other words how can I change my ungrouped cases into grouped cases?
Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding the column with the maximum value by row

2012-07-17 Thread Christopher Desjardins
Hi,
Let's say I have the following data:

 a=matrix(c(1,2,4,4,2,1,1,2,4),nrow=3,byrow=T)

 a

 [,1] [,2] [,3]

[1,]124

[2,]421

[3,]124

What syntax should I use to get R to tell me the column that corresponds to
the maximum value for each row?

For my example, I would like to get a vector that says 3, 1, 3 because the
maximum value for row 1 is in column 3, the maximum value for row 2 is in
column 1, and the maximum value for row 3 is in column 3.

Does that make sense?

Please cc me if you reply!

Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting predicted values from a zero-inflated negative binomial using zeroinfl()

2012-05-05 Thread Christopher Desjardins
Hi,
I am a little confused at the output from predict() for a zeroinfl object.

Here's my confusion:

## From zeroinfl package
fm_zinb2 - zeroinfl(art ~ . | ., data = bioChemists, dist = negbin)


## The raw zero-inflated overdispersed data
 table(bioChemists$art)

  0   1   2   3   4   5   6   7   8   9  10  11  12  16  19
275 246 178  84  67  27  17  12   1   2   1   1   2   1   1

## The default output from predict. It looks like it is doing a horrible
job. Does it really predict 7 zeros?
 table(round(predict(fm_zinb2)))

  0   1   2   3   4   5   6  10
  7 354 487  45  12   6   3   1

##  The output from predict using count
 table(round(predict(fm_zinb2,type=count)))

  1   2   3   4   5   6  10
312 536  45  12   6   3   1

## The output from predict using zero, but here it predicts 24
structural zeros?
 table(round(predict(fm_zinb2,type=zero)))

  0   1
891  24


So my question is how do I interpret these different outputs from the
zeroinf object? What are the differences? The help page just left me
confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y)
and would most accurately track table(bioChemists$art) but I am wrong. How
can I find the E(Y) that would most closely track the raw data?

Please cc me if you reply.
Thanks,
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lambert (1992) simulation

2012-04-27 Thread Christopher Desjardins
On Fri, Apr 27, 2012 at 1:53 AM, Achim Zeileis achim.zeil...@uibk.ac.atwrote:

 On Thu, 26 Apr 2012, Christopher Desjardins wrote:

  Hi,
 I am trying to replicate Lambert (1992)'s simulation with zero-inflated
 Poisson models. The citation is here:

 @article{lambert1992zero,
 Author = {Lambert, D.},
 Journal = {Technometrics},
 Pages = {1--14},
 Publisher = {JSTOR},
 Title = {Zero-inflated {P}oisson regression, with an application to
 defects
 in manufacturing},
 Year = {1992}}

 Specifically I am trying to recreate Table 2. But my estimates for Gamma
 are not working out. Any ideas why?


 Your implementation of the inverse logit link is wrong, it should be
 exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by
 hand but either use qlogis()/plogis() or make.link(logit).



Doh! So obvious. I really should have noticed that. Thanks! I'll have a
look at the rest of your code too.
Cheers,
Chris



 Your setting resulting in an almost constant probability of zero inflation
 and hence gamma was completely off.

 Below is my cleaned up code which nicely replicates the results for n =
 100. The other two would require additional work because one would need to
 catch non-convergence etc.

 hth,
 Z

 ## data-generating process
 dgp - function(nobs = 100) {
  gamma - c(-1.5, 2)
  beta - c(1.5, -2)
  x - seq(0, 1, length.out = nobs)
  lambda - exp(beta[1] + beta[2] * x)
  p - plogis(gamma[1] + gamma[2] * x)
  y - ifelse(runif(nobs) = p, 0, rpois(nobs, lambda = lambda))
  data.frame(y = y, x = x)
 }

 ## simulation of coefficients and standard errors
 sim - function(nrep = 2000, ...) {
  onesim - function(i) {
d - dgp(...)
m - zeroinfl(y ~ x, data = d)
c(coef(m), sqrt(diag(vcov(m
  }
  t(sapply(1:nrep, onesim))
 }

 ## run simulation #3
 library(pscl)
 set.seed(1)
 cfse - sim(nobs = 100)

 ## mean coefficient estimates
 apply(cfse[, 1:4], 2, mean)

 ## median coefficient estimates
 apply(cfse[, 1:4], 2, median)

 ## sd of coefficient estimates
 apply(cfse[, 1:4], 2, sd)

 ## mean of the standard deviation estimates from observed information
 apply(cfse[, 5:8], 2, mean)



  Please cc me on a reply!
 Thanks,
 Chris

 ## ZIP simulations based on Lambert (1992)'s conditions

 # Empty workspace
 rm(list=ls())

 # Load zero-inflation package
 library(pscl)

 # Simulation conditions #3
 NumSimulations=2000
 X=seq(from=0,to=1,length.out=**N)
 Model.Matrix=cbind(rep(1,**length(X)),X)
 Gamma=c(-1.5,2)
 Beta=c(1.5,-2)
 P=exp(Model.Matrix%*%Gamma)/**exp(1+Model.Matrix%*%Gamma)
 Lambda=exp(Model.Matrix%*%**Beta)
 CoefSimulations=matrix(nrow=**NumSimulations,ncol=2*dim(**
 Model.Matrix)[2])

 for(i in 1 : NumSimulations){
 Lambda.Draw=rpois(N,Lambda)
 U=runif(N)
 Y=ifelse(U=P,0,Lambda.Draw)
 CoefSimulations[i,]=coef(**zeroinfl(Y~X|X))
 }

 # What were the estimates?
 colMeans(CoefSimulations) # My gamma estimates aren't even close ...

[[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lambert (1992) simulation

2012-04-26 Thread Christopher Desjardins
Hi,
I am trying to replicate Lambert (1992)'s simulation with zero-inflated
Poisson models. The citation is here:

@article{lambert1992zero,
Author = {Lambert, D.},
Journal = {Technometrics},
Pages = {1--14},
Publisher = {JSTOR},
Title = {Zero-inflated {P}oisson regression, with an application to defects
in manufacturing},
Year = {1992}}

Specifically I am trying to recreate Table 2. But my estimates for Gamma
are not working out. Any ideas why? Please cc me on a reply!
Thanks,
Chris

## ZIP simulations based on Lambert (1992)'s conditions

# Empty workspace
rm(list=ls())

# Load zero-inflation package
library(pscl)

# Simulation conditions #3
NumSimulations=2000
X=seq(from=0,to=1,length.out=N)
Model.Matrix=cbind(rep(1,length(X)),X)
Gamma=c(-1.5,2)
Beta=c(1.5,-2)
P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma)
Lambda=exp(Model.Matrix%*%Beta)
CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2])

for(i in 1 : NumSimulations){
Lambda.Draw=rpois(N,Lambda)
U=runif(N)
Y=ifelse(U=P,0,Lambda.Draw)
CoefSimulations[i,]=coef(zeroinfl(Y~X|X))
}

# What were the estimates?
colMeans(CoefSimulations) # My gamma estimates aren't even close ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] For loops

2012-04-09 Thread Christopher Desjardins
Hi,
I am having trouble with syntax for a for loop. Here is what I am trying to
do.

class=c(rep(1,3),rep(2,3),rep(3,3))
out1=rnorm(length(class))
out2=rnorm(length(class))
out3=rnorm(length(class))
data=data.frame(class,out1,out2,out3)

dat.split=split(data,data$class)
  for(i in 1:3){
  sub[i]=dat.split[i]
  }

However, the for loop doesn't work. I want to assign each split to a
different data object. Better yet, how I could assign each class to a
separate object and skip the splitting?

Thanks,
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simplifying my code

2011-11-27 Thread Christopher Desjardins
Hi,
I have a pretty simple problem. Here is the code:

dat1=complete(dat.mice,1)

dat2=complete(dat.mice,2)

dat3=complete(dat.mice,3)

dat4=complete(dat.mice,4)

dat5=complete(dat.mice,5)

dat6=complete(dat.mice,6)

dat7=complete(dat.mice,7)

dat8=complete(dat.mice,8)

dat9=complete(dat.mice,9)

dat10=complete(dat.mice,10)

dat11=complete(dat.mice,11)

dat12=complete(dat.mice,12)

dat13=complete(dat.mice,13)

dat14=complete(dat.mice,14)

dat15=complete(dat.mice,15)

dat16=complete(dat.mice,16)

dat17=complete(dat.mice,17)

dat18=complete(dat.mice,18)

dat19=complete(dat.mice,19)

dat20=complete(dat.mice,20)


I would like to simplify this into a for loop. I thought this would work:

for(i in 1:20){

dat[i]=complete(dat.mice,[i]

}

But it doesn't. Any tips?

Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SPSS F-test on change in R square between hierarchical models

2011-11-23 Thread Christopher Desjardins
Hi,

I am wondering if anyone knows how to perform an F-test on the change in R
square between hierarchical models in R? SPSS provides this information and
a researcher that I am working with is interested in getting this
information. Alternatively, if someone knows how I can calculate the test
statistic (SPSS calls it F-change?) and dfs that would be helpful as well.

The output and the test I am looking for is available here -
http://dl.dropbox.com/u/1501309/spss_rsquared.pdf

Chris

On Wed, Nov 23, 2011 at 6:20 AM, Smart Guy smartgu...@gmail.com wrote:

 Hi All,
   I was adding a new row of data to my data frame using rbind(). I
 was surprised to see that after adding new row, I lost my data frame level
 attibute as well as  col level attribute. Please help me to insert a new
 row at frist or middle position so that my custom attribute is not lost.

  Here is what I did.

 age-c(15,20,18)
 weight-c(40,42,30)

 ### creating my data frame 
 mydata - data.frame(age,weight)

 ### creating data frame level attribute 
 attr(mydata,myattr)-c(myinfo)

 ### creating col level attribute for 'age' column  ###
 attr(mydata$age,mycolattr)-c(mycolinfo)

  Checking attributes  ###
 attributes(mydata)
 attributes(mydata$age)

 ### creating new row  #
 newrow - data.frame(age=16, weight= 42)

  Inserting newrow as first row to my data frame 
 mydata- rbind(newrow, mydata)

  Checking attributes again ###  I lost my custom attributes
 attributes(mydata$age)
 attributes(mydata)


 Thanks in advance,


 --
 SG

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting just a portion of a smoother graph in ggplot2

2011-08-04 Thread Christopher Desjardins
Hi,
I am using ggplot2 to with the following code:

gmathk2 -
qplot(time,math,colour=Kids,data=kids.ach.lm.k5,geom=smooth,method=lm,formula=y~ns(x,1))
+ opts(title=Smoother Plot: Math K-5) + xlab(Time) + ylab(Math) +
scale_colour_brewer(pal=Set1); gmathk2

This plots all the smoother for all the x values. What I'd like to do is
plot the smoother for the x values that are only greater than or equal to 0.
I don't want this:

gmathk2 -
qplot(time,math,colour=Kids,data=kids.ach.lm.k5,geom=smooth,method=lm,formula=y~ns(x,1))
+ opts(title=Smoother Plot: Math K-5) + xlab(Time) + ylab(Math)  +
scale_colour_brewer(pal=Set1) + xlim(0,50); gmathk2

Because adding xlim seems to throw away the data below 0 when calculating
the smoother. What I want it to do is have ggplot2 give me the same graph as
the first command but just not plot the part of the smoother that is below
0.

Thanks,
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Suppressing the labelling of tick marks on ggplot2

2011-07-09 Thread Christopher Desjardins
Hi,
I have the follow ggplot2 code I am running:

ggplot(data=bb.res.math,aes(x=factor(id.bb),y=bb.math.comb,fill=BB)) + 
geom_bar() + facet_grid(BB~.) + scale_fill_brewer(pal=Set1) + ylab(Average 
Student Residual (Math)) + xlab(Student ID)

The number of unique id.bb is 2207 and so my X-axis has a couple of thousands, 
indistinguishable tick marks that correspond to each id. I am wondering if 
there is a way to suppress these tick marks?

Here is some test data:

 test
  id.bb bb.res.m bb.math.comb BB
1  29001   -3.0032235 BB
2  29041   -1.8758770 BB
3  346910.8182852 BB
4  42131   -2.0660215 Not BB
5  42151   -4.8622086 BB
6  439510.8873996 Not BB

 ggplot(data=test,aes(x=factor(id.bb),y=bb.math.comb,fill=BB)) + geom_bar() + 
 facet_grid(BB~.) + scale_fill_brewer(pal=Set1) + ylab(Average Student 
 Residual (Math)) + xlab(Student ID)


Gives the following graph:

http://dl.dropbox.com/u/1501309/ggplot2_suppress_ids.pdf

I'd like to prevent 2900, 2904, 3469, etc. from showing up on the X-axis.

Thanks,
Chris
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence bands in ggplot2

2011-07-07 Thread Christopher Desjardins
Hi,
I have the following data:

 est
 sch190  sch107  sch290  sch256  sch287  sch130  
sch139 
 4.16656026  2.64306071  4.22579866  6.12024789  4.49624748 11.12799127  
1.17353917 
 sch140  sch282  sch161  sch193  sch156  sch288  
sch352 
 3.48197696 -0.29659410 -1.99194986 10.23489859  7.77342138  6.77624539  
9.66795001 
 sch368  sch225  sch301  sch105  sch353  sch291  
sch179 
 7.20229569  4.41989204  5.61586860  5.99460203 -2.65019242 -9.42614560 
-0.25874193 
 sch134  sch135  sch324  sch360 bb1 
 3.26432479 10.52555091 -0.09637968  2.49668858 -3.24173545 

 se
   sch190sch107sch290sch256sch287sch130sch139sch140 
 3.165127  3.710750  4.680911  6.335386  3.896302  4.907679  4.426284  4.266303 
   sch282sch161sch193sch156sch288sch352sch368sch225 
 3.303747  4.550193  3.995261  5.787374  5.017278  7.820763  7.253183  4.483988 
   sch301sch105sch353sch291sch179sch134sch135sch324 
 4.076570  7.564359 10.456522  5.705474  4.247927  5.671536 10.567093  4.138356 
   sch360   bb1 
 4.943779  1.935142 

 sch
 [1] 190 107 290 256 287 130 139 140 282 161 193 156 
288
[14] 352 368 225 301 105 353 291 179 134 135 324 360 
BB 


From this data I have created 95% confidence intervals assuming a normal 
distribution. 

lower.95ci - est - se*qnorm(.975) 
upper.95ci - est + se*qnorm(.975) 

What I'd like to do is plot the estimate (est) and have lines attach to the 
points located in lower.95ci and upper.95ci.  Presently I am doing the 
following:

qplot(x=as.factor(sch),y=lower.95ci) + 
geom_point(aes(x=as.factor(sch),y=upper.95ci),colour=black) + 
geom_point(aes(x=as.factor(sch), y=est),colour=red) + ylab(Value-Added) + 
xlab(School) + theme_bw() 

Which creates this graph ---  
http://dl.dropbox.com/u/1501309/value_added_test.pdf

That's fine except that it doesn't connect the points vertically. Does anyone 
know how I could make the 'black' points connect to the 'red' point, i.e. show 
confidence bands?

Thanks,
Chris


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence bands in ggplot2

2011-07-07 Thread Christopher Desjardins
Thanks that worked perfectly. One thing if I may. Is it possible to make the 
center dot red and the lines connecting the dots black?

Thanks,
Chris


On Jul 7, 2011, at 5:10 PM, Abhijit Dasgupta, PhD wrote:

 You can easily do this by:
 
 qplot(x=as.factor(sch),y=est, geom='point', colour='red') +
 geom_pointrange(aes(x=as.factor(sch), y=est, ymin=lower.95ci, 
 ymax=upper.95ci))+
 xlab('School') + ylab(Value-added)+theme_bw()
 
 
 
 
 On 07/07/2011 05:55 PM, Christopher Desjardins wrote:
 Hi,
 I have the following data:
 
 est
  sch190  sch107  sch290  sch256  sch287  sch130  
 sch139
  4.16656026  2.64306071  4.22579866  6.12024789  4.49624748 11.12799127  
 1.17353917
  sch140  sch282  sch161  sch193  sch156  sch288  
 sch352
  3.48197696 -0.29659410 -1.99194986 10.23489859  7.77342138  6.77624539  
 9.66795001
  sch368  sch225  sch301  sch105  sch353  sch291  
 sch179
  7.20229569  4.41989204  5.61586860  5.99460203 -2.65019242 -9.42614560 
 -0.25874193
  sch134  sch135  sch324  sch360 bb1
  3.26432479 10.52555091 -0.09637968  2.49668858 -3.24173545
 
 se
sch190sch107sch290sch256sch287sch130sch139
 sch140
  3.165127  3.710750  4.680911  6.335386  3.896302  4.907679  4.426284  
 4.266303
sch282sch161sch193sch156sch288sch352sch368
 sch225
  3.303747  4.550193  3.995261  5.787374  5.017278  7.820763  7.253183  
 4.483988
sch301sch105sch353sch291sch179sch134sch135
 sch324
  4.076570  7.564359 10.456522  5.705474  4.247927  5.671536 10.567093  
 4.138356
sch360   bb1
  4.943779  1.935142
 
 sch
  [1] 190 107 290 256 287 130 139 140 282 161 193 156 
 288
 [14] 352 368 225 301 105 353 291 179 134 135 324 360 
 BB
 
 
 From this data I have created 95% confidence intervals assuming a normal 
 distribution.
 
 lower.95ci- est - se*qnorm(.975)
 upper.95ci- est + se*qnorm(.975)
 
 What I'd like to do is plot the estimate (est) and have lines attach to the 
 points located in lower.95ci and upper.95ci.  Presently I am doing the 
 following:
 
 qplot(x=as.factor(sch),y=lower.95ci) + 
 geom_point(aes(x=as.factor(sch),y=upper.95ci),colour=black) + 
 geom_point(aes(x=as.factor(sch), y=est),colour=red) + ylab(Value-Added) 
 + xlab(School) + theme_bw()
 
 Which creates this graph ---   
 http://dl.dropbox.com/u/1501309/value_added_test.pdf
 
 That's fine except that it doesn't connect the points vertically. Does 
 anyone know how I could make the 'black' points connect to the 'red' point, 
 i.e. show confidence bands?
 
 Thanks,
 Chris
 
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Recoding several variables into one use the most recent data

2011-06-27 Thread Christopher Desjardins
Hi,
I have the following data management issue. I am trying to combine multiple
years of ethnicity data into one variable called ethnic. The data looks
similar to the following

idethnic07ethnic08   ethnic09ethnic10
1  1  1 11
2  1  1 22
3  3  4 4NA
4  2  3   NANA

So, what I'd like to do is create a variable called 'ethnic' and I'd like to
have this variable be filled with the most recent data available. So
ethnic10 would have the highest priority, then ethnic09, followed by
ethnic08, and finally ethnic07.  So the ethnic variable based on the data
above would look like the following:

ethnic
   1
   2
   4
   3

I thought an ifelse() statement might work but I seem to be writing over my
data every time I do this.

Thanks,
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change pattern in histograms in ggplot2

2011-05-18 Thread Christopher Desjardins
Hi,
I am wondering if there is a way to change the pattern of the fill in
histogram in ggplot2? By default the fill is solid and I'd like to add some
sort of pattern to make it more visible that these are different levels of a
factor.

Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changing order of facet grid in ggplot2

2011-05-18 Thread Christopher Desjardins
Hi I am running the following code:

   sym - c(sym1,sym2,sym4)

lifedxm - c(O-BD,O-WELL,O-UNI)

life - c(lifedxm,lifedxm,lifedxm)

tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN
, TP-CLIN, TP-CLIN)

data - data.frame(sym,life,tp)

qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) +
facet_grid(. ~ tp)



This creates a facet grid where TP-ANY is followed by TP-CLIN and  then
TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then
TP-CLIN.


Is this possible?


Thanks,

Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting number of students with zeroes in long format

2011-04-07 Thread Christopher Desjardins
Hi Jorge,
I want to make sure this does what I want.

So I want to get a count of students that never get a suspension. Once a
student has a non-zero I don't want to count that student. Each id_r is may
be associated with multiple sus. Are these commands doing this? Because ...

 suslm[175953:nrow(suslm),c(id_r,sus)]
   id_r  sus
999881.5 999881   1
999881.6 999881   7
999881.7 999881   0
999881.8 999881   0
999886.5 999886   0
999886.6 999886   0
999886.7 999886   0
999886.8 999886   0
999890.5 999890   0
999890.6 999890   0
999890.7 999890   0
999890.8 999890   0
999892.5 999892   0
999892.6 999892   0
999892.7 999892   0
999892.8 999892   0
999896.5 999896   0
999896.6 999896   4
999896.7 999896   3
999896.8 999896   0
999897.5 999897   0
999897.6 999897   0
999897.7 999897   0

 tail(with(suslm,tapply(sus,id_r,function(x) any(x==0
999881 999886 999890 999892 999896 999897
  TRUE   TRUE   TRUE   TRUE   TRUE   TRUE
 r - with(suslm, tapply(sus, id_r, function(x) any(x  0))
 tail(with(suslm, tapply(sus, id_r, function(x) any(x  0
999881 999886 999890 999892 999896 999897
  TRUE  FALSE  FALSE  FALSE   TRUE  FALSE

Based on this 999881 and 999896 should be FALSE not TRUE

I would expect if they were true for the first command they should be false
for the second command right?

 tail(names(r[ r == TRUE ]))
[1] 999752 999767 999806 999807 999881 999896
 tail(names(r[ r == FALSE ]))
[1] 999869 999870 999886 999890 999892 999897

This command seems to do the right thing. Is that right?


On Wed, Apr 6, 2011 at 10:25 PM, Jorge Ivan Velez
jorgeivanve...@gmail.comwrote:

 Hi Chris,

 Sorry I did not see your email before ;-)   Here is one option:

   r - with(d, tapply(sus, id_r, function(x) any(x  0)))
  r
111516181920212224252630
  3132
 FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
 FALSE FALSE
33
 FALSE
  names(r[ r == TRUE ])
 [1] 15 25

 Regards,
 Jorge


 On Wed, Apr 6, 2011 at 5:03 PM, Christopher Desjardins  wrote:

 Thanks. And how many could I find that have greater than 0?
 Chris


 On Wed, Apr 6, 2011 at 3:58 PM, Jorge Ivan Velez  wrote:

 Hi Chris,

 Is this what you have in mind?

  sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0
 [1] 13

 HTH,
 Jorge


 On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins  wrote:

 Hi,
 I have longitudinal school suspension data on students. I would like to
 figure out how many students (id_r) have no suspensions (sus), i.e. have
 a
 code of '0'. My data is in long format and the first 20 records look
 like
 the following:

  suslm[1:20,c(1,7)]
   id_r sus
   11   0
   15  10
   16   0
   18   0
   19   0
   19   0
   20   0
   21   0
   21   0
   22   0
   24   0
   24   0
   25   3
   26   0
   26   0
   30   0
   30   0
   31   0
   32   0
   33   0

 Each id_r is unique and I'd like to know the number of id_r that have a
 0
 for sus not the total number of 0. Does that make sense?
 Thanks!
 Chris

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting number of students with zeroes in long format

2011-04-07 Thread Christopher Desjardins
On Thu, Apr 7, 2011 at 8:07 AM, Christopher Desjardins 
cddesjard...@gmail.com wrote:

 Hi Jorge,
 I want to make sure this does what I want.

 So I want to get a count of students that never get a suspension. Once a
 student has a non-zero I don't want to count that student. Each id_r is may
 be associated with multiple sus. Are these commands doing this? Because ...


Also sus = suspension


  suslm[175953:nrow(suslm),c(id_r,sus)]
id_r  sus
 999881.5 999881   1
 999881.6 999881   7
 999881.7 999881   0
 999881.8 999881   0
 999886.5 999886   0
 999886.6 999886   0
 999886.7 999886   0
 999886.8 999886   0
 999890.5 999890   0
 999890.6 999890   0
 999890.7 999890   0
 999890.8 999890   0
 999892.5 999892   0
 999892.6 999892   0
 999892.7 999892   0
 999892.8 999892   0
 999896.5 999896   0
 999896.6 999896   4
 999896.7 999896   3
 999896.8 999896   0
 999897.5 999897   0
 999897.6 999897   0
 999897.7 999897   0
 
  tail(with(suslm,tapply(sus,id_r,function(x) any(x==0
 999881 999886 999890 999892 999896 999897
   TRUE   TRUE   TRUE   TRUE   TRUE   TRUE
  r - with(suslm, tapply(sus, id_r, function(x) any(x  0))
  tail(with(suslm, tapply(sus, id_r, function(x) any(x  0
 999881 999886 999890 999892 999896 999897
   TRUE  FALSE  FALSE  FALSE   TRUE  FALSE

 Based on this 999881 and 999896 should be FALSE not TRUE


for tail(with(suslm,tapply(sus,id_r,function(x) any(x==0


 I would expect if they were true for the first command they should be false
 for the second command right?

  tail(names(r[ r == TRUE ]))
 [1] 999752 999767 999806 999807 999881 999896
  tail(names(r[ r == FALSE ]))
 [1] 999869 999870 999886 999890 999892 999897

 This command seems to do the right thing. Is that right?


 On Wed, Apr 6, 2011 at 10:25 PM, Jorge Ivan Velez 
 jorgeivanve...@gmail.com wrote:

 Hi Chris,

 Sorry I did not see your email before ;-)   Here is one option:

   r - with(d, tapply(sus, id_r, function(x) any(x  0)))
  r
111516181920212224252630
  3132
 FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
 FALSE FALSE
33
 FALSE
  names(r[ r == TRUE ])
 [1] 15 25

 Regards,
 Jorge


 On Wed, Apr 6, 2011 at 5:03 PM, Christopher Desjardins  wrote:

  Thanks. And how many could I find that have greater than 0?
 Chris


 On Wed, Apr 6, 2011 at 3:58 PM, Jorge Ivan Velez  wrote:

 Hi Chris,

 Is this what you have in mind?

  sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0
 [1] 13

 HTH,
 Jorge


 On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins  wrote:

 Hi,
 I have longitudinal school suspension data on students. I would like to
 figure out how many students (id_r) have no suspensions (sus), i.e.
 have a
 code of '0'. My data is in long format and the first 20 records look
 like
 the following:

  suslm[1:20,c(1,7)]
   id_r sus
   11   0
   15  10
   16   0
   18   0
   19   0
   19   0
   20   0
   21   0
   21   0
   22   0
   24   0
   24   0
   25   3
   26   0
   26   0
   30   0
   30   0
   31   0
   32   0
   33   0

 Each id_r is unique and I'd like to know the number of id_r that have a
 0
 for sus not the total number of 0. Does that make sense?
 Thanks!
 Chris

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting number of students with zeroes in long format

2011-04-06 Thread Christopher Desjardins
Hi,
I have longitudinal school suspension data on students. I would like to
figure out how many students (id_r) have no suspensions (sus), i.e. have a
code of '0'. My data is in long format and the first 20 records look like
the following:

 suslm[1:20,c(1,7)]
   id_r sus
   11   0
   15  10
   16   0
   18   0
   19   0
   19   0
   20   0
   21   0
   21   0
   22   0
   24   0
   24   0
   25   3
   26   0
   26   0
   30   0
   30   0
   31   0
   32   0
   33   0

Each id_r is unique and I'd like to know the number of id_r that have a 0
for sus not the total number of 0. Does that make sense?
Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting number of students with zeroes in long format

2011-04-06 Thread Christopher Desjardins
On Wed, Apr 6, 2011 at 4:03 PM, Douglas Bates ba...@stat.wisc.edu wrote:

 On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins
 cddesjard...@gmail.com wrote:
  Hi,
  I have longitudinal school suspension data on students. I would like to
  figure out how many students (id_r) have no suspensions (sus), i.e. have
 a
  code of '0'. My data is in long format and the first 20 records look like
  the following:
 
  suslm[1:20,c(1,7)]
id_r sus
11   0
15  10
16   0
18   0
19   0
19   0
20   0
21   0
21   0
22   0
24   0
24   0
25   3
26   0
26   0
30   0
30   0
31   0
32   0
33   0
 
  Each id_r is unique and I'd like to know the number of id_r that have a 0
  for sus not the total number of 0. Does that make sense?

 You say you have longitudinal data so may we assum that a particular
 id_r can occur multiple times in the data set?


Yes an id_r can occur multiple times in the data set.


 It is not clear to me
 what you want the result to be for students who have no suspensions at
 one time but may have a suspension at another time.  Are you
 interested in the number of students who have only zeros in the sus
 column?


Yes. Once a student has a value other than zero I don't want to include that
student in the tally. So I want to know how many students never got
suspended during the study.



 One way to approach this task is to use tapply.  I would create a data
 frame and convert id_r to a factor.

 df - within(as.data.frame(suslm), id_r - factor(id_r))
 counts - with(df, lapply(sus, id_r, function(sus) all(sus == 0)))



I am getting the following message:

 df - within(as.data.frame(suslm), id_r - factor(id_r))
 counts - with(df, lapply(sus, id_r, function(sus) all(sus == 0)))
Error in get(as.character(FUN), mode = function, envir = envir) :
  object 'id_r' of mode 'function' was not found


Thanks,
Chris


 The tapply function will split the vector sus according to the levels
 of id_r and apply the function to the subvectors.

 I just say Jorge's response and he uses the same tactic but he is
 looking for students who had any value of sus==0


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quick recode of -999 to NA in R

2011-03-30 Thread Christopher Desjardins
Hi,
I am trying to write a loop to recode my data from -999 to NA in R. What's
the most efficient way to do this? Below is what I'm presently doing, which
is inefficient. Thanks,
Chris


   dat0 - read.table(time1.dat)

colnames(dat0) - c(e1dq, e1arcp, e1dev, s1prcp, s1nrcp, s1ints,
a1gpar, a1pias, a1devt)

dat0[dat0$e1dq==-999.,e1dq] - NA

dat0[dat0$e1arcp==-999.,e1arcp] - NA

dat0[dat0$e1dev==-999.,e1dev] - NA

dat0[dat0$s1prcp==-999.,s1prcp] - NA

dat0[dat0$s1nrcp==-999.,s1nrcp] - NA

dat0[dat0$s1ints==-999.,s1ints] - NA

dat0[dat0$a1gpar==-999.,a1gpar] - NA

dat0[dat0$a1pias==-999.,a1pias] - NA

dat0[dat0$a1devt==-999.,a1devt] - NA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quick recode of -999 to NA in R

2011-03-30 Thread Christopher Desjardins
Ah ... yes. I knew that but clearly didn't at the time of my question or
script writing.
Thanks,
Chris

On Wed, Mar 30, 2011 at 8:22 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Try:

 dat0 - read.table('tim1.dat', na = -999)


Ah ... yes. I knew that but clearly didn't at the time of my question or
script writing.
Thanks,
Chris


 On Wed, Mar 30, 2011 at 10:15 AM, Christopher Desjardins
 cddesjard...@gmail.com wrote:
  Hi,
  I am trying to write a loop to recode my data from -999 to NA in R.
 What's
  the most efficient way to do this? Below is what I'm presently doing,
 which
  is inefficient. Thanks,
  Chris
 
 
dat0 - read.table(time1.dat)
 
  colnames(dat0) - c(e1dq, e1arcp, e1dev, s1prcp, s1nrcp,
 s1ints,
  a1gpar, a1pias, a1devt)
 
  dat0[dat0$e1dq==-999.,e1dq] - NA
 
  dat0[dat0$e1arcp==-999.,e1arcp] - NA
 
  dat0[dat0$e1dev==-999.,e1dev] - NA
 
  dat0[dat0$s1prcp==-999.,s1prcp] - NA
 
  dat0[dat0$s1nrcp==-999.,s1nrcp] - NA
 
  dat0[dat0$s1ints==-999.,s1ints] - NA
 
  dat0[dat0$a1gpar==-999.,a1gpar] - NA
 
  dat0[dat0$a1pias==-999.,a1pias] - NA
 
  dat0[dat0$a1devt==-999.,a1devt] - NA
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quick recode of -999 to NA in R

2011-03-30 Thread Christopher Desjardins
On Wed, Mar 30, 2011 at 10:51 AM, peter dalgaard pda...@gmail.com wrote:


 On Mar 30, 2011, at 16:05 , Christopher Desjardins wrote:

 
  dat0 - read.table('tim1.dat', na = -999)
 
 
  Ah ... yes. I knew that but clearly didn't at the time of my question or
  script writing.
  Thanks,
  Chris

 Depending on where your data came from, you could get caught by the fact
 that the above is really ...na.strings=-999... and that is not going to
 work if the actual code is (say) -999.00.

 Another straightforward option is dat0[dat0 == -999] - NA


Thanks Peter. That way does work well too and it does what I need.
Chris



 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating 3 vectors that sum to 1

2011-03-29 Thread Christopher Desjardins
I have 3 vectors: p1, p2, and p3. I would like each vector to be any
possible value between 0 and 1 and p1 + p2 + p3 = 1. I want to graph these
and I've thought about using scatterplot3d(). Here's what I have so far.

library(scatterplot3d)
p1 - c(1,0,0,.5,.5,0,.5,.25,.25,.34,.33,.33,.8,.1,.1,.9,.05,.05)
p2 - c(0,1,0,.5,0,.5,.25,.5,.25,.33,.34,.33,.1,.8,.1,.05,.9,.05)
p3 - c(0,0,1,0,.5,.5,.25,.25,.5,.33,.33,.34,.1,.1,.8,.05,.05,.9)
scatterplot3d(p1,p2,p3)


However, I wonder if there is an easy way to create vectors p1, p2, and p3.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating 3 vectors that sum to 1

2011-03-29 Thread Christopher Desjardins
Thanks several of these do exactly what I want.
Chris

On Tue, Mar 29, 2011 at 2:42 PM, Greg Snow greg.s...@imail.org wrote:

 Or we could expand a bit more:

 require(TeachingDemos)
 require(gtools)

 n - 1000
 rtrg - matrix(NA, n, 3)
 for (i in 1:n) rtrg[i,] - diff(c(0, sort(runif(2)), 1))

 rtrg2 - matrix(NA, n, 3)
 for (i in 1:n) {
 tmp - runif(3)
 rtrg2[i, ] - tmp/sum(tmp)
 }

 rtrg3 - matrix( rexp(n*3), ncol=3 )
 rtrg3 - rtrg3/rowSums(rtrg3)

 rtrg4 - rdirichlet(n, rep(1,3))

 par(mfrow=c(2,2))
 triplot(rtrg, pch='.')  # Looks more uniformly distributed
 triplot(rtrg2, col=2, pch='.')  # Corners are sparsely populated
 triplot(rtrg3, col=3, pch='.')
 triplot(rtrg4, col=4, pch='.')




 What could also be interesting in using vis.test (also TeachingDemos) to
 see which can be told apart from each other.  My guess is that rtrg2 method
 will be visible different from the other 3, but the other 3 will be
 indistinguishable from each other.

 The last 2 have the advantage (the 2nd could be rewritten to have the same
 advantage) of being much quicker, not sure how to speed up the 1st
 noticibly.



 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


  -Original Message-
  From: Ravi Varadhan [mailto:rvarad...@jhmi.edu]
  Sent: Tuesday, March 29, 2011 12:59 PM
  To: Ravi Varadhan
  Cc: Greg Snow; r-help@r-project.org
  Subject: Re: [R] Creating 3 vectors that sum to 1
 
 
  Here is an exploration of two different 3-tuple generators (that sum to
  1) based on Greg's triplot function:
 
  require(TeachingDemos)
 
  n - 1000
  rtrg - matrix(NA, n, 3)
  for (i in 1:n) rtrg[i,] - diff(c(0, sort(runif(2)), 1))
 
  rtrg2 - matrix(NA, n, 3)
  for (i in 1:n) {
  tmp - runif(3)
  rtrg2[i, ] - tmp/sum(tmp)
  }
 
  par(mfrow=c(2,1))
  triplot(rtrg)  # Looks more uniformly distributed
  triplot(rtrg2, col=2)  # Corners are sparsely populated
 
  Ravi.
  
 
  Ravi Varadhan, Ph.D.
  Assistant Professor,
  Division of Geriatric Medicine and Gerontology
  School of Medicine
  Johns Hopkins University
 
  Ph. (410) 502-2619
  email: rvarad...@jhmi.edu
 
 
  - Original Message -
  From: Ravi Varadhan rvarad...@jhmi.edu
  Date: Tuesday, March 29, 2011 2:33 pm
  Subject: Re: [R] Creating 3 vectors that sum to 1
  To: Greg Snow greg.s...@imail.org
  Cc: r-help@r-project.org r-help@r-project.org
 
 
   The following one-liner generates uniformly distributed 3-tuples that
   sum to 1:
  
   diff(c(0, sort(runif(2)), 1))
  
   More, generally you can generate n-tuples that sum to unity as:
  
   diff(c(0, sort(runif(n-1)), 1))
  
  
   Ravi.
  
   
  
   Ravi Varadhan, Ph.D.
   Assistant Professor,
   Division of Geriatric Medicine and Gerontology
   School of Medicine
   Johns Hopkins University
  
   Ph. (410) 502-2619
   email: rvarad...@jhmi.edu
  
  
   - Original Message -
   From: Greg Snow greg.s...@imail.org
   Date: Tuesday, March 29, 2011 1:02 pm
   Subject: Re: [R] Creating 3 vectors that sum to 1
   To: Christopher Desjardins cddesjard...@gmail.com,
   r-help@r-project.org r-help@r-project.org
  
  
Do a search for Dirichlet, that may give you the tools you need.
   Also
for plotting 3 vectors that sum to 1, instead of a 3d scatter plot
   you
should look into a triangle or trilinear plot, see ?triplot in the
TeachingDemos package (the see also for that help page lists
  several
  
other implementations in other packages as well).
   
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111
   
   
 -Original Message-
 From: r-help-boun...@r-project.org [
 project.org] On Behalf Of Christopher Desjardins
 Sent: Tuesday, March 29, 2011 10:20 AM
 To: r-help@r-project.org
 Subject: [R] Creating 3 vectors that sum to 1

 I have 3 vectors: p1, p2, and p3. I would like each vector to be
  any
 possible value between 0 and 1 and p1 + p2 + p3 = 1. I want to
  graph
 these
 and I've thought about using scatterplot3d(). Here's what I have
  so
 far.

 library(scatterplot3d)
 p1 - c(1,0,0,.5,.5,0,.5,.25,.25,.34,.33,.33,.8,.1,.1,.9,.05,.05)
 p2 - c(0,1,0,.5,0,.5,.25,.5,.25,.33,.34,.33,.1,.8,.1,.05,.9,.05)
 p3 - c(0,0,1,0,.5,.5,.25,.25,.5,.33,.33,.34,.1,.1,.8,.05,.05,.9)
 scatterplot3d(p1,p2,p3)


 However, I wonder if there is an easy way to create vectors p1,
   p2,
and
 p3.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list

 PLEASE do read the posting guide
 guide.html
 and provide commented, minimal, self-contained, reproducible
  code

[R] Sweave and Textwrangler

2011-03-26 Thread Christopher Desjardins
Hi,
I am trying to get TextWrangler to work with LaTeX and Sweave. Ideally
I would call a script from TextWrangler that would run Sweave on a
document, then LaTeX (using SyncTeX), and finally
open the corresponding pdf in Skim. Of course I don't always need to
run Sweave and would be looking for the option of LaTeX - Open.  I
found this http://www.xs4all.nl/~msneep/latex/ but it doesn't work for
me as it gives me an error. Something similar to do that would be
ideal. Additionally the option to use BibTeX would be great too.

Additionally, I tried this
http://www.stat.umn.edu/~arendahl/computing/index.html
and Sweave.sh script and trying to alter that (and the *.engine files) to
work with TextWrangler but to no avail.

Does anyone do this and have a tip they could give me? Sorry if this is
off-topic (if so please reply to me off the list).

Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RWinEdt and WinEdt 6.0

2010-07-15 Thread Christopher Desjardins
Hi,
I am curious if the status of RWinEdt and WinEdt 6.0 has changed since this
thread
http://r.789695.n4.nabble.com/RWinEdt-in-WinEdt-6-td2174285.html

Thanks for the help (especially Uwe Ligges for writing RWinEdt).
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calling Gnuplot from R

2010-07-08 Thread Christopher Desjardins
Hi,
I am wondering if there is a way to call Gnuplot from R and/or if anyone can
recommend a package on CRAN capable of doing this?
Thanks,
Chris
PS - Please cc me on the response.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] combining 4 vectors into one

2009-07-19 Thread Christopher Desjardins

Hi,
How can I perform the following

y1y2y3y4
1  0   3   2
0  1   2   1
3  2   0   1
0  5   1   0

into ...

 y
 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0


Please cc me on reply as I subscribe to the digest.
Thanks!
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combining 4 vectors into one

2009-07-19 Thread Christopher Desjardins
Thanks. That worked.
Chris

On 7/19/09 6:41 PM, jim holtman wrote:
 Try this:


 x
  
y1 y2 y3 y4
 1  1  0  3  2
 2  0  1  2  1
 3  3  2  0  1
 4  0  5  1  0

 as.vector(t(x))
  
   [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0




 On Sun, Jul 19, 2009 at 7:25 PM, Christopher
 Desjardinscddesjard...@gmail.com  wrote:

 Hi,
 How can I perform the following

 y1y2y3y4
 1  0   3   2
 0  1   2   1
 3  2   0   1
 0  5   1   0

 into ...

  
 y

   1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0


 Please cc me on reply as I subscribe to the digest.
 Thanks!
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

  






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error with r2winbugs

2009-07-16 Thread Christopher Desjardins

Hi,
I am trying to do run the following model saved in C:/bugs/sus.bug

model {
  for (i in 1:n){
y[i] ~ dpois(lamdba[i])
log(lambda[i]) - mu+bmale[male[i]]+bschn[schn[i]]+epsilon[i] #
epsilon[i] ~ dnorm(0,tau.epsilon)
  }
  mu ~ dnorm(0,.0001)
  bmale ~ dnorm(0,.0001)
  tau.epsilon - pow(sigma.epsilon, -2)
  sigma.epsilon ~ dunif(0,100)
  for (j in 1:n.schn){
bschn[j] ~ dnorm(0,tau.schn)
  }
  tau.schn - pow(sigma.schn, -2)
  sigma.schn ~ dunif(0,100)
}

I am running the following syntax:

enroll.wide.m - na.omit(enroll.wide)
n - nrow(enroll.wide.m)
y - enroll.wide.m$sus.5
sch - as.vector(enroll.wide.m$schno.5)
uniq - unique(sch)
n.schn - length(uniq)
male - enroll.wide.m$male
schn - enroll.wide.m$schno.5
data - list (n,n.schn,y,male,schn)
inits - function() {list (bschn=rnorm(n.schn), mu=rnorm(1,0,100), 
bmale=rnorm(1,0,100), 
sigma.epsilon=runif(1,0,100),sigma.schn=runif(1,0,100))}

parameters - c(bschn,mu,bmale,sigma.epsilon,sigma.schn)
sus.sim - 
bugs(data,inits,parameters,C:/bugs/susc.bug,n.chains=3,n.iter=10,debug=TRUE)


And get the following error:

display(log)
check(C:\DOCUME~1\CHRIST~1\LOCALS~1\Temp\Rtmpyg7eDj/susc.bug.txt)
empty slot not allowed in variable name
data(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/data.txt)
command #Bugs:data cannot be executed (is greyed out)
compile(3)
inits(1,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits1.txt)
command #Bugs:inits cannot be executed (is greyed out)
inits(2,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits2.txt)
command #Bugs:inits cannot be executed (is greyed out)
inits(3,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/inits3.txt)
command #Bugs:inits cannot be executed (is greyed out)
gen.inits()
command #Bugs:gen.inits cannot be executed (is greyed out)
thin.updater(1)
update(5)
command #Bugs:update cannot be executed (is greyed out)
set(bschn)
command #Bugs:set cannot be executed (is greyed out)
set(mu)
command #Bugs:set cannot be executed (is greyed out)
set(bmale)
command #Bugs:set cannot be executed (is greyed out)
set(sigma.epsilon)
command #Bugs:set cannot be executed (is greyed out)
set(sigma.schn)
command #Bugs:set cannot be executed (is greyed out)
set(deviance)
command #Bugs:set cannot be executed (is greyed out)
dic.set()
command #Bugs:dic.set cannot be executed (is greyed out)
update(5)
command #Bugs:update cannot be executed (is greyed out)
coda(*,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/coda)
command #Bugs:coda cannot be executed (is greyed out)
stats(*)
command #Bugs:stats cannot be executed (is greyed out)
dic.stats()

DIC
history(*,C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/history.odc)
command #Bugs:history cannot be executed (is greyed out)
save(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/log.odc)
save(C:/DOCUME~1/CHRIST~1/LOCALS~1/Temp/Rtmpyg7eDj/log.txt)

Any help would be greatly appreciate. Also please cc me as I am digest 
subscriber.

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to combine two data frames by ID

2009-07-14 Thread Christopher Desjardins

Hi,

I am trying to combine two data frames by ID. The first data frame is 
the whole data set and the second data frame is a subset of the first. 
What I would like to do is take the values from variable, p1, from the 
second data frame and merge them back into that variable in the first 
data frame. I have ID variables that match for both data frames and the 
data frames obviously have differing number of rows where the first data 
frame  second data frame. I want to keep the pre existing data for this 
variable that is present in the first data frame and only write over the 
data that I edited in the second.


Thanks and please cc me as I am a digest subscriber,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 x axis question

2009-06-29 Thread Christopher Desjardins
Hi Hadley,
Thanks for the reply and the great graphing package. That code is giving 
me the following error:

  qplot(reorder(model,delta),delta,data=growthm.bic)
Error in UseMethod(reorder) : no applicable method for reorder

Cheers,
Chris

On 6/28/09 8:21 PM, hadley wickham wrote:
 Hi Chris,

 Try this:

 qplot(reorder(model, delta), delta, data = growthm.bic)

 Hadley

 On Sun, Jun 28, 2009 at 9:53 AM, Christopher
 Desjardinscddesjard...@gmail.com  wrote:

 Hi,
 I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to
 plot them in order of ascending BIC values. I am however unclear as to how I
 can get the models to line up on the x-axis by BIC and not by numeric order.
 For example, if model 5 has a lower BIC than 1, I want it to be the first
 point on the left hand side of the curve. This seems to work in plot:

 plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model,
 ylab=expression(Delta[k]))   # where growthm.bic$delta is my BIC value
 axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of the
 45 models examined, i.e 1:45

 Currently using qplot I have this which doesn't not work as it arranges the
 BIC values in order from 1:45.

 qplot(model,delta,data=growthm.bic)

 Thanks. Also please cc me as I am a digest subscriber.
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

  






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 x axis question

2009-06-29 Thread Christopher Desjardins
Thanks Hadley that worked.
Chris

On 6/29/09 11:05 AM, hadley wickham wrote:
 In that case, try:

 qplot(reorder(factor(model),delta),delta,data=growthm.bic)

 Deepayan: do you think there should also be a numeric method for reorder?

 Hadley

 On Mon, Jun 29, 2009 at 10:39 AM, Christopher
 Desjardinscddesjard...@gmail.com  wrote:

 Hi Hadley,
 Thanks for the reply and the great graphing package. That code is giving me
 the following error:

  
 qplot(reorder(model,delta),delta,data=growthm.bic)

 Error in UseMethod(reorder) : no applicable method for reorder

 Cheers,
 Chris

 On 6/28/09 8:21 PM, hadley wickham wrote:

 Hi Chris,

 Try this:

 qplot(reorder(model, delta), delta, data = growthm.bic)

 Hadley

 On Sun, Jun 28, 2009 at 9:53 AM, Christopher
 Desjardinscddesjard...@gmail.com  wrote:


 Hi,
 I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to
 plot them in order of ascending BIC values. I am however unclear as to how I
 can get the models to line up on the x-axis by BIC and not by numeric order.
 For example, if model 5 has a lower BIC than 1, I want it to be the first
 point on the left hand side of the curve. This seems to work in plot:

 plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model,
 ylab=expression(Delta[k]))   # where growthm.bic$delta is my BIC value
 axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of the
 45 models examined, i.e 1:45

 Currently using qplot I have this which doesn't not work as it arranges the
 BIC values in order from 1:45.

 qplot(model,delta,data=growthm.bic)

 Thanks. Also please cc me as I am a digest subscriber.
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





  






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 x axis question

2009-06-28 Thread Christopher Desjardins

Hi,
I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to 
plot them in order of ascending BIC values. I am however unclear as to 
how I can get the models to line up on the x-axis by BIC and not by 
numeric order. For example, if model 5 has a lower BIC than 1, I want it 
to be the first point on the left hand side of the curve. This seems to 
work in plot:


plot(1:45, growthm.bic$delta, type=b, xaxt = n, xlab=Model, 
ylab=expression(Delta[k]))   # where growthm.bic$delta is my BIC value
axis(1, at=1:45, labels=growthm.bic$Model) #where model is the name of 
the 45 models examined, i.e 1:45


Currently using qplot I have this which doesn't not work as it arranges 
the BIC values in order from 1:45.


qplot(model,delta,data=growthm.bic)

Thanks. Also please cc me as I am a digest subscriber.
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changing gird marks in ggplot2

2009-04-24 Thread Christopher Desjardins
Hi,
When I zoom into a graph created in ggplot2 with the
coord_cartesian(ylim=c(0,5)) option, I have no values labelled on my y-axis.
For this graph ggplot2 only puts labels the y-axis at intervals of 10 (i.e.
0, 10, 20, ...). However, the major portion of the graph I am interested in
is located between the values of 0 and 5 on the y-axis (thus why I am
zoooming). How can I coerce ggplot2 into making the major gird marks so that
0, 1, 2, 3, 4, and 5 are shown as it is currently showing no label?
Thanks. Also please cc me directly as I'm a digest subscriber.
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.