Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
In the book Modern Applied Statistics with S, 4th edition, 2002, by
Venables and Ripley, there is a function logitreg on page 445, which does
provide the weighted logistic regression I asked, judging by the loss
function. And interesting enough, logitreg provides the same coefficients
as glm in the example I provided earlier, even with weights < 1. Also for
residual deviance, logitreg yields the same number as glm. Unless I
misunderstood something, I am convinced that glm is a valid tool for
weighted logistic regression despite the description on weights and somehow
questionable logLik value in the case of non-integer weights < 1. Perhaps
this is a bold claim: the description of weights can be modified and logLik
can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not the
link in that post. Sorry for the confusion.

On Sat, Aug 29, 2020 at 10:18 AM John Smith  wrote:

> Thanks for very insightful thoughts. What I am trying to achieve with the
> weights is actually not new, something like
> https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
> I thought my inquiry was not too strange, and I could utilize some existing
> codes. It is just an optimization problem at the end of day, or not? Thanks
>
> On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:
>
>> Dear John,
>>
>> On 2020-08-29 1:30 a.m., John Smith wrote:
>> > Thanks Prof. Fox.
>> >
>> > I am curious: what is the model estimated below?
>>
>> Nonsense, as Peter explained in a subsequent response to your prior
>> posting.
>>
>> >
>> > I guess my inquiry seems more complicated than I thought: with y being
>> 0/1, how to fit weighted logistic regression with weights <1, in the sense
>> of weighted least squares? Thanks
>>
>> What sense would that make? WLS is meant to account for non-constant
>> error variance in a linear model, but in a binomial GLM, the variance is
>> purely a function for the mean.
>>
>> If you had binomial (rather than binary 0/1) observations (i.e.,
>> binomial trials exceeding 1), then you could account for overdispersion,
>> e.g., by introducing a dispersion parameter via the quasibinomial
>> family, but that isn't equivalent to variance weights in a LM, rather to
>> the error-variance parameter in a LM.
>>
>> I guess the question is what are you trying to achieve with the weights?
>>
>> Best,
>>   John
>>
>> >
>> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
>> >>
>> >> Dear John
>> >>
>> >> I think that you misunderstand the use of the weights argument to
>> glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are
>> used to give the number of trials when the response is the proportion of
>> successes." That is, in this case y should be the observed proportion of
>> successes (i.e., between 0 and 1) and the weights are integers giving the
>> number of trials for each binomial observation.
>> >>
>> >> I hope this helps,
>> >> John
>> >>
>> >> John Fox, Professor Emeritus
>> >> McMaster University
>> >> Hamilton, Ontario, Canada
>> >> web: https://socialsciences.mcmaster.ca/jfox/
>> >>
>> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> >>> If the weights < 1, then we have different values! See an example
>> below.
>> >>> How  should I interpret logLik value then?
>> >>> set.seed(135)
>> >>>   y <- c(rep(0, 50), rep(1, 50))
>> >>>   x <- rnorm(100)
>> >>>   data <- data.frame(cbind(x, y))
>> >>>   weights <- c(rep(1, 50), rep(2, 50))
>> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
>> >>>   res.dev <- residuals(fit, type="deviance")
>> >>>   res2 <- -0.5*res.dev^2
>> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>> >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
>> wrote:
>> >>>> If you don't worry too much about an additive constant, then half the
>> >>>> negative squared deviance residuals should do. (Not quite sure how
>> weights
>> >>>> factor in. Looks like they are accounted for.)
>> >>>>
>> >>>> -pd
>> >>>>
>> >>>>> On 25 Aug 2020, at 17:33 , John Smith  wrote:
>> >>>>>
>> >>>>> Dear R-help,
>> >&

Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks

On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:

> Dear John,
>
> On 2020-08-29 1:30 a.m., John Smith wrote:
> > Thanks Prof. Fox.
> >
> > I am curious: what is the model estimated below?
>
> Nonsense, as Peter explained in a subsequent response to your prior
> posting.
>
> >
> > I guess my inquiry seems more complicated than I thought: with y being
> 0/1, how to fit weighted logistic regression with weights <1, in the sense
> of weighted least squares? Thanks
>
> What sense would that make? WLS is meant to account for non-constant
> error variance in a linear model, but in a binomial GLM, the variance is
> purely a function for the mean.
>
> If you had binomial (rather than binary 0/1) observations (i.e.,
> binomial trials exceeding 1), then you could account for overdispersion,
> e.g., by introducing a dispersion parameter via the quasibinomial
> family, but that isn't equivalent to variance weights in a LM, rather to
> the error-variance parameter in a LM.
>
> I guess the question is what are you trying to achieve with the weights?
>
> Best,
>   John
>
> >
> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
> >>
> >> Dear John
> >>
> >> I think that you misunderstand the use of the weights argument to glm()
> for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
> to give the number of trials when the response is the proportion of
> successes." That is, in this case y should be the observed proportion of
> successes (i.e., between 0 and 1) and the weights are integers giving the
> number of trials for each binomial observation.
> >>
> >> I hope this helps,
> >> John
> >>
> >> John Fox, Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> web: https://socialsciences.mcmaster.ca/jfox/
> >>
> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
> >>> If the weights < 1, then we have different values! See an example
> below.
> >>> How  should I interpret logLik value then?
> >>> set.seed(135)
> >>>   y <- c(rep(0, 50), rep(1, 50))
> >>>   x <- rnorm(100)
> >>>   data <- data.frame(cbind(x, y))
> >>>   weights <- c(rep(1, 50), rep(2, 50))
> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
> >>>   res.dev <- residuals(fit, type="deviance")
> >>>   res2 <- -0.5*res.dev^2
> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
> >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
> wrote:
> >>>> If you don't worry too much about an additive constant, then half the
> >>>> negative squared deviance residuals should do. (Not quite sure how
> weights
> >>>> factor in. Looks like they are accounted for.)
> >>>>
> >>>> -pd
> >>>>
> >>>>> On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >>>>>
> >>>>> Dear R-help,
> >>>>>
> >>>>> The function logLik can be used to obtain the maximum log-likelihood
> >>>> value
> >>>>> from a glm object. This is an aggregated value, a summation of
> individual
> >>>>> log-likelihood values. How do I obtain individual values? In the
> >>>> following
> >>>>> example, I would expect 9 numbers since the response has length 9. I
> >>>> could
> >>>>> write a function to compute the values, but there are lots of
> >>>>> family members in glm, and I am trying not to reinvent wheels.
> Thanks!
> >>>>>
> >>>>> counts <- c(18,17,15,20,10,20,25,13,12)
> >>>>>  outcome <- gl(3,1,9)
> >>>>>  treatment <- gl(3,3)
> >>>>>  data.frame(treatment, outcome, counts) # showing data
> >>>>>  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> >>>>>  (ll <- logLik(glm.D93))
> >>>>>
> >>>>>[[alternative HTML version deleted]]
> >>>>>
> >>>>> ___

Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread John Smith
Thanks Prof. Fox. 

I am curious: what is the model estimated below?

I guess my inquiry seems more complicated than I thought: with y being 0/1, how 
to fit weighted logistic regression with weights <1, in the sense of weighted 
least squares? Thanks

> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
> 
> Dear John
> 
> I think that you misunderstand the use of the weights argument to glm() for a 
> binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give 
> the number of trials when the response is the proportion of successes." That 
> is, in this case y should be the observed proportion of successes (i.e., 
> between 0 and 1) and the weights are integers giving the number of trials for 
> each binomial observation.
> 
> I hope this helps,
> John
> 
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
> 
>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> If the weights < 1, then we have different values! See an example below.
>> How  should I interpret logLik value then?
>> set.seed(135)
>>  y <- c(rep(0, 50), rep(1, 50))
>>  x <- rnorm(100)
>>  data <- data.frame(cbind(x, y))
>>  weights <- c(rep(1, 50), rep(2, 50))
>>  fit <- glm(y~x, data, family=binomial(), weights/10)
>>  res.dev <- residuals(fit, type="deviance")
>>  res2 <- -0.5*res.dev^2
>>  cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
>>> If you don't worry too much about an additive constant, then half the
>>> negative squared deviance residuals should do. (Not quite sure how weights
>>> factor in. Looks like they are accounted for.)
>>> 
>>> -pd
>>> 
>>>> On 25 Aug 2020, at 17:33 , John Smith  wrote:
>>>> 
>>>> Dear R-help,
>>>> 
>>>> The function logLik can be used to obtain the maximum log-likelihood
>>> value
>>>> from a glm object. This is an aggregated value, a summation of individual
>>>> log-likelihood values. How do I obtain individual values? In the
>>> following
>>>> example, I would expect 9 numbers since the response has length 9. I
>>> could
>>>> write a function to compute the values, but there are lots of
>>>> family members in glm, and I am trying not to reinvent wheels. Thanks!
>>>> 
>>>> counts <- c(18,17,15,20,10,20,25,13,12)
>>>> outcome <- gl(3,1,9)
>>>> treatment <- gl(3,3)
>>>> data.frame(treatment, outcome, counts) # showing data
>>>> glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>>>> (ll <- logLik(glm.D93))
>>>> 
>>>>   [[alternative HTML version deleted]]
>>>> 
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> --
>>> Peter Dalgaard, Professor,
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Office: A 4.23
>>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>[[alternative HTML version deleted]]
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread John Smith
If the weights < 1, then we have different values! See an example below.
How  should I interpret logLik value then?

set.seed(135)
 y <- c(rep(0, 50), rep(1, 50))
 x <- rnorm(100)
 data <- data.frame(cbind(x, y))
 weights <- c(rep(1, 50), rep(2, 50))
 fit <- glm(y~x, data, family=binomial(), weights/10)
 res.dev <- residuals(fit, type="deviance")
 res2 <- -0.5*res.dev^2
 cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:

> If you don't worry too much about an additive constant, then half the
> negative squared deviance residuals should do. (Not quite sure how weights
> factor in. Looks like they are accounted for.)
>
> -pd
>
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
> value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the
> following
> > example, I would expect 9 numbers since the response has length 9. I
> could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Passing formula and weights error

2020-08-28 Thread John Smith
Thanks to Duncan and Bill for very helpful tips.

On Fri, Aug 28, 2020 at 11:38 AM William Dunlap  wrote:

> Note that neither call to glm in your myglm function really works -
> the first one is using the 'weights' object from the global
> environment, not the weights argument.  E.g., in the fresh R session,
> where I avoid making unneeded assignments and use fixed x and y for
> repeatability,
>
>   > n <- 16
>   > data <- data.frame(x = log2(1:n), y = 1:n)
>   > myglm2 <- function(formula, data, weights)
>   {
>   glm(formula, data=data, family=gaussian(), weights=weights)
>   }
>   > myglm2(y~., data=data, weights=1/(1:n))
>   Error in model.frame.default(formula = formula, data = data, weights
> = weights,  :
> invalid type (closure) for variable '(weights)'
>
> The error arises because glm finds stats::weights, a function, not the
> argument called weights.  glm(), lm() and their ilk evaluate their
> weights and subset arguments in the environment of the formula.  In
> this case environment(y~.) is .GlobalEnv, not the function's
> environment.  The following function gives one way to deal with this,
> by giving formula a new environment that inherits from its original
> environment and contains the extra variables.
>
>   > myglm3 <- function(formula, data, weights)
>   {
>   envir <- list2env(list(weights=weights),
> parent=environment(formula))
>   environment(formula) <- envir
>   glm(formula, data=data, family=gaussian(), weights=weights)
>   }
>   > myglm3(y~., data=data, weights=1/(1:n))
>
>   Call:  glm(formula = formula, family = gaussian(), data = data,
> weights = weights)
>
>   Coefficients:
>   (Intercept)x
>  -0.09553  2.93352
>
>   Degrees of Freedom: 15 Total (i.e. Null);  14 Residual
>   Null Deviance:  60.28
>   Residual Deviance: 7.72 AIC: 70.42
>
> This is the same result you get with a direct call to
>   glm(y~., data=data, weights=1/(1:n))
>
> This is a common problem and I don't know if there is a FAQ on it or a
> standard function to deal with it.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Aug 28, 2020 at 8:33 AM John Smith  wrote:
> >
> > Dear R-help:
> >
> > I am writing a function based on glm and would like some variations of
> > weights. In the code below, I couldn't understand why the second glm
> > function fails and don't know how to fix it:
> >
> > Error in eval(extras, data, env) : object 'newweights' not found
> >  Calls: print ... eval ->  -> model.frame.default -> eval ->
> eval
> >  Execution halted
> >
> > ### R code
> > y <- rnorm(100)
> >  x <- rnorm(100)
> >  data <- data.frame(cbind(x, y))
> >  weights <- rep(1, 100)
> >  n <- 100
> >  myglm <- function(formula, data, weights){
> >  ## this works
> >  print(glm(formula, data, family=gaussian(), weights))
> >  ## this is not working
> >  newweights <- rep(1, n)
> >  glm(formula, data, family=gaussian(), weights=newweights)
> >  }
> >  myglm(y~., data, weights)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Passing formula and weights error

2020-08-28 Thread John Smith
Dear R-help:

I am writing a function based on glm and would like some variations of
weights. In the code below, I couldn't understand why the second glm
function fails and don't know how to fix it:

Error in eval(extras, data, env) : object 'newweights' not found
 Calls: print ... eval ->  -> model.frame.default -> eval -> eval
 Execution halted

### R code
y <- rnorm(100)
 x <- rnorm(100)
 data <- data.frame(cbind(x, y))
 weights <- rep(1, 100)
 n <- 100
 myglm <- function(formula, data, weights){
 ## this works
 print(glm(formula, data, family=gaussian(), weights))
 ## this is not working
 newweights <- rep(1, n)
 glm(formula, data, family=gaussian(), weights=newweights)
 }
 myglm(y~., data, weights)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-27 Thread John Smith
Thanks Peter for a very promising tip.

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:

> If you don't worry too much about an additive constant, then half the
> negative squared deviance residuals should do. (Not quite sure how weights
> factor in. Looks like they are accounted for.)
>
> -pd
>
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
> value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the
> following
> > example, I would expect 9 numbers since the response has length 9. I
> could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to obtain individual log-likelihood value from glm?

2020-08-25 Thread John Smith
Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood value
from a glm object. This is an aggregated value, a summation of individual
log-likelihood values. How do I obtain individual values? In the following
example, I would expect 9 numbers since the response has length 9. I could
write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels. Thanks!

counts <- c(18,17,15,20,10,20,25,13,12)
 outcome <- gl(3,1,9)
 treatment <- gl(3,3)
 data.frame(treatment, outcome, counts) # showing data
 glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
 (ll <- logLik(glm.D93))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Classification Tree Prediction Error

2020-08-25 Thread John Smith
As Bert advised correctly, this is not an R programming question. There is
some misunderstanding on how training//test data work together
in predictions. Suppose your test data has only one class. Therefore, you can
get the following rate by betting on the majority class every time, again
using data from the test set. In this case, the misclassification rate is
0! Of course no classification algorithm can beat that prediction for which
you already utilize the truth in the test data. In conclusion, the tree
model you provided has accuracy 0.837, which is very close to 0.85. I would
not complain.

On Tue, Aug 25, 2020 at 9:19 AM Xu Jun  wrote:

> Thank you for your comment! This tree function is from the tree package.
> Although it might be a pure statistical question, it could be related to
> how the tree function is used. I will explore the site that you suggested.
> But if there is anyone who can figure it out off the top of their head, I'd
> very much appreciate it.
>
> Jun
>
> On Mon, Aug 24, 2020 at 1:01 PM Bert Gunter 
> wrote:
>
> > Purely statistical questions -- as opposed to R programming queries --
> are
> > generally off topic here.
> > Here is where they are on topic:  https://stats.stackexchange.com/
> >
> > Suggestion: when you post, do include the package name where you get
> > tree() from, as there might be
> > more than one with this function.
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Aug 24, 2020 at 8:58 AM Xu Jun  wrote:
> >
> >> Dear all R experts,
> >>
> >> I have a question about using cross-validation to assess results
> estimated
> >> from a classification tree model. I annotated what each line does in
> the R
> >> code chunk below. Basically, I split the data, named usedta, into 70%
> vs.
> >> 30%, with the training set having 70% and the test set 30% of the
> original
> >> cases. After splitting the data, I first run a classification tree off
> the
> >> training set, and then use the results for cross-validation using the
> test
> >> set. It turns out that if I don't have any predictors and make
> predictions
> >> by simply betting on the majority class of the zero-one coding of the
> >> binary response variable, I can do better than what the results from the
> >> classification tree would deliver in the test set. What would this imply
> >> and what would cause this problem? Does it mean that classification tree
> >> is
> >> not an appropriate method for my data; or, it's because I have too few
> >> variables? Thanks a lot!
> >>
> >> Jun Xu, PhD
> >> Professor
> >> Department of Sociology
> >> Ball State University
> >> Muncie, IN 47306
> >> USA
> >>
> >> Using the estimates, I get the following prediction rate (correct
> >> prediction) using the test set. Or we can say the misclassification
> error
> >> rate is 1-0.837 = 0.163
> >>
> >> > (tab[1,1] + tab[2,2]) / sum(tab)[1] 0.837
> >>
> >>
> >> Without any predictors, I can get the following rate by betting on the
> >> majority class every time, again using data from the test set. In this
> >> case, the misclassification error rate is 1-0.85 = 0.15
> >>
> >> > table(h2.test)h2.test
> >> 1poorHlth 0goodHlth
> >>   101   575 > 571/(571+101)[1] 0.85
> >>
> >>
> >>
> >> R Code Chunk
> >>
> >> # set the seed for random number generator for replication
> >> set.seed(47306)
> >> # have the 7/3 split with 70% of the cases allotted to the training set
> >> # AND create the training set identifier
> >> class.train = sample(1:nrow(usedta), nrow(usedta)*0.7)
> >> # create the test set indicator
> >> class.test = (-class.train)
> >> # create a vector for the binary response variable from the test set
> >> # for future cross-tabulation.
> >> h2.test <- usedta$h2[class.test]
> >> # count the train set cases
> >> Ntrain = length(usedta$h2[class.train])
> >> # run the classification tree model using the training set
> >> # h2 is the binary response and other variables are predictors
> >> tree.h2 <- tree(h2 ~ age + educ + female + white + married + happy,
> >> data = usedta, subset = class.train,
> >> control = tree.control(nobs=Ntrain, mindev=0.003))
> >> # summary results
> >> summary(tree.h2)
> >> # make predictions of h2 using the test set
> >> tree.h2.pred <- predict(tree.h2, usedta[class.test,], type="class")
> >> # cross tab the predictions using the test set
> >> table(tree.h2.pred, h2.test)
> >> tab = table(tree.h2.pred, h2.test)
> >> # calculate the ratio for the correctly predicted in the test set
> >> (tab[1,1] + tab[2,2]) / sum(tab)
> >> # calculate the ratio for the correctly predicted using the naive
> approach
> >> # by betting on the majority category.
> >> table(h2.test)[2]/sum(tab)
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> 

Re: [R] [EXT] Re: Plot math symbol with string and number

2020-08-17 Thread John Smith
Thanks David for a quite interesting suggestion: Indeed I didn't know
paste0! Best!

On Mon, Aug 17, 2020 at 12:26 PM David K Stevens 
wrote:

> John - one more approach
>
> plot(y,main=parse(text=paste0('data ~~ sigma == ',s)))
>
> Good luck
>
> David Stevens
>
> On 8/17/2020 8:23 AM, John Smith wrote:
> > Thanks to Dunkan, Rasmus and Bert. Will keep the very useful tips. Best!
> >
> > On Sun, Aug 16, 2020 at 9:13 PM Rasmus Liland  wrote:
> >
> >> On Sun, Aug 16, 2020 at 3:18 PM Bert wrote:
> >> | On Sun, Aug 16, 2020, 14:53 John wrote:
> >> | |
> >> | | I would like to make plots with
> >> | | titles for different data sets and
> >> | | different parameters. The first
> >> | | title doesn't show sigma as a math
> >> | | symbol, while the second one
> >> | | doesn't contain the s value as a
> >> | | numeric value
> >> | |
> >> | | s <- 1
> >> | | y <- rnorm(100)
> >> | | plot(y, main=paste("data", "sigma=", s))
> >> | | plot(y, main=expression(paste("data", sigma,"=", s)))
> >> |
> >> | ?plotmath
> >>
> >> Dear John, read ?plotmath, it is good, I
> >> was not aware of its existence; then
> >> backquote s like so:
> >>
> >>  plot(y, main=bquote(paste(
> >>"data" ~~ widehat(aleph)
> >>%notin% .(s)^.(s
> >>
> >> V
> >>
> >> r
> >>
> >  [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > CAUTION: This email originated from outside of USU. If this appears to
> be a USU employee, beware of impersonators. Do not click links, reply,
> download images, or open attachments unless you verify the sender’s
> identity and know the content is safe.
> >
> --
> David K Stevens, P.E., Ph.D.
> Professor, Environmental Engineering
> Civil and Environmental Engineering
> Utah Water Research Laboratory
> 8200 Old Main Hill
> Logan, UT  84322-8200
> 435 797 3229 - voice
> 435 797 1363 - fax
> david.stev...@usu.edu
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot math symbol with string and number

2020-08-17 Thread John Smith
Thanks to Dunkan, Rasmus and Bert. Will keep the very useful tips. Best!

On Sun, Aug 16, 2020 at 9:13 PM Rasmus Liland  wrote:

> On Sun, Aug 16, 2020 at 3:18 PM Bert wrote:
> | On Sun, Aug 16, 2020, 14:53 John wrote:
> | |
> | | I would like to make plots with
> | | titles for different data sets and
> | | different parameters. The first
> | | title doesn't show sigma as a math
> | | symbol, while the second one
> | | doesn't contain the s value as a
> | | numeric value
> | |
> | | s <- 1
> | | y <- rnorm(100)
> | | plot(y, main=paste("data", "sigma=", s))
> | | plot(y, main=expression(paste("data", sigma,"=", s)))
> |
> | ?plotmath
>
> Dear John, read ?plotmath, it is good, I
> was not aware of its existence; then
> backquote s like so:
>
> plot(y, main=bquote(paste(
>   "data" ~~ widehat(aleph)
>   %notin% .(s)^.(s
>
> V
>
> r
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot math symbol with string and number

2020-08-16 Thread John Smith
Dear Helpers,

I would like to make plots with titles for different data sets and
different parameters. So a useful title should combine data name
and parameter for clarity. The following is a simplified code example with
two plots. The first title doesn't show sigma as a math symbol, while the
second one doesn't contain the s value as a numeric value - I could
manually change the s value, but when there are many different s values,
this is not a good solution. Thanks!

s <- 1
y <- rnorm(100)
plot(y, main=paste("data", "sigma=", s))
plot(y, main=expression(paste("data", sigma,"=", s)))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] offset in glm

2020-02-27 Thread John Smith
  It is simple to use the provided function glm as fit1 below. However,
without the offset argument, I tried fit2 below. The reason I used fit2 is
that (for X as predictors, b the coefficients)
fit2: log(Claims/Holders) = Xb
means
fit1: log(Claims)=Xb + log(Holders)

Obviously the results from fit2 are different from fit1.

Thanks!
##
library("MASS")
  ## main-effects fit as Poisson GLM with offset
 fit1 <- glm(Claims ~ District + Group + Age + offset(log(Holders)),
  data = Insurance, family = poisson)
 coef(fit1)
 > coef(fit1)
   (Intercept) District2 District3 District4   Group.L
 -1.8105078329  0.0258681909  0.0385239271  0.2342053280  0.4297075387
   Group.Q   Group.C Age.L Age.Q Age.C
  0.0046324351 -0.0292943222 -0.3944318082 -0.0003549709 -0.0167367565

 fit2 <- glm(Claims/Holders ~ District + Group + Age,
  data = Insurance, family = poisson)
 > coef(fit2)
 (Intercept)   District2   District3   District4 Group.L Group.Q
 -1.86340418  0.17552458  0.11081521  0.15131076  0.43701544 -0.01530721
 Group.C   Age.L   Age.Q   Age.C
 -0.06033747 -0.31976743 -0.01833841 -0.01694737

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running Omega in R

2017-01-16 Thread John Smith
Hi David

Thank you for your feedback
I didn’t realise you couldn’t implement omega on a single factor
This definitely makes sense the more I read on it

Thank you all very much for your time 


From: David L Carlson
Sent: Monday 16 January 2017 16:43
To: peter dalgaard; Rui Barradas
Cc: John Smith; r-help@r-project.org; reve...@northwestern.edu
Subject: RE: [R] Running Omega in R

I get an additional error message when I run your example so perhaps your 
version is not the current (1.6.12) one? 

Loading required namespace: GPArotation
Failed with error:  ‘there is no package called ‘GPArotation’’
Error in omegah(m = m, nfactors = nfactors, fm = fm, key = key, flip = flip,  : 
  I am sorry, you need to have the  GPArotation package installed

Then I get a great deal of error output followed by the one you list. The first 
error is 

"Omega_h for 1 factor is not meaningful, just omega_t" (repeated 174 times). 
I'm copying the package maintainer as this seems like overkill.

Your problem is addressed in the first line of the manual page for omega:

"McDonald has proposed coefficient omega as an estimate of the general factor 
saturation of a test. One way to find omega is to do a factor analysis of the 
original data set, rotate the factors obliquely, do a Schmid Leiman 
transformation, and then find omega. "

You cannot rotate a single factor so you cannot compute omega.

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of peter dalgaard
Sent: Sunday, January 15, 2017 2:27 PM
To: Rui Barradas <ruipbarra...@sapo.pt>
Cc: John Smith <jsmith.cours...@outlook.com>; r-help@r-project.org
Subject: Re: [R] Running Omega in R


> On 15 Jan 2017, at 20:57 , Rui Barradas <ruipbarra...@sapo.pt> wrote:
> 
> Hello,
> 
> I no nothing about package psych so if you ask whether this is the wrong list 
> you can always try
> 
> maintainer("psych")
> [1] "William Revelle <reve...@northwestern.edu>"

...although publishing a package is not the same as issuing a blank cheque for 
free support, so you may want to dig a little deeper first. 

I'd check the documentation for whether the data are in the right format. In 
particular, can "m" be a data frame? If it wants a matrix, you probably need to 
give it one (m=as.matrix(construct) should do it).

-pd

> 
> Hope this helps,
> 
> Rui Barradas
> 
> Em 15-01-2017 12:32, John Smith escreveu:
>> Hi,
>> 
>> I opened a question on stack overflow I’m hoping this mailing list can help 
>> with.
>> I have a dataset below (this is made up but produces the same error I am 
>> getting)
>> 
>> structure(list(Q1 = c(4, 5, 3, 5, 4, 5, 3, 5, 5, 5, 6,
>> 3, 5, 4, 6, 5, 5, 6, 7, 4, 5, 5, 3, 4, 4, 5, 4, 3, 5, 4, 5, 5,
>> 6, 6, 3, 6, 3, 4, 4, 4, 6, 5, 3, 2, 6, 6, 4, 5, 4, 3, 6, 4, 4,
>> 5, 6, 2, 4, 3, 4, 6, 4, 6, 4, 5, 5, 6, 4, 6, 5, 5, 4, 5, 6, 6,
>> 2, 5, 4, 3, 4, 4, 4, 6, 3, 3, 5, 4, 4, 4, 5, 5, 5, 3, 6, 6, 6,
>> 6, 5, 4, 3, 5), Q2 = c(7, 4, 4, 4, 4, 6, 6, 6, 7, 6, 5,
>> 6, 5, 4, 5, 6, 6, 6, 7, 5, 4, 4, 6, 6, 4, 4, 6, 2, 6, 5, 4, 6,
>> 4, 6, 6, 6, 5, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4, 6, 2, 6, 6, 5, 4,
>> 6, 6, 4, 4, 7, 6, 5, 5, 5, 5, 6, 5, 5, 4, 5, 5, 5, 4, 6, 7, 5,
>> 5, 5, 6, 5, 6, 5, 6, 7, 2, 6, 5, 7, 3, 5, 5, 3, 3, 3, 7, 4, 5,
>> 6, 6, 6, 5, 7), Q3 = c(5, 4, 5, 6, 4, 4, 5, 4, 2, 6, 5,
>> 5, 5, 5, 7, 5, 5, 6, 7, 6, 3, 6, 6, 6, 5, 6, 6, 5, 5, 4, 5, 5,
>> 6, 6, 5, 6, 5, 5, 4, 4, 6, 4, 4, 4, 4, 4, 4, 5, 5, 4, 5, 5, 4,
>> 3, 5, 4, 5, 6, 6, 6, 4, 5, 5, 5, 6, 4, 5, 5, 7, 4, 5, 6, 6, 5,
>> 5, 3, 3, 5, 4, 6, 5, 5, 1, 3, 5, 3, 2, 5, 4, 6, 6, 6, 6, 4, 6,
>> 3, 6, 6, 6, 5), Q4 = c(6, 6, 4, 7, 4, 6, 7, 6, 7, 6, 6,
>> 6, 5, 7, 7, 6, 6, 5, 7, 7, 6, 6, 7, 7, 6, 6, 6, 5, 6, 7, 5, 6,
>> 7, 5, 4, 6, 4, 3, 6, 4, 6, 6, 6, 3, 5, 7, 5, 6, 4, 6, 7, 6, 7,
>> 4, 6, 3, 5, 7, 5, 4, 6, 6, 4, 6, 5, 5, 5, 5, 7, 7, 7, 6, 6, 6,
>> 5, 6, 6, 4, 5, 7, 6, 7, 3, 5, 6, 5, 6, 5, 5, 7, 7, 6, 6, 2, 7,
>> 6, 6, 7, 7, 5)), .Names = c("Q1", "Q2", "Q3",
>> "Q4"), row.names = c(NA, 100L), class = "data.frame")
>> 
>> 
>> When i run Cronbach Alpha with R i get a result
>> 
>> psych::alpha(construct,
>>  na.rm = TRUE,
>>  title = 'myscale',
>>  n.iter = 1000)
>> 
>> 
>> When i run Omega using R i get the following error message
>> 
>> "Error in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : 
>> I am sorry: missing values (NAs) in the correlation matrix do not allow me 
>> to continue.
>> Ple

[R] Running Omega in R

2017-01-15 Thread John Smith
Hi,

I opened a question on stack overflow I’m hoping this mailing list can help 
with.
I have a dataset below (this is made up but produces the same error I am 
getting)

structure(list(Q1 = c(4, 5, 3, 5, 4, 5, 3, 5, 5, 5, 6,
3, 5, 4, 6, 5, 5, 6, 7, 4, 5, 5, 3, 4, 4, 5, 4, 3, 5, 4, 5, 5,
6, 6, 3, 6, 3, 4, 4, 4, 6, 5, 3, 2, 6, 6, 4, 5, 4, 3, 6, 4, 4,
5, 6, 2, 4, 3, 4, 6, 4, 6, 4, 5, 5, 6, 4, 6, 5, 5, 4, 5, 6, 6,
2, 5, 4, 3, 4, 4, 4, 6, 3, 3, 5, 4, 4, 4, 5, 5, 5, 3, 6, 6, 6,
6, 5, 4, 3, 5), Q2 = c(7, 4, 4, 4, 4, 6, 6, 6, 7, 6, 5,
6, 5, 4, 5, 6, 6, 6, 7, 5, 4, 4, 6, 6, 4, 4, 6, 2, 6, 5, 4, 6,
4, 6, 6, 6, 5, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4, 6, 2, 6, 6, 5, 4,
6, 6, 4, 4, 7, 6, 5, 5, 5, 5, 6, 5, 5, 4, 5, 5, 5, 4, 6, 7, 5,
5, 5, 6, 5, 6, 5, 6, 7, 2, 6, 5, 7, 3, 5, 5, 3, 3, 3, 7, 4, 5,
6, 6, 6, 5, 7), Q3 = c(5, 4, 5, 6, 4, 4, 5, 4, 2, 6, 5,
5, 5, 5, 7, 5, 5, 6, 7, 6, 3, 6, 6, 6, 5, 6, 6, 5, 5, 4, 5, 5,
6, 6, 5, 6, 5, 5, 4, 4, 6, 4, 4, 4, 4, 4, 4, 5, 5, 4, 5, 5, 4,
3, 5, 4, 5, 6, 6, 6, 4, 5, 5, 5, 6, 4, 5, 5, 7, 4, 5, 6, 6, 5,
5, 3, 3, 5, 4, 6, 5, 5, 1, 3, 5, 3, 2, 5, 4, 6, 6, 6, 6, 4, 6,
3, 6, 6, 6, 5), Q4 = c(6, 6, 4, 7, 4, 6, 7, 6, 7, 6, 6,
6, 5, 7, 7, 6, 6, 5, 7, 7, 6, 6, 7, 7, 6, 6, 6, 5, 6, 7, 5, 6,
7, 5, 4, 6, 4, 3, 6, 4, 6, 6, 6, 3, 5, 7, 5, 6, 4, 6, 7, 6, 7,
4, 6, 3, 5, 7, 5, 4, 6, 6, 4, 6, 5, 5, 5, 5, 7, 7, 7, 6, 6, 6,
5, 6, 6, 4, 5, 7, 6, 7, 3, 5, 6, 5, 6, 5, 5, 7, 7, 6, 6, 2, 7,
6, 6, 7, 7, 5)), .Names = c("Q1", "Q2", "Q3",
"Q4"), row.names = c(NA, 100L), class = "data.frame")


When i run Cronbach Alpha with R i get a result

psych::alpha(construct,
 na.rm = TRUE,
 title = 'myscale',
 n.iter = 1000)


When i run Omega using R i get the following error message

"Error in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : I 
am sorry: missing values (NAs) in the correlation matrix do not allow me to 
continue.
Please drop those variables and try again. In addition: There were 50 or more 
warnings (use warnings() to see the first 50)"

psych::omega(m = construct,
  nfactors = 1, fm = "pa", n.iter = 1000, p = 0.05,
  title = "Omega", plot = FALSE, n.obs = 100)


Stackoverflow Link: 
http://stackoverflow.com/questions/41533231/running-omega-with-psych-library-in-r?noredirect=1#comment70278453_41533231

If I have the wrong mailing list, could you direct me to the appropriate one 
(I’m aware it might not be an R issue but more of a stats question)



Thank you for your time



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating Volatile Table in Teradata using RODBC

2015-02-15 Thread John Smith
Hi All

 

I'm trying to use R to create a temporary table in Teradata and then add
rows from data frame into the temporary volatile table in R

Based on the code below (I have changed the SQL slightly), I am able to
create the temporary table in my spool space but when I try add the data
frame mydata to it, R tells me it cannot find the table

My assumption here is that it is looking for the real table as opposed to
the temp table in spool space. Can anyone see the problem?

 

# RShowDoc(teradataR, package=teradataR) - Manual

#RShowDoc(RODBC, package=RODBC)

library(RODBC)

library(teradataR)

 

# Import Data From Text File and remove duplicates

mydata = read.table(c:/Users/user/Desktop/ Keys.txt)

mydata.unique = unique(mydata)

 

# Create SQL for Temp Table in Teradata

strSQL.TempTable = CREATE VOLATILE TABLE TEMP_Keys (keys VARCHAR (39))
UNIQUE PRIMARY INDEX(key) ON COMMIT PRESERVE ROWS;

 

# Connect To Teradata DB

channel - odbcConnect('DB')

# Execute Temp Table

sqlQuery(channel, paste(strSQL.TempTable))

 

sqlUpdate(channel, mydata, tablename = TEMP_Keys, fast = TRUE)

 

 

Any Help would be greatly appreciated

 

KEYWORDS: Volatile, SQL, R, RODBC, teradataR

REF:
http://stackoverflow.com/questions/24740751/error-with-creating-volatile-tab
le-in-teradata

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to change annotations in contour function from rsm package

2014-04-04 Thread John Smith
Dear All,

I am using R3.0.3 and rsm 2.04. Since I want do log transformation on my
data, so I want change annotation of Slice at ...  For example, I want
change (Slice at W = 1.25, L=2) in the first graph into (Slice at W =
log(1.25), L=log(2)). Can anyone tell me how can I change it?

Thanks

library(rsm)
heli.rsm - rsm(ave ~ block + SO(x1, x2, x3, x4), data = heli)
windows(width=10.8/2*3, height=10.8)
par(mfrow = c(2, 3))
contour(heli.rsm, ~ x1 + x2 + x3 + x4, image = TRUE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to set pch in auto.key for splom

2013-10-15 Thread John Smith
Dear All,

I am using most current version of R (3.0.2) and lattice (0.20-24). But can
not set appropriate pch type with the following code. Could anyone help me?



splom(~iris[1:4], groups = iris$Species, pch=1:3, auto.key = list(space =
'top', columns = 3, pch=1:3))


Thanks

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] appending and merging data frames

2013-02-07 Thread John Smith
I know that the basic approach to append or merge data frames is using the 
rbind and merge commands.
However, if I understand things correctly, for both commands one needs to do 
quite some additional programming to get e.g. behavior as with the Stata append 
and morge commands or to achieve some things which I think users need quite 
frequently.

E.g. for appending, the data frame must have identical column names. In order 
to rename columns or in order to add columns with missing values if necessary, 
additional programming is needed.
For merging, all matches get combined, so it is not easily possible to check 
for 1:1 or 1:n matches or limit the join to such kind of matches, is it?
Those are just examples, there are a number of additional details that would be 
useful to be able to control for merging/appending (maybe at the expense of 
restricting the operation to just data frames).

So my question is: are there any packages or existing utility functions which 
would provide append and merge functionality at a slightly higher 
(user-friendly) level?

Although I am quite a noob, I would be prepared to give it a try and program 
these myself, but I have the feeling that this must be so common that maybe it 
would mean re-inventing the wheel?


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] appending and merging data frames

2013-02-07 Thread John Smith
Hi Gerrit,

as I said in my original email, I already know both the
merge and the rbind commands, but I think that many
standard situations (I have given just a few) require
rather clumsy ad-hoc programming. So I was wondering
if there are any packages or existing code that would
make it easier to handle the diverse append/merge
tasks that tend to occur rather frequently.
For at least some of these tasks, other stats
packages, like Stata do provide out-of-the box
commands or command options.
Again, I just want to know if there is already some
package or code for this out there somewhere, especially
for appending tasks (automatically taking care of 
missing variables, renaming, data types etc.)
The RSiteSearch() function is helpful and came up
with slightly different versions of merge but nothing
for append.

John

 -Original Message-
 From: gerrit.eich...@math.uni-giessen.de
 Sent: Thu, 7 Feb 2013 16:57:13 +0100 (MET)
 To: johsmi9...@inbox.com
 Subject: Re: [R] appending and merging data frames
 
 Hello, John,
 
 as a start take a look at
 
 ?merge
 
 And to (maybe) get a bit overwhelmed at first sight use
 
 RSiteSearch( merge)
 
 
   Hth  --  Gerrit
 
 On Thu, 7 Feb 2013, John Smith wrote:
 
 I know that the basic approach to append or merge data frames is using
 the rbind and merge commands.
 However, if I understand things correctly, for both commands one needs
 to do quite some additional programming to get e.g. behavior as with the
 Stata append and morge commands or to achieve some things which I think
 users need quite frequently.
 
 E.g. for appending, the data frame must have identical column names. In
 order to rename columns or in order to add columns with missing values
 if necessary, additional programming is needed.
 For merging, all matches get combined, so it is not easily possible to
 check for 1:1 or 1:n matches or limit the join to such kind of matches,
 is it?
 Those are just examples, there are a number of additional details that
 would be useful to be able to control for merging/appending (maybe at
 the expense of restricting the operation to just data frames).
 
 So my question is: are there any packages or existing utility functions
 which would provide append and merge functionality at a slightly higher
 (user-friendly) level?
 
 Although I am quite a noob, I would be prepared to give it a try and
 program these myself, but I have the feeling that this must be so common
 that maybe it would mean re-inventing the wheel?
 
 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talkÿÿ
 and most webmails
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indexing operation based upon a sequence

2013-02-07 Thread John Smith
I think this is because some decimal numbers with a fixed 
number of digits after the decimal point cannot be represented
as base-2 floating point numbers with a limited number of 
digits. So when one adds them up, the result is slightly
different from the sum:

huh1 = seq(.1,.7,.15)
huh1 - 0.55
[1] -4.50e-01 -3.00e-01 -1.50e-01 -1.110223e-16  1.50e-01

for == to give TRUE, the numbers have to be exactly identical though.

R shows those numbers rounded to two places, so the difference is 
not immediately visible.


 -Original Message-
 From: j.brae...@uvt.nl
 Sent: Thu, 7 Feb 2013 16:41:49 +
 To: r-help@r-project.org
 Subject: [R] indexing operation based upon a sequence
 
 Dear R-list,
 
 
 We stumbled upon some weird problem when performing a simple indexing
 operation.
 Below some example code to illustrate the issue
 
 #FAILS TO FIND .55 Oo
 huh1 = seq(.1,.7,.15);huh1
 # [1] 0.10 0.25 0.40 0.55 0.70
 huh1 == .25
 # [1] FALSE  TRUE FALSE FALSE FALSE
 huh1 == .55
 # [1] FALSE FALSE FALSE FALSE FALSE
 huh1 == 0.55
 # [1] FALSE FALSE FALSE  TRUE FALSE
 
 Hence somehow one element in the sequence has become a character?
 
 #DOES NOT FAIL TO FIND .55
 huh2 = c(.1,.25,.4,.55,.7);huh3
 # [1] 0.10 0.25 0.40 0.55 0.70
 huh2 == .25
 # FALSE  TRUE FALSE FALSE FALSE
 huh2 == .55
 # [1] FALSE FALSE FALSE  TRUE FALSE
 
 This is what you would expect.
 
 #DOES NOT FAIL TO FIND .55
 huh3 = seq(.1,.7,.05);huh2
 # [1] 0.10 0.25 0.40 0.55 0.70
 huh3 == .25
 # [1] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE
 huh3 == .55
 # [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
 FALSE FALSE
 
 Strangely when making a longer sequence, the weird behavior does not
 persist.
 
 
 Does anyone have a clue what is going on here?
 
 
_
 platform   i386-pc-mingw32
 version.string R version 2.12.1 (2010-12-16)
 
 
 Best regards,
 
 
 Johan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] appending and merging data frames

2013-02-07 Thread John Smith
I gave a small number of examples in my first email.

For append, in Stata any variable that is missing in the 
other dataset is automatically inserted with all values set
to the missing value. In R, one would first have to compare
the columns in both data frames and generate columns
apropriately. 

For merge, when merging dataset A with dataset B on some 
set of key variables, it is possible to specify if the
join should be done 1:1, 1:n, n:1 or n:m. Stata creates
a data structure that contains information about which 
observations could or could not be matched and which 
matches did not fit the 1:1/1:n or n:1 join pattern.

Again, I know that in R it is not hard to program all that
yourself (while programming in Stata is something different 
altogether), 
I just wondered if for these operations, which probably
are needed quite often, there are some additional packages 
or existing code that already deals with most of the situations
one encounters when appending/merging.

The background is that I am making the move from other stats
packages (including Stata) to R myself and I also want to 
motivate others to do it.

Also, before I go about implementing my own utility functions
for this, I wanted to make sure that something much better
isn't already out there (it usually is, especially when one
is a beginner).

 -Original Message-
 From: dwinsem...@comcast.net
 Sent: Thu, 7 Feb 2013 09:46:02 -0800
 To: johsmi9...@inbox.com
 Subject: Re: [R] appending and merging data frames
 
 
 On Feb 7, 2013, at 9:12 AM, John Smith wrote:
 
 Hi Gerrit,
 
 as I said in my original email, I already know both the
 merge and the rbind commands, but I think that many
 standard situations (I have given just a few) require
 rather clumsy ad-hoc programming. So I was wondering
 if there are any packages or existing code that would
 make it easier to handle the diverse append/merge
 tasks that tend to occur rather frequently.
 For at least some of these tasks, other stats
 packages, like Stata do provide out-of-the box
 commands or command options.
 Again, I just want to know if there is already some
 package or code for this out there somewhere, especially
 for appending tasks (automatically taking care of
 missing variables, renaming, data types etc.)
 The RSiteSearch() function is helpful and came up
 with slightly different versions of merge but nothing
 for append.
 
 The sqldf package provides an interface to popular database drivers. I
 do not think your question provides enough specificity to go much
 further. You say Stata provides ... something... but you do not really
 explain what that something is. My efforts to understand the Stata
 documentation for the egen command left me shaking my head in
 disbelief at its opacity, and caused me to appreciate further the
 efforts of the R developers to make our help system available. The
 most productive approach would be to present a simple example in R code.
 
 --
 David.
 
 
 John
 
 -Original Message-
 From: gerrit.eich...@math.uni-giessen.de
 Sent: Thu, 7 Feb 2013 16:57:13 +0100 (MET)
 To: johsmi9...@inbox.com
 Subject: Re: [R] appending and merging data frames
 
 Hello, John,
 
 as a start take a look at
 
 ?merge
 
 And to (maybe) get a bit overwhelmed at first sight use
 
 RSiteSearch( merge)
 
 
  Hth  --  Gerrit
 
 On Thu, 7 Feb 2013, John Smith wrote:
 
 I know that the basic approach to append or merge data frames is
 using
 the rbind and merge commands.
 However, if I understand things correctly, for both commands one
 needs
 to do quite some additional programming to get e.g. behavior as
 with the
 Stata append and morge commands or to achieve some things which I
 think
 users need quite frequently.
 
 E.g. for appending, the data frame must have identical column
 names. In
 order to rename columns or in order to add columns with missing
 values
 if necessary, additional programming is needed.
 For merging, all matches get combined, so it is not easily
 possible to
 check for 1:1 or 1:n matches or limit the join to such kind of
 matches,
 is it?
 Those are just examples, there are a number of additional details
 that
 would be useful to be able to control for merging/appending (maybe
 at
 the expense of restricting the operation to just data frames).
 
 So my question is: are there any packages or existing utility
 functions
 which would provide append and merge functionality at a slightly
 higher
 (user-friendly) level?
 
 Although I am quite a noob, I would be prepared to give it a try and
 program these myself, but I have the feeling that this must be so
 common
 that maybe it would mean re-inventing the wheel?
 
 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google
 Talkÿÿ
 and most webmails
 
 __
 R-help@r-project.org mailing list
 https

Re: [R] appending and merging data frames

2013-02-07 Thread John Smith
One rather extremem limitation in Stata is that there is
always only just one active or loaded dataset.
So all the commands that involve more than one dataset
operate on the loaded master dataset and one or more
other datasets that are stored somewhere.

Merge in Stata joins the loaded master dataset with
another one on the harddisk. There are many options 
that influence the behavior, but roughly this is what
can be done:
When one specifies one of the join options 1:1, 1:n or
n:1, an error is shown if the key variables do not 
uniquely identify observations at the 1 side.
(so with regard to your question it is (b))
After the merge, a column is added to the resulting
dataset that indicates if the observation was matched,
if it occurred in the master dataset only or if it 
occurred in the other dataset only.
Additional options allow to restrict what will be 
kept in the merge, e.g. only observations that
occur in the master or in both and in the same way
one can restrict what pairings are allowed without
raising an error.
Another set of options allows to specify how variables
that occur in both datasets but are not key variables
are updated: e.g. always from master, always from the
other dataset or the master gets updated only if the 
current value is missing.
A merge can also be done purely on observation number
(no key variables). 
Additional options control how value labels (how 
factors are represented as readable strings) and 
variable notes are merged.

 -Original Message-
 From: wdun...@tibco.com
 Sent: Thu, 7 Feb 2013 18:18:26 +
 To: johsmi9...@inbox.com, dwinsem...@comcast.net
 Subject: RE: [R] appending and merging data frames
 
 For merge, when merging dataset A with dataset B on some
 set of key variables, it is possible to specify if the
 join should be done 1:1, 1:n, n:1 or n:m. Stata creates
 a data structure that contains information about which
 observations could or could not be matched and which
 matches did not fit the 1:1/1:n or n:1 join pattern.
 
 I don't know Stata and am curious about the above.
 If  you ask for a 1:n join but your key column[s] in your
 first input have duplicates, what does Stata do?  Does it
 (a) use the first of the duplicates to produce an answer
   and also return an object describing the problem
 (b) refuse to do the merge and return an object describing
   the problem
 or something else?  How is the data structure containing
 information about problems in the merge connected to
 the output of merge?
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
 Of John Smith
 Sent: Thursday, February 07, 2013 9:59 AM
 To: David Winsemius
 Cc: r-help@r-project.org
 Subject: Re: [R] appending and merging data frames
 
 I gave a small number of examples in my first email.
 
 For append, in Stata any variable that is missing in the
 other dataset is automatically inserted with all values set
 to the missing value. In R, one would first have to compare
 the columns in both data frames and generate columns
 apropriately.
 
 For merge, when merging dataset A with dataset B on some
 set of key variables, it is possible to specify if the
 join should be done 1:1, 1:n, n:1 or n:m. Stata creates
 a data structure that contains information about which
 observations could or could not be matched and which
 matches did not fit the 1:1/1:n or n:1 join pattern.
 
 Again, I know that in R it is not hard to program all that
 yourself (while programming in Stata is something different
 altogether),
 I just wondered if for these operations, which probably
 are needed quite often, there are some additional packages
 or existing code that already deals with most of the situations
 one encounters when appending/merging.
 
 The background is that I am making the move from other stats
 packages (including Stata) to R myself and I also want to
 motivate others to do it.
 
 Also, before I go about implementing my own utility functions
 for this, I wanted to make sure that something much better
 isn't already out there (it usually is, especially when one
 is a beginner).
 
 -Original Message-
 From: dwinsem...@comcast.net
 Sent: Thu, 7 Feb 2013 09:46:02 -0800
 To: johsmi9...@inbox.com
 Subject: Re: [R] appending and merging data frames
 
 
 On Feb 7, 2013, at 9:12 AM, John Smith wrote:
 
 Hi Gerrit,
 
 as I said in my original email, I already know both the
 merge and the rbind commands, but I think that many
 standard situations (I have given just a few) require
 rather clumsy ad-hoc programming. So I was wondering
 if there are any packages or existing code that would
 make it easier to handle the diverse append/merge
 tasks that tend to occur rather frequently.
 For at least some of these tasks, other stats
 packages, like Stata do provide out-of-the box
 commands or command options.
 Again, I just want to know if there is already some

[R] use xyplot to plot mean and CI by groups

2012-05-29 Thread John Smith
Dear R users,

I am trying to use xyplot to draw group mean and CI. The following is the
sample code. But I want:

   1. Use different colors and symbols to draw individual points, CI and
   the lines connect group means from different time points;
   2. Add jitters to x axis to allow CIs not be overlapped

Could anyone modify the attached code to achieve this?

Thanks





library(lattice)
set.seed(123)
src - data.frame(time = rep(c(1, 6, 24), rep(6,3)), trt =
rep(rep(c('C','T'), each=3), 3))
src - transform(src, x=sqrt(time)+2*(trt=='C')+rnorm(18), trt =
ordered(trt, levels=c('C','T')))
src - src[-5,]
src$m - ave(src$x, src[c('time','trt')], FUN = mean)
src$sd - ave(src$x, src[c('time','trt')], FUN = sd)
src$n - ave(src$x, src[c('time','trt')], FUN = length)
src - transform(src, se = sd/sqrt(n))
src - transform(src, cl=m-se, cu=m+se)

prepanel.ci - function(x, y, ly, uy, subscripts, ...)
  {
x - as.numeric(x)
ly - as.numeric(ly[subscripts])
uy - as.numeric(uy[subscripts])
list(ylim = range(y, uy, ly, finite = TRUE))
  }

panel.ci - function(x, y, ly, my, uy, subscripts, ...)
  {
x - as.numeric(x);y - as.numeric(y);  my -
as.numeric(my[subscripts])
ly - as.numeric(ly[subscripts]);  uy - as.numeric(uy[subscripts])
panel.xyplot(x, y, ...)
panel.arrows(x, ly, x, uy, length = 0.25, unit = native, angle = 90,
code = 3)
panel.lines(x, my)
  }

par.settings - list(superpose.symbol = list(col = c(red, green,
blue),
   fill = c(red, green, blue), pch=15:17),
  superpose.line = list(col = c(red, green,
blue)))
windows(width=10.67, height=6.60)
xyplot(x ~ time, xlab=Time, ylab=X, groups=trt, data=src, trt=src$trt,
   ly = src$cl, uy = src$cu, mx = src$time, my = src$m, type=p,
   prepanel = prepanel.ci, panel = panel.superpose, panel.groups =
panel.ci,
   auto.key = list(space = top,  points = TRUE, lines = TRUE,
columns=2),
   par.settings = par.settings)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Performing t tests between matrices

2012-04-19 Thread John Smith
I need to perform 1 T tests 

#I have two populations with different means
Popc1-rnorm(10,10,2)
Popc2-rnorm(10,8,2)

#I created two sets of samples - each set has 1 samples, and I made a
matrix of 20 rows and 1 columns to fit the data 

sampc1-matrix(,20,1) 
for(j in 1:1){sampc1[1:20,j]-sample(Popc1,20)}
sampc2-matrix(,20,1) 
for(j in 1:1){sampc2[1:20,j]-sample(Popc2,20)}

#I wrote a blank matrix with 1 row and 1 columns. I am planning to fill
this matrix with the t values that i calculate for each pair of samples
t-matrix(,1,1)

#What i need to do is to calculate the t value for each pair of samples: So
if i were to do it manually, for the first sample in each set, I would do
something like: 

t.test(sampc1[,1],sampc2[,1])
#And i would continue doing this all the way until the 1th pair of
samples 
so t.test(sampc1[,2],sampc2[,2])
t.test(sampc1[,3],sampc2[,3]) and so on 

#But i need to do this for all 1 pairs of samples: 
#I tried to write a loop

for(j in 1:1) {t[,1:1]-t.test(sampc1[, j],sampc2[, j])$statistic}

#but this fills each column in the matrix with the same exact t value.  

#Can someone tell me how to write the code to perform each pair of t tests?
I think i am missing something in my loop

Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/Performing-t-tests-between-matrices-tp4571706p4571706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Performing repeated T tests in R

2012-04-19 Thread John Smith
I need to perform 1 T tests

#I have two populations with different means
Popc1-rnorm(10,10,2)
Popc2-rnorm(10,8,2)

#I created two sets of samples - each set has 1 samples, and I made a
matrix of 20 rows and 1 columns to fit the data

sampc1-matrix(,20,1)
for(j in 1:1){sampc1[1:20,j]-sample(Popc1,20)}
sampc2-matrix(,20,1)
for(j in 1:1){sampc2[1:20,j]-sample(Popc2,20)}

#I wrote a blank matrix with 1 row and 1 columns. I am planning to fill
this matrix with the t values that i calculate for each pair of samples
t-matrix(,1,1)

#What i need to do is to calculate the t value for each pair of samples: So
if i were to do it manually, for the first sample in each set, I would do
something like:

t.test(sampc1[,1],sampc2[,1])
#And i would continue doing this all the way until the 1th pair of
samples
so t.test(sampc1[,2],sampc2[,2])
t.test(sampc1[,3],sampc2[,3]) and so on

#But i need to do this for all 1 pairs of samples:
#I tried to write a loop

for(j in 1:1) {t[,1:1]-t.test(sampc1[, j],sampc2[, j])$statistic}

#but this fills each column in the matrix with the same exact t value.  

#Can someone tell me how to write the code to perform each pair of t tests?
I think i am missing something in my loop

Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/Performing-repeated-T-tests-in-R-tp4571860p4571860.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can not use plot.Predict {rms} reproduce figure 7.8 from Regression Modeling Strategies (http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)

2012-03-15 Thread John Smith
Sorry. I am still using the 9-11 March 2011 version of course2.pdf.



On Wed, Mar 14, 2012 at 5:52 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Mar 14, 2012, at 4:09 PM, John Smith wrote:

  With most current version of R and RMS, the 4 curves are drew in
 4 separate panels. Can anyone show me how can I get the figure exactly
 like
 the figure 7.8 in  *Regression Modeling Strategies* (
 http://biostat.mc.vanderbilt.**edu/wiki/pub/Main/RmS/course2.**pdfhttp://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf
 )


 In case anyone else is scratching their head wondering what connection
 there might be between figure 7.8 in that document (which happens to be a
 nomogram), they should realize that the questioner is most probably asking
 about 7.8 in the RMS *book* and that there is no such Figure in chapter 7
 of the pdf document . (Nor, for that matter, does a search for democrat
 bring up any hits in the document so I don't think the code comes from
 there either.)

 I suspect the data is here:

 http://biostat.mc.vanderbilt.**edu/wiki/pub/Main/DataSets/**counties.savhttp://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/counties.sav


 --

 David.





 On Tue, May 17, 2011 at 4:04 PM, John Smith zmr...@gmail.com wrote:

  Dear R-users,

 I am using R 2.13.0 and rms 3.3-0 , but can not reproduce figure 7.8 of
 the handouts *Regression Modeling Strategies* (

 http://biostat.mc.vanderbilt.**edu/wiki/pub/Main/RmS/course2.**pdfhttp://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)
 by the
 following code. Could any one help me figure out how to solve this?


 setwd('C:/Rharrell')
 require(rms)
 load('data/counties.sav')

 older - counties$age6574 + counties$age75
 label(older) - '% age = 65, 1990'
 pdensity - logb(counties$pop.density+1, 10)
 label(pdensity) - 'log 10 of 1992 pop per 1990 miles^2'
 counties - cbind(counties, older, pdensity)  # add 2 vars. not in data
 frame
 dd - datadist(counties)
 options(datadist='dd')

 f - ols(democrat ~ rcs(pdensity,4) + rcs(pop.change,3) +
rcs(older,3) + crime + rcs(college,5) + rcs(income,4) +
rcs(college,5) %ia% rcs(income,4) +
rcs(farm,3) + rcs(white,5) + rcs(turnout,3), data=counties)

 incomes - seq(22900, 32800, length=4)
 show.pts - function(college.pts, income.pt)
  {
   s - abs(income - income.pt)  1650
   # Compute 10th smallest and 10th
 largest % college
   # educated in counties with median
 family income within
   # $1650 of the target income
   x - college[s]
   x - sort(x[!is.na(x)])
   n - length(x)
   low - x[10]; high - x[n-9]
   college.pts = low  college.pts = high
  }
 windows()
 plot(Predict(f, college, income=incomes, conf.int=FALSE),
xlim=c(0,35), ylim=c(30,55), lty=1, lwd=c(.25,1.5,3.5,6),
 col=c(1,1,2,2), perim=show.pts)





 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can not use plot.Predict {rms} reproduce figure 7.8 from Regression Modeling Strategies (http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)

2012-03-14 Thread John Smith
With most current version of R and RMS, the 4 curves are drew in
4 separate panels. Can anyone show me how can I get the figure exactly like
the figure 7.8 in  *Regression Modeling Strategies* (
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)

Thanks




On Tue, May 17, 2011 at 4:04 PM, John Smith zmr...@gmail.com wrote:

 Dear R-users,

 I am using R 2.13.0 and rms 3.3-0 , but can not reproduce figure 7.8 of
 the handouts *Regression Modeling Strategies* (
 http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf) by the
 following code. Could any one help me figure out how to solve this?


 setwd('C:/Rharrell')
 require(rms)
 load('data/counties.sav')

 older - counties$age6574 + counties$age75
 label(older) - '% age = 65, 1990'
 pdensity - logb(counties$pop.density+1, 10)
 label(pdensity) - 'log 10 of 1992 pop per 1990 miles^2'
 counties - cbind(counties, older, pdensity)  # add 2 vars. not in data
 frame
 dd - datadist(counties)
 options(datadist='dd')

 f - ols(democrat ~ rcs(pdensity,4) + rcs(pop.change,3) +
  rcs(older,3) + crime + rcs(college,5) + rcs(income,4) +
  rcs(college,5) %ia% rcs(income,4) +
  rcs(farm,3) + rcs(white,5) + rcs(turnout,3), data=counties)

 incomes - seq(22900, 32800, length=4)
 show.pts - function(college.pts, income.pt)
   {
 s - abs(income - income.pt)  1650
 # Compute 10th smallest and 10th
 largest % college
 # educated in counties with median
 family income within
 # $1650 of the target income
 x - college[s]
 x - sort(x[!is.na(x)])
 n - length(x)
 low - x[10]; high - x[n-9]
 college.pts = low  college.pts = high
   }
 windows()
 plot(Predict(f, college, income=incomes, conf.int=FALSE),
  xlim=c(0,35), ylim=c(30,55), lty=1, lwd=c(.25,1.5,3.5,6),
 col=c(1,1,2,2), perim=show.pts)




 Thanks




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] general question on Spotfire

2012-02-16 Thread John Smith
After one month of struggle, I lost confidence on Spotfire, it's totally
not user-friendly. Yes it is very good for certain data presentation, but
you still need do heavy lifting on data processing by other software. For
me, R is the perfect tool and has best user supporting group. I send this
email just for the statisticians like me, I don't think Spotfire can
improve your efficiency.



On Sun, Jan 15, 2012 at 5:22 PM, Frank Harrell f.harr...@vanderbilt.eduwrote:

 Thanks for your note Uwe.  Yes I think a lot of the needed work was related
 to implementing R functions that many of us use that are not available in
 S-Plus, plus what to do about plotmath.  It wasn't enough to just be able
 to
 load the R package.  I don't think implementation of the needed R functions
 in S-Plus ever happened.

 Frank


 Uwe Ligges-3 wrote
 
  On 12.01.2012 17:38, Frank Harrell wrote:
  As a slight aside, Tibco/Spotfire originally planned to provide a
  capability
  to load R packages into S-Plus.  This always seemed to me to be a hard
  thing
  to do, and if my understanding is correct, this proved to be too
  difficult
  to do in S-Plus, at  least for large packages such as mine.
 
  Frank,
 
  this dates back to the times of Insightful. They already did that and
  had a module that allowed to load R packages. Of course, they had to be
  S-PLUS compatible (which is not really easy any more if package authors
  made use of R functionality that exceed the features of S-PLUS). They
  even had hired people to make some R packages S-PLUS compatible,
  R2WinBUGS was just one example that was available from their CSAN
  repositories. (Nowadays R2WInBUGS is no longer S-PLUS compatible, since
  the authors do not care too much and do not have S-PLUS licenses to
  check it.)
 
  Best,
  Uwe
 
 
 
 
 
 
 
  Frank
 
  Terry Therneau-2 wrote
 
  John,
 Spotfire is a menu driven data exploration tool, very popular here
  with biologists who found that their previous Excel based approach
  doesn't cut it for large data sets.  When TIBCO wanted to expand the
  tool with further quantitative features they made (I think) a bright
  decision to purchase S-plus and integrate it as a back end, rather than
  try to write dozens of new modules in house.  The Splus vs R aspect
 of
  the list responses misses the main point, however.
 
  Spotfire is designed to let you nose around in a data set, quickly
  plotting various aspects, zoom in on subsets (imagine a mouse based
  version of the pinch metafor used on the iphone), etc.  It is a
 useful
  and very well designed tool; one demo was enough to make the sale and
  early growth here was explosive.  But if you already know R you can do
  those graphs already, albeit quite a bit slower.  I decided not to
  persue proficiency in Spotfire, but that was partly because it's
 Windows
  based and I prefer Unix.  Also most of my work is at the
  post-exploration phase, and I would have flipped back to straight R for
  that anyway.
 
  Terry Therneau
 
  __
  R-help@ mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  -
  Frank Harrell
  Department of Biostatistics, Vanderbilt University
  --
  View this message in context:
 
 http://r.789695.n4.nabble.com/general-question-on-Spotfire-tp4285758p4289575.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@ mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@ mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/general-question-on-Spotfire-tp4285758p4297916.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] general question on Spotfire

2012-01-11 Thread John Smith
Dear R users,

I have been using R for 10 years, and I love it very much. But in my daily
job for drug discovery, some people use Spotfire. I tried Spotfire on
couple of data sets. It sounds I still need do some data manipulation
before plot figures. For example, I can not plot figure with data arranged
in rows (is this true, or I am stupid?).   So far I don't feel any benefit
Spotfire can provide over R. I am just wondering whether it just because I
am new to Spotfire, or it's true that Spotfire is not a good tool for
statistician.

Also could anyone give me any suggestion how to learn Spotfire?

Thanks

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] general question on Spotfire

2012-01-11 Thread John Smith
I am struggling whether I should learn Spotfire or not. I just want some
statisticians inputs.

Thanks



On Wed, Jan 11, 2012 at 10:28 AM, Duncan Murdoch
murdoch.dun...@gmail.comwrote:

 On 12-01-11 10:13 AM, John Smith wrote:

 Dear R users,

 I have been using R for 10 years, and I love it very much. But in my daily
 job for drug discovery, some people use Spotfire. I tried Spotfire on
 couple of data sets. It sounds I still need do some data manipulation
 before plot figures. For example, I can not plot figure with data arranged
 in rows (is this true, or I am stupid?).   So far I don't feel any benefit
 Spotfire can provide over R. I am just wondering whether it just because I
 am new to Spotfire, or it's true that Spotfire is not a good tool for
 statistician.

 Also could anyone give me any suggestion how to learn Spotfire?


 Shouldn't you be asking this question to Spotfire users?

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can not use plot.Predict {rms} reproduce figure 7.8 from Regression Modeling Strategies (http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)

2011-05-17 Thread John Smith
Dear R-users,

I am using R 2.13.0 and rms 3.3-0 , but can not reproduce figure 7.8 of the
handouts *Regression Modeling Strategies* (
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf) by the
following code. Could any one help me figure out how to solve this?


setwd('C:/Rharrell')
require(rms)
load('data/counties.sav')

older - counties$age6574 + counties$age75
label(older) - '% age = 65, 1990'
pdensity - logb(counties$pop.density+1, 10)
label(pdensity) - 'log 10 of 1992 pop per 1990 miles^2'
counties - cbind(counties, older, pdensity)  # add 2 vars. not in data
frame
dd - datadist(counties)
options(datadist='dd')

f - ols(democrat ~ rcs(pdensity,4) + rcs(pop.change,3) +
 rcs(older,3) + crime + rcs(college,5) + rcs(income,4) +
 rcs(college,5) %ia% rcs(income,4) +
 rcs(farm,3) + rcs(white,5) + rcs(turnout,3), data=counties)

incomes - seq(22900, 32800, length=4)
show.pts - function(college.pts, income.pt)
  {
s - abs(income - income.pt)  1650
# Compute 10th smallest and 10th
largest % college
# educated in counties with median
family income within
# $1650 of the target income
x - college[s]
x - sort(x[!is.na(x)])
n - length(x)
low - x[10]; high - x[n-9]
college.pts = low  college.pts = high
  }
windows()
plot(Predict(f, college, income=incomes, conf.int=FALSE),
 xlim=c(0,35), ylim=c(30,55), lty=1, lwd=c(.25,1.5,3.5,6),
col=c(1,1,2,2), perim=show.pts)




Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can not use plot.Predict {rms} reproduce figure 7.8 from Regression Modeling Strategies (http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf)

2011-05-17 Thread John Smith
Also I can not reproduce figure 7.11 by

f - Newlabels(f, list(turnout='voter turnout (%)'))
windows()
nomogram(f, interact=list(income=incomes), turnout=seq(30,100,by=10),
 lplabel='estimated % voting Democratic', cex.var=.8, cex.axis=.75)



On Tue, May 17, 2011 at 4:04 PM, John Smith zmr...@gmail.com wrote:

 Dear R-users,

 I am using R 2.13.0 and rms 3.3-0 , but can not reproduce figure 7.8 of the
 handouts *Regression Modeling Strategies* (
 http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/course2.pdf) by the
 following code. Could any one help me figure out how to solve this?


 setwd('C:/Rharrell')
 require(rms)
 load('data/counties.sav')

 older - counties$age6574 + counties$age75
 label(older) - '% age = 65, 1990'
 pdensity - logb(counties$pop.density+1, 10)
 label(pdensity) - 'log 10 of 1992 pop per 1990 miles^2'
 counties - cbind(counties, older, pdensity)  # add 2 vars. not in data
 frame
 dd - datadist(counties)
 options(datadist='dd')

 f - ols(democrat ~ rcs(pdensity,4) + rcs(pop.change,3) +
  rcs(older,3) + crime + rcs(college,5) + rcs(income,4) +
  rcs(college,5) %ia% rcs(income,4) +
  rcs(farm,3) + rcs(white,5) + rcs(turnout,3), data=counties)

 incomes - seq(22900, 32800, length=4)
 show.pts - function(college.pts, income.pt)
   {
 s - abs(income - income.pt)  1650
 # Compute 10th smallest and 10th
 largest % college
 # educated in counties with median
 family income within
 # $1650 of the target income
 x - college[s]
 x - sort(x[!is.na(x)])
 n - length(x)
 low - x[10]; high - x[n-9]
 college.pts = low  college.pts = high
   }
 windows()
 plot(Predict(f, college, income=incomes, conf.int=FALSE),
  xlim=c(0,35), ylim=c(30,55), lty=1, lwd=c(.25,1.5,3.5,6),
 col=c(1,1,2,2), perim=show.pts)




 Thanks




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to delete empty levels from lattice xyplot

2011-03-02 Thread John Smith
Hello All,

I try to use the attached code to produce a cross over plot. There are 13
subjects, 7 of them in for/sal group, and 6 of them in sal/for group. But in
xyplot, all the subjects are listed in both subgraphs. Could anyone help me
figure out how to get rid of the empty levels?

Thanks




library(lattice)

pef1 - c(310,310,370,410,250,380,330,370,310,380,290,260,90)
pef2 - c(270,260,300,390,210,350,365,385,400,410,320,340,220)
id - c(1,4,6,7,10,11,14,2,3,5,9,12,13)
sequence - c(rep('for/sal', 7), rep('sal/for', 6))
treat1 - c(rep('for', 7), rep('sal', 6))
treat2 - c(rep('sal', 7), rep('for', 6))
study - data.frame(id, sequence, treat1, pef1, treat2, pef2)

studyLong - as.data.frame(rbind(as.matrix(study[,c('id', 'sequence',
'treat1', 'pef1')]),
 as.matrix(study[,c('id', 'sequence',
'treat2', 'pef2')])))
colnames(studyLong) - c('id', 'sequence', 'treat', 'pef')

xyplot(pef ~ id | sequence, groups=treat, data=studyLong,
auto.key=list(columns=2))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice auto.key gives mismatch colors

2011-02-11 Thread John Smith
Hello All,

I am using the following code to draw a figure. But the legend given buy
auto.key has mismatched colors. Could any one help me?

I am using R2.12.1 and most current lattice on windows XP.

Thanks

John

library(lattice)

src - data.frame(t=rep(c('A','B','C','D'), rep(8,4)),
  s=rep(c(8132,8140,8178,8180,8224,8230,8337,8345), 4),
  v=c(55.10, 56.00, 206.00, 5.86, 164.00, 102.00, 171.00,
280.00, 236.00,
91.10, 238.00, 102.00, 59.30, 227.00, 280.00, 316.00,
205.00, 120.00,
273.00, 98.80, 167.00, 104.00, 155.00, 370.00, 215.00,
97.60, 133.00,
135.00, 48.60, 135.00, 77.10, 91.90))
colors - rgb(c(228,  55,  77, 152, 255, 255, 166, 247),
  c(26,  126, 175,  78, 127, 255,  86, 129),
  c(28,  184,  74, 163,   0,  51,  40, 191), maxColorValue=255)
xyplot(v~t, groups=s, type='o', data=src, col=colors, auto.key =
list(points=TRUE, columns = 4, col=colors))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help in understanding R code, and maybe some math

2010-05-23 Thread john smith
Hi,
I am trying to implement Higham's algorithm for correcting a non positive
definite covariance matrix.
I found this code in R:
http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637

I managed to understand most of it, the only line I really don't understand
is this one:
X - tcrossprod(Q * rep(d[p], each=nrow(Q)), Q)

This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by
m matrix and R is a diagonal n by n matrix. What does this mean?
I also don't understand the meaning of a cross product between matrices, I
only know it between vectors.

Thanks,
Barisdad.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bug in R2WinBUGS

2009-01-09 Thread John Smith
In newest version of R2WinBUGS, the default directory is changed to
working.directory, but never changed back once finished bugs call.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice: How to draw curves from given formulae

2008-08-04 Thread John Smith
I have another questions. How can I type specific names into strips of the
resulting plot?

For instance, in the resulting figure from the attached code, instead of
'umbrella(d)', I want have 'UMBRELLA' in the strip.

library(lattice)

flat - function(d) 0 * d
linear   - function(d) -(1.65/8) * d
logistic - function(d) 0.015 - 1.73 / (1 + exp(1.2 * (4-d)))
umbrella - function(d) -(1.65/3) * d + (1.65 / 36) * d^2
emax - function(d) -1.81 * d / (0.79 + d)
sigmoid  - function(d) -1.70 * d^5 / (4^5 + d^5)

xyplot(flat(d) + linear(d) + logistic(d) + umbrella(d) + emax(d) +
sigmoid(d) ~ d,
   data = list(d = seq(0, 8, length = 101)),
   ylab='Expected change from baseline in VAS at Week 6', xlab='Dose',
   type = l, outer = TRUE, )

Many thanks




On Tue, Jul 22, 2008 at 4:31 PM, Deepayan Sarkar
[EMAIL PROTECTED]wrote:

 On 7/22/08, John Smith [EMAIL PROTECTED] wrote:
  Thanks, Deepayan,
 
   I actually want to have a picture like the Figure 5.9 in Mixed-Effects
   Models in S and S-PLUS, which have a separate panel for each function
 and
   has the function name been printed in strip.

 xyplot(linear(d) + logistic(d) + umbrella(d) ~ d,
  data = list(d = seq(-5, 5, length = 101)),
   type = l, outer = TRUE)

 There are several ways to make the strip labels just the names; the
 simplest is to create a data frame with all the necessary columns
 beforehand.

 -Deepayan

 
   Thanks
 
 
 
   On Tue, Jul 22, 2008 at 1:08 PM, Deepayan Sarkar 
 [EMAIL PROTECTED]
   wrote:
 
 
On 7/21/08, John Smith [EMAIL PROTECTED] wrote:
 Dear R Users:

  I have a list function as:
  Flat:  y = 0
  Linear: y = -(1.65/8)d
  Logistic: y = 0.015 - 1.73/{1+exp[1.2(4-d)]}
  Umbrella: y= -(1.65/3)d + (1.65/36)d^2
  Emax: y = -1.81d/(0.79+d)
  Sigmoid Emax: y = -1.70d^5/(4^5+d^5)

  And want draw the figure as attached (those material are extracted
 from
a
  paper). Could anyone give me a sample code to do this?
   
The attachment didn't come through, but here's one approach (with
three of your functions):
   
linear - function(d) -(1.65/8) * d
logistic - function(d) 0.015 - 1.73 / (1 + exp(1.2 * (4-d)))
umbrella - function(d) -(1.65/3) * d + (1.65 / 36) * d^2
   
xyplot(linear(d) + logistic(d) + umbrella(d) ~ d,
  data = list(d = seq(-5, 5, length = 101)),
  type = l,
  auto.key = list(lines = TRUE, points = FALSE))
   
-Deepayan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice: How to draw curves from given formulae

2008-07-22 Thread John Smith
Dear R Users:

I have a list function as:
Flat:  y = 0
Linear: y = -(1.65/8)d
Logistic: y = 0.015 - 1.73/{1+exp[1.2(4-d)]}
Umbrella: y= -(1.65/3)d + (1.65/36)d^2
Emax: y = -1.81d/(0.79+d)
Sigmoid Emax: y = -1.70d^5/(4^5+d^5)

And want draw the figure as attached (those material are extracted from a
paper). Could anyone give me a sample code to do this?

Thanks
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice: How to draw curves from given formulae

2008-07-22 Thread John Smith
Thanks, Deepayan,

I actually want to have a picture like the Figure 5.9 in Mixed-Effects
Models in S and S-PLUS, which have a separate panel for each function and
has the function name been printed in strip.

Thanks



On Tue, Jul 22, 2008 at 1:08 PM, Deepayan Sarkar [EMAIL PROTECTED]
wrote:

 On 7/21/08, John Smith [EMAIL PROTECTED] wrote:
  Dear R Users:
 
   I have a list function as:
   Flat:  y = 0
   Linear: y = -(1.65/8)d
   Logistic: y = 0.015 - 1.73/{1+exp[1.2(4-d)]}
   Umbrella: y= -(1.65/3)d + (1.65/36)d^2
   Emax: y = -1.81d/(0.79+d)
   Sigmoid Emax: y = -1.70d^5/(4^5+d^5)
 
   And want draw the figure as attached (those material are extracted from
 a
   paper). Could anyone give me a sample code to do this?

 The attachment didn't come through, but here's one approach (with
 three of your functions):

 linear - function(d) -(1.65/8) * d
 logistic - function(d) 0.015 - 1.73 / (1 + exp(1.2 * (4-d)))
 umbrella - function(d) -(1.65/3) * d + (1.65 / 36) * d^2

 xyplot(linear(d) + logistic(d) + umbrella(d) ~ d,
   data = list(d = seq(-5, 5, length = 101)),
   type = l,
   auto.key = list(lines = TRUE, points = FALSE))

 -Deepayan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to draw multiples curses (for given formulae) in lattice

2008-07-21 Thread John Smith
Dear R Users,

Could you please write a piece of code to draw Figure 5.9 of Mixed-Effects
Models in S and S-Plus?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot an user-defined function

2008-02-06 Thread John Smith
Thank all of you for your helps. They are very helpful.

But I have a further question. Suppose I have  the following mixed effect
model

thetaMixed - function(tau, i)
  {
w - 1 / (s^2 + tau^2)
mu - sum(theta * w) / sum(w)
b - s[i]^2 / (s[i]^2 + tau^2)
theta[i]*(1-b) + mu*b)
  }

I want draw all the mixed effects in a single figure using
for (i in 1:10)
{
   plot(Vectorize(thetaMixed), 0, 2, i=i)
}
and hope plot will recognize that i is the argument for function thetaMixed,
and 0, 2 are the range for tau. But it fails.

Could anyone kindly help me on this issue?

Thanks




On Feb 5, 2008 10:45 PM, Duncan Murdoch [EMAIL PROTECTED] wrote:

 jim holtman wrote:
  Your function 'll' only returns a single value when passed a vector:
 
 
  x - seq(0,2,.1)
  ll(x)
 
  [1] -7.571559
 
 
  'plot' expects to pass a vector to the function and have it return a
  vector of the same length; e.g.,
 
 
  sin(x)
 
   [1] 0. 0.09983342 0.19866933 0.29552021 0.38941834 0.47942554
  0.56464247 0.64421769 0.71735609
  [10] 0.78332691 0.84147098 0.89120736 0.93203909 0.96355819 0.98544973
  0.99749499 0.99957360 0.99166481
  [19] 0.97384763 0.94630009 0.90929743
 
 
  So you either have to rewrite your function, or have a loop that will
  evaluate the function at each individual point and then plot it.
 
 Or use Vectorize, e.g.

 plot(Vectorize(ll), 0, 2)

 Duncan Murdoch
  On Feb 5, 2008 7:06 PM, John Smith [EMAIL PROTECTED] wrote:
 
  Dear R-users,
 
  Suppose I have defined a likelihood function as ll(tau), how can I plot
 this
  likelihood function by calling it by plot?
 
  I want to do it like this:
 
  ll - function(tau)
   {
 w - 1 / (s^2 + tau^2)
 mu - sum(theta * w) / sum(w)
 -1/2*sum((theta-mu)^2*w -log(w))
   }
  plot(ll, 0, 2)
 
 
 
  But have the following error:
  Error in xy.coords(x, y, xlabel, ylabel, log) :
   'x' and 'y' lengths differ
  In addition: Warning messages:
  1: In s^2 + tau^2 :
   longer object length is not a multiple of shorter object length
  2: In theta * w :
   longer object length is not a multiple of shorter object length
  3: In (theta - mu)^2 * w :
   longer object length is not a multiple of shorter object length
 
 
  Thanks
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R2WinBUGS is broken

2008-01-31 Thread John Smith
Dear R-users,

I am trying to use the following code to reproduce results from Prof.
Gelman's book, but have the listed error for R2WinBUGS version (the openbugs
version is good). I am using R-2.6.1 on windows XP, and all the R packages
are most current ones. schools.bug can be found at
http://www.stat.columbia.edu/~gelman/bugsR/runningbugs.html . Can anyone
help me to figure out what's going wrong?


rm(list=ls())
library(R2WinBUGS)
library ('BRugs')
setwd('C:/bayesian/bda/codes/example_5_5')
data(schools)
J - nrow(schools)
y - schools$estimate
sigma.y - schools$sd
data - list(J=J, y=y, sigma.y=sigma.y)

inits - function()
  list(theta=rnorm(J,0,100), mu.theta=rnorm(1,0,100),
   sigma.theta=runif(1,0,100))
parameters - c('theta', 'mu.theta', 'sigma.theta')
schools.sim - bugs(data, inits, parameters, model.file='schools.bug',
n.chains=3, n.iter=1000, clearWD=TRUE)
## schools.sim - bugs(data, inits, parameters, model.file='schools.bug',
## n.chains=3, n.iter=1000, clearWD=TRUE,
## program = 'openbugs')
print(schools.sim)
plot(schools.sim)



Error in matrix(unlist(mx), ncol = n.cols, byrow = TRUE) :
  invalid 'nrow' value (too large or NA)



Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hmisc: can not reproduce figure 4 of Statistical Tables and Plots using S and LATEX

2007-11-24 Thread John Smith
Dear R-users:

I can not reproduce figure 4 of *Statistical Tables and Plots using S and
LATEX* by Prof. Frank Harrell with the following code:

rm(list=ls())
library(Hmisc)

getHdata(pbc)
attach(pbc)
age.groups - cut2(age, c(45,60))
g - function(y) apply(y, 2, quantile, c(.25,.5,.75))
y - with(pbc, cbind(Chol=chol,Bili=bili))
# You can give new column names that are not legal S names
# by enclosing them in quotes, e.g. 'Chol (mg/dl)'=chol
vars - with(pbc, c(label(chol), label(bili)))
label(y) - paste(vars, collapse=' and ') # Will make nice caption in table
s3 - summary(y ~ age.groups + ascites, fun=g, data=pbc)
s3

windows(width=10.67, height=6.60)
par(mfrow=c(1,2), oma=c(3,0,3,0))
for(ivar in 1:2)
  {
isub - (1:3)+(ivar-1)*3
print(isub)
plot(s3, which=isub, main='', xlab=vars[ivar], pch=c(91,16,93))
  }
mtitle(paste('Quartiles of', label(y)), cex.m=1.5)


Could any one help me?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.