Re: [R] test logistic regression model

2022-11-20 Thread Mitchell Maltenfort
Agreed on the ranking of (1) vs (2)



On Sun, Nov 20, 2022 at 1:30 PM Ebert,Timothy Aaron  wrote:

> I like option 1. Option 2 may cause problems if you are pooling groups
> that do not go together. This is especially a problem if you know that the
> data is missing some groups. I would consider dropping rare groups - or
> compare results between pooling and dropping options. If the answer is the
> same in both cases then use the approach that makes your life easier with
> reviewers/clients. If the answer is different then I would go with dropping
> rare categories, or present both and highlight the difference in outcome. A
> third option is to gather more data.
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Bert Gunter
> Sent: Sunday, November 20, 2022 1:06 PM
> To: Mitchell Maltenfort 
> Cc: R-help 
> Subject: Re: [R] test logistic regression model
>
> [External Email]
>
> I think (2) might be a bad idea if one of the "sparse"categories has high
> predictive power. You'll lose it when you pool, will you not?
> Also, there is the problem of subjectively defining "sparse."
>
> However, 1) seems quite sensible to me. But IANAE.
>
> -- Bert
>
> On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort 
> wrote:
> >
> > Two possible fixes occur to me
> >
> > 1) Redo the test/training split but within levels of factor - so you
> > have the same split within each level and each level accounted for in
> > training and testing
> >
> > 2) if you have a lot of levels, and perhaps sparse representation in a
> > few, consider recoding levels to pool the rare ones into an "other"
> > category
> >
> > On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter 
> wrote:
> >>
> >> small reprex:
> >>
> >> set.seed(5)
> >> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <-
> >> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not
> >> seen in dat to NA
> >> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data =
> >> dat)
> >>
> >> ##Result:
> >> > predict(lmfit,newdat)
> >> 1 2 3 4 5 6
> >> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
> >>
> >> If this does not suffice, as Rui said, we need details of what you did.
> >> (predict.glm works like predict.lm)
> >>
> >>
> >> -- Bert
> >>
> >>
> >> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas 
> wrote:
> >> >
> >> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> >> > > Dear Bert,
> >> > >
> >> > > Yes, was trying to fill the not existing categories with NAs, but
> >> > > the suggested solutions in stackoverflow.com unfortunately did not
> work.
> >> > >
> >> > > Best regards
> >> > > Gabor
> >> > >
> >> > >
> >> > > Bert Gunter  schrieb am So., 20. Nov.
> 2022, 16:20:
> >> > >
> >> > >> You can't predict results for categories that you've not seen
> >> > >> before (think about it). You will need to remove those cases
> >> > >> from your test set (or convert them to NA and predict them as NA).
> >> > >>
> >> > >> -- Bert
> >> > >>
> >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki
> >> > >> 
> >> > >> wrote:
> >> > >>
> >> > >>> Dear all,
> >> > >>>
> >> > >>> i have created a logistic regression model,
> >> > >>>   on the train df:
> >> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> >> > >>> "binomial")
> >> > >>>
> >> > >>> then i try to predict with the test df
> >> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> >> > >>> then iget this error message:
> >> > >>> Error in model.frame.default(Terms, newdata, na.action =
> >> > >>> na.action, xlev =
> >> > >>> object$xlevels)
> >> > >>> Factor  "TG_KraftF5" has new levels
> >> > >>>
> >> > >>> i have tried different proposals from stackoverflow, but
> >> > >>> unfortu

Re: [R] test logistic regression model

2022-11-20 Thread Ebert,Timothy Aaron
I like option 1. Option 2 may cause problems if you are pooling groups that do 
not go together. This is especially a problem if you know that the data is 
missing some groups. I would consider dropping rare groups - or compare results 
between pooling and dropping options. If the answer is the same in both cases 
then use the approach that makes your life easier with reviewers/clients. If 
the answer is different then I would go with dropping rare categories, or 
present both and highlight the difference in outcome. A third option is to 
gather more data.

Tim

-Original Message-
From: R-help  On Behalf Of Bert Gunter
Sent: Sunday, November 20, 2022 1:06 PM
To: Mitchell Maltenfort 
Cc: R-help 
Subject: Re: [R] test logistic regression model

[External Email]

I think (2) might be a bad idea if one of the "sparse"categories has high 
predictive power. You'll lose it when you pool, will you not?
Also, there is the problem of subjectively defining "sparse."

However, 1) seems quite sensible to me. But IANAE.

-- Bert

On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort  wrote:
>
> Two possible fixes occur to me
>
> 1) Redo the test/training split but within levels of factor - so you 
> have the same split within each level and each level accounted for in 
> training and testing
>
> 2) if you have a lot of levels, and perhaps sparse representation in a 
> few, consider recoding levels to pool the rare ones into an "other" 
> category
>
> On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:
>>
>> small reprex:
>>
>> set.seed(5)
>> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <- 
>> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not 
>> seen in dat to NA
>> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data = 
>> dat)
>>
>> ##Result:
>> > predict(lmfit,newdat)
>> 1 2 3 4 5 6
>> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>>
>> If this does not suffice, as Rui said, we need details of what you did.
>> (predict.glm works like predict.lm)
>>
>>
>> -- Bert
>>
>>
>> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>> >
>> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
>> > > Dear Bert,
>> > >
>> > > Yes, was trying to fill the not existing categories with NAs, but 
>> > > the suggested solutions in stackoverflow.com unfortunately did not work.
>> > >
>> > > Best regards
>> > > Gabor
>> > >
>> > >
>> > > Bert Gunter  schrieb am So., 20. Nov. 2022, 
>> > > 16:20:
>> > >
>> > >> You can't predict results for categories that you've not seen 
>> > >> before (think about it). You will need to remove those cases 
>> > >> from your test set (or convert them to NA and predict them as NA).
>> > >>
>> > >> -- Bert
>> > >>
>> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
>> > >> 
>> > >> wrote:
>> > >>
>> > >>> Dear all,
>> > >>>
>> > >>> i have created a logistic regression model,
>> > >>>   on the train df:
>> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> > >>> "binomial")
>> > >>>
>> > >>> then i try to predict with the test df
>> > >>> Predict<- predict(mymodel1, newdata = test, type = "response") 
>> > >>> then iget this error message:
>> > >>> Error in model.frame.default(Terms, newdata, na.action = 
>> > >>> na.action, xlev =
>> > >>> object$xlevels)
>> > >>> Factor  "TG_KraftF5" has new levels
>> > >>>
>> > >>> i have tried different proposals from stackoverflow, but 
>> > >>> unfortunately they did not solved the problem.
>> > >>> Do you have any idea how to test a logistic regression model 
>> > >>> when you have different levels in train and in test df?
>> > >>>
>> > >>> thank you in advance
>> > >>> Regards,
>> > >>> Gabor
>> > >>>
>> > >>>  [[alternative HTML version deleted]]
>> > >>>
>> > >>> __
>> > >>> R-help@r-project.org mailing list -- To UNS

Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
I think (2) might be a bad idea if one of the "sparse"categories has
high predictive power. You'll lose it when you pool, will you not?
Also, there is the problem of subjectively defining "sparse."

However, 1) seems quite sensible to me. But IANAE.

-- Bert

On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort  wrote:
>
> Two possible fixes occur to me
>
> 1) Redo the test/training split but within levels of factor - so you have the 
> same split within each level and each level accounted for in training and 
> testing
>
> 2) if you have a lot of levels, and perhaps sparse representation in a few, 
> consider recoding levels to pool the rare ones into an “other” category
>
> On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:
>>
>> small reprex:
>>
>> set.seed(5)
>> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
>> newdat <- data.frame(f =rep(c('r','g','b'),2))
>> ## convert values in newdat not seen in dat to NA
>> is.na(newdat$f) <-!( newdat$f %in% dat$f)
>> lmfit <- lm(y~f, data = dat)
>>
>> ##Result:
>> > predict(lmfit,newdat)
>> 1 2 3 4 5 6
>> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>>
>> If this does not suffice, as Rui said, we need details of what you did.
>> (predict.glm works like predict.lm)
>>
>>
>> -- Bert
>>
>>
>> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>> >
>> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
>> > > Dear Bert,
>> > >
>> > > Yes, was trying to fill the not existing categories with NAs, but the
>> > > suggested solutions in stackoverflow.com unfortunately did not work.
>> > >
>> > > Best regards
>> > > Gabor
>> > >
>> > >
>> > > Bert Gunter  schrieb am So., 20. Nov. 2022, 
>> > > 16:20:
>> > >
>> > >> You can't predict results for categories that you've not seen before
>> > >> (think about it). You will need to remove those cases from your test set
>> > >> (or convert them to NA and predict them as NA).
>> > >>
>> > >> -- Bert
>> > >>
>> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
>> > >> 
>> > >> wrote:
>> > >>
>> > >>> Dear all,
>> > >>>
>> > >>> i have created a logistic regression model,
>> > >>>   on the train df:
>> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> > >>> "binomial")
>> > >>>
>> > >>> then i try to predict with the test df
>> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
>> > >>> then iget this error message:
>> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, 
>> > >>> xlev =
>> > >>> object$xlevels)
>> > >>> Factor  "TG_KraftF5" has new levels
>> > >>>
>> > >>> i have tried different proposals from stackoverflow, but unfortunately
>> > >>> they
>> > >>> did not solved the problem.
>> > >>> Do you have any idea how to test a logistic regression model when you 
>> > >>> have
>> > >>> different levels in train and in test df?
>> > >>>
>> > >>> thank you in advance
>> > >>> Regards,
>> > >>> Gabor
>> > >>>
>> > >>>  [[alternative HTML version deleted]]
>> > >>>
>> > >>> __
>> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >>> PLEASE do read the posting guide
>> > >>> http://www.R-project.org/posting-guide.html
>> > >>> and provide commented, minimal, self-contained, reproducible code.
>> > >>>
>> > >>
>> > >
>> > >   [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide 
>> > > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> > hello,
>> >
>> > What exactly didn't work? You say you have tried the solutions found in
>> > stackoverflow but without a link, we don't know which answers to which
>> > questions you are talking about.
>> > Like Bert said, if you assign NA to the new levels, present only in
>> > test, it should work.
>> >
>> > Can you post links to what you have tried?
>> >
>> > Hope this helps,
>> >
>> > Rui Barradas
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from Gmail Mobile

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Mitchell Maltenfort
Two possible fixes occur to me

1) Redo the test/training split but within levels of factor - so you have
the same split within each level and each level accounted for in training
and testing

2) if you have a lot of levels, and perhaps sparse representation in a few,
consider recoding levels to pool the rare ones into an “other” category

On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:

> small reprex:
>
> set.seed(5)
> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
> newdat <- data.frame(f =rep(c('r','g','b'),2))
> ## convert values in newdat not seen in dat to NA
> is.na(newdat$f) <-!( newdat$f %in% dat$f)
> lmfit <- lm(y~f, data = dat)
>
> ##Result:
> > predict(lmfit,newdat)
> 1 2 3 4 5 6
> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>
> If this does not suffice, as Rui said, we need details of what you did.
> (predict.glm works like predict.lm)
>
>
> -- Bert
>
>
> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
> >
> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> > > Dear Bert,
> > >
> > > Yes, was trying to fill the not existing categories with NAs, but the
> > > suggested solutions in stackoverflow.com unfortunately did not work.
> > >
> > > Best regards
> > > Gabor
> > >
> > >
> > > Bert Gunter  schrieb am So., 20. Nov. 2022,
> 16:20:
> > >
> > >> You can't predict results for categories that you've not seen before
> > >> (think about it). You will need to remove those cases from your test
> set
> > >> (or convert them to NA and predict them as NA).
> > >>
> > >> -- Bert
> > >>
> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki <
> gmalomsoki1...@gmail.com>
> > >> wrote:
> > >>
> > >>> Dear all,
> > >>>
> > >>> i have created a logistic regression model,
> > >>>   on the train df:
> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> > >>> "binomial")
> > >>>
> > >>> then i try to predict with the test df
> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> > >>> then iget this error message:
> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action,
> xlev =
> > >>> object$xlevels)
> > >>> Factor  "TG_KraftF5" has new levels
> > >>>
> > >>> i have tried different proposals from stackoverflow, but
> unfortunately
> > >>> they
> > >>> did not solved the problem.
> > >>> Do you have any idea how to test a logistic regression model when
> you have
> > >>> different levels in train and in test df?
> > >>>
> > >>> thank you in advance
> > >>> Regards,
> > >>> Gabor
> > >>>
> > >>>  [[alternative HTML version deleted]]
> > >>>
> > >>> __
> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > hello,
> >
> > What exactly didn't work? You say you have tried the solutions found in
> > stackoverflow but without a link, we don't know which answers to which
> > questions you are talking about.
> > Like Bert said, if you assign NA to the new levels, present only in
> > test, it should work.
> >
> > Can you post links to what you have tried?
> >
> > Hope this helps,
> >
> > Rui Barradas
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Sent from Gmail Mobile

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
small reprex:

set.seed(5)
dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
newdat <- data.frame(f =rep(c('r','g','b'),2))
## convert values in newdat not seen in dat to NA
is.na(newdat$f) <-!( newdat$f %in% dat$f)
lmfit <- lm(y~f, data = dat)

##Result:
> predict(lmfit,newdat)
1 2 3 4 5 6
0.4374251 0.6196527NA 0.4374251 0.6196527NA

If this does not suffice, as Rui said, we need details of what you did.
(predict.glm works like predict.lm)


-- Bert


On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>
> Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> > Dear Bert,
> >
> > Yes, was trying to fill the not existing categories with NAs, but the
> > suggested solutions in stackoverflow.com unfortunately did not work.
> >
> > Best regards
> > Gabor
> >
> >
> > Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:
> >
> >> You can't predict results for categories that you've not seen before
> >> (think about it). You will need to remove those cases from your test set
> >> (or convert them to NA and predict them as NA).
> >>
> >> -- Bert
> >>
> >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
> >> wrote:
> >>
> >>> Dear all,
> >>>
> >>> i have created a logistic regression model,
> >>>   on the train df:
> >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> >>> "binomial")
> >>>
> >>> then i try to predict with the test df
> >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> >>> then iget this error message:
> >>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
> >>> object$xlevels)
> >>> Factor  "TG_KraftF5" has new levels
> >>>
> >>> i have tried different proposals from stackoverflow, but unfortunately
> >>> they
> >>> did not solved the problem.
> >>> Do you have any idea how to test a logistic regression model when you have
> >>> different levels in train and in test df?
> >>>
> >>> thank you in advance
> >>> Regards,
> >>> Gabor
> >>>
> >>>  [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> hello,
>
> What exactly didn't work? You say you have tried the solutions found in
> stackoverflow but without a link, we don't know which answers to which
> questions you are talking about.
> Like Bert said, if you assign NA to the new levels, present only in
> test, it should work.
>
> Can you post links to what you have tried?
>
> Hope this helps,
>
> Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Rui Barradas

Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:

Dear Bert,

Yes, was trying to fill the not existing categories with NAs, but the
suggested solutions in stackoverflow.com unfortunately did not work.

Best regards
Gabor


Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:


You can't predict results for categories that you've not seen before
(think about it). You will need to remove those cases from your test set
(or convert them to NA and predict them as NA).

-- Bert

On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
wrote:


Dear all,

i have created a logistic regression model,
  on the train df:
mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
"binomial")

then i try to predict with the test df
Predict<- predict(mymodel1, newdata = test, type = "response")
then iget this error message:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels)
Factor  "TG_KraftF5" has new levels

i have tried different proposals from stackoverflow, but unfortunately
they
did not solved the problem.
Do you have any idea how to test a logistic regression model when you have
different levels in train and in test df?

thank you in advance
Regards,
Gabor

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


hello,

What exactly didn't work? You say you have tried the solutions found in 
stackoverflow but without a link, we don't know which answers to which 
questions you are talking about.
Like Bert said, if you assign NA to the new levels, present only in 
test, it should work.


Can you post links to what you have tried?

Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Gábor Malomsoki
Dear Bert,

Yes, was trying to fill the not existing categories with NAs, but the
suggested solutions in stackoverflow.com unfortunately did not work.

Best regards
Gabor


Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:

> You can't predict results for categories that you've not seen before
> (think about it). You will need to remove those cases from your test set
> (or convert them to NA and predict them as NA).
>
> -- Bert
>
> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
> wrote:
>
>> Dear all,
>>
>> i have created a logistic regression model,
>>  on the train df:
>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> "binomial")
>>
>> then i try to predict with the test df
>> Predict<- predict(mymodel1, newdata = test, type = "response")
>> then iget this error message:
>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
>> object$xlevels)
>> Factor  "TG_KraftF5" has new levels
>>
>> i have tried different proposals from stackoverflow, but unfortunately
>> they
>> did not solved the problem.
>> Do you have any idea how to test a logistic regression model when you have
>> different levels in train and in test df?
>>
>> thank you in advance
>> Regards,
>> Gabor
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
You can't predict results for categories that you've not seen before (think
about it). You will need to remove those cases from your test set (or
convert them to NA and predict them as NA).

-- Bert

On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
wrote:

> Dear all,
>
> i have created a logistic regression model,
>  on the train df:
> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial")
>
> then i try to predict with the test df
> Predict<- predict(mymodel1, newdata = test, type = "response")
> then iget this error message:
> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
> object$xlevels)
> Factor  "TG_KraftF5" has new levels
>
> i have tried different proposals from stackoverflow, but unfortunately they
> did not solved the problem.
> Do you have any idea how to test a logistic regression model when you have
> different levels in train and in test df?
>
> thank you in advance
> Regards,
> Gabor
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test logistic regression model

2022-11-20 Thread Gábor Malomsoki
Dear all,

i have created a logistic regression model,
 on the train df:
mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial")

then i try to predict with the test df
Predict<- predict(mymodel1, newdata = test, type = "response")
then iget this error message:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels)
Factor  "TG_KraftF5" has new levels

i have tried different proposals from stackoverflow, but unfortunately they
did not solved the problem.
Do you have any idea how to test a logistic regression model when you have
different levels in train and in test df?

thank you in advance
Regards,
Gabor

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedded R: Test if initialized

2021-06-16 Thread Bert Gunter
I believe this is the wrong list for this post. See the posting guide,
linked below, for one that is more appropriate.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jun 16, 2021 at 12:51 PM Matthias Gondan 
wrote:

> Dear R friends,
>
> I am currently trying to write a piece of C code that uses „embedded R“,
> and for specific reasons*, I cannot keep track if R already has been
> initialized. So the code snippet looks like this:
>
> LibExtern char *R_TempDir;
>
> if(R_TempDir == NULL)
> …throw exception R not initialized…
>
> I have seen that the source code of Rf_initialize_R itself checks if it is
> ivoked twice (num_initialized), but this latter flag does not seem to
> accessible, or is it?
>
> int Rf_initialize_R(int ac, char **av)
> {
> int i, ioff = 1, j;
> Rboolean useX11 = TRUE, useTk = FALSE;
> char *p, msg[1024], cmdlines[1], **avv;
> structRstart rstart;
> Rstart Rp = 
> Rboolean force_interactive = FALSE;
>
> if (num_initialized++) {
> fprintf(stderr, "%s", "R is already initialized\n");
> exit(1);
> }
>
>
> Is the test of the TempDir a good substitute, or should I choose another
> solution? Having said this, it may be a good idea to expose a function
> Rf_R_initialized that performs such a test.
>
> Thank you for your consideration.
>
> Best regards,
>
> Matthias
>
> *The use case is an R library that connects to swi-prolog and allows the
> „embedded“ swi-prolog to establish the reverse connection to R. In that
> case, i.e., R -> Prolog -> R, I do not want to initialize R a second time.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Embedded R: Test if initialized

2021-06-16 Thread Matthias Gondan
Dear R friends,

I am currently trying to write a piece of C code that uses „embedded R“, and 
for specific reasons*, I cannot keep track if R already has been initialized. 
So the code snippet looks like this:

LibExtern char *R_TempDir;

if(R_TempDir == NULL)
…throw exception R not initialized…

I have seen that the source code of Rf_initialize_R itself checks if it is 
ivoked twice (num_initialized), but this latter flag does not seem to 
accessible, or is it? 

int Rf_initialize_R(int ac, char **av)
{
int i, ioff = 1, j;
Rboolean useX11 = TRUE, useTk = FALSE;
char *p, msg[1024], cmdlines[1], **avv;
structRstart rstart;
Rstart Rp = 
Rboolean force_interactive = FALSE;

if (num_initialized++) {
fprintf(stderr, "%s", "R is already initialized\n");
exit(1);
}


Is the test of the TempDir a good substitute, or should I choose another 
solution? Having said this, it may be a good idea to expose a function 
Rf_R_initialized that performs such a test.

Thank you for your consideration.

Best regards,

Matthias

*The use case is an R library that connects to swi-prolog and allows the 
„embedded“ swi-prolog to establish the reverse connection to R. In that case, 
i.e., R -> Prolog -> R, I do not want to initialize R a second time.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if something was plotted on pdf device

2019-09-13 Thread PIKAL Petr
Dear Duncan

Thank you for the code, I will test it or at least check what it does. I 
finally found probably easier solution.

I stay with my original code

if (dev.cur()==1) plot(ecdf(velik[,"ecd"]), main = ufil[j], col=i) else
plot(ecdf(velik[,"ecd"]), add=T, col=i)

After plot is finished and cycle ends, I copy result to pdf device

dev.copy(pdf,paste(gsub(".xls", "", ufil)[j], ".pdf", sep=""))
dev.off()

Using this approach I could stay with my original code (almost), check if plot 
was initialised by dev.cur() and save it after it is finished to pdf.

The only obstacle is that my code flashes during plotting to basic device, 
however I can live with it.

Thank you again and best regards

Petr

> -Original Message-
> From: Duncan Murdoch 
> Sent: Thursday, September 12, 2019 2:29 PM
> To: PIKAL Petr ; r-help mailing list  project.org>
> Subject: Re: [R] test if something was plotted on pdf device
>
> On 12/09/2019 7:10 a.m., PIKAL Petr wrote:
> > Dear all
> >
> > Is there any simple way checking whether after calling pdf device
> something was plotted into it?
> >
> > In interactive session I used
> >
> > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)),
> > add=T, col=i) which enabled me to test if plot is open
> >
> > But when I want to call eg. pdf("test.pdf") before cycle
> > dev.cur()==1 is FALSE even when no plot is drawn and plot.new error
> comes.
> >
> >> pdf("test.pdf")
> >
> > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)),
> > add=T, col=i)
> >
> > Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd,  :
> >plot.new has not been called yet
> >
>
> I don't know if this is reliable or not, but you could use code like this:
>
>f <- tempfile()
>pdf(f)
>blankPlot <- recordPlot()
>dev.off()
>unlink(f)
>
>pdf("test.pdf")
>
>...  unknown operations ...
>
>if (dev.cur() == 1 || identical(recordPlot(), blankPlot))
>  plot(ecdf(rnorm(100)))
>else
>  plot(ecdf(rnorm(100)), add=TRUE, col=i)
>
>
>
> Duncan Murdoch
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if something was plotted on pdf device

2019-09-12 Thread Duncan Murdoch

On 12/09/2019 7:10 a.m., PIKAL Petr wrote:

Dear all

Is there any simple way checking whether after calling pdf device something was 
plotted into it?

In interactive session I used

if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)
which enabled me to test if plot is open

But when I want to call eg. pdf("test.pdf") before cycle
dev.cur()==1 is FALSE even when no plot is drawn and plot.new error comes.


pdf("test.pdf")


if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)

Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd,  :
   plot.new has not been called yet



I don't know if this is reliable or not, but you could use code like this:

  f <- tempfile()
  pdf(f)
  blankPlot <- recordPlot()
  dev.off()
  unlink(f)

  pdf("test.pdf")

  ...  unknown operations ...

  if (dev.cur() == 1 || identical(recordPlot(), blankPlot))
plot(ecdf(rnorm(100)))
  else
plot(ecdf(rnorm(100)), add=TRUE, col=i)



Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test if something was plotted on pdf device

2019-09-12 Thread PIKAL Petr
Dear all

Is there any simple way checking whether after calling pdf device something was 
plotted into it?

In interactive session I used

if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)
which enabled me to test if plot is open

But when I want to call eg. pdf("test.pdf") before cycle
dev.cur()==1 is FALSE even when no plot is drawn and plot.new error comes.

> pdf("test.pdf")

if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)

Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd,  :
  plot.new has not been called yet

Best regards
Petr
Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch 
partner? PRECHEZA a.s. jsou zve?ejn?ny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner's personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a 
podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test of independence

2018-12-20 Thread Greg Snow
The basic test of independence for a table based on the Chi-squared
distribution can be done using the `chisq.test` function.  This is in
the stats package which is installed and loaded by default, so you
don't need to do anything additional.  There is also the `fisher.test`
function for Fisher's exact test (similar hypotheses, different
methodology and assumptions, may be really slow on your table).

If you need more than the basics provided in those functions, then a
search of CRAN may be helpful, or give us more detail to be able to
help.

On Thu, Dec 20, 2018 at 12:08 AM km  wrote:
>
> Dear All,
>
> How do I do a test of independence with 16x16 table of counts.
> Please suggest.
>
> Regards,
> KM
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test of independence

2018-12-20 Thread PIKAL Petr
Hi

Did you search CRAN? I got **many** results for

test of independence

which may or may not provide you with suitable procedures.

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of km
> Sent: Thursday, December 20, 2018 8:07 AM
> To: r-help@r-project.org
> Subject: [R] test of independence
>
> Dear All,
>
> How do I do a test of independence with 16x16 table of counts.
> Please suggest.
>
> Regards,
> KM
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test of independence

2018-12-19 Thread km
Dear All,

How do I do a test of independence with 16x16 table of counts.
Please suggest.

Regards,
KM

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] TEST message

2018-04-24 Thread Ted Harding
Apologies for disturbance! Just checking that I can
get through to r-help.
Ted.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if data uniformly distributed (newbie)

2018-04-10 Thread Huber, Florian
Dear Mr. Savicky,

I am currently working on a project where I want to test a random number 
generator, which is supposed to create 10.000 continuously uniformly 
distributed random numbers between 0 and 1. I am now wondering if I can use the 
Chi-Squared-Test to solve this problem or if the Kolmogorov-Smirnov-test would 
be a better fit.

I came across one of your threads on the internet where you answer a similar 
question and thought I'd reach out to you.


Thanks in advance
Florian Huber




Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist vertraulich und 
kann dem Bank- und Datengeheimnis unterliegen oder sonst rechtlich geschuetzte 
Daten und Informationen enthalten. Wenn Sie nicht der richtige Adressat sind 
oder diese Nachricht irrtuemlich erhalten haben, informieren Sie bitte sofort 
den Absender �ber die Antwortfunktion. Anschliessend moechten Sie bitte diese 
Nachricht einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig 
loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht und/oder der 
ihr etwa beigefuegten Anhaenge sowie die unbefugte Weitergabe der darin 
enthaltenen Daten und Informationen sind nicht gestattet. Wir weisen darauf 
hin, dass rechtsverbindliche Erklaerungen namens unseres Hauses grundsaetzlich 
der Unterschriften zweier ausreichend bevollmaechtigter Vertreter unseres 
Hauses beduerfen. Wir verschicken daher keine rechtsverbindlichen Erklaerungen 
per E-Mail an Dritte. Demgemaess nehmen wir per E-Mail auch keine 
rechtsverbindlichen Erklaerungen oder Auftraege von Dritten entgegen. 
Sollten Sie Schwierigkeiten beim Oeffnen dieser E-Mail haben, wenden Sie sich 
bitte an den Absender oder an i...@berenberg.de. Please refer to 
http://www.berenberg.de/my_berenberg/disclaimer_e.html for our confidentiality 
notice.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test set and Train set in Caret package train function

2017-10-22 Thread Elahe chalabi via R-help
Hey all,

Does anyone know how we can get train set and test set for each fold of 5 fold 
cross validation in Caret package? Imagine if I want to do cross validation by 
random forest method, I do the following in Caret:

set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame':   90 obs. of  5 variables:
$ pred: Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 
$ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 
$ rowIndex: int  2 3 9 11 25 29 35 36 41 50 ...
$ mtry: num  2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr  "Fold1" "Fold1" "Fold1" "Fold1" ...

Are these 90 observations in Fold1 used as training set? If yes then where is 
the test set for this fold?

thanks for any help! 

Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test for proportion or concordance

2017-08-03 Thread Bert Gunter
This list is about R programming, not statistics, although admittedly
there is a nonempty intersection. However, I think you would do better
posting this on a statistics list like stats.stackexchange.com.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Aug 3, 2017 at 7:19 AM, Adrian Johnson
 wrote:
> Hello group,
>
> my question is deciding what test would be appropriate for following question.
>
> An experiment 'A' yielded 3200 observations of which 431 are
> significant. Similarly, using same method, another experiment 'B' on a
> different population yielded 2541 observations of which 260 are
> significant.
>
> There are 180 observations that are common between significant
> observations of A and B.
> (180 are common between 431 and 260).
>
> 80 observations are specific to A
> 251 observations are specific to B.
>
> The question are the 180 observations  that are common between A and B
> - are these 180 common observations occurring by  chance?
>
> What test would be appropriate for this scenario.  (if my total
> observations are fixed between two experiments A and B, I could use
> Cohens kappa for concordance or Chi-square etc.
> Since the total observations differ between experiments A and B, I
> dont know what test would be appropriate.   I appreciate your help.
>
> thanks
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test for proportion or concordance

2017-08-03 Thread Adrian Johnson
Hello group,

my question is deciding what test would be appropriate for following question.

An experiment 'A' yielded 3200 observations of which 431 are
significant. Similarly, using same method, another experiment 'B' on a
different population yielded 2541 observations of which 260 are
significant.

There are 180 observations that are common between significant
observations of A and B.
(180 are common between 431 and 260).

80 observations are specific to A
251 observations are specific to B.

The question are the 180 observations  that are common between A and B
- are these 180 common observations occurring by  chance?

What test would be appropriate for this scenario.  (if my total
observations are fixed between two experiments A and B, I could use
Cohens kappa for concordance or Chi-square etc.
Since the total observations differ between experiments A and B, I
dont know what test would be appropriate.   I appreciate your help.

thanks

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread li li
Hi John. Thanks much for your help. It is great to know this.
  Hanna

2017-03-16 8:02 GMT-04:00 Fox, John :

> Dear Hanna,
>
> You can test the slope in each non-reference group as a linear hypothesis.
> You didn’t make the data available for your example, so here’s an example
> using the linearHypothesis() function in the car package with the Moore
> data set in the same package:
>
> - - - snip - - -
>
> > library(car)
> > mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> > summary(mod)
>
> Call:
> lm(formula = conformity ~ fscore * partner.status, data = Moore)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -7.5296 -2.5984 -0.4473  2.0994 12.4704
>
> Coefficients:
>   Estimate Std. Error t value Pr(>|t|)
> (Intercept)   20.793483.26273   6.373 1.27e-07 ***
> fscore-0.151100.07171  -2.107  0.04127 *
> partner.statuslow-15.534084.40045  -3.530  0.00104 **
> fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 4.562 on 41 degrees of freedom
> Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
> F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347
>
> > linearHypothesis(mod, "fscore + fscore:partner.statuslow")
> Linear hypothesis test
>
> Hypothesis:
> fscore  + fscore:partner.statuslow = 0
>
> Model 1: restricted model
> Model 2: conformity ~ fscore * partner.status
>
>   Res.DfRSS Df Sum of Sq  F  Pr(>F)
> 1 42 912.45
> 2 41 853.42  159.037 2.8363 0.09976 .
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> - - - snip - - -
>
> In this case, there are just two levels for partner.status, but for a
> multi-level factor you can simply perform more than one test.
>
>
> I hope this helps,
>
>  John
>
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> Web: http://socserv.mcmaster.ca/jfox/
>
>
>
>
> On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
>  wrote:
>
> >Hi all,
> >   Consider the data set where there are a continuous response variable, a
> >continuous predictor "weeks" and a categorical variable "region" with five
> >levels "a", "b", "c",
> >"d", "e".
> >  I fit the ANCOVA model as follows. Here the reference level is region
> >"a"
> >and there are 4 dummy variables. The interaction terms (in red below)
> >represent the slope
> >difference between each region and  the baseline region "a" and the
> >corresponding p-value is for testing whether this slope difference is
> >zero.
> >Is there a way to directly test whether the slope corresponding to each
> >individual factor level is 0 or not, instead of testing the slope
> >difference from the baseline level?
> >  Thanks very much.
> >  Hanna
> >
> >
> >
> >
> >
> >
> >> mod <- lm(response ~ weeks*region,data)> summary(mod)
> >Call:
> >lm(formula = response ~ weeks * region, data = data)
> >
> >Residuals:
> > Min   1Q   Median   3Q  Max
> >-0.19228 -0.07433 -0.01283  0.04439  0.24544
> >
> >Coefficients:
> >Estimate Std. Error t value Pr(>|t|)
> >(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
> >weeks -0.021  0.0147293  -1.4480.156
> >regionb   -0.0257778  0.1349962  -0.1910.850
> >regionc   -0.034  0.1349962  -0.2550.800
> >regiond   -0.075  0.1349962  -0.5590.580
> >regione   -0.148  0.1349962  -1.0980.280weeks:regionb
> >-0.0007222  0.0208304  -0.0350.973
> >weeks:regionc -0.0017778  0.0208304  -0.0850.932
> >weeks:regiond  0.003  0.0208304   0.1440.886
> >weeks:regione  0.0301667  0.0208304   1.4480.156---
> >Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> >Residual standard error: 0.1082 on 35 degrees of freedom
> >Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
> >F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread Fox, John
Dear Hanna,

You can test the slope in each non-reference group as a linear hypothesis.
You didn’t make the data available for your example, so here’s an example
using the linearHypothesis() function in the car package with the Moore
data set in the same package:

- - - snip - - -

> library(car)
> mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> summary(mod)

Call:
lm(formula = conformity ~ fscore * partner.status, data = Moore)

Residuals:
Min  1Q  Median  3Q Max
-7.5296 -2.5984 -0.4473  2.0994 12.4704

Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept)   20.793483.26273   6.373 1.27e-07 ***
fscore-0.151100.07171  -2.107  0.04127 *
partner.statuslow-15.534084.40045  -3.530  0.00104 **
fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.562 on 41 degrees of freedom
Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347

> linearHypothesis(mod, "fscore + fscore:partner.statuslow")
Linear hypothesis test

Hypothesis:
fscore  + fscore:partner.statuslow = 0

Model 1: restricted model
Model 2: conformity ~ fscore * partner.status

  Res.DfRSS Df Sum of Sq  F  Pr(>F)
1 42 912.45
2 41 853.42  159.037 2.8363 0.09976 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

- - - snip - - -

In this case, there are just two levels for partner.status, but for a
multi-level factor you can simply perform more than one test.


I hope this helps,

 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/




On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
 wrote:

>Hi all,
>   Consider the data set where there are a continuous response variable, a
>continuous predictor "weeks" and a categorical variable "region" with five
>levels "a", "b", "c",
>"d", "e".
>  I fit the ANCOVA model as follows. Here the reference level is region
>"a"
>and there are 4 dummy variables. The interaction terms (in red below)
>represent the slope
>difference between each region and  the baseline region "a" and the
>corresponding p-value is for testing whether this slope difference is
>zero.
>Is there a way to directly test whether the slope corresponding to each
>individual factor level is 0 or not, instead of testing the slope
>difference from the baseline level?
>  Thanks very much.
>  Hanna
>
>
>
>
>
>
>> mod <- lm(response ~ weeks*region,data)> summary(mod)
>Call:
>lm(formula = response ~ weeks * region, data = data)
>
>Residuals:
> Min   1Q   Median   3Q  Max
>-0.19228 -0.07433 -0.01283  0.04439  0.24544
>
>Coefficients:
>Estimate Std. Error t value Pr(>|t|)
>(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
>weeks -0.021  0.0147293  -1.4480.156
>regionb   -0.0257778  0.1349962  -0.1910.850
>regionc   -0.034  0.1349962  -0.2550.800
>regiond   -0.075  0.1349962  -0.5590.580
>regione   -0.148  0.1349962  -1.0980.280weeks:regionb
>-0.0007222  0.0208304  -0.0350.973
>weeks:regionc -0.0017778  0.0208304  -0.0850.932
>weeks:regiond  0.003  0.0208304   0.1440.886
>weeks:regione  0.0301667  0.0208304   1.4480.156---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.1082 on 35 degrees of freedom
>Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
>F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Test individual slope for each factor level in ANCOVA

2017-03-15 Thread li li
Hi all,
   Consider the data set where there are a continuous response variable, a
continuous predictor "weeks" and a categorical variable "region" with five
levels "a", "b", "c",
"d", "e".
  I fit the ANCOVA model as follows. Here the reference level is region "a"
and there are 4 dummy variables. The interaction terms (in red below)
represent the slope
difference between each region and  the baseline region "a" and the
corresponding p-value is for testing whether this slope difference is zero.
Is there a way to directly test whether the slope corresponding to each
individual factor level is 0 or not, instead of testing the slope
difference from the baseline level?
  Thanks very much.
  Hanna






> mod <- lm(response ~ weeks*region,data)> summary(mod)
Call:
lm(formula = response ~ weeks * region, data = data)

Residuals:
 Min   1Q   Median   3Q  Max
-0.19228 -0.07433 -0.01283  0.04439  0.24544

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
weeks -0.021  0.0147293  -1.4480.156
regionb   -0.0257778  0.1349962  -0.1910.850
regionc   -0.034  0.1349962  -0.2550.800
regiond   -0.075  0.1349962  -0.5590.580
regione   -0.148  0.1349962  -1.0980.280weeks:regionb
-0.0007222  0.0208304  -0.0350.973
weeks:regionc -0.0017778  0.0208304  -0.0850.932
weeks:regiond  0.003  0.0208304   0.1440.886
weeks:regione  0.0301667  0.0208304   1.4480.156---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1082 on 35 degrees of freedom
Multiple R-squared:  0.2678,Adjusted R-squared:  0.07946
F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Test

2017-01-16 Thread Robert Piliero
-- 

Robert J. Piliero

Cell: (617) 283 1020
38 Linnaean St. #6
Cambridge, MA, 02138
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Deepak Singh
I have tried and got the result.
Thank you every one.


On Tue, Apr 5, 2016 at 12:58 AM, Achim Zeileis <achim.zeil...@uibk.ac.at>
wrote:

> On Mon, 4 Apr 2016, varin sacha via R-help wrote:
>
> Hi Deepak,
>>
>> In econometrics there is another test very often used : the white test.
>> The white test is based on the comparison of the estimated variances of
>> residuals when the model is estimated by OLS under the assumption of
>> homoscedasticity and when the model is estimated by OLS under the
>> assumption of heteroscedastic.
>>
>
> The White test is a special case of the Breusch-Pagan test using a
> particular specification of the auxiliary regressors: namely all
> regressors, their squares and their cross-products. As this specification
> makes only sense if all regressors are continuous, many implementations
> have problems if there are already dummy variables, interactions, etc. in
> the regressor matrix. This is also the reason why bptest() from "lmtest"
> uses a different specification by default. However, you can utilize the
> function to carry out the White test as illustrated in:
>
> example("CigarettesB", package = "AER")
>
> (Of course, the AER package needs to be installed first.)
>
> The White test with R
>>
>> install.packages("bstats")
>> library(bstats)
>> white.test(LinearModel)
>>
>
> That package is no longer on CRAN as it took the code from bptest()
> without crediting its original authors and released it in a package that
> conflicted with the original license. Also, the implementation did not
> check for potential problems with dummy variables or interactions mentioned
> above.
>
> So the bptest() implementation from "lmtest" is really recommend. Or
> alternatively ncvTest() from package "car".
>
>
> Hope this helps.
>>
>> Sacha
>>
>>
>>
>>
>>
>> 
>> De : Deepak Singh <sdeepakrh...@gmail.com>
>> À : r-help@r-project.org Envoyé le : Lundi 4 avril 2016 10h40
>> Objet : [R] Test for Homoscedesticity in R Without BP Test
>>
>>
>> Respected Sir,
>> I am doing a project on multiple linear model fitting and in that project
>> I
>> have to test Homoscedesticity of errors I have google for the same and
>> found bptest for the same but in R version 3.2.4 bp test is not available.
>> So please suggest me a test on homoscedesticity ASAP as we have to submit
>> our report on 7-04-2016.
>>
>> P.S. : I have plotted residuals against fitted values and it is less or
>> more random.
>>
>> Thank You !
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Achim Zeileis

On Mon, 4 Apr 2016, varin sacha via R-help wrote:


Hi Deepak,

In econometrics there is another test very often used : the white test. 
The white test is based on the comparison of the estimated variances of 
residuals when the model is estimated by OLS under the assumption of 
homoscedasticity and when the model is estimated by OLS under the 
assumption of heteroscedastic.


The White test is a special case of the Breusch-Pagan test using a 
particular specification of the auxiliary regressors: namely all 
regressors, their squares and their cross-products. As this specification 
makes only sense if all regressors are continuous, many implementations 
have problems if there are already dummy variables, interactions, etc. in 
the regressor matrix. This is also the reason why bptest() from "lmtest" 
uses a different specification by default. However, you can utilize the 
function to carry out the White test as illustrated in:


example("CigarettesB", package = "AER")

(Of course, the AER package needs to be installed first.)


The White test with R

install.packages("bstats")
library(bstats)
white.test(LinearModel)


That package is no longer on CRAN as it took the code from bptest() 
without crediting its original authors and released it in a package that 
conflicted with the original license. Also, the implementation did not 
check for potential problems with dummy variables or interactions 
mentioned above.


So the bptest() implementation from "lmtest" is really recommend. Or 
alternatively ncvTest() from package "car".



Hope this helps.

Sacha






De : Deepak Singh <sdeepakrh...@gmail.com>
À : r-help@r-project.org 
Envoyé le : Lundi 4 avril 2016 10h40

Objet : [R] Test for Homoscedesticity in R Without BP Test


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.
So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Achim Zeileis

On Mon, 4 Apr 2016, Deepak Singh wrote:


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.


The function is called bptest() and is implemented in package "lmtest" 
which is available for current versions of R, see

https://CRAN.R-project.org/package=lmtest

To install it, run:
install.packages("lmtest")

And then to load the package and try the function:
library("lmtest")
example("bptest")


So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread varin sacha via R-help
Hi Deepak,

In econometrics there is another test very often used : the white test.
The white test is based on the comparison of the estimated variances of 
residuals when the model is estimated by OLS under the assumption of 
homoscedasticity and when the model is estimated by OLS under the assumption of 
heteroscedastic.


The White test with R

install.packages("bstats")
library(bstats)
white.test(LinearModel)



Hope this helps.

Sacha






De : Deepak Singh <sdeepakrh...@gmail.com>
À : r-help@r-project.org 
Envoyé le : Lundi 4 avril 2016 10h40
Objet : [R] Test for Homoscedesticity in R Without BP Test


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.
So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread John C Frain
You might "google Breusch Pagan test r" and find that the test is
implemented in lmtest package.
On 4 Apr 2016 17:28, "Deepak Singh"  wrote:

> Respected Sir,
> I am doing a project on multiple linear model fitting and in that project I
> have to test Homoscedesticity of errors I have google for the same and
> found bptest for the same but in R version 3.2.4 bp test is not available.
> So please suggest me a test on homoscedesticity ASAP as we have to submit
> our report on 7-04-2016.
>
> P.S. : I have plotted residuals against fitted values and it is less or
> more random.
>
> Thank You !
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Deepak Singh
Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.
So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread David Winsemius

> On Mar 23, 2016, at 1:44 PM, ruipbarra...@sapo.pt wrote:
> 
> Hello,
> 
> Try
> 
> ?t.test
> t.test(mA, mB, alternative = "greater")
> 
> Hope this helps,
> 
> Rui Barradas
>  
> 
> Citando Eliza Botto :
> 
>> Dear All,
>> I want to test a hypothesis in R by using student' t-test (P-values).
>> The hypothesis is that model A produces lesser error than model B at  
>> ten stations. Obviously, Null Hypothesis (H0) is that the error  
>> produces by model A is not lower than model B.

NOT "obviously". You only get to do one-sided tests when the scientific 
question would not allow the possibility of a departure to "the other side".

Two-sided tests are the norm in scientific literature, often to the 
experimenter's distress when they haven't done a thoughtful (non-optimistic) 
power analysis and their results are inconclusive as a result. Your hypothesis 
_should_ have been constructed _before_ you saw the data. That is if you want 
to be an ethical scientist.


>> The error magnitudes are
>> 
>> #model A
>>> dput(mA)
>> 
>> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
>> 37.2003157636202, 36.1318687775115, 37.164132533536,  
>> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
>> 37.4174132119744)
>> #model B
>>> dput(mB)
>> 
>> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
>> 39.401619831763, 41.1218634441457, 39.1968630742826,  
>> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
>> 41.4875529130543)

Those are not models. They are just vectors of numbers. And they seem unlikely 
to be residual errors of a linear model since they are not centered on zero. I 
doubt there is enough in your presentation for a sensible comment on the proper 
analysis.

-- 

David.

>> 
>> Now can I test my hypothesis in R?
>> Thankyou very much in Advance,
>> Eliza
>> [[alternative HTML version deleted]]
>> 
>> __



David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread ruipbarradas
Sorry, but in your original post you said that " Null Hypothesis (H0)  
is that the error produces by model A is not lower than model B".
If now is that model A produces less error change to  
alternative="less". The relevant part in the help page ?t.test is

alternative = "greater" is the alternative that x has a larger mean than y.

Rui Barradas
 

Citando Eliza Botto <eliza_bo...@outlook.com>:

> Thnx Rui,  
> Just one point though
>  
> Should it be alternative="greater" or "less"? Since alternative  
> hypothesis is that model A produced less error.
>  
> regards,
>  
> Eliza
>  
> -
> Date: Wed, 23 Mar 2016 20:44:20 +
> From: ruipbarra...@sapo.pt
> To: eliza_bo...@outlook.com
> CC: r-help@r-project.org
> Subject: Re: [R] test hypothesis in R

> Dear All,
> I want to test a hypothesis in R by using student' t-test (P-values).
> The hypothesis is that model A produces lesser error than model B at  
> ten stations. Obviously, Null Hypothesis (H0) is that the error  
> produces by model A is not lower than model B.
> The error magnitudes are
>
> #model A
>> dput(mA)
>
> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
> 37.2003157636202, 36.1318687775115, 37.164132533536,  
> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
> 37.4174132119744)
> #model B
>> dput(mB)
>
> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
> 39.401619831763, 41.1218634441457, 39.1968630742826,  
> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
> 41.4875529130543)
>
> Now can I test my hypothesis in R?
> Thankyou very much in Advance,
> Eliza
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.

 

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] test hypothesis in R

2016-03-23 Thread Eliza Botto
Thnx Rui,
Just one point though
Should it be alternative="greater" or "less"? Since alternative hypothesis is 
that model A produced less error.
regards,
Eliza

Date: Wed, 23 Mar 2016 20:44:20 +
From: ruipbarra...@sapo.pt
To: eliza_bo...@outlook.com
CC: r-help@r-project.org
Subject: Re: [R] test hypothesis in R








Hello,



Try



?t.test

t.test(mA, mB, alternative = "greater")



Hope this helps,



Rui Barradas

 

Citando Eliza Botto <eliza_bo...@outlook.com>:


Dear All,

I want to test a hypothesis in R by using student' t-test (P-values).

The hypothesis is that model A produces lesser error than model B at ten 
stations. Obviously, Null Hypothesis (H0) is that the error produces by model A 
is not lower than model B.

The error magnitudes are



#model A


dput(mA)


c(36.1956086452583, 34.9996207622861, 36.435733025221, 37.2003157636202, 
36.1318687775115, 37.164132533536, 35.2028759357069, 36.7719835944373, 
38.3861425339751, 37.4174132119744)

#model B

dput(mB)


c(39.7655211768704, 40.1730916643841, 39.3699055738618, 39.401619831763, 
41.1218634441457, 39.1968630742826, 40.5265825061639, 40.4674956975404, 
40.5954427072364, 41.4875529130543)



Now can I test my hypothesis in R?

Thankyou very much in Advance,

Eliza

[[alternative HTML version deleted]]



__

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland 
provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread ruipbarradas
Hello,

Try

?t.test
t.test(mA, mB, alternative = "greater")

Hope this helps,

Rui Barradas
 

Citando Eliza Botto :

> Dear All,
> I want to test a hypothesis in R by using student' t-test (P-values).
> The hypothesis is that model A produces lesser error than model B at  
> ten stations. Obviously, Null Hypothesis (H0) is that the error  
> produces by model A is not lower than model B.
> The error magnitudes are
>
> #model A
>> dput(mA)
>
> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
> 37.2003157636202, 36.1318687775115, 37.164132533536,  
> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
> 37.4174132119744)
> #model B
>> dput(mB)
>
> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
> 39.401619831763, 41.1218634441457, 39.1968630742826,  
> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
> 41.4875529130543)
>
> Now can I test my hypothesis in R?
> Thankyou very much in Advance,
> Eliza
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] test hypothesis in R

2016-03-23 Thread Eliza Botto
Dear All,
I want to test a hypothesis in R by using student' t-test (P-values).
The hypothesis is that model A produces lesser error than model B at ten 
stations. Obviously, Null Hypothesis (H0) is that the error produces by model A 
is not lower than model B.
The error magnitudes are 

#model A
> dput(mA)
c(36.1956086452583, 34.9996207622861, 36.435733025221, 37.2003157636202, 
36.1318687775115, 37.164132533536, 35.2028759357069, 36.7719835944373, 
38.3861425339751, 37.4174132119744)
#model B
> dput(mB)
c(39.7655211768704, 40.1730916643841, 39.3699055738618, 39.401619831763, 
41.1218634441457, 39.1968630742826, 40.5265825061639, 40.4674956975404, 
40.5954427072364, 41.4875529130543)

Now can I test my hypothesis in R?
Thankyou very much in Advance,
Eliza 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if a url exists

2014-06-29 Thread Duncan Murdoch
On 29/06/2014, 7:12 AM, Hui Du wrote:
 Hi all,
 
 I need to test if a url exists. I used url.exists() in RCurl package
 
 library(RCurl)
 
 however the test result is kind of weird. For example,
 
 url.exists(http://www.amazon.com;)
 [1] FALSE
 
 although www.amazon.comhttp://www.amazon.com is a valid url. Does anybody 
 know how to use that function correctly or the other way to test url 
 existence?

You can use the .header = TRUE option to that call to see the error 405
that it gives.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test if a url exists

2014-06-28 Thread Hui Du
Hi all,

I need to test if a url exists. I used url.exists() in RCurl package

library(RCurl)

however the test result is kind of weird. For example,

 url.exists(http://www.amazon.com;)
[1] FALSE

although www.amazon.comhttp://www.amazon.com is a valid url. Does anybody 
know how to use that function correctly or the other way to test url existence?

Thanks.

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test the return from grep or agrep

2014-03-02 Thread Prof Brian Ripley

On 01/03/2014 23:32, Hui Du wrote:

Hi All,

My sample code looks like

options(stringsAsFactors = FALSE);
clean = function(x)
{
 loc = agrep(ABC, x$name);
 x[loc,]$new_name - NEW;
 x;
}

name = c(12, dad, dfd);
y = data.frame(name = as.character(name), idx = 1:3);
y$new_name = y$name;

z - clean(y)

The snippet does not work because I forgot to test the return value of agrep. 
If no pattern is found, it returns 0 and the following x[loc, ]$new_name does 
not like. I know how to fix that part. However, my code has many places like 
that, say over 100 calls for agrep or grep for different patterns and 
substitution. Is there any smart way to fix them all rather than line by line?


That is not true: it returns integer(0).  (If it returned 0 it would work.)

For grep() I would recommend using grepl() instead. Otherwise

if(length(loc)) x[loc,]$new_name - NEW

or

x[loc,]$new_name - rep_len(NEW, length(loc))


Your code is full of pointless empty statements (between ; and NL): R is 
not C and ; is a separator, not a terminator.




Many thanks.

HXD




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test the return from grep or agrep

2014-03-01 Thread Hui Du
Hi All,

My sample code looks like

options(stringsAsFactors = FALSE);
clean = function(x)
{
loc = agrep(ABC, x$name);
x[loc,]$new_name - NEW;
x;
}

name = c(12, dad, dfd);
y = data.frame(name = as.character(name), idx = 1:3);
y$new_name = y$name;

z - clean(y)

The snippet does not work because I forgot to test the return value of agrep. 
If no pattern is found, it returns 0 and the following x[loc, ]$new_name does 
not like. I know how to fix that part. However, my code has many places like 
that, say over 100 calls for agrep or grep for different patterns and 
substitution. Is there any smart way to fix them all rather than line by line?

Many thanks.

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test to determine if there is a difference between two means

2013-12-24 Thread wesley bell
Hi,
I have a data set where there are 20 experiments which each ran for 10 minutes. 
In each experiment an insect had a choice to spend time in one of two chambers. 
Each experiment therefore has number of seconds spent in each chamber. I want 
to know whether there is a difference in the mean time spent in each chamber.

I was going to do a t-test but was advised that there was a better way, 
something about introducing random numbers? I was hoping someone could help?

Thanks
Wes
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test to determine if there is a difference between two means

2013-12-24 Thread Bert Gunter
Inline below.

 Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Tue, Dec 24, 2013 at 7:38 AM, wesley bell wesleybel...@yahoo.com wrote:
 Hi,
 I have a data set where there are 20 experiments which each ran for 10 
 minutes. In each experiment an insect had a choice to spend time in one of 
 two chambers. Each experiment therefore has number of seconds spent in each 
 chamber. I want to know whether there is a difference in the mean time spent 
 in each chamber.

Yes, there is. Always.


 I was going to do a t-test but was advised that there was a better way, 
 something about introducing random numbers? I was hoping someone could help?

This list is about R, not statistics, although they certainly overlap.
 I suggest you post on stats.stackexchange.com instead for statistics
help. Better yet, you might do well to talk with a local expert about
statistical issues, as you are obviously weak here.


 Thanks
 Wes
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test ADF differences in R and Eviews

2013-12-05 Thread nooldor
Hi,


In attachment you can find source data on which I run adf.test() and
print-screen with results in R and Eviews.

Results are very different. Did I missed something?

Best,
T.S.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test ADF differences in R and Eviews

2013-12-05 Thread David Winsemius

On Dec 5, 2013, at 3:18 PM, nooldor wrote:

 Hi,
 
 
 In attachment you can find source data on which I run adf.test() and
 print-screen with results in R and Eviews.
 
 Results are very different. Did I missed something?

Yes. You missed the list of acceptable file types for r-help.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for exogeneity

2013-11-11 Thread jpm miao
Hi,



   I am building a bivariate SVAR model



y_1t=c_1+Ã_1 (1,1) y_(1,t-1)+Ã_1 (1,2) y_(2,t-1)+Ã_2 (1,1) y_(1,t-2)+Ã_2
(1,2) y_(2,t-2)+å_1t



   b y_1t+ y_2t=c_2+Ã_1 (2,1) y_(1,t-1)+Ã_1 (2,2) y_(2,t-1)+Ã_2 (2,1)
y_(1,t-2)+Ã_2 (1,2) y_(2,t-2)+å_2t



  Now y1 is relatively exogenous in that y1 impacts y2 contemporaneously
but not the other way around. Given a bivariate dataset, is there any
statistical test (in any R package or elsewhere) that helps to justify/test
the exogeneity of y1 in the present context? Is there any reference
available?



Thanks,



Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread arun
Hi,
Try:
fun1 - function(dat){
mat1 - combn(colnames(dat1),2)
 res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})
names(res) - apply(mat1,2,paste,collapse=_)
res
}

set.seed(432)
dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

  fun1(dat1) #gives the p-value for each pair of columns




Hi, 

I want to make a wilcoxon test, i have 18 columns each column 
corresponds to a different sample and i want to compare one to each 
other with a wilcoxon test in one step this is possible ? or do i 
compare two by tow? 

Does it exist a code for automation this test? like this i dont have to type 
the code for each couple. 

thanks! 
denisse

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread Rui Barradas

Hello,

There's a bug in your function, it should be 'dat', not 'dat1'. In the 
line marked, below.


fun1 - function(dat){
mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'
	res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})

names(res) - apply(mat1,2,paste,collapse=_)
res
}


Hope this helps,

Rui Barradas

Em 24-10-2013 20:16, arun escreveu:

Hi,
Try:
fun1 - function(dat){
mat1 - combn(colnames(dat1),2)
  res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})
names(res) - apply(mat1,2,paste,collapse=_)
res
}

set.seed(432)
dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

   fun1(dat1) #gives the p-value for each pair of columns




Hi,

I want to make a wilcoxon test, i have 18 columns each column
corresponds to a different sample and i want to compare one to each
other with a wilcoxon test in one step this is possible ? or do i
compare two by tow?

Does it exist a code for automation this test? like this i dont have to type 
the code for each couple.

thanks!
denisse

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread vikram ranga
Hi,
Check out this function:-
pairwise.wilcox.test {package=stats}.

example(pairwise.wilcox.test)


On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 Hello,

 There's a bug in your function, it should be 'dat', not 'dat1'. In the line
 marked, below.

 fun1 - function(dat){
 mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'

 res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }


 Hope this helps,

 Rui Barradas

 Em 24-10-2013 20:16, arun escreveu:

 Hi,
 Try:
 fun1 - function(dat){
 mat1 - combn(colnames(dat1),2)
   res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }

 set.seed(432)
 dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

fun1(dat1) #gives the p-value for each pair of columns




 Hi,

 I want to make a wilcoxon test, i have 18 columns each column
 corresponds to a different sample and i want to compare one to each
 other with a wilcoxon test in one step this is possible ? or do i
 compare two by tow?

 Does it exist a code for automation this test? like this i dont have to
 type the code for each couple.

 thanks!
 denisse

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread arun
It looks much better than mine.


with p value adjustment:
p.adjust(fun1(dat1), method = holm, n = 153)
#

dat1$id - 1:10
library(reshape2)
dat2 - melt(dat1,id.var=id)
with(dat2,pairwise.wilcox.test(value,variable))
 with(dat2,pairwise.wilcox.test(value,variable,p.adj=none)) 


A.K.




On Friday, October 25, 2013 12:05 AM, vikram ranga babuaw...@gmail.com wrote:
Hi,
Check out this function:-
pairwise.wilcox.test {package=stats}.

example(pairwise.wilcox.test)


On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 Hello,

 There's a bug in your function, it should be 'dat', not 'dat1'. In the line
 marked, below.

 fun1 - function(dat){
         mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'

         res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
         names(res) - apply(mat1,2,paste,collapse=_)
         res
 }


 Hope this helps,

 Rui Barradas

 Em 24-10-2013 20:16, arun escreveu:

 Hi,
 Try:
 fun1 - function(dat){
 mat1 - combn(colnames(dat1),2)
   res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }

 set.seed(432)
 dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

    fun1(dat1) #gives the p-value for each pair of columns




 Hi,

 I want to make a wilcoxon test, i have 18 columns each column
 corresponds to a different sample and i want to compare one to each
 other with a wilcoxon test in one step this is possible ? or do i
 compare two by tow?

 Does it exist a code for automation this test? like this i dont have to
 type the code for each couple.

 thanks!
 denisse

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test if 2 samples differ if they have autocorrelation

2013-07-18 Thread Eric Jaeger
 Dear all
 
 I have one question that I struggle to find an answer:
 
 Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 
 different trading strategies. I want to find out if strategy A is better than 
 strategy B. The problem is that the two series have serial correlations, 
 hence I cannot just do a simple t-test.
 
 I tried something like this:
 
 1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B
 
 2.take the difference of both: C_A – C_B = DiffPnL (to see how the 
 difference evolves over time)
 
 3.do a regression: DiffPnL = beta * time + error (I thought if beta is 
 significantly different from 0 than the two time series are different)
 
 4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) 
 - this corrects statistical tests, standard errors for beta 
 heteroskedasticity and autocorrelation
 
 BUT: I read something that the tests are biased when the timeseries are unit 
 root non-stationary (which is due to the fact that I take cumulative time 
 series)
 
  
 
 I am lost! This should be fairly simple: test if two samples differ if they 
 have autocorrelation? Probably my approach above is completely wrong…
 
  
 
 Thanks for your help
 
 Best regards
 
 Eric
 
 
 
 The information in this e-mail is intended only for th...{{dropped:23}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if 2 samples differ if they have autocorrelation

2013-07-18 Thread Rolf Turner



I imagine that most readers of this list will put your question in the 
too hard basket.

That being so, here is my inexpert take on the question.

The issue is to estimate the uncertainty in the estimated difference of 
the means.
This uncertainty depends on the nature of the serial dependence of the 
series.

Therefore in order to get anywhere you need to *model* this dependence.

Different models could yield very different values for the variance of 
the estimated

difference of the means.

If the series are observed at the same times I would suggest taking the 
pointwise

difference of the two series: D_t = X_t - Y_t, say.

Fit the best arima model that you can to D_t. Then the standard error of 
what
is incorrectly labelled intercept (it is actually the estimate of the 
series *mean*)
is the appropriate estimate of the uncertainty. The ratio of the 
intercept value

to its standard error is the test statistic you are looking for.

If the series are *not* observed at the same times but can be assumed to be
independent then model *each* series as well as you can (different 
models for
each series) and obtain the standard error of the intercept for each 
series.
Your test statistic is then the difference of the intercept estimates 
divided by

sqrt(se_X^2 + se_Y^2) in what I hope is an obvious notation.

If the series are not observed at the same times and cannot be assumed to be
independent then you probably haven't got sufficient information to answer
the question that you wish to answer.

I hope that there is some value in the forgoing.

cheers,

Rolf Turner

On 18/07/13 21:50, Eric Jaeger wrote:

Dear all

I have one question that I struggle to find an answer:

Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 
different trading strategies. I want to find out if strategy A is better than 
strategy B. The problem is that the two series have serial correlations, hence 
I cannot just do a simple t-test.

I tried something like this:

1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B

2.take the difference of both: C_A – C_B = DiffPnL (to see how the difference 
evolves over time)

3.do a regression: DiffPnL = beta * time + error (I thought if beta is 
significantly different from 0 than the two time series are different)

4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) - 
this corrects statistical tests, standard errors for beta heteroskedasticity and 
autocorrelation

BUT: I read something that the tests are biased when the timeseries are unit 
root non-stationary (which is due to the fact that I take cumulative time 
series)

  


I am lost! This should be fairly simple: test if two samples differ if they 
have autocorrelation? Probably my approach above is completely wrong…


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-14 Thread Thiem Alrik
Dear William,

thanks a lot. I've found another nice alternative:

A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
B - combn(16, 3)

B.n - B[, -which(duplicated(t(cbind(A, B - ncol(A)]

Best wishes,
Alrik


-Ursprüngliche Nachricht-
Von: arun [mailto:smartpink...@yahoo.com] 
Gesendet: Samstag, 13. Juli 2013 19:57
An: William Dunlap
Cc: mailman, r-help; Thiem Alrik
Betreff: Re: [R] Test for column equality across matrices

I tried it on a slightly bigger dataset:
A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
B1 - combn(90, 3)
which(is.element(columnsOf(B1), columnsOf(A1)))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481


which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481 44331


B1[,44331]
#[1] 14 15 16


which(apply(t(A1),1,paste,collapse=)==141516)
#[1] 14

B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
 identical(B1New,newB)
#[1] FALSE

 is.element(B1[,44331],A1[,14])
#[1] TRUE TRUE TRUE


 B1Sp-columnsOf(B1)
B1Sp[[44331]]
#[1] 14 15 16
 A1Sp- columnsOf(A1)
 A1Sp[[14]]
#[1] 14 15 16
 is.element(B1Sp[[44331]],A1Sp[[14]])
#[1] TRUE TRUE TRUE


A.K.



- Original Message -
From: William Dunlap wdun...@tibco.com
To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 1:30 PM
Subject: Re: [R] Test for column equality across matrices

Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-14 Thread William Dunlap
It looks like match() (and relatives like %in% and is.element) act a bit 
unpredictably
on lists when the list elements are vectors of numbers of different types.  If 
you match
integers to integers or doubles to doubles it works as expected, but when the 
types
don't match the results vary.  I would expect the following to give either 
c(1,2) or
c(NA,NA) but not c(1,NA):

 match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13.,15.,16.), 
 c(14.,15.,16.) ))
[1]  1 NA

It works when the list elements have the same type

 match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13L,15L,16L), 
 c(14L,15L,16L) ))
[1] 1 2
 match( list( c(13.,15.,16.), c(14.,15.,16.)), list( c(13.,15.,16.), 
 c(14.,15.,16.) ))
[1] 1 2
 match( list( c(13.,15.,16.), c(14L,15L,16L)), list( c(13.,15.,16.), 
 c(14L,15L,16L) ))
[1] 1 2

So - A and B should be coerced to have a common type ('storage.mode') before
comparing them.

By the way, the discrepency might happen because match() applied to lists might
be implemented by calling deparse on each element of each list and then using
the character method of match.  For sequential integers deparse uses colon 
notation;
e.g., c(14L,15L,16L) becomes the string 14:16.  But usually deparse puts an 
'L' after
integers so they would never match with a double of the same value.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: arun [mailto:smartpink...@yahoo.com]
 Sent: Saturday, July 13, 2013 10:57 AM
 To: William Dunlap
 Cc: R help; Thiem Alrik
 Subject: Re: [R] Test for column equality across matrices
 
 I tried it on a slightly bigger dataset:
 A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
 B1 - combn(90, 3)
 which(is.element(columnsOf(B1), columnsOf(A1)))
 # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
 #[13] 41481
 
 
 which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
 # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
 #[13] 41481 44331
 
 
 B1[,44331]
 #[1] 14 15 16
 
 
 which(apply(t(A1),1,paste,collapse=)==141516)
 #[1] 14
 
 B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
 newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
  identical(B1New,newB)
 #[1] FALSE
 
  is.element(B1[,44331],A1[,14])
 #[1] TRUE TRUE TRUE
 
 
  B1Sp-columnsOf(B1)
 B1Sp[[44331]]
 #[1] 14 15 16
  A1Sp- columnsOf(A1)
  A1Sp[[14]]
 #[1] 14 15 16
  is.element(B1Sp[[44331]],A1Sp[[14]])
 #[1] TRUE TRUE TRUE
 
 
 A.K.
 
 
 
 - Original Message -
 From: William Dunlap wdun...@tibco.com
 To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
 r-help@r-project.org
 Cc:
 Sent: Saturday, July 13, 2013 1:30 PM
 Subject: Re: [R] Test for column equality across matrices
 
 Try
    columnsOf - function(mat) split(mat, col(mat))
    newB - B[ , !is.element(columnsOf(B), columnsOf(A))]
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
  Behalf
  Of Thiem Alrik
  Sent: Saturday, July 13, 2013 6:45 AM
  To: mailman, r-help
  Subject: [R] Test for column equality across matrices
 
  Dear list,
 
  I have two matrices
 
  A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
  B - combn(16, 3)
 
  Now I would like to exclude all columns from the 560 columns in B which are 
  identical
 to
  any 1 of the 6 columns in A. How could I do this?
 
  Many thanks and best wishes,
 
  Alrik
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for column equality across matrices

2013-07-13 Thread Thiem Alrik
Dear list,

I have two matrices

A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
B - combn(16, 3)

Now I would like to exclude all columns from the 560 columns in B which are 
identical to any 1 of the 6 columns in A. How could I do this?

Many thanks and best wishes,

Alrik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread William Dunlap
Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread arun
I tried it on a slightly bigger dataset:
A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
B1 - combn(90, 3)
which(is.element(columnsOf(B1), columnsOf(A1)))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481


which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481 44331


B1[,44331]
#[1] 14 15 16


which(apply(t(A1),1,paste,collapse=)==141516)
#[1] 14

B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
 identical(B1New,newB)
#[1] FALSE

 is.element(B1[,44331],A1[,14])
#[1] TRUE TRUE TRUE


 B1Sp-columnsOf(B1)
B1Sp[[44331]]
#[1] 14 15 16
 A1Sp- columnsOf(A1)
 A1Sp[[14]]
#[1] 14 15 16
 is.element(B1Sp[[44331]],A1Sp[[14]])
#[1] TRUE TRUE TRUE


A.K.



- Original Message -
From: William Dunlap wdun...@tibco.com
To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 1:30 PM
Subject: Re: [R] Test for column equality across matrices

Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread arun
Hi,
One way would be:
 which(apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=))
#[1] 105 196 274 340 395
B[,105]
#[1]  1 15 16
 B[,196]
#[1]  2 15 16
 B1-B[,!apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=)]
 dim(B1)
#[1]   3 555
 dim(B)
#[1]   3 560

#or
B2-B[,is.na(match(interaction(as.data.frame(t(B))),interaction(as.data.frame(t(A)]
 identical(B1,B2)
#[1] TRUE


A.K.





- Original Message -
From: Thiem Alrik th...@sipo.gess.ethz.ch
To: mailman, r-help r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 9:45 AM
Subject: [R] Test for column equality across matrices

Dear list,

I have two matrices

A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
B - combn(16, 3)

Now I would like to exclude all columns from the 560 columns in B which are 
identical to any 1 of the 6 columns in A. How could I do this?

Many thanks and best wishes,

Alrik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test

2013-04-01 Thread catalin roibu
Sorry for this message it's just a test.

Thank you!

-- 
---
Catalin-Constantin ROIBU
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-12 Thread Rune Haubo
Dear Heather,

You can make this test using the ordinal package. Here the function
clm fits cumulative link models where the ordinal logistic regression
model is a special case (using the logit link).

Let me illustrate how to test the parallel regression assumption for a
particular variable using clm in the ordinal package. I am using the
wine dataset from the same package, I fit a model with two explanatory
variables; temp and contact, and I test the parallel regression
assumption for the contact variable in a likelihood ratio test:

 library(ordinal)
Loading required package: MASS
Loading required package: ucminf
Loading required package: Matrix
Loading required package: lattice
 head(wine)
  response rating temp contact bottle judge
1   36  2 cold  no  1 1
2   48  3 cold  no  2 1
3   47  3 cold yes  3 1
4   67  4 cold yes  4 1
5   77  4 warm  no  5 1
6   60  4 warm  no  6 1
 fm1 - clm(rating ~ temp + contact, data=wine)
 fm2 - clm(rating ~ temp, nominal=~ contact, data=wine)
 anova(fm1, fm2)
Likelihood ratio tests of cumulative link models:

formula:nominal: link: threshold:
fm1 rating ~ temp + contact ~1   logit flexible
fm2 rating ~ temp   ~contact logit flexible

no.parAIC  logLik LR.stat df Pr(Chisq)
fm1  6 184.98 -86.492
fm2  9 190.42 -86.209  0.5667  3  0.904

The idea is to fit the model under the null hypothesis (parallel
effects - fm1) and under the alternative hypothesis (non-parallel
effects for contact - fm2) and compare these models with anova() which
performs the LR test. From the high p-value we see that the null
cannot be rejected and there is no evidence of non-parallel slopes in
this case. For additional information, I suggest that you take a look
at the following package vignette
(http://cran.r-project.org/web/packages/ordinal/vignettes/clm_tutorial.pdf)
where these kind of tests are more thoroughly described starting page
6.

I think you can also make similar tests with the VGAM package, but I
am not as well versed in that package.

Hope this helps,
Rune

Rune Haubo Bojesen Christensen
Postdoc
DTU Compute - Section for Statistics
---
Technical University of Denmark
Department of Applied Mathematics and Computer Science
Richard Petersens Plads
Building 324, Room 220
2800 Lyngby
Direct +45 45253363
Mobile +45 30264554
http://www.imm.dtu.dk


On 11 March 2013 22:52, Nicole Ford nicole.f...@me.com wrote:
 here's some code as an example  hope it helps!

 mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
 summary(mod)


 mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
 levs-levels(dat$vote)
 tmpdat-list()
 for(i in 1:(nlevels(dat$vote)-1)){
 tmpdat[[i]] - dat
 tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i])
 }
 form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban)
 mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial))
 probs-sapply(mods, predict, type=response)
 p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)])
 p.ologit-predict(mod, type='probs')
 n-nrow(p.logits)
 bin.ll - p.logits[cbind(1:n, dat$vote)]
 ologit.ll - p.ologit[cbind(1:n, dat$vote)]
 binom.test(sum(bin.ll  ologit.ll), n)


 dat$vote.fac-factor(dat$vote, levels=1:6)
 mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, 
 data=dat)

 source(http://www.quantoid.net/cat_pre.R )
 catpre(mod)

 install.packages(rms)
 library(rms)
 olprobs-predict(mod, type='probs')
 pred.cat-apply(olprobs, 1, which.max)
 table(pred.cat, dat$vote)

 round(prop.table(table(pred.cat, dat$vote), 2), 3)
 On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote:

 Hi,

 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.

 Does anyone know how to test the parallel regression assumption in R?

 Thanks for your help!


 --
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 

[R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Heather Kettrey
Hi,

I am running an analysis with an ordinal outcome and I need to run a test
of the parallel regression assumption to determine if ordinal logistic
regression is appropriate. I cannot find a function to conduct such a test.
From searching various message boards I have seen a few useRs ask this same
question without a definitive answer - and I came across a thread that
indicated there is no such function available in any R packages. I hope
this is incorrect.

Does anyone know how to test the parallel regression assumption in R?

Thanks for your help!


-- 
Heather Hensman Kettrey
PhD Candidate
Department of Sociology
Vanderbilt University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Bert Gunter
Heather:

You are at Vanderbilt, whose statistics department under Frank Harrell
is a veritable bastion of R and statistical wisdom. I strongly
recommend that you take a stroll over there in the lovely spring
weather and seek their help. I can't imagine how you could do better
than that!

Cheers,
Bert

On Mon, Mar 11, 2013 at 2:02 PM, Heather Kettrey
heather.h.kett...@vanderbilt.edu wrote:
 Hi,

 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.

 Does anyone know how to test the parallel regression assumption in R?

 Thanks for your help!


 --
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Jeff Newmiller
Perhaps you should be asking whether such an algorithm exists, regardless of 
whether it is already implemented in R. However, this is the wrong place to ask 
such theory questions... your local statistics expert might know, or you could 
ask on a statistics theory forum such as stats.stackexchange.com. With the 
answer to that question you could use the RSiteSeek function to search for 
references to that algorithm, or even implement it yourself.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Heather Kettrey heather.h.kett...@vanderbilt.edu wrote:

Hi,

I am running an analysis with an ordinal outcome and I need to run a
test
of the parallel regression assumption to determine if ordinal logistic
regression is appropriate. I cannot find a function to conduct such a
test.
From searching various message boards I have seen a few useRs ask this
same
question without a definitive answer - and I came across a thread that
indicated there is no such function available in any R packages. I hope
this is incorrect.

Does anyone know how to test the parallel regression assumption in R?

Thanks for your help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Nicole Ford
here's some code as an example  hope it helps!

mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
summary(mod)

 
mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
levs-levels(dat$vote)
tmpdat-list()
for(i in 1:(nlevels(dat$vote)-1)){
tmpdat[[i]] - dat
tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i])
}
form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban)
mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial))
probs-sapply(mods, predict, type=response)
p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)])
p.ologit-predict(mod, type='probs')
n-nrow(p.logits)
bin.ll - p.logits[cbind(1:n, dat$vote)]
ologit.ll - p.ologit[cbind(1:n, dat$vote)]
binom.test(sum(bin.ll  ologit.ll), n)
 

dat$vote.fac-factor(dat$vote, levels=1:6)
mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, 
data=dat)
 
source(http://www.quantoid.net/cat_pre.R )
catpre(mod)
 
install.packages(rms)
library(rms)
olprobs-predict(mod, type='probs')
pred.cat-apply(olprobs, 1, which.max)
table(pred.cat, dat$vote)
 
round(prop.table(table(pred.cat, dat$vote), 2), 3)
On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote:

 Hi,
 
 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.
 
 Does anyone know how to test the parallel regression assumption in R?
 
 Thanks for your help!
 
 
 -- 
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test if mysql connection is alive

2013-02-14 Thread Frans Marcelissen
Hi fellows,
I use RMySQL. I want to reconnect, if the connections is not alive anymore.

if (!connected()) con-dbConnect(MySQL(),user=..,
password=..,host=..,db=..)

But how can I do the test connected()?
I thought the way to do this was,

connected()-function(){return (exists(con)  isIdCurrent(con))}

But that does'n work, after some time connected() returns TRUE, but the next
dbGetQuery  signals

Error in mysqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not run statement: MySQL server has gone away)

How can I test if the connection is still valid?

Thanks
Frans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test for a condition in a vector for loop not working

2012-11-10 Thread scoyoc
Once again, thanks!
MVS



-
MVS
=
Matthew Van Scoyoc
Graduate Research Assistant, Ecology
Wildland Resources Department  Ecology Center
Quinney College of Natural Resources
Utah State University
Logan, UT
=
Think SNOW!


--
View this message in context: 
http://r.789695.n4.nabble.com/test-for-a-condition-in-a-vector-for-loop-not-working-tp4649212p4649216.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for treatment effect in a logistic regression

2012-10-16 Thread bibek sharma
Dear R usuer,

I need to fit logistic regression with binomial response. The
objective is to compare treatment groups controlling other categorical
and continuous predictors. The GLM procedure with
family=binomial(Logit) gives me parameters estimates as well as odd
ratios. But objective is to compare if treatment groups are
significantly different. I have used wald test but got error message
(Plz see code used and the error message)  Any suggestion is much
appreciated!

wald.test(b=coef(fit),sigma=vcov(fit), Terms = 2:3) # 2 and 3 are the
estimates  for treatment group.
## Comparing GRoup B to Group C

l - cbind(0, 1,-1, 0,0,0,0,0,0,0)
wald.test(b = coef(fit), Sigma = vcov(fit), L =1
Error Message

Error in wald.test(b = coef(), sigma = vcov(), Terms = 2:3) :
  unused argument(s) (sigma = vcov())

Thanks in advance for your suggestion,
Bibek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-07 Thread 周果
Hi Lorenzo,

Just a quick thought, the uniform probability density on a unit sphere is 1
/ (4pi),
what about binning those random points according to their directions and do
a chi-square test?

Regards,
Guo

On Sun, Oct 7, 2012 at 2:16 AM, cbe...@tajo.ucsd.edu wrote:

 Lorenzo Isella lorenzo.ise...@gmail.com writes:

  Dear All,
  I implemented an algorithm for (uniform) random rotations.
  In order to test it, I can apply it to a unit vector (0,0,1) in
  Cartesian coordinates.
  The result is supposed to be a set of random, uniformly distributed,
  points on a sphere (not the point of the algorithm, but a way to test
  it).
  This is what the points look like when I plot them, but other then
  eyeballing them, can anyone suggest a test to ensure that I am really
  generating uniform random points on a sphere?

 There is a substantial literature on this topic and more than one
 (metaphorical?) direction you could follow.

 I suggest you Google 'directional statistics' and start reading.

 Visit http://www.rseek.org and enter 'directional statistics' in
 the search box and click on the search button to see if there is
 something in R to meet your needs.

 A post to r-sig-geo might get more helpful responses once you can focus
 the question a bit more.


 HTH,

 Chuck

  Many thanks
 
  Lorenzo
 

 --
 Charles C. BerryDept of Family/Preventive
 Medicine
 cberry at ucsd edu  UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-06 Thread cberry
Lorenzo Isella lorenzo.ise...@gmail.com writes:

 Dear All,
 I implemented an algorithm for (uniform) random rotations.
 In order to test it, I can apply it to a unit vector (0,0,1) in
 Cartesian coordinates.
 The result is supposed to be a set of random, uniformly distributed,
 points on a sphere (not the point of the algorithm, but a way to test
 it).
 This is what the points look like when I plot them, but other then
 eyeballing them, can anyone suggest a test to ensure that I am really
 generating uniform random points on a sphere?

There is a substantial literature on this topic and more than one
(metaphorical?) direction you could follow.

I suggest you Google 'directional statistics' and start reading.

Visit http://www.rseek.org and enter 'directional statistics' in
the search box and click on the search button to see if there is
something in R to meet your needs.

A post to r-sig-geo might get more helpful responses once you can focus
the question a bit more.


HTH,

Chuck

 Many thanks

 Lorenzo


-- 
Charles C. BerryDept of Family/Preventive Medicine
cberry at ucsd edu  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for Random Points on a Sphere

2012-10-05 Thread Lorenzo Isella

Dear All,
I implemented an algorithm for (uniform) random rotations.
In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian  
coordinates.
The result is supposed to be a set of random, uniformly distributed,  
points on a sphere (not the point of the algorithm, but a way to test it).
This is what the points look like when I plot them, but other then  
eyeballing them, can anyone suggest a test to ensure that I am really  
generating uniform random points on a sphere?

Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-05 Thread R. Michael Weylandt
On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote:
 Dear All,
 I implemented an algorithm for (uniform) random rotations.
 In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian
 coordinates.
 The result is supposed to be a set of random, uniformly distributed, points
 on a sphere (not the point of the algorithm, but a way to test it).
 This is what the points look like when I plot them, but other then
 eyeballing them, can anyone suggest a test to ensure that I am really
 generating uniform random points on a sphere?
 Many thanks


Gut says to divide the surface into n bits of equal area and see if
the points appear uniformly in those using something chi-squared-ish,
but I'm not aware of a canonical way to do so.

Cheers,
Michael

 Lorenzo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-05 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of R. Michael Weylandt
 Sent: Friday, October 05, 2012 11:17 AM
 To: Lorenzo Isella
 Cc: r-help@r-project.org
 Subject: Re: [R] Test for Random Points on a Sphere
 
 On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella
 lorenzo.ise...@gmail.com wrote:
  Dear All,
  I implemented an algorithm for (uniform) random rotations.
  In order to test it, I can apply it to a unit vector (0,0,1) in
 Cartesian
  coordinates.
  The result is supposed to be a set of random, uniformly distributed,
 points
  on a sphere (not the point of the algorithm, but a way to test it).
  This is what the points look like when I plot them, but other then
  eyeballing them, can anyone suggest a test to ensure that I am really
  generating uniform random points on a sphere?
  Many thanks
 
 
 Gut says to divide the surface into n bits of equal area and see if
 the points appear uniformly in those using something chi-squared-ish,
 but I'm not aware of a canonical way to do so.
 
 Cheers,
 Michael
 
  Lorenzo
 

I would be more inclined to use a method which is known to produce a points 
uniformly distributed on the surface of a sphere and not worry about testing 
your results.  You might find the discussion at the following link useful.

http://mathworld.wolfram.com/SpherePointPicking.html


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test Breslow-Day for svytable??

2012-08-31 Thread Diana Marcela Martinez Ruiz

Hi all,

 I want to know how to perform the test Breslow-Day test for homogeneity of 
odds ratios (OR) stratified for svytable. This test is obtained with the 
following code:

 epi.2by2 (dat = daty, method = case.control conf.level = 0.95,
units = 100, homogeneity = breslow.day, verbose = TRUE)

 where daty is the object type table  svytable consider it, but when I run 
the code
does not throw the homogeneity test.

 Thanks.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread John Sorkin
Suggstion:
You need to send us more information, i.e. the code that genrated daty, or a 
listing of the daty structure, and a copy of the listing
produced by epi.2by2
John

 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) Diana 
Marcela Martinez Ruiz dianamm...@hotmail.com 8/31/2012 10:20 AM 

Hi all,

I want to know how to perform the test Breslow-Day test for homogeneity of 
odds ratios (OR) stratified for svytable. This test is obtained with the 
following code:

epi.2by2 (dat = daty, method = case.control conf.level = 0.95,
units = 100, homogeneity = breslow.day, verbose = TRUE)

where daty is the object type table  svytable consider it, but when I run the 
code
does not throw the homogeneity test.

Thanks.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread David Winsemius

On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:

 Hi all,
 
 I want to know how to perform the test Breslow-Day test for homogeneity of 
 odds ratios (OR) stratified for svytable. This test is obtained with the 
 following code:
 
 epi.2by2 (dat = daty, method = case.control conf.level = 0.95,

missing comma here ...^

units = 100, homogeneity = breslow.day, verbose = TRUE)
 
 where daty is the object type table  svytable consider it, but when I run 
 the code
 does not throw the homogeneity test.

You are asked in the Posting guide to copy all errors and warnings when asking 
about unexpected behavior. When I run epi.2y2 on the output of a syvtable 
object I get no errors, but I do get warnings which I think are due to 
non-integer entries in the weighted table. I also get from a svytable() 
usingits first example on the help page an object that is NOT a set of 2 x 2 
tables in an array of the structure as expected by epi.2by2(). The fact that 
epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
error checking is weak.

This is the output of str(dat) from one of the example on epi.2by2's help page:

 str(dat)
 table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
 - attr(*, dimnames)=List of 3
  ..$ Exposure: chr [1:2] + -
  ..$ Disease : chr [1:2] + -
  ..$ Strata  : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs

Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
reading the help pages and using str() to look at the objects created to get an 
idea regarding success or failure. I am not an experienced user of either 
package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
another example you can build a 2 x 2 x n table from the built-in dataset: 
UCBAdmissions 

DF - as.data.frame(UCBAdmissions)
## Now 'DF' is a data frame with a grid of the factors and the counts
## in variable 'Freq'.
dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF)
epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95, 
 units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
#-
  test.statistic dfp.value
1   18.82551  5 0.00207139

Using svydesign and svytable I _think_ this is how one would go about 
constructing a 2 x 2 table:

tbl2-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
  summary(dclus1)
(tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2))
 epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95, 
 units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
#---
  test.statistic dfp.value
1   18.82551  5 0.00207139

(At least I got internal consistency. I see you copied Thomas Lumley, which is 
a good idea. I'll be happy to get corrected on any point. I'm adding the 
maintainer of epiR to the recipients.)

-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread Thomas Lumley
On Sat, Sep 1, 2012 at 4:27 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:

 Hi all,

 I want to know how to perform the test Breslow-Day test for homogeneity of
 odds ratios (OR) stratified for svytable. This test is obtained with the 
 following code:

 epi.2by2 (dat = daty, method = case.control conf.level = 0.95,

 missing comma here ...^

units = 100, homogeneity = breslow.day, verbose = TRUE)

 where daty is the object type table  svytable consider it, but when I run 
 the code
 does not throw the homogeneity test.

 You are asked in the Posting guide to copy all errors and warnings when 
 asking about unexpected behavior. When I run epi.2y2 on the output of a 
 syvtable object I get no errors, but I do get warnings which I think are due 
 to non-integer entries in the weighted table. I also get from a svytable() 
 usingits first example on the help page an object that is NOT a set of 2 x 2 
 tables in an array of the structure as expected by epi.2by2(). The fact that 
 epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
 error checking is weak.

 This is the output of str(dat) from one of the example on epi.2by2's help 
 page:

 str(dat)
  table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
  - attr(*, dimnames)=List of 3
   ..$ Exposure: chr [1:2] + -
   ..$ Disease : chr [1:2] + -
   ..$ Strata  : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs

 Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
 reading the help pages and using str() to look at the objects created to get 
 an idea regarding success or failure. I am not an experienced user of either 
 package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
 another example you can build a 2 x 2 x n table from the built-in dataset: 
 UCBAdmissions

 DF - as.data.frame(UCBAdmissions)
 ## Now 'DF' is a data frame with a grid of the factors and the counts
 ## in variable 'Freq'.
 dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF)
 epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95,
  units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
 #-
   test.statistic dfp.value
 1   18.82551  5 0.00207139

 Using svydesign and svytable I _think_ this is how one would go about 
 constructing a 2 x 2 table:

 tbl2-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
   summary(dclus1)
 (tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2))
  epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95,
  units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
 #---
   test.statistic dfp.value
 1   18.82551  5 0.00207139

 (At least I got internal consistency. I see you copied Thomas Lumley, which 
 is a good idea. I'll be happy to get corrected on any point. I'm adding the 
 maintainer of epiR to the recipients.)


Yes, that will give internal consistency from a data structure point
of view.  It won't give a valid test in real examples, though --
epi.2by2 doesn't know about complex sampling, and what you're passing
it is just an estimate of the population 2x2xK table.

What would work, though it's not quite the same as the Breslow-Day
test, is to use svyloglin() and do a Rao-Scott test comparing the
model with all two-way interactions ~(Gender+Dept+Admit)^2 to the
saturated model ~Gender*Dept*Admit.

-thomas


-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-08 Thread Liviu Andronic
On Tue, Aug 7, 2012 at 10:26 PM, Marc Schwartz marc_schwa...@me.com wrote:
 since there are alpha-numerics present, whereas the first option will:

 grepl([^[:alnum:]], ab%)
 [1] TRUE


 So, use the first option.

And I should start reading more carefully. The above works fine for me.

I ended up defining the following wrappers:
is_alpha - function(x) {grepl([[:alpha:]], x)}  ##Alphabetic characters
is_digit - function(x) {grepl([[:digit:]], x)}  ##Digits
is_alnum - function(x) {grepl([[:alnum:]], x)}  ##Alphanumeric characters
is_punct - function(x) {grepl([[:punct:]], x)}  ##Punctuation characters
is_notalnum - function(x) {grepl([^[:alnum:]], x)}
##Non-Alphanumeric characters


Thanks again
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)

Quick follow-up question.

I'm always reluctant to create functions that would resemble the
method of a function (here, is() ), but would in fact not be a genuine
method. So would there be any incompatibility between is() and
is.letter(), given that the latter is not a method of the former?
Is it good (or acceptable) practice to define is.letter() as above?
Would is_letter() be better?

Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)


Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26  +
-   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
[1] +   -   %^


More regex rules are available on the Wiki [1]. Regards
Liviu

[1] http://en.wikipedia.org/wiki/Regular_expression

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread R. Michael Weylandt
On Tue, Aug 7, 2012 at 4:28 AM, Liviu Andronic landronim...@gmail.com wrote:
 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)

 Quick follow-up question.

 I'm always reluctant to create functions that would resemble the
 method of a function (here, is() ), but would in fact not be a genuine
 method. So would there be any incompatibility between is() and
 is.letter(), given that the latter is not a method of the former?
 Is it good (or acceptable) practice to define is.letter() as above?
 Would is_letter() be better?

It certainly won't cause problems if you never define anything of
class letter or number.


 Regards
 Liviu


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Marc Schwartz

On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote:

 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)
 
 
 Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26  +
 -   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
 [1] +   -   %^


That will get you values where punctuation characters are used, but there may 
be other non-alphanumeric characters in the vector. There may be ASCII control 
codes, tabs, newlines, CR, LF, spaces, etc. which would not be found by using 
[:punct:].

For example:

 grepl([[:punct:]],  )
[1] FALSE


If you want to explicitly look for non-alphanumeric characters, you would be 
better off using a negation of [:alnum:] such as:

grepl([^[:alnum:]], x)

or

!grepl([[:alnum:]], x)


Regards,

Marc



 
 More regex rules are available on the Wiki [1]. Regards
 Liviu
 
 [1] http://en.wikipedia.org/wiki/Regular_expression
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Marc Schwartz

On Aug 7, 2012, at 3:18 PM, Marc Schwartz marc_schwa...@me.com wrote:

 
 On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote:
 
 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)
 
 
 Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26  +
 -   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
 [1] +   -   %^
 
 
 That will get you values where punctuation characters are used, but there may 
 be other non-alphanumeric characters in the vector. There may be ASCII 
 control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found 
 by using [:punct:].
 
 For example:
 
 grepl([[:punct:]],  )
 [1] FALSE
 
 
 If you want to explicitly look for non-alphanumeric characters, you would be 
 better off using a negation of [:alnum:] such as:
 
 grepl([^[:alnum:]], x)
 
 or
 
 !grepl([[:alnum:]], x)
 



Actually (for the second time in two days) I need to correct myself. The second 
option would not work correctly in cases where there is a mix of alpha-numerics 
and non:

 !grepl([[:alnum:]], ab%)
[1] FALSE

since there are alpha-numerics present, whereas the first option will:

 grepl([^[:alnum:]], ab%)
[1] TRUE


So, use the first option.

Regards,

Marc who is heading to the coffee machine...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Tue, Aug 7, 2012 at 10:18 PM, Marc Schwartz marc_schwa...@me.com wrote:
 That will get you values where punctuation characters are used, but there may 
 be other non-alphanumeric characters in the vector. There may be ASCII 
 control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found 
 by using [:punct:].

 For example:

 grepl([[:punct:]],  )
 [1] FALSE


 If you want to explicitly look for non-alphanumeric characters, you would be 
 better off using a negation of [:alnum:] such as:

[..]


 !grepl([[:alnum:]], x)

Good point! Thanks.
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test if elements of a character vector contain letters

2012-08-06 Thread Liviu Andronic
Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
}

 is_letter(x)
  a10b7c2d3e6f1g5h8i9j4 k
l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
p q r s t u v w x y z
1 2 3 4
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
5 6 7 8 9101112131415
16171819
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
   20212223242526
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
  a10b7c2d3e6f1g5h8i9j4 k
l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
p q r s t u v w x y z
1 2 3 4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
5 6 7 8 9101112131415
16171819
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
   20212223242526
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Bert Gunter
nzchar(x)  !is.na(x)

No?

-- Bert

On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26


 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }

 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


 Is there a nicer way to do this? Regards
 Liviu


 --
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Rui Barradas

Hello,

Fun as an exercise in vectorization. 30 times faster. Don't look, guess.

Gave it up? Ok, here it is.


is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
}
# test ascii codes, just one loop.
has_letter - function(x){
sapply(x, function(y){
y - as.integer(charToRaw(y))
any((65 = y  y = 90) | (97 = y  y = 122))
})
}

x - c(letters, 1:26)
x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
x - rep(x, 1e3)

t1 - system.time(is_letter(x))
t2 - system.time(has_letter(x))
rbind(t1, t2, t1/t2)
   user.self sys.self elapsed user.child sys.child
t1 15.690   15.74 NANA
t2  0.5000.50 NANA
   31.38  NaN   31.48 NANA


Em 06-08-2012 17:25, Liviu Andronic escreveu:

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:

(x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4

x

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}


is_letter(x)

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
16171819
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
20212223242526
FALSE FALSE FALSE FALSE FALSE FALSE FALSE

is_letter(x, 0:9)  ##function slightly misnamed

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
1 2 3 4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Martin Morgan

On 08/06/2012 09:51 AM, Rui Barradas wrote:

Hello,

Fun as an exercise in vectorization. 30 times faster. Don't look, guess.


 system.time(res0 - grepl([[:alpha:]], x))
   user  system elapsed
  0.060   0.000   0.061
 system.time(res1 - has_letter(x))
   user  system elapsed
  3.728   0.008   3.747
 all.equal(res0, res1, check.attributes=FALSE)
[1] TRUE



Gave it up? Ok, here it is.


is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}
# test ascii codes, just one loop.
has_letter - function(x){
 sapply(x, function(y){
 y - as.integer(charToRaw(y))
 any((65 = y  y = 90) | (97 = y  y = 122))
 })
}

x - c(letters, 1:26)
x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
x - rep(x, 1e3)

t1 - system.time(is_letter(x))
t2 - system.time(has_letter(x))
rbind(t1, t2, t1/t2)
user.self sys.self elapsed user.child sys.child
t1 15.690   15.74 NANA
t2  0.5000.50 NANA
31.38  NaN   31.48 NANA


Em 06-08-2012 17:25, Liviu Andronic escreveu:

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:

(x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4

x

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}


is_letter(x)

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
16171819
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
20212223242526
FALSE FALSE FALSE FALSE FALSE FALSE FALSE

is_letter(x, 0:9)  ##function slightly misnamed

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
1 2 3 4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Marc Schwartz
Perhaps I am missing something, but why use sapply() when grepl() is already 
vectorized?

is.letter - function(x) grepl([:alpha:], x)
is.number - function(x) grepl([:digit:], x)

x - c(letters, 1:26)

x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')

x - rep(x, 1e3)

 str(x)
 chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...

 system.time(is.letter(x))
   user  system elapsed 
  0.011   0.000   0.010 

 system.time(is.number(x))
   user  system elapsed 
  0.010   0.000   0.011 


Regards,

Marc Schwartz

On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:

 Hello,
 
 Fun as an exercise in vectorization. 30 times faster. Don't look, guess.
 
 Gave it up? Ok, here it is.
 
 
 is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
 }
 # test ascii codes, just one loop.
 has_letter - function(x){
sapply(x, function(y){
y - as.integer(charToRaw(y))
any((65 = y  y = 90) | (97 = y  y = 122))
})
 }
 
 x - c(letters, 1:26)
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 x - rep(x, 1e3)
 
 t1 - system.time(is_letter(x))
 t2 - system.time(has_letter(x))
 rbind(t1, t2, t1/t2)
   user.self sys.self elapsed user.child sys.child
 t1 15.690   15.74 NANA
 t2  0.5000.50 NANA
   31.38  NaN   31.48 NANA
 
 
 Em 06-08-2012 17:25, Liviu Andronic escreveu:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 
 
 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }
 
 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 
 
 Is there a nicer way to do this? Regards
 Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread arun
Hi,

Not sure whether this is you wanted.
x-letters
  (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
 x1-c(x,1:26)


x1
 [1] a4  b3  c5  d2  e9  f6  g1  h8  i10 j7  k   l  
[13] m   n   o   p   q   r   s   t   u   v   w   x  
[25] y   z   1   2   3   4   5   6   7   8   9   10 
[37] 11  12  13  14  15  16  17  18  19  20  21  22 
[49] 23  24  25  26 


 grepl(^[[:alpha:]][[:digit:]],x1)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE

A.K.



- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help@r-project.org Help r-help@r-project.org
Cc: 
Sent: Monday, August 6, 2012 12:25 PM
Subject: [R] test if elements of a character vector contain letters

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
[1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
[1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
    sapply(x, function(y){
        any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
    })
}

 is_letter(x)
  a10    b7    c2    d3    e6    f1    g5    h8    i9    j4     k
l     m     n     o
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
    p     q     r     s     t     u     v     w     x     y     z
1     2     3     4
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
    5     6     7     8     9    10    11    12    13    14    15
16    17    18    19
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
   20    21    22    23    24    25    26
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
  a10    b7    c2    d3    e6    f1    g5    h8    i9    j4     k
l     m     n     o
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
    p     q     r     s     t     u     v     w     x     y     z
1     2     3     4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
    5     6     7     8     9    10    11    12    13    14    15
16    17    18    19
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
   20    21    22    23    24    25    26
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Marc Schwartz

On Aug 6, 2012, at 12:06 PM, Marc Schwartz marc_schwa...@me.com wrote:

 Perhaps I am missing something, but why use sapply() when grepl() is already 
 vectorized?
 
 is.letter - function(x) grepl([:alpha:], x)
 is.number - function(x) grepl([:digit:], x)

Sorry, typos in the above from my CP. Should be:

is.letter - function(x) grepl([[:alpha:]], x)
is.number - function(x) grepl([[:digit:]], x)

Marc

 
 x - c(letters, 1:26)
 
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 
 x - rep(x, 1e3)
 
 str(x)
 chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...
 
 system.time(is.letter(x))
   user  system elapsed 
  0.011   0.000   0.010 
 
 system.time(is.number(x))
   user  system elapsed 
  0.010   0.000   0.011 
 
 
 Regards,
 
 Marc Schwartz
 
 On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 
 Hello,
 
 Fun as an exercise in vectorization. 30 times faster. Don't look, guess.
 
 Gave it up? Ok, here it is.
 
 
 is_letter - function(x, pattern=c(letters, LETTERS)){
   sapply(x, function(y){
   any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
   })
 }
 # test ascii codes, just one loop.
 has_letter - function(x){
   sapply(x, function(y){
   y - as.integer(charToRaw(y))
   any((65 = y  y = 90) | (97 = y  y = 122))
   })
 }
 
 x - c(letters, 1:26)
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 x - rep(x, 1e3)
 
 t1 - system.time(is_letter(x))
 t2 - system.time(has_letter(x))
 rbind(t1, t2, t1/t2)
  user.self sys.self elapsed user.child sys.child
 t1 15.690   15.74 NANA
 t2  0.5000.50 NANA
  31.38  NaN   31.48 NANA
 
 
 Em 06-08-2012 17:25, Liviu Andronic escreveu:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 
 
 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
 }
 
 is_letter(x)
  a10b7c2d3e6f1g5h8i9j4 k
 l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
p q r s t u v w x y z
 1 2 3 4
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
   20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
  a10b7c2d3e6f1g5h8i9j4 k
 l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
5 6 7 8 9101112131415
 16171819
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
   20212223242526
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 
 
 Is there a nicer way to do this? Regards
 Liviu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread David L Carlson
Only an extra set of brackets:

is.letter - function(x) grepl([[:alpha:]], x)
is.number - function(x) grepl([[:digit:]], x)

Without them, the functions are fast, but wrong.

 x
 [1] a8  b5  c10 d1  e6  f2  g4  h3  i7  j9  k   l  
[13] m   n   o   p   q   r   s   t   u   v   w   x  
[25] y   z   1   2   3   4   5   6   7   8   9   10 
[37] 11  12  13  14  15  16  17  18  19  20  21  22 
[49] 23  24  25  26 
 is.letter - function(x) grepl([:alpha:], x)
 is.letter(x)
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
[13] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE
 is.letter - function(x) grepl([[:alpha:]], x)
 is.letter(x)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[25]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE 

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Marc Schwartz
 Sent: Monday, August 06, 2012 12:07 PM
 To: Rui Barradas
 Cc: r-help
 Subject: Re: [R] test if elements of a character vector contain letters
 
 Perhaps I am missing something, but why use sapply() when grepl() is
 already vectorized?
 
 is.letter - function(x) grepl([:alpha:], x)
 is.number - function(x) grepl([:digit:], x)
 
 x - c(letters, 1:26)
 
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 
 x - rep(x, 1e3)
 
  str(x)
  chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...
 
  system.time(is.letter(x))
user  system elapsed
   0.011   0.000   0.010
 
  system.time(is.number(x))
user  system elapsed
   0.010   0.000   0.011
 
 
 Regards,
 
 Marc Schwartz
 
 On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 
  Hello,
 
  Fun as an exercise in vectorization. 30 times faster. Don't look,
 guess.
 
  Gave it up? Ok, here it is.
 
 
  is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
  }
  # test ascii codes, just one loop.
  has_letter - function(x){
 sapply(x, function(y){
 y - as.integer(charToRaw(y))
 any((65 = y  y = 90) | (97 = y  y = 122))
 })
  }
 
  x - c(letters, 1:26)
  x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
  x - rep(x, 1e3)
 
  t1 - system.time(is_letter(x))
  t2 - system.time(has_letter(x))
  rbind(t1, t2, t1/t2)
user.self sys.self elapsed user.child sys.child
  t1 15.690   15.74 NANA
  t2  0.5000.50 NANA
31.38  NaN   31.48 NANA
 
 
  Em 06-08-2012 17:25, Liviu Andronic escreveu:
  Dear all
  I'm pretty sure that I'm approaching the problem in a wrong way.
  Suppose the following character vector:
  (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
   [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
  x
   [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
  l   m   n
  [15] o   p   q   r   s   t   u   v   w   x   y
  z   1   2
  [29] 3   4   5   6   7   8   9   10  11  12
 13
  14  15  16
  [43] 17  18  19  20  21  22  23  24  25  26
 
 
  How do you test whether the elements of the vector contain at least
  one letter (or at least one digit) and obtain a logical vector of
 the
  same dimension? I came up with the following awkward function:
  is_letter - function(x, pattern=c(letters, LETTERS)){
  sapply(x, function(y){
  any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
  })
  }
 
  is_letter(x)
a10b7c2d3e6f1g5h8i9j4 k
  l m n o
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
  TRUE  TRUE  TRUE  TRUE
  p q r s t u v w x y z
  1 2 3 4
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
  FALSE FALSE FALSE FALSE
  5 6 7 8 9101112131415
  16171819
  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  FALSE FALSE FALSE FALSE
 20212223242526
  FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  is_letter(x, 0:9)  ##function slightly misnamed
a10b7c2d3e6f1g5h8i9j4 k
  l m n o
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
  FALSE FALSE FALSE FALSE
  p q r s t u v w x y z
  1 2 3 4
  FALSE FALSE FALSE FALSE

Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote:
 nzchar(x)  !is.na(x)

 No?


It doesn't work for what I need:
 x
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 nzchar(x)  !is.na(x)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[52] TRUE


I need to have TRUE when an element contains a letter, and FALSE when
an element contains only numbers. The above returns TRUE for the
entire vector.

Regards
Liviu


 On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26


 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }

 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


 Is there a nicer way to do this? Regards
 Liviu


 --
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Yihui Xie
You probably mean grepl('[a-zA-Z]', x)

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Mon, Aug 6, 2012 at 3:29 PM, Liviu Andronic landronim...@gmail.com wrote:
 On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote:
 nzchar(x)  !is.na(x)

 No?


 It doesn't work for what I need:
 x
  [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 nzchar(x)  !is.na(x)
  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [52] TRUE


 I need to have TRUE when an element contains a letter, and FALSE when
 an element contains only numbers. The above returns TRUE for the
 entire vector.

 Regards
 Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)



This does exactly what I wanted:
 x
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:alpha:]],x)
 x[xb]  ##extract all vector elements that contain a letter
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y   z
 xb - grepl([[:digit:]],x)
 x[xb]  ##extract all vector elements that contain a digit
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  1
2   3   4
[15] 5   6   7   8   9   10  11  12  13  14  15
16  17  18
[29] 19  20  21  22  23  24  25  26

Thanks all for the suggestions! Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test parallel slopes with svyolr

2012-07-08 Thread Thomas Lumley
On Sun, Jul 8, 2012 at 2:32 AM, Diana Marcela Martinez Ruiz
dianamm...@hotmail.com wrote:
 Hello,

 I would like to know how to test the assumption of proportional odds or
 parallel lines or slopes for an ordinal logistic regression with svyolr


I wouldn't, but if someone finds a clear reference I'd be prepared to
implement it anyway.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test parallel slopes with svyolr

2012-07-07 Thread Diana Marcela Martinez Ruiz

Hello,

 I would like to know how to test the assumption of proportional odds or 
parallel lines or slopes for an ordinal logistic regression with svyolr

Thanks
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Binary File

2012-06-12 Thread Nortisiv
As an alternative to the hexview package, an external Hex-Editor may help you
investigate how the data is organised.

--
View this message in context: 
http://r.789695.n4.nabble.com/Test-Binary-File-tp833690p4633075.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test if a sample mean of integers with range -inf; inf is different from zero

2012-05-04 Thread Kay Cichini
Hi all,

how would you test  if a sample mean of integers with range -inf;inf  is
different from zero:

# my sample of integers:
c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12)

# is mean of c  0?:
mean(c)

Thanks,
Kay

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero

2012-05-04 Thread R. Michael Weylandt
mean(c) != 0

But if you mean in a statistical sense... t.test() is one possibility.

Michael

On Fri, May 4, 2012 at 5:29 AM, Kay Cichini kay.cich...@gmail.com wrote:
 Hi all,

 how would you test  if a sample mean of integers with range -inf;inf  is
 different from zero:

 # my sample of integers:
 c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12)

 # is mean of c  0?:
 mean(c)

 Thanks,
 Kay

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero

2012-05-04 Thread Petr Savicky
On Fri, May 04, 2012 at 11:29:51AM +0200, Kay Cichini wrote:
 Hi all,
 
 how would you test  if a sample mean of integers with range -inf;inf  is
 different from zero:
 
 # my sample of integers:
 c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12)
 
 # is mean of c  0?:
 mean(c)

Hi.

It is better to use a name of a vector different from c, which
is a function, which you also use.

Testing, whether the sample mean is zero is simple, since one can use

  mean(c) == 0

or 

  sum(c) == 0

which are equivalent even in the inaccurate computer arithmetic.

So, i think, you are asking for a statistical test, whether the
true distribution mean is zero on the basis of a sample. Testing
this requires some additional information on the distribution.
If we do not know anything about the distribution except that the
values are integers, then the sample mean can be arbitrarily large
even if the distribuition mean is zero. Consider, for example,
a uniform distribution on {-M, M} for some very large integer M.
Observing a large sample mean does not allow to reject the null
hypothesis on any level, since a large mean may have large probability
even if the null hypothesis is true.

If there is no bound on the values, then testing anything concerning
the mean may not be possible, since the expected may not exist. Do you
have a reason to think that the true distribution has an expected value?

An example of an integer random variable without an expected value is

  s*X

where s is uniform on {-1, 1} and X has value 2^i with probability 2^-i
for i a positive integer.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >