Re: [R] test logistic regression model
Agreed on the ranking of (1) vs (2) On Sun, Nov 20, 2022 at 1:30 PM Ebert,Timothy Aaron wrote: > I like option 1. Option 2 may cause problems if you are pooling groups > that do not go together. This is especially a problem if you know that the > data is missing some groups. I would consider dropping rare groups - or > compare results between pooling and dropping options. If the answer is the > same in both cases then use the approach that makes your life easier with > reviewers/clients. If the answer is different then I would go with dropping > rare categories, or present both and highlight the difference in outcome. A > third option is to gather more data. > > Tim > > -Original Message- > From: R-help On Behalf Of Bert Gunter > Sent: Sunday, November 20, 2022 1:06 PM > To: Mitchell Maltenfort > Cc: R-help > Subject: Re: [R] test logistic regression model > > [External Email] > > I think (2) might be a bad idea if one of the "sparse"categories has high > predictive power. You'll lose it when you pool, will you not? > Also, there is the problem of subjectively defining "sparse." > > However, 1) seems quite sensible to me. But IANAE. > > -- Bert > > On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort > wrote: > > > > Two possible fixes occur to me > > > > 1) Redo the test/training split but within levels of factor - so you > > have the same split within each level and each level accounted for in > > training and testing > > > > 2) if you have a lot of levels, and perhaps sparse representation in a > > few, consider recoding levels to pool the rare ones into an "other" > > category > > > > On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter > wrote: > >> > >> small reprex: > >> > >> set.seed(5) > >> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <- > >> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not > >> seen in dat to NA > >> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data = > >> dat) > >> > >> ##Result: > >> > predict(lmfit,newdat) > >> 1 2 3 4 5 6 > >> 0.4374251 0.6196527NA 0.4374251 0.6196527NA > >> > >> If this does not suffice, as Rui said, we need details of what you did. > >> (predict.glm works like predict.lm) > >> > >> > >> -- Bert > >> > >> > >> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas > wrote: > >> > > >> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: > >> > > Dear Bert, > >> > > > >> > > Yes, was trying to fill the not existing categories with NAs, but > >> > > the suggested solutions in stackoverflow.com unfortunately did not > work. > >> > > > >> > > Best regards > >> > > Gabor > >> > > > >> > > > >> > > Bert Gunter schrieb am So., 20. Nov. > 2022, 16:20: > >> > > > >> > >> You can't predict results for categories that you've not seen > >> > >> before (think about it). You will need to remove those cases > >> > >> from your test set (or convert them to NA and predict them as NA). > >> > >> > >> > >> -- Bert > >> > >> > >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki > >> > >> > >> > >> wrote: > >> > >> > >> > >>> Dear all, > >> > >>> > >> > >>> i have created a logistic regression model, > >> > >>> on the train df: > >> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = > >> > >>> "binomial") > >> > >>> > >> > >>> then i try to predict with the test df > >> > >>> Predict<- predict(mymodel1, newdata = test, type = "response") > >> > >>> then iget this error message: > >> > >>> Error in model.frame.default(Terms, newdata, na.action = > >> > >>> na.action, xlev = > >> > >>> object$xlevels) > >> > >>> Factor "TG_KraftF5" has new levels > >> > >>> > >> > >>> i have tried different proposals from stackoverflow, but > >> > >>> unfortu
Re: [R] test logistic regression model
I like option 1. Option 2 may cause problems if you are pooling groups that do not go together. This is especially a problem if you know that the data is missing some groups. I would consider dropping rare groups - or compare results between pooling and dropping options. If the answer is the same in both cases then use the approach that makes your life easier with reviewers/clients. If the answer is different then I would go with dropping rare categories, or present both and highlight the difference in outcome. A third option is to gather more data. Tim -Original Message- From: R-help On Behalf Of Bert Gunter Sent: Sunday, November 20, 2022 1:06 PM To: Mitchell Maltenfort Cc: R-help Subject: Re: [R] test logistic regression model [External Email] I think (2) might be a bad idea if one of the "sparse"categories has high predictive power. You'll lose it when you pool, will you not? Also, there is the problem of subjectively defining "sparse." However, 1) seems quite sensible to me. But IANAE. -- Bert On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort wrote: > > Two possible fixes occur to me > > 1) Redo the test/training split but within levels of factor - so you > have the same split within each level and each level accounted for in > training and testing > > 2) if you have a lot of levels, and perhaps sparse representation in a > few, consider recoding levels to pool the rare ones into an "other" > category > > On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter wrote: >> >> small reprex: >> >> set.seed(5) >> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <- >> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not >> seen in dat to NA >> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data = >> dat) >> >> ##Result: >> > predict(lmfit,newdat) >> 1 2 3 4 5 6 >> 0.4374251 0.6196527NA 0.4374251 0.6196527NA >> >> If this does not suffice, as Rui said, we need details of what you did. >> (predict.glm works like predict.lm) >> >> >> -- Bert >> >> >> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas wrote: >> > >> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: >> > > Dear Bert, >> > > >> > > Yes, was trying to fill the not existing categories with NAs, but >> > > the suggested solutions in stackoverflow.com unfortunately did not work. >> > > >> > > Best regards >> > > Gabor >> > > >> > > >> > > Bert Gunter schrieb am So., 20. Nov. 2022, >> > > 16:20: >> > > >> > >> You can't predict results for categories that you've not seen >> > >> before (think about it). You will need to remove those cases >> > >> from your test set (or convert them to NA and predict them as NA). >> > >> >> > >> -- Bert >> > >> >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki >> > >> >> > >> wrote: >> > >> >> > >>> Dear all, >> > >>> >> > >>> i have created a logistic regression model, >> > >>> on the train df: >> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = >> > >>> "binomial") >> > >>> >> > >>> then i try to predict with the test df >> > >>> Predict<- predict(mymodel1, newdata = test, type = "response") >> > >>> then iget this error message: >> > >>> Error in model.frame.default(Terms, newdata, na.action = >> > >>> na.action, xlev = >> > >>> object$xlevels) >> > >>> Factor "TG_KraftF5" has new levels >> > >>> >> > >>> i have tried different proposals from stackoverflow, but >> > >>> unfortunately they did not solved the problem. >> > >>> Do you have any idea how to test a logistic regression model >> > >>> when you have different levels in train and in test df? >> > >>> >> > >>> thank you in advance >> > >>> Regards, >> > >>> Gabor >> > >>> >> > >>> [[alternative HTML version deleted]] >> > >>> >> > >>> __ >> > >>> R-help@r-project.org mailing list -- To UNS
Re: [R] test logistic regression model
I think (2) might be a bad idea if one of the "sparse"categories has high predictive power. You'll lose it when you pool, will you not? Also, there is the problem of subjectively defining "sparse." However, 1) seems quite sensible to me. But IANAE. -- Bert On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort wrote: > > Two possible fixes occur to me > > 1) Redo the test/training split but within levels of factor - so you have the > same split within each level and each level accounted for in training and > testing > > 2) if you have a lot of levels, and perhaps sparse representation in a few, > consider recoding levels to pool the rare ones into an “other” category > > On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter wrote: >> >> small reprex: >> >> set.seed(5) >> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) >> newdat <- data.frame(f =rep(c('r','g','b'),2)) >> ## convert values in newdat not seen in dat to NA >> is.na(newdat$f) <-!( newdat$f %in% dat$f) >> lmfit <- lm(y~f, data = dat) >> >> ##Result: >> > predict(lmfit,newdat) >> 1 2 3 4 5 6 >> 0.4374251 0.6196527NA 0.4374251 0.6196527NA >> >> If this does not suffice, as Rui said, we need details of what you did. >> (predict.glm works like predict.lm) >> >> >> -- Bert >> >> >> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas wrote: >> > >> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: >> > > Dear Bert, >> > > >> > > Yes, was trying to fill the not existing categories with NAs, but the >> > > suggested solutions in stackoverflow.com unfortunately did not work. >> > > >> > > Best regards >> > > Gabor >> > > >> > > >> > > Bert Gunter schrieb am So., 20. Nov. 2022, >> > > 16:20: >> > > >> > >> You can't predict results for categories that you've not seen before >> > >> (think about it). You will need to remove those cases from your test set >> > >> (or convert them to NA and predict them as NA). >> > >> >> > >> -- Bert >> > >> >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki >> > >> >> > >> wrote: >> > >> >> > >>> Dear all, >> > >>> >> > >>> i have created a logistic regression model, >> > >>> on the train df: >> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = >> > >>> "binomial") >> > >>> >> > >>> then i try to predict with the test df >> > >>> Predict<- predict(mymodel1, newdata = test, type = "response") >> > >>> then iget this error message: >> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, >> > >>> xlev = >> > >>> object$xlevels) >> > >>> Factor "TG_KraftF5" has new levels >> > >>> >> > >>> i have tried different proposals from stackoverflow, but unfortunately >> > >>> they >> > >>> did not solved the problem. >> > >>> Do you have any idea how to test a logistic regression model when you >> > >>> have >> > >>> different levels in train and in test df? >> > >>> >> > >>> thank you in advance >> > >>> Regards, >> > >>> Gabor >> > >>> >> > >>> [[alternative HTML version deleted]] >> > >>> >> > >>> __ >> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > >>> https://stat.ethz.ch/mailman/listinfo/r-help >> > >>> PLEASE do read the posting guide >> > >>> http://www.R-project.org/posting-guide.html >> > >>> and provide commented, minimal, self-contained, reproducible code. >> > >>> >> > >> >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > __ >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> > > http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. >> > >> > hello, >> > >> > What exactly didn't work? You say you have tried the solutions found in >> > stackoverflow but without a link, we don't know which answers to which >> > questions you are talking about. >> > Like Bert said, if you assign NA to the new levels, present only in >> > test, it should work. >> > >> > Can you post links to what you have tried? >> > >> > Hope this helps, >> > >> > Rui Barradas >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from Gmail Mobile __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test logistic regression model
Two possible fixes occur to me 1) Redo the test/training split but within levels of factor - so you have the same split within each level and each level accounted for in training and testing 2) if you have a lot of levels, and perhaps sparse representation in a few, consider recoding levels to pool the rare ones into an “other” category On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter wrote: > small reprex: > > set.seed(5) > dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) > newdat <- data.frame(f =rep(c('r','g','b'),2)) > ## convert values in newdat not seen in dat to NA > is.na(newdat$f) <-!( newdat$f %in% dat$f) > lmfit <- lm(y~f, data = dat) > > ##Result: > > predict(lmfit,newdat) > 1 2 3 4 5 6 > 0.4374251 0.6196527NA 0.4374251 0.6196527NA > > If this does not suffice, as Rui said, we need details of what you did. > (predict.glm works like predict.lm) > > > -- Bert > > > On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas wrote: > > > > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: > > > Dear Bert, > > > > > > Yes, was trying to fill the not existing categories with NAs, but the > > > suggested solutions in stackoverflow.com unfortunately did not work. > > > > > > Best regards > > > Gabor > > > > > > > > > Bert Gunter schrieb am So., 20. Nov. 2022, > 16:20: > > > > > >> You can't predict results for categories that you've not seen before > > >> (think about it). You will need to remove those cases from your test > set > > >> (or convert them to NA and predict them as NA). > > >> > > >> -- Bert > > >> > > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki < > gmalomsoki1...@gmail.com> > > >> wrote: > > >> > > >>> Dear all, > > >>> > > >>> i have created a logistic regression model, > > >>> on the train df: > > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = > > >>> "binomial") > > >>> > > >>> then i try to predict with the test df > > >>> Predict<- predict(mymodel1, newdata = test, type = "response") > > >>> then iget this error message: > > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, > xlev = > > >>> object$xlevels) > > >>> Factor "TG_KraftF5" has new levels > > >>> > > >>> i have tried different proposals from stackoverflow, but > unfortunately > > >>> they > > >>> did not solved the problem. > > >>> Do you have any idea how to test a logistic regression model when > you have > > >>> different levels in train and in test df? > > >>> > > >>> thank you in advance > > >>> Regards, > > >>> Gabor > > >>> > > >>> [[alternative HTML version deleted]] > > >>> > > >>> __ > > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>> PLEASE do read the posting guide > > >>> http://www.R-project.org/posting-guide.html > > >>> and provide commented, minimal, self-contained, reproducible code. > > >>> > > >> > > > > > > [[alternative HTML version deleted]] > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > hello, > > > > What exactly didn't work? You say you have tried the solutions found in > > stackoverflow but without a link, we don't know which answers to which > > questions you are talking about. > > Like Bert said, if you assign NA to the new levels, present only in > > test, it should work. > > > > Can you post links to what you have tried? > > > > Hope this helps, > > > > Rui Barradas > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Sent from Gmail Mobile [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test logistic regression model
small reprex: set.seed(5) dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <- data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not seen in dat to NA is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data = dat) ##Result: > predict(lmfit,newdat) 1 2 3 4 5 6 0.4374251 0.6196527NA 0.4374251 0.6196527NA If this does not suffice, as Rui said, we need details of what you did. (predict.glm works like predict.lm) -- Bert On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas wrote: > > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: > > Dear Bert, > > > > Yes, was trying to fill the not existing categories with NAs, but the > > suggested solutions in stackoverflow.com unfortunately did not work. > > > > Best regards > > Gabor > > > > > > Bert Gunter schrieb am So., 20. Nov. 2022, 16:20: > > > >> You can't predict results for categories that you've not seen before > >> (think about it). You will need to remove those cases from your test set > >> (or convert them to NA and predict them as NA). > >> > >> -- Bert > >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki > >> wrote: > >> > >>> Dear all, > >>> > >>> i have created a logistic regression model, > >>> on the train df: > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = > >>> "binomial") > >>> > >>> then i try to predict with the test df > >>> Predict<- predict(mymodel1, newdata = test, type = "response") > >>> then iget this error message: > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = > >>> object$xlevels) > >>> Factor "TG_KraftF5" has new levels > >>> > >>> i have tried different proposals from stackoverflow, but unfortunately > >>> they > >>> did not solved the problem. > >>> Do you have any idea how to test a logistic regression model when you have > >>> different levels in train and in test df? > >>> > >>> thank you in advance > >>> Regards, > >>> Gabor > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> __ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >> > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > hello, > > What exactly didn't work? You say you have tried the solutions found in > stackoverflow but without a link, we don't know which answers to which > questions you are talking about. > Like Bert said, if you assign NA to the new levels, present only in > test, it should work. > > Can you post links to what you have tried? > > Hope this helps, > > Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test logistic regression model
Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: Dear Bert, Yes, was trying to fill the not existing categories with NAs, but the suggested solutions in stackoverflow.com unfortunately did not work. Best regards Gabor Bert Gunter schrieb am So., 20. Nov. 2022, 16:20: You can't predict results for categories that you've not seen before (think about it). You will need to remove those cases from your test set (or convert them to NA and predict them as NA). -- Bert On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki wrote: Dear all, i have created a logistic regression model, on the train df: mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial") then i try to predict with the test df Predict<- predict(mymodel1, newdata = test, type = "response") then iget this error message: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) Factor "TG_KraftF5" has new levels i have tried different proposals from stackoverflow, but unfortunately they did not solved the problem. Do you have any idea how to test a logistic regression model when you have different levels in train and in test df? thank you in advance Regards, Gabor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. hello, What exactly didn't work? You say you have tried the solutions found in stackoverflow but without a link, we don't know which answers to which questions you are talking about. Like Bert said, if you assign NA to the new levels, present only in test, it should work. Can you post links to what you have tried? Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test logistic regression model
Dear Bert, Yes, was trying to fill the not existing categories with NAs, but the suggested solutions in stackoverflow.com unfortunately did not work. Best regards Gabor Bert Gunter schrieb am So., 20. Nov. 2022, 16:20: > You can't predict results for categories that you've not seen before > (think about it). You will need to remove those cases from your test set > (or convert them to NA and predict them as NA). > > -- Bert > > On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki > wrote: > >> Dear all, >> >> i have created a logistic regression model, >> on the train df: >> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = >> "binomial") >> >> then i try to predict with the test df >> Predict<- predict(mymodel1, newdata = test, type = "response") >> then iget this error message: >> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = >> object$xlevels) >> Factor "TG_KraftF5" has new levels >> >> i have tried different proposals from stackoverflow, but unfortunately >> they >> did not solved the problem. >> Do you have any idea how to test a logistic regression model when you have >> different levels in train and in test df? >> >> thank you in advance >> Regards, >> Gabor >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test logistic regression model
You can't predict results for categories that you've not seen before (think about it). You will need to remove those cases from your test set (or convert them to NA and predict them as NA). -- Bert On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki wrote: > Dear all, > > i have created a logistic regression model, > on the train df: > mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial") > > then i try to predict with the test df > Predict<- predict(mymodel1, newdata = test, type = "response") > then iget this error message: > Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = > object$xlevels) > Factor "TG_KraftF5" has new levels > > i have tried different proposals from stackoverflow, but unfortunately they > did not solved the problem. > Do you have any idea how to test a logistic regression model when you have > different levels in train and in test df? > > thank you in advance > Regards, > Gabor > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test logistic regression model
Dear all, i have created a logistic regression model, on the train df: mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial") then i try to predict with the test df Predict<- predict(mymodel1, newdata = test, type = "response") then iget this error message: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) Factor "TG_KraftF5" has new levels i have tried different proposals from stackoverflow, but unfortunately they did not solved the problem. Do you have any idea how to test a logistic regression model when you have different levels in train and in test df? thank you in advance Regards, Gabor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embedded R: Test if initialized
I believe this is the wrong list for this post. See the posting guide, linked below, for one that is more appropriate. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Jun 16, 2021 at 12:51 PM Matthias Gondan wrote: > Dear R friends, > > I am currently trying to write a piece of C code that uses „embedded R“, > and for specific reasons*, I cannot keep track if R already has been > initialized. So the code snippet looks like this: > > LibExtern char *R_TempDir; > > if(R_TempDir == NULL) > …throw exception R not initialized… > > I have seen that the source code of Rf_initialize_R itself checks if it is > ivoked twice (num_initialized), but this latter flag does not seem to > accessible, or is it? > > int Rf_initialize_R(int ac, char **av) > { > int i, ioff = 1, j; > Rboolean useX11 = TRUE, useTk = FALSE; > char *p, msg[1024], cmdlines[1], **avv; > structRstart rstart; > Rstart Rp = > Rboolean force_interactive = FALSE; > > if (num_initialized++) { > fprintf(stderr, "%s", "R is already initialized\n"); > exit(1); > } > > > Is the test of the TempDir a good substitute, or should I choose another > solution? Having said this, it may be a good idea to expose a function > Rf_R_initialized that performs such a test. > > Thank you for your consideration. > > Best regards, > > Matthias > > *The use case is an R library that connects to swi-prolog and allows the > „embedded“ swi-prolog to establish the reverse connection to R. In that > case, i.e., R -> Prolog -> R, I do not want to initialize R a second time. > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Embedded R: Test if initialized
Dear R friends, I am currently trying to write a piece of C code that uses „embedded R“, and for specific reasons*, I cannot keep track if R already has been initialized. So the code snippet looks like this: LibExtern char *R_TempDir; if(R_TempDir == NULL) …throw exception R not initialized… I have seen that the source code of Rf_initialize_R itself checks if it is ivoked twice (num_initialized), but this latter flag does not seem to accessible, or is it? int Rf_initialize_R(int ac, char **av) { int i, ioff = 1, j; Rboolean useX11 = TRUE, useTk = FALSE; char *p, msg[1024], cmdlines[1], **avv; structRstart rstart; Rstart Rp = Rboolean force_interactive = FALSE; if (num_initialized++) { fprintf(stderr, "%s", "R is already initialized\n"); exit(1); } Is the test of the TempDir a good substitute, or should I choose another solution? Having said this, it may be a good idea to expose a function Rf_R_initialized that performs such a test. Thank you for your consideration. Best regards, Matthias *The use case is an R library that connects to swi-prolog and allows the „embedded“ swi-prolog to establish the reverse connection to R. In that case, i.e., R -> Prolog -> R, I do not want to initialize R a second time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if something was plotted on pdf device
Dear Duncan Thank you for the code, I will test it or at least check what it does. I finally found probably easier solution. I stay with my original code if (dev.cur()==1) plot(ecdf(velik[,"ecd"]), main = ufil[j], col=i) else plot(ecdf(velik[,"ecd"]), add=T, col=i) After plot is finished and cycle ends, I copy result to pdf device dev.copy(pdf,paste(gsub(".xls", "", ufil)[j], ".pdf", sep="")) dev.off() Using this approach I could stay with my original code (almost), check if plot was initialised by dev.cur() and save it after it is finished to pdf. The only obstacle is that my code flashes during plotting to basic device, however I can live with it. Thank you again and best regards Petr > -Original Message- > From: Duncan Murdoch > Sent: Thursday, September 12, 2019 2:29 PM > To: PIKAL Petr ; r-help mailing list project.org> > Subject: Re: [R] test if something was plotted on pdf device > > On 12/09/2019 7:10 a.m., PIKAL Petr wrote: > > Dear all > > > > Is there any simple way checking whether after calling pdf device > something was plotted into it? > > > > In interactive session I used > > > > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), > > add=T, col=i) which enabled me to test if plot is open > > > > But when I want to call eg. pdf("test.pdf") before cycle > > dev.cur()==1 is FALSE even when no plot is drawn and plot.new error > comes. > > > >> pdf("test.pdf") > > > > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), > > add=T, col=i) > > > > Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd, : > >plot.new has not been called yet > > > > I don't know if this is reliable or not, but you could use code like this: > >f <- tempfile() >pdf(f) >blankPlot <- recordPlot() >dev.off() >unlink(f) > >pdf("test.pdf") > >... unknown operations ... > >if (dev.cur() == 1 || identical(recordPlot(), blankPlot)) > plot(ecdf(rnorm(100))) >else > plot(ecdf(rnorm(100)), add=TRUE, col=i) > > > > Duncan Murdoch Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if something was plotted on pdf device
On 12/09/2019 7:10 a.m., PIKAL Petr wrote: Dear all Is there any simple way checking whether after calling pdf device something was plotted into it? In interactive session I used if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, col=i) which enabled me to test if plot is open But when I want to call eg. pdf("test.pdf") before cycle dev.cur()==1 is FALSE even when no plot is drawn and plot.new error comes. pdf("test.pdf") if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, col=i) Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd, : plot.new has not been called yet I don't know if this is reliable or not, but you could use code like this: f <- tempfile() pdf(f) blankPlot <- recordPlot() dev.off() unlink(f) pdf("test.pdf") ... unknown operations ... if (dev.cur() == 1 || identical(recordPlot(), blankPlot)) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=TRUE, col=i) Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test if something was plotted on pdf device
Dear all Is there any simple way checking whether after calling pdf device something was plotted into it? In interactive session I used if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, col=i) which enabled me to test if plot is open But when I want to call eg. pdf("test.pdf") before cycle dev.cur()==1 is FALSE even when no plot is drawn and plot.new error comes. > pdf("test.pdf") if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, col=i) Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd, : plot.new has not been called yet Best regards Petr Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner's personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test of independence
The basic test of independence for a table based on the Chi-squared distribution can be done using the `chisq.test` function. This is in the stats package which is installed and loaded by default, so you don't need to do anything additional. There is also the `fisher.test` function for Fisher's exact test (similar hypotheses, different methodology and assumptions, may be really slow on your table). If you need more than the basics provided in those functions, then a search of CRAN may be helpful, or give us more detail to be able to help. On Thu, Dec 20, 2018 at 12:08 AM km wrote: > > Dear All, > > How do I do a test of independence with 16x16 table of counts. > Please suggest. > > Regards, > KM > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test of independence
Hi Did you search CRAN? I got **many** results for test of independence which may or may not provide you with suitable procedures. Cheers Petr > -Original Message- > From: R-help On Behalf Of km > Sent: Thursday, December 20, 2018 8:07 AM > To: r-help@r-project.org > Subject: [R] test of independence > > Dear All, > > How do I do a test of independence with 16x16 table of counts. > Please suggest. > > Regards, > KM > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test of independence
Dear All, How do I do a test of independence with 16x16 table of counts. Please suggest. Regards, KM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] TEST message
Apologies for disturbance! Just checking that I can get through to r-help. Ted. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test if data uniformly distributed (newbie)
Dear Mr. Savicky, I am currently working on a project where I want to test a random number generator, which is supposed to create 10.000 continuously uniformly distributed random numbers between 0 and 1. I am now wondering if I can use the Chi-Squared-Test to solve this problem or if the Kolmogorov-Smirnov-test would be a better fit. I came across one of your threads on the internet where you answer a similar question and thought I'd reach out to you. Thanks in advance Florian Huber Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist vertraulich und kann dem Bank- und Datengeheimnis unterliegen oder sonst rechtlich geschuetzte Daten und Informationen enthalten. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender �ber die Antwortfunktion. Anschliessend moechten Sie bitte diese Nachricht einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht und/oder der ihr etwa beigefuegten Anhaenge sowie die unbefugte Weitergabe der darin enthaltenen Daten und Informationen sind nicht gestattet. Wir weisen darauf hin, dass rechtsverbindliche Erklaerungen namens unseres Hauses grundsaetzlich der Unterschriften zweier ausreichend bevollmaechtigter Vertreter unseres Hauses beduerfen. Wir verschicken daher keine rechtsverbindlichen Erklaerungen per E-Mail an Dritte. Demgemaess nehmen wir per E-Mail auch keine rechtsverbindlichen Erklaerungen oder Auftraege von Dritten entgegen. Sollten Sie Schwierigkeiten beim Oeffnen dieser E-Mail haben, wenden Sie sich bitte an den Absender oder an i...@berenberg.de. Please refer to http://www.berenberg.de/my_berenberg/disclaimer_e.html for our confidentiality notice. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test set and Train set in Caret package train function
Hey all, Does anyone know how we can get train set and test set for each fold of 5 fold cross validation in Caret package? Imagine if I want to do cross validation by random forest method, I do the following in Caret: set.seed(12) train_control <- trainControl(method="cv", number=5,savePredictions = TRUE) rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf") first_holdout <- subset(rfmodel$pred, Resample == "Fold1") str(first_holdout) 'data.frame': 90 obs. of 5 variables: $ pred: Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 $ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 $ rowIndex: int 2 3 9 11 25 29 35 36 41 50 ... $ mtry: num 2 2 2 2 2 2 2 2 2 2 ... $ Resample: chr "Fold1" "Fold1" "Fold1" "Fold1" ... Are these 90 observations in Fold1 used as training set? If yes then where is the test set for this fold? thanks for any help! Elahe __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test for proportion or concordance
This list is about R programming, not statistics, although admittedly there is a nonempty intersection. However, I think you would do better posting this on a statistics list like stats.stackexchange.com. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Aug 3, 2017 at 7:19 AM, Adrian Johnsonwrote: > Hello group, > > my question is deciding what test would be appropriate for following question. > > An experiment 'A' yielded 3200 observations of which 431 are > significant. Similarly, using same method, another experiment 'B' on a > different population yielded 2541 observations of which 260 are > significant. > > There are 180 observations that are common between significant > observations of A and B. > (180 are common between 431 and 260). > > 80 observations are specific to A > 251 observations are specific to B. > > The question are the 180 observations that are common between A and B > - are these 180 common observations occurring by chance? > > What test would be appropriate for this scenario. (if my total > observations are fixed between two experiments A and B, I could use > Cohens kappa for concordance or Chi-square etc. > Since the total observations differ between experiments A and B, I > dont know what test would be appropriate. I appreciate your help. > > thanks > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test for proportion or concordance
Hello group, my question is deciding what test would be appropriate for following question. An experiment 'A' yielded 3200 observations of which 431 are significant. Similarly, using same method, another experiment 'B' on a different population yielded 2541 observations of which 260 are significant. There are 180 observations that are common between significant observations of A and B. (180 are common between 431 and 260). 80 observations are specific to A 251 observations are specific to B. The question are the 180 observations that are common between A and B - are these 180 common observations occurring by chance? What test would be appropriate for this scenario. (if my total observations are fixed between two experiments A and B, I could use Cohens kappa for concordance or Chi-square etc. Since the total observations differ between experiments A and B, I dont know what test would be appropriate. I appreciate your help. thanks __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test individual slope for each factor level in ANCOVA
Hi John. Thanks much for your help. It is great to know this. Hanna 2017-03-16 8:02 GMT-04:00 Fox, John: > Dear Hanna, > > You can test the slope in each non-reference group as a linear hypothesis. > You didn’t make the data available for your example, so here’s an example > using the linearHypothesis() function in the car package with the Moore > data set in the same package: > > - - - snip - - - > > > library(car) > > mod <- lm(conformity ~ fscore*partner.status, data=Moore) > > summary(mod) > > Call: > lm(formula = conformity ~ fscore * partner.status, data = Moore) > > Residuals: > Min 1Q Median 3Q Max > -7.5296 -2.5984 -0.4473 2.0994 12.4704 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 20.793483.26273 6.373 1.27e-07 *** > fscore-0.151100.07171 -2.107 0.04127 * > partner.statuslow-15.534084.40045 -3.530 0.00104 ** > fscore:partner.statuslow 0.261100.09700 2.692 0.01024 * > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 4.562 on 41 degrees of freedom > Multiple R-squared: 0.2942,Adjusted R-squared: 0.2426 > F-statistic: 5.698 on 3 and 41 DF, p-value: 0.002347 > > > linearHypothesis(mod, "fscore + fscore:partner.statuslow") > Linear hypothesis test > > Hypothesis: > fscore + fscore:partner.statuslow = 0 > > Model 1: restricted model > Model 2: conformity ~ fscore * partner.status > > Res.DfRSS Df Sum of Sq F Pr(>F) > 1 42 912.45 > 2 41 853.42 159.037 2.8363 0.09976 . > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > - - - snip - - - > > In this case, there are just two levels for partner.status, but for a > multi-level factor you can simply perform more than one test. > > > I hope this helps, > > John > > - > John Fox, Professor > McMaster University > Hamilton, Ontario, Canada > Web: http://socserv.mcmaster.ca/jfox/ > > > > > On 2017-03-15, 9:43 PM, "R-help on behalf of li li" > wrote: > > >Hi all, > > Consider the data set where there are a continuous response variable, a > >continuous predictor "weeks" and a categorical variable "region" with five > >levels "a", "b", "c", > >"d", "e". > > I fit the ANCOVA model as follows. Here the reference level is region > >"a" > >and there are 4 dummy variables. The interaction terms (in red below) > >represent the slope > >difference between each region and the baseline region "a" and the > >corresponding p-value is for testing whether this slope difference is > >zero. > >Is there a way to directly test whether the slope corresponding to each > >individual factor level is 0 or not, instead of testing the slope > >difference from the baseline level? > > Thanks very much. > > Hanna > > > > > > > > > > > > > >> mod <- lm(response ~ weeks*region,data)> summary(mod) > >Call: > >lm(formula = response ~ weeks * region, data = data) > > > >Residuals: > > Min 1Q Median 3Q Max > >-0.19228 -0.07433 -0.01283 0.04439 0.24544 > > > >Coefficients: > >Estimate Std. Error t value Pr(>|t|) > >(Intercept)1.2105556 0.0954567 12.682 1.2e-14 *** > >weeks -0.021 0.0147293 -1.4480.156 > >regionb -0.0257778 0.1349962 -0.1910.850 > >regionc -0.034 0.1349962 -0.2550.800 > >regiond -0.075 0.1349962 -0.5590.580 > >regione -0.148 0.1349962 -1.0980.280weeks:regionb > >-0.0007222 0.0208304 -0.0350.973 > >weeks:regionc -0.0017778 0.0208304 -0.0850.932 > >weeks:regiond 0.003 0.0208304 0.1440.886 > >weeks:regione 0.0301667 0.0208304 1.4480.156--- > >Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > > >Residual standard error: 0.1082 on 35 degrees of freedom > >Multiple R-squared: 0.2678, Adjusted R-squared: 0.07946 > >F-statistic: 1.422 on 9 and 35 DF, p-value: 0.2165 > > > > [[alternative HTML version deleted]] > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test individual slope for each factor level in ANCOVA
Dear Hanna, You can test the slope in each non-reference group as a linear hypothesis. You didn’t make the data available for your example, so here’s an example using the linearHypothesis() function in the car package with the Moore data set in the same package: - - - snip - - - > library(car) > mod <- lm(conformity ~ fscore*partner.status, data=Moore) > summary(mod) Call: lm(formula = conformity ~ fscore * partner.status, data = Moore) Residuals: Min 1Q Median 3Q Max -7.5296 -2.5984 -0.4473 2.0994 12.4704 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 20.793483.26273 6.373 1.27e-07 *** fscore-0.151100.07171 -2.107 0.04127 * partner.statuslow-15.534084.40045 -3.530 0.00104 ** fscore:partner.statuslow 0.261100.09700 2.692 0.01024 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.562 on 41 degrees of freedom Multiple R-squared: 0.2942,Adjusted R-squared: 0.2426 F-statistic: 5.698 on 3 and 41 DF, p-value: 0.002347 > linearHypothesis(mod, "fscore + fscore:partner.statuslow") Linear hypothesis test Hypothesis: fscore + fscore:partner.statuslow = 0 Model 1: restricted model Model 2: conformity ~ fscore * partner.status Res.DfRSS Df Sum of Sq F Pr(>F) 1 42 912.45 2 41 853.42 159.037 2.8363 0.09976 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 - - - snip - - - In this case, there are just two levels for partner.status, but for a multi-level factor you can simply perform more than one test. I hope this helps, John - John Fox, Professor McMaster University Hamilton, Ontario, Canada Web: http://socserv.mcmaster.ca/jfox/ On 2017-03-15, 9:43 PM, "R-help on behalf of li li"wrote: >Hi all, > Consider the data set where there are a continuous response variable, a >continuous predictor "weeks" and a categorical variable "region" with five >levels "a", "b", "c", >"d", "e". > I fit the ANCOVA model as follows. Here the reference level is region >"a" >and there are 4 dummy variables. The interaction terms (in red below) >represent the slope >difference between each region and the baseline region "a" and the >corresponding p-value is for testing whether this slope difference is >zero. >Is there a way to directly test whether the slope corresponding to each >individual factor level is 0 or not, instead of testing the slope >difference from the baseline level? > Thanks very much. > Hanna > > > > > > >> mod <- lm(response ~ weeks*region,data)> summary(mod) >Call: >lm(formula = response ~ weeks * region, data = data) > >Residuals: > Min 1Q Median 3Q Max >-0.19228 -0.07433 -0.01283 0.04439 0.24544 > >Coefficients: >Estimate Std. Error t value Pr(>|t|) >(Intercept)1.2105556 0.0954567 12.682 1.2e-14 *** >weeks -0.021 0.0147293 -1.4480.156 >regionb -0.0257778 0.1349962 -0.1910.850 >regionc -0.034 0.1349962 -0.2550.800 >regiond -0.075 0.1349962 -0.5590.580 >regione -0.148 0.1349962 -1.0980.280weeks:regionb >-0.0007222 0.0208304 -0.0350.973 >weeks:regionc -0.0017778 0.0208304 -0.0850.932 >weeks:regiond 0.003 0.0208304 0.1440.886 >weeks:regione 0.0301667 0.0208304 1.4480.156--- >Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > >Residual standard error: 0.1082 on 35 degrees of freedom >Multiple R-squared: 0.2678, Adjusted R-squared: 0.07946 >F-statistic: 1.422 on 9 and 35 DF, p-value: 0.2165 > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test individual slope for each factor level in ANCOVA
Hi all, Consider the data set where there are a continuous response variable, a continuous predictor "weeks" and a categorical variable "region" with five levels "a", "b", "c", "d", "e". I fit the ANCOVA model as follows. Here the reference level is region "a" and there are 4 dummy variables. The interaction terms (in red below) represent the slope difference between each region and the baseline region "a" and the corresponding p-value is for testing whether this slope difference is zero. Is there a way to directly test whether the slope corresponding to each individual factor level is 0 or not, instead of testing the slope difference from the baseline level? Thanks very much. Hanna > mod <- lm(response ~ weeks*region,data)> summary(mod) Call: lm(formula = response ~ weeks * region, data = data) Residuals: Min 1Q Median 3Q Max -0.19228 -0.07433 -0.01283 0.04439 0.24544 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)1.2105556 0.0954567 12.682 1.2e-14 *** weeks -0.021 0.0147293 -1.4480.156 regionb -0.0257778 0.1349962 -0.1910.850 regionc -0.034 0.1349962 -0.2550.800 regiond -0.075 0.1349962 -0.5590.580 regione -0.148 0.1349962 -1.0980.280weeks:regionb -0.0007222 0.0208304 -0.0350.973 weeks:regionc -0.0017778 0.0208304 -0.0850.932 weeks:regiond 0.003 0.0208304 0.1440.886 weeks:regione 0.0301667 0.0208304 1.4480.156--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1082 on 35 degrees of freedom Multiple R-squared: 0.2678,Adjusted R-squared: 0.07946 F-statistic: 1.422 on 9 and 35 DF, p-value: 0.2165 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test
-- Robert J. Piliero Cell: (617) 283 1020 38 Linnaean St. #6 Cambridge, MA, 02138 USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Homoscedesticity in R Without BP Test
I have tried and got the result. Thank you every one. On Tue, Apr 5, 2016 at 12:58 AM, Achim Zeileis <achim.zeil...@uibk.ac.at> wrote: > On Mon, 4 Apr 2016, varin sacha via R-help wrote: > > Hi Deepak, >> >> In econometrics there is another test very often used : the white test. >> The white test is based on the comparison of the estimated variances of >> residuals when the model is estimated by OLS under the assumption of >> homoscedasticity and when the model is estimated by OLS under the >> assumption of heteroscedastic. >> > > The White test is a special case of the Breusch-Pagan test using a > particular specification of the auxiliary regressors: namely all > regressors, their squares and their cross-products. As this specification > makes only sense if all regressors are continuous, many implementations > have problems if there are already dummy variables, interactions, etc. in > the regressor matrix. This is also the reason why bptest() from "lmtest" > uses a different specification by default. However, you can utilize the > function to carry out the White test as illustrated in: > > example("CigarettesB", package = "AER") > > (Of course, the AER package needs to be installed first.) > > The White test with R >> >> install.packages("bstats") >> library(bstats) >> white.test(LinearModel) >> > > That package is no longer on CRAN as it took the code from bptest() > without crediting its original authors and released it in a package that > conflicted with the original license. Also, the implementation did not > check for potential problems with dummy variables or interactions mentioned > above. > > So the bptest() implementation from "lmtest" is really recommend. Or > alternatively ncvTest() from package "car". > > > Hope this helps. >> >> Sacha >> >> >> >> >> >> >> De : Deepak Singh <sdeepakrh...@gmail.com> >> À : r-help@r-project.org Envoyé le : Lundi 4 avril 2016 10h40 >> Objet : [R] Test for Homoscedesticity in R Without BP Test >> >> >> Respected Sir, >> I am doing a project on multiple linear model fitting and in that project >> I >> have to test Homoscedesticity of errors I have google for the same and >> found bptest for the same but in R version 3.2.4 bp test is not available. >> So please suggest me a test on homoscedesticity ASAP as we have to submit >> our report on 7-04-2016. >> >> P.S. : I have plotted residuals against fitted values and it is less or >> more random. >> >> Thank You ! >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Homoscedesticity in R Without BP Test
On Mon, 4 Apr 2016, varin sacha via R-help wrote: Hi Deepak, In econometrics there is another test very often used : the white test. The white test is based on the comparison of the estimated variances of residuals when the model is estimated by OLS under the assumption of homoscedasticity and when the model is estimated by OLS under the assumption of heteroscedastic. The White test is a special case of the Breusch-Pagan test using a particular specification of the auxiliary regressors: namely all regressors, their squares and their cross-products. As this specification makes only sense if all regressors are continuous, many implementations have problems if there are already dummy variables, interactions, etc. in the regressor matrix. This is also the reason why bptest() from "lmtest" uses a different specification by default. However, you can utilize the function to carry out the White test as illustrated in: example("CigarettesB", package = "AER") (Of course, the AER package needs to be installed first.) The White test with R install.packages("bstats") library(bstats) white.test(LinearModel) That package is no longer on CRAN as it took the code from bptest() without crediting its original authors and released it in a package that conflicted with the original license. Also, the implementation did not check for potential problems with dummy variables or interactions mentioned above. So the bptest() implementation from "lmtest" is really recommend. Or alternatively ncvTest() from package "car". Hope this helps. Sacha De : Deepak Singh <sdeepakrh...@gmail.com> À : r-help@r-project.org Envoyé le : Lundi 4 avril 2016 10h40 Objet : [R] Test for Homoscedesticity in R Without BP Test Respected Sir, I am doing a project on multiple linear model fitting and in that project I have to test Homoscedesticity of errors I have google for the same and found bptest for the same but in R version 3.2.4 bp test is not available. So please suggest me a test on homoscedesticity ASAP as we have to submit our report on 7-04-2016. P.S. : I have plotted residuals against fitted values and it is less or more random. Thank You ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Homoscedesticity in R Without BP Test
On Mon, 4 Apr 2016, Deepak Singh wrote: Respected Sir, I am doing a project on multiple linear model fitting and in that project I have to test Homoscedesticity of errors I have google for the same and found bptest for the same but in R version 3.2.4 bp test is not available. The function is called bptest() and is implemented in package "lmtest" which is available for current versions of R, see https://CRAN.R-project.org/package=lmtest To install it, run: install.packages("lmtest") And then to load the package and try the function: library("lmtest") example("bptest") So please suggest me a test on homoscedesticity ASAP as we have to submit our report on 7-04-2016. P.S. : I have plotted residuals against fitted values and it is less or more random. Thank You ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Homoscedesticity in R Without BP Test
Hi Deepak, In econometrics there is another test very often used : the white test. The white test is based on the comparison of the estimated variances of residuals when the model is estimated by OLS under the assumption of homoscedasticity and when the model is estimated by OLS under the assumption of heteroscedastic. The White test with R install.packages("bstats") library(bstats) white.test(LinearModel) Hope this helps. Sacha De : Deepak Singh <sdeepakrh...@gmail.com> À : r-help@r-project.org Envoyé le : Lundi 4 avril 2016 10h40 Objet : [R] Test for Homoscedesticity in R Without BP Test Respected Sir, I am doing a project on multiple linear model fitting and in that project I have to test Homoscedesticity of errors I have google for the same and found bptest for the same but in R version 3.2.4 bp test is not available. So please suggest me a test on homoscedesticity ASAP as we have to submit our report on 7-04-2016. P.S. : I have plotted residuals against fitted values and it is less or more random. Thank You ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Homoscedesticity in R Without BP Test
You might "google Breusch Pagan test r" and find that the test is implemented in lmtest package. On 4 Apr 2016 17:28, "Deepak Singh"wrote: > Respected Sir, > I am doing a project on multiple linear model fitting and in that project I > have to test Homoscedesticity of errors I have google for the same and > found bptest for the same but in R version 3.2.4 bp test is not available. > So please suggest me a test on homoscedesticity ASAP as we have to submit > our report on 7-04-2016. > > P.S. : I have plotted residuals against fitted values and it is less or > more random. > > Thank You ! > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for Homoscedesticity in R Without BP Test
Respected Sir, I am doing a project on multiple linear model fitting and in that project I have to test Homoscedesticity of errors I have google for the same and found bptest for the same but in R version 3.2.4 bp test is not available. So please suggest me a test on homoscedesticity ASAP as we have to submit our report on 7-04-2016. P.S. : I have plotted residuals against fitted values and it is less or more random. Thank You ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test hypothesis in R
> On Mar 23, 2016, at 1:44 PM, ruipbarra...@sapo.pt wrote: > > Hello, > > Try > > ?t.test > t.test(mA, mB, alternative = "greater") > > Hope this helps, > > Rui Barradas > > > Citando Eliza Botto: > >> Dear All, >> I want to test a hypothesis in R by using student' t-test (P-values). >> The hypothesis is that model A produces lesser error than model B at >> ten stations. Obviously, Null Hypothesis (H0) is that the error >> produces by model A is not lower than model B. NOT "obviously". You only get to do one-sided tests when the scientific question would not allow the possibility of a departure to "the other side". Two-sided tests are the norm in scientific literature, often to the experimenter's distress when they haven't done a thoughtful (non-optimistic) power analysis and their results are inconclusive as a result. Your hypothesis _should_ have been constructed _before_ you saw the data. That is if you want to be an ethical scientist. >> The error magnitudes are >> >> #model A >>> dput(mA) >> >> c(36.1956086452583, 34.9996207622861, 36.435733025221, >> 37.2003157636202, 36.1318687775115, 37.164132533536, >> 35.2028759357069, 36.7719835944373, 38.3861425339751, >> 37.4174132119744) >> #model B >>> dput(mB) >> >> c(39.7655211768704, 40.1730916643841, 39.3699055738618, >> 39.401619831763, 41.1218634441457, 39.1968630742826, >> 40.5265825061639, 40.4674956975404, 40.5954427072364, >> 41.4875529130543) Those are not models. They are just vectors of numbers. And they seem unlikely to be residual errors of a linear model since they are not centered on zero. I doubt there is enough in your presentation for a sensible comment on the proper analysis. -- David. >> >> Now can I test my hypothesis in R? >> Thankyou very much in Advance, >> Eliza >> [[alternative HTML version deleted]] >> >> __ David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test hypothesis in R
Sorry, but in your original post you said that " Null Hypothesis (H0) is that the error produces by model A is not lower than model B". If now is that model A produces less error change to alternative="less". The relevant part in the help page ?t.test is alternative = "greater" is the alternative that x has a larger mean than y. Rui Barradas Citando Eliza Botto <eliza_bo...@outlook.com>: > Thnx Rui, > Just one point though > > Should it be alternative="greater" or "less"? Since alternative > hypothesis is that model A produced less error. > > regards, > > Eliza > > - > Date: Wed, 23 Mar 2016 20:44:20 + > From: ruipbarra...@sapo.pt > To: eliza_bo...@outlook.com > CC: r-help@r-project.org > Subject: Re: [R] test hypothesis in R > Dear All, > I want to test a hypothesis in R by using student' t-test (P-values). > The hypothesis is that model A produces lesser error than model B at > ten stations. Obviously, Null Hypothesis (H0) is that the error > produces by model A is not lower than model B. > The error magnitudes are > > #model A >> dput(mA) > > c(36.1956086452583, 34.9996207622861, 36.435733025221, > 37.2003157636202, 36.1318687775115, 37.164132533536, > 35.2028759357069, 36.7719835944373, 38.3861425339751, > 37.4174132119744) > #model B >> dput(mB) > > c(39.7655211768704, 40.1730916643841, 39.3699055738618, > 39.401619831763, 41.1218634441457, 39.1968630742826, > 40.5265825061639, 40.4674956975404, 40.5954427072364, > 41.4875529130543) > > Now can I test my hypothesis in R? > Thankyou very much in Advance, > Eliza > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.htmland provide commented, > minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test hypothesis in R
Thnx Rui, Just one point though Should it be alternative="greater" or "less"? Since alternative hypothesis is that model A produced less error. regards, Eliza Date: Wed, 23 Mar 2016 20:44:20 + From: ruipbarra...@sapo.pt To: eliza_bo...@outlook.com CC: r-help@r-project.org Subject: Re: [R] test hypothesis in R Hello, Try ?t.test t.test(mA, mB, alternative = "greater") Hope this helps, Rui Barradas Citando Eliza Botto <eliza_bo...@outlook.com>: Dear All, I want to test a hypothesis in R by using student' t-test (P-values). The hypothesis is that model A produces lesser error than model B at ten stations. Obviously, Null Hypothesis (H0) is that the error produces by model A is not lower than model B. The error magnitudes are #model A dput(mA) c(36.1956086452583, 34.9996207622861, 36.435733025221, 37.2003157636202, 36.1318687775115, 37.164132533536, 35.2028759357069, 36.7719835944373, 38.3861425339751, 37.4174132119744) #model B dput(mB) c(39.7655211768704, 40.1730916643841, 39.3699055738618, 39.401619831763, 41.1218634441457, 39.1968630742826, 40.5265825061639, 40.4674956975404, 40.5954427072364, 41.4875529130543) Now can I test my hypothesis in R? Thankyou very much in Advance, Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test hypothesis in R
Hello, Try ?t.test t.test(mA, mB, alternative = "greater") Hope this helps, Rui Barradas Citando Eliza Botto: > Dear All, > I want to test a hypothesis in R by using student' t-test (P-values). > The hypothesis is that model A produces lesser error than model B at > ten stations. Obviously, Null Hypothesis (H0) is that the error > produces by model A is not lower than model B. > The error magnitudes are > > #model A >> dput(mA) > > c(36.1956086452583, 34.9996207622861, 36.435733025221, > 37.2003157636202, 36.1318687775115, 37.164132533536, > 35.2028759357069, 36.7719835944373, 38.3861425339751, > 37.4174132119744) > #model B >> dput(mB) > > c(39.7655211768704, 40.1730916643841, 39.3699055738618, > 39.401619831763, 41.1218634441457, 39.1968630742826, > 40.5265825061639, 40.4674956975404, 40.5954427072364, > 41.4875529130543) > > Now can I test my hypothesis in R? > Thankyou very much in Advance, > Eliza > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.htmland provide commented, > minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test hypothesis in R
Dear All, I want to test a hypothesis in R by using student' t-test (P-values). The hypothesis is that model A produces lesser error than model B at ten stations. Obviously, Null Hypothesis (H0) is that the error produces by model A is not lower than model B. The error magnitudes are #model A > dput(mA) c(36.1956086452583, 34.9996207622861, 36.435733025221, 37.2003157636202, 36.1318687775115, 37.164132533536, 35.2028759357069, 36.7719835944373, 38.3861425339751, 37.4174132119744) #model B > dput(mB) c(39.7655211768704, 40.1730916643841, 39.3699055738618, 39.401619831763, 41.1218634441457, 39.1968630742826, 40.5265825061639, 40.4674956975404, 40.5954427072364, 41.4875529130543) Now can I test my hypothesis in R? Thankyou very much in Advance, Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if a url exists
On 29/06/2014, 7:12 AM, Hui Du wrote: Hi all, I need to test if a url exists. I used url.exists() in RCurl package library(RCurl) however the test result is kind of weird. For example, url.exists(http://www.amazon.com;) [1] FALSE although www.amazon.comhttp://www.amazon.com is a valid url. Does anybody know how to use that function correctly or the other way to test url existence? You can use the .header = TRUE option to that call to see the error 405 that it gives. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test if a url exists
Hi all, I need to test if a url exists. I used url.exists() in RCurl package library(RCurl) however the test result is kind of weird. For example, url.exists(http://www.amazon.com;) [1] FALSE although www.amazon.comhttp://www.amazon.com is a valid url. Does anybody know how to use that function correctly or the other way to test url existence? Thanks. HXD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test the return from grep or agrep
On 01/03/2014 23:32, Hui Du wrote: Hi All, My sample code looks like options(stringsAsFactors = FALSE); clean = function(x) { loc = agrep(ABC, x$name); x[loc,]$new_name - NEW; x; } name = c(12, dad, dfd); y = data.frame(name = as.character(name), idx = 1:3); y$new_name = y$name; z - clean(y) The snippet does not work because I forgot to test the return value of agrep. If no pattern is found, it returns 0 and the following x[loc, ]$new_name does not like. I know how to fix that part. However, my code has many places like that, say over 100 calls for agrep or grep for different patterns and substitution. Is there any smart way to fix them all rather than line by line? That is not true: it returns integer(0). (If it returned 0 it would work.) For grep() I would recommend using grepl() instead. Otherwise if(length(loc)) x[loc,]$new_name - NEW or x[loc,]$new_name - rep_len(NEW, length(loc)) Your code is full of pointless empty statements (between ; and NL): R is not C and ; is a separator, not a terminator. Many thanks. HXD -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test the return from grep or agrep
Hi All, My sample code looks like options(stringsAsFactors = FALSE); clean = function(x) { loc = agrep(ABC, x$name); x[loc,]$new_name - NEW; x; } name = c(12, dad, dfd); y = data.frame(name = as.character(name), idx = 1:3); y$new_name = y$name; z - clean(y) The snippet does not work because I forgot to test the return value of agrep. If no pattern is found, it returns 0 and the following x[loc, ]$new_name does not like. I know how to fix that part. However, my code has many places like that, say over 100 calls for agrep or grep for different patterns and substitution. Is there any smart way to fix them all rather than line by line? Many thanks. HXD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test to determine if there is a difference between two means
Hi, I have a data set where there are 20 experiments which each ran for 10 minutes. In each experiment an insect had a choice to spend time in one of two chambers. Each experiment therefore has number of seconds spent in each chamber. I want to know whether there is a difference in the mean time spent in each chamber. I was going to do a t-test but was advised that there was a better way, something about introducing random numbers? I was hoping someone could help? Thanks Wes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test to determine if there is a difference between two means
Inline below. Cheers, Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Tue, Dec 24, 2013 at 7:38 AM, wesley bell wesleybel...@yahoo.com wrote: Hi, I have a data set where there are 20 experiments which each ran for 10 minutes. In each experiment an insect had a choice to spend time in one of two chambers. Each experiment therefore has number of seconds spent in each chamber. I want to know whether there is a difference in the mean time spent in each chamber. Yes, there is. Always. I was going to do a t-test but was advised that there was a better way, something about introducing random numbers? I was hoping someone could help? This list is about R, not statistics, although they certainly overlap. I suggest you post on stats.stackexchange.com instead for statistics help. Better yet, you might do well to talk with a local expert about statistical issues, as you are obviously weak here. Thanks Wes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test ADF differences in R and Eviews
Hi, In attachment you can find source data on which I run adf.test() and print-screen with results in R and Eviews. Results are very different. Did I missed something? Best, T.S. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test ADF differences in R and Eviews
On Dec 5, 2013, at 3:18 PM, nooldor wrote: Hi, In attachment you can find source data on which I run adf.test() and print-screen with results in R and Eviews. Results are very different. Did I missed something? Yes. You missed the list of acceptable file types for r-help. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for exogeneity
Hi, I am building a bivariate SVAR model y_1t=c_1+Ã_1 (1,1) y_(1,t-1)+Ã_1 (1,2) y_(2,t-1)+Ã_2 (1,1) y_(1,t-2)+Ã_2 (1,2) y_(2,t-2)+å_1t b y_1t+ y_2t=c_2+Ã_1 (2,1) y_(1,t-1)+Ã_1 (2,2) y_(2,t-1)+Ã_2 (2,1) y_(1,t-2)+Ã_2 (1,2) y_(2,t-2)+å_2t Now y1 is relatively exogenous in that y1 impacts y2 contemporaneously but not the other way around. Given a bivariate dataset, is there any statistical test (in any R package or elsewhere) that helps to justify/test the exogeneity of y1 in the present context? Is there any reference available? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test wilcoxon sur R help!
Hi, Try: fun1 - function(dat){ mat1 - combn(colnames(dat1),2) res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } set.seed(432) dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18)) fun1(dat1) #gives the p-value for each pair of columns Hi, I want to make a wilcoxon test, i have 18 columns each column corresponds to a different sample and i want to compare one to each other with a wilcoxon test in one step this is possible ? or do i compare two by tow? Does it exist a code for automation this test? like this i dont have to type the code for each couple. thanks! denisse __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test wilcoxon sur R help!
Hello, There's a bug in your function, it should be 'dat', not 'dat1'. In the line marked, below. fun1 - function(dat){ mat1 - combn(colnames(dat),2) # Here, 'dat' not 'dat1' res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } Hope this helps, Rui Barradas Em 24-10-2013 20:16, arun escreveu: Hi, Try: fun1 - function(dat){ mat1 - combn(colnames(dat1),2) res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } set.seed(432) dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18)) fun1(dat1) #gives the p-value for each pair of columns Hi, I want to make a wilcoxon test, i have 18 columns each column corresponds to a different sample and i want to compare one to each other with a wilcoxon test in one step this is possible ? or do i compare two by tow? Does it exist a code for automation this test? like this i dont have to type the code for each couple. thanks! denisse __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test wilcoxon sur R help!
Hi, Check out this function:- pairwise.wilcox.test {package=stats}. example(pairwise.wilcox.test) On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, There's a bug in your function, it should be 'dat', not 'dat1'. In the line marked, below. fun1 - function(dat){ mat1 - combn(colnames(dat),2) # Here, 'dat' not 'dat1' res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } Hope this helps, Rui Barradas Em 24-10-2013 20:16, arun escreveu: Hi, Try: fun1 - function(dat){ mat1 - combn(colnames(dat1),2) res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } set.seed(432) dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18)) fun1(dat1) #gives the p-value for each pair of columns Hi, I want to make a wilcoxon test, i have 18 columns each column corresponds to a different sample and i want to compare one to each other with a wilcoxon test in one step this is possible ? or do i compare two by tow? Does it exist a code for automation this test? like this i dont have to type the code for each couple. thanks! denisse __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test wilcoxon sur R help!
It looks much better than mine. with p value adjustment: p.adjust(fun1(dat1), method = holm, n = 153) # dat1$id - 1:10 library(reshape2) dat2 - melt(dat1,id.var=id) with(dat2,pairwise.wilcox.test(value,variable)) with(dat2,pairwise.wilcox.test(value,variable,p.adj=none)) A.K. On Friday, October 25, 2013 12:05 AM, vikram ranga babuaw...@gmail.com wrote: Hi, Check out this function:- pairwise.wilcox.test {package=stats}. example(pairwise.wilcox.test) On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, There's a bug in your function, it should be 'dat', not 'dat1'. In the line marked, below. fun1 - function(dat){ mat1 - combn(colnames(dat),2) # Here, 'dat' not 'dat1' res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } Hope this helps, Rui Barradas Em 24-10-2013 20:16, arun escreveu: Hi, Try: fun1 - function(dat){ mat1 - combn(colnames(dat1),2) res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; wilcox.test(x1[,1],x1[,2])$p.value}) names(res) - apply(mat1,2,paste,collapse=_) res } set.seed(432) dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18)) fun1(dat1) #gives the p-value for each pair of columns Hi, I want to make a wilcoxon test, i have 18 columns each column corresponds to a different sample and i want to compare one to each other with a wilcoxon test in one step this is possible ? or do i compare two by tow? Does it exist a code for automation this test? like this i dont have to type the code for each couple. thanks! denisse __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test if 2 samples differ if they have autocorrelation
Dear all I have one question that I struggle to find an answer: Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 different trading strategies. I want to find out if strategy A is better than strategy B. The problem is that the two series have serial correlations, hence I cannot just do a simple t-test. I tried something like this: 1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B 2.take the difference of both: C_A â C_B = DiffPnL (to see how the difference evolves over time) 3.do a regression: DiffPnL = beta * time + error (I thought if beta is significantly different from 0 than the two time series are different) 4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) - this corrects statistical tests, standard errors for beta heteroskedasticity and autocorrelation BUT: I read something that the tests are biased when the timeseries are unit root non-stationary (which is due to the fact that I take cumulative time series) I am lost! This should be fairly simple: test if two samples differ if they have autocorrelation? Probably my approach above is completely wrong⦠Thanks for your help Best regards Eric The information in this e-mail is intended only for th...{{dropped:23}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test if 2 samples differ if they have autocorrelation
I imagine that most readers of this list will put your question in the too hard basket. That being so, here is my inexpert take on the question. The issue is to estimate the uncertainty in the estimated difference of the means. This uncertainty depends on the nature of the serial dependence of the series. Therefore in order to get anywhere you need to *model* this dependence. Different models could yield very different values for the variance of the estimated difference of the means. If the series are observed at the same times I would suggest taking the pointwise difference of the two series: D_t = X_t - Y_t, say. Fit the best arima model that you can to D_t. Then the standard error of what is incorrectly labelled intercept (it is actually the estimate of the series *mean*) is the appropriate estimate of the uncertainty. The ratio of the intercept value to its standard error is the test statistic you are looking for. If the series are *not* observed at the same times but can be assumed to be independent then model *each* series as well as you can (different models for each series) and obtain the standard error of the intercept for each series. Your test statistic is then the difference of the intercept estimates divided by sqrt(se_X^2 + se_Y^2) in what I hope is an obvious notation. If the series are not observed at the same times and cannot be assumed to be independent then you probably haven't got sufficient information to answer the question that you wish to answer. I hope that there is some value in the forgoing. cheers, Rolf Turner On 18/07/13 21:50, Eric Jaeger wrote: Dear all I have one question that I struggle to find an answer: Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 different trading strategies. I want to find out if strategy A is better than strategy B. The problem is that the two series have serial correlations, hence I cannot just do a simple t-test. I tried something like this: 1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B 2.take the difference of both: C_A – C_B = DiffPnL (to see how the difference evolves over time) 3.do a regression: DiffPnL = beta * time + error (I thought if beta is significantly different from 0 than the two time series are different) 4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) - this corrects statistical tests, standard errors for beta heteroskedasticity and autocorrelation BUT: I read something that the tests are biased when the timeseries are unit root non-stationary (which is due to the fact that I take cumulative time series) I am lost! This should be fairly simple: test if two samples differ if they have autocorrelation? Probably my approach above is completely wrong… __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for column equality across matrices
Dear William, thanks a lot. I've found another nice alternative: A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) B.n - B[, -which(duplicated(t(cbind(A, B - ncol(A)] Best wishes, Alrik -Ursprüngliche Nachricht- Von: arun [mailto:smartpink...@yahoo.com] Gesendet: Samstag, 13. Juli 2013 19:57 An: William Dunlap Cc: mailman, r-help; Thiem Alrik Betreff: Re: [R] Test for column equality across matrices I tried it on a slightly bigger dataset: A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3) B1 - combn(90, 3) which(is.element(columnsOf(B1), columnsOf(A1))) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 44331 B1[,44331] #[1] 14 15 16 which(apply(t(A1),1,paste,collapse=)==141516) #[1] 14 B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)] newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))] identical(B1New,newB) #[1] FALSE is.element(B1[,44331],A1[,14]) #[1] TRUE TRUE TRUE B1Sp-columnsOf(B1) B1Sp[[44331]] #[1] 14 15 16 A1Sp- columnsOf(A1) A1Sp[[14]] #[1] 14 15 16 is.element(B1Sp[[44331]],A1Sp[[14]]) #[1] TRUE TRUE TRUE A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help r-help@r-project.org Cc: Sent: Saturday, July 13, 2013 1:30 PM Subject: Re: [R] Test for column equality across matrices Try columnsOf - function(mat) split(mat, col(mat)) newB - B[ , !is.element(columnsOf(B), columnsOf(A))] Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thiem Alrik Sent: Saturday, July 13, 2013 6:45 AM To: mailman, r-help Subject: [R] Test for column equality across matrices Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for column equality across matrices
It looks like match() (and relatives like %in% and is.element) act a bit unpredictably on lists when the list elements are vectors of numbers of different types. If you match integers to integers or doubles to doubles it works as expected, but when the types don't match the results vary. I would expect the following to give either c(1,2) or c(NA,NA) but not c(1,NA): match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13.,15.,16.), c(14.,15.,16.) )) [1] 1 NA It works when the list elements have the same type match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13L,15L,16L), c(14L,15L,16L) )) [1] 1 2 match( list( c(13.,15.,16.), c(14.,15.,16.)), list( c(13.,15.,16.), c(14.,15.,16.) )) [1] 1 2 match( list( c(13.,15.,16.), c(14L,15L,16L)), list( c(13.,15.,16.), c(14L,15L,16L) )) [1] 1 2 So - A and B should be coerced to have a common type ('storage.mode') before comparing them. By the way, the discrepency might happen because match() applied to lists might be implemented by calling deparse on each element of each list and then using the character method of match. For sequential integers deparse uses colon notation; e.g., c(14L,15L,16L) becomes the string 14:16. But usually deparse puts an 'L' after integers so they would never match with a double of the same value. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Saturday, July 13, 2013 10:57 AM To: William Dunlap Cc: R help; Thiem Alrik Subject: Re: [R] Test for column equality across matrices I tried it on a slightly bigger dataset: A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3) B1 - combn(90, 3) which(is.element(columnsOf(B1), columnsOf(A1))) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 44331 B1[,44331] #[1] 14 15 16 which(apply(t(A1),1,paste,collapse=)==141516) #[1] 14 B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)] newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))] identical(B1New,newB) #[1] FALSE is.element(B1[,44331],A1[,14]) #[1] TRUE TRUE TRUE B1Sp-columnsOf(B1) B1Sp[[44331]] #[1] 14 15 16 A1Sp- columnsOf(A1) A1Sp[[14]] #[1] 14 15 16 is.element(B1Sp[[44331]],A1Sp[[14]]) #[1] TRUE TRUE TRUE A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help r-help@r-project.org Cc: Sent: Saturday, July 13, 2013 1:30 PM Subject: Re: [R] Test for column equality across matrices Try columnsOf - function(mat) split(mat, col(mat)) newB - B[ , !is.element(columnsOf(B), columnsOf(A))] Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thiem Alrik Sent: Saturday, July 13, 2013 6:45 AM To: mailman, r-help Subject: [R] Test for column equality across matrices Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for column equality across matrices
Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for column equality across matrices
Try columnsOf - function(mat) split(mat, col(mat)) newB - B[ , !is.element(columnsOf(B), columnsOf(A))] Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thiem Alrik Sent: Saturday, July 13, 2013 6:45 AM To: mailman, r-help Subject: [R] Test for column equality across matrices Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for column equality across matrices
I tried it on a slightly bigger dataset: A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3) B1 - combn(90, 3) which(is.element(columnsOf(B1), columnsOf(A1))) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)) # [1] 1067 4895 8636 12291 15861 19347 22750 26071 29311 32471 35552 38555 #[13] 41481 44331 B1[,44331] #[1] 14 15 16 which(apply(t(A1),1,paste,collapse=)==141516) #[1] 14 B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)] newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))] identical(B1New,newB) #[1] FALSE is.element(B1[,44331],A1[,14]) #[1] TRUE TRUE TRUE B1Sp-columnsOf(B1) B1Sp[[44331]] #[1] 14 15 16 A1Sp- columnsOf(A1) A1Sp[[14]] #[1] 14 15 16 is.element(B1Sp[[44331]],A1Sp[[14]]) #[1] TRUE TRUE TRUE A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help r-help@r-project.org Cc: Sent: Saturday, July 13, 2013 1:30 PM Subject: Re: [R] Test for column equality across matrices Try columnsOf - function(mat) split(mat, col(mat)) newB - B[ , !is.element(columnsOf(B), columnsOf(A))] Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thiem Alrik Sent: Saturday, July 13, 2013 6:45 AM To: mailman, r-help Subject: [R] Test for column equality across matrices Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for column equality across matrices
Hi, One way would be: which(apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=)) #[1] 105 196 274 340 395 B[,105] #[1] 1 15 16 B[,196] #[1] 2 15 16 B1-B[,!apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=)] dim(B1) #[1] 3 555 dim(B) #[1] 3 560 #or B2-B[,is.na(match(interaction(as.data.frame(t(B))),interaction(as.data.frame(t(A)] identical(B1,B2) #[1] TRUE A.K. - Original Message - From: Thiem Alrik th...@sipo.gess.ethz.ch To: mailman, r-help r-help@r-project.org Cc: Sent: Saturday, July 13, 2013 9:45 AM Subject: [R] Test for column equality across matrices Dear list, I have two matrices A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3) B - combn(16, 3) Now I would like to exclude all columns from the 560 columns in B which are identical to any 1 of the 6 columns in A. How could I do this? Many thanks and best wishes, Alrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test
Sorry for this message it's just a test. Thank you! -- --- Catalin-Constantin ROIBU Lecturer PhD, Forestry engineer Forestry Faculty of Suceava Str. Universitatii no. 13, Suceava, 720229, Romania office phone +4 0230 52 29 78, ext. 531 mobile phone +4 0745 53 18 01 +4 0766 71 76 58 FAX:+4 0230 52 16 64 silvic.usv.ro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Parallel Regression Assumption in R
Dear Heather, You can make this test using the ordinal package. Here the function clm fits cumulative link models where the ordinal logistic regression model is a special case (using the logit link). Let me illustrate how to test the parallel regression assumption for a particular variable using clm in the ordinal package. I am using the wine dataset from the same package, I fit a model with two explanatory variables; temp and contact, and I test the parallel regression assumption for the contact variable in a likelihood ratio test: library(ordinal) Loading required package: MASS Loading required package: ucminf Loading required package: Matrix Loading required package: lattice head(wine) response rating temp contact bottle judge 1 36 2 cold no 1 1 2 48 3 cold no 2 1 3 47 3 cold yes 3 1 4 67 4 cold yes 4 1 5 77 4 warm no 5 1 6 60 4 warm no 6 1 fm1 - clm(rating ~ temp + contact, data=wine) fm2 - clm(rating ~ temp, nominal=~ contact, data=wine) anova(fm1, fm2) Likelihood ratio tests of cumulative link models: formula:nominal: link: threshold: fm1 rating ~ temp + contact ~1 logit flexible fm2 rating ~ temp ~contact logit flexible no.parAIC logLik LR.stat df Pr(Chisq) fm1 6 184.98 -86.492 fm2 9 190.42 -86.209 0.5667 3 0.904 The idea is to fit the model under the null hypothesis (parallel effects - fm1) and under the alternative hypothesis (non-parallel effects for contact - fm2) and compare these models with anova() which performs the LR test. From the high p-value we see that the null cannot be rejected and there is no evidence of non-parallel slopes in this case. For additional information, I suggest that you take a look at the following package vignette (http://cran.r-project.org/web/packages/ordinal/vignettes/clm_tutorial.pdf) where these kind of tests are more thoroughly described starting page 6. I think you can also make similar tests with the VGAM package, but I am not as well versed in that package. Hope this helps, Rune Rune Haubo Bojesen Christensen Postdoc DTU Compute - Section for Statistics --- Technical University of Denmark Department of Applied Mathematics and Computer Science Richard Petersens Plads Building 324, Room 220 2800 Lyngby Direct +45 45253363 Mobile +45 30264554 http://www.imm.dtu.dk On 11 March 2013 22:52, Nicole Ford nicole.f...@me.com wrote: here's some code as an example hope it helps! mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) summary(mod) mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) levs-levels(dat$vote) tmpdat-list() for(i in 1:(nlevels(dat$vote)-1)){ tmpdat[[i]] - dat tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i]) } form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban) mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial)) probs-sapply(mods, predict, type=response) p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)]) p.ologit-predict(mod, type='probs') n-nrow(p.logits) bin.ll - p.logits[cbind(1:n, dat$vote)] ologit.ll - p.ologit[cbind(1:n, dat$vote)] binom.test(sum(bin.ll ologit.ll), n) dat$vote.fac-factor(dat$vote, levels=1:6) mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) source(http://www.quantoid.net/cat_pre.R ) catpre(mod) install.packages(rms) library(rms) olprobs-predict(mod, type='probs') pred.cat-apply(olprobs, 1, which.max) table(pred.cat, dat$vote) round(prop.table(table(pred.cat, dat$vote), 2), 3) On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote: Hi, I am running an analysis with an ordinal outcome and I need to run a test of the parallel regression assumption to determine if ordinal logistic regression is appropriate. I cannot find a function to conduct such a test. From searching various message boards I have seen a few useRs ask this same question without a definitive answer - and I came across a thread that indicated there is no such function available in any R packages. I hope this is incorrect. Does anyone know how to test the parallel regression assumption in R? Thanks for your help! -- Heather Hensman Kettrey PhD Candidate Department of Sociology Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
[R] Test of Parallel Regression Assumption in R
Hi, I am running an analysis with an ordinal outcome and I need to run a test of the parallel regression assumption to determine if ordinal logistic regression is appropriate. I cannot find a function to conduct such a test. From searching various message boards I have seen a few useRs ask this same question without a definitive answer - and I came across a thread that indicated there is no such function available in any R packages. I hope this is incorrect. Does anyone know how to test the parallel regression assumption in R? Thanks for your help! -- Heather Hensman Kettrey PhD Candidate Department of Sociology Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Parallel Regression Assumption in R
Heather: You are at Vanderbilt, whose statistics department under Frank Harrell is a veritable bastion of R and statistical wisdom. I strongly recommend that you take a stroll over there in the lovely spring weather and seek their help. I can't imagine how you could do better than that! Cheers, Bert On Mon, Mar 11, 2013 at 2:02 PM, Heather Kettrey heather.h.kett...@vanderbilt.edu wrote: Hi, I am running an analysis with an ordinal outcome and I need to run a test of the parallel regression assumption to determine if ordinal logistic regression is appropriate. I cannot find a function to conduct such a test. From searching various message boards I have seen a few useRs ask this same question without a definitive answer - and I came across a thread that indicated there is no such function available in any R packages. I hope this is incorrect. Does anyone know how to test the parallel regression assumption in R? Thanks for your help! -- Heather Hensman Kettrey PhD Candidate Department of Sociology Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Parallel Regression Assumption in R
Perhaps you should be asking whether such an algorithm exists, regardless of whether it is already implemented in R. However, this is the wrong place to ask such theory questions... your local statistics expert might know, or you could ask on a statistics theory forum such as stats.stackexchange.com. With the answer to that question you could use the RSiteSeek function to search for references to that algorithm, or even implement it yourself. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Heather Kettrey heather.h.kett...@vanderbilt.edu wrote: Hi, I am running an analysis with an ordinal outcome and I need to run a test of the parallel regression assumption to determine if ordinal logistic regression is appropriate. I cannot find a function to conduct such a test. From searching various message boards I have seen a few useRs ask this same question without a definitive answer - and I came across a thread that indicated there is no such function available in any R packages. I hope this is incorrect. Does anyone know how to test the parallel regression assumption in R? Thanks for your help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Parallel Regression Assumption in R
here's some code as an example hope it helps! mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) summary(mod) mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) levs-levels(dat$vote) tmpdat-list() for(i in 1:(nlevels(dat$vote)-1)){ tmpdat[[i]] - dat tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i]) } form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban) mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial)) probs-sapply(mods, predict, type=response) p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)]) p.ologit-predict(mod, type='probs') n-nrow(p.logits) bin.ll - p.logits[cbind(1:n, dat$vote)] ologit.ll - p.ologit[cbind(1:n, dat$vote)] binom.test(sum(bin.ll ologit.ll), n) dat$vote.fac-factor(dat$vote, levels=1:6) mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat) source(http://www.quantoid.net/cat_pre.R ) catpre(mod) install.packages(rms) library(rms) olprobs-predict(mod, type='probs') pred.cat-apply(olprobs, 1, which.max) table(pred.cat, dat$vote) round(prop.table(table(pred.cat, dat$vote), 2), 3) On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote: Hi, I am running an analysis with an ordinal outcome and I need to run a test of the parallel regression assumption to determine if ordinal logistic regression is appropriate. I cannot find a function to conduct such a test. From searching various message boards I have seen a few useRs ask this same question without a definitive answer - and I came across a thread that indicated there is no such function available in any R packages. I hope this is incorrect. Does anyone know how to test the parallel regression assumption in R? Thanks for your help! -- Heather Hensman Kettrey PhD Candidate Department of Sociology Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test if mysql connection is alive
Hi fellows, I use RMySQL. I want to reconnect, if the connections is not alive anymore. if (!connected()) con-dbConnect(MySQL(),user=.., password=..,host=..,db=..) But how can I do the test connected()? I thought the way to do this was, connected()-function(){return (exists(con) isIdCurrent(con))} But that does'n work, after some time connected() returns TRUE, but the next dbGetQuery signals Error in mysqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not run statement: MySQL server has gone away) How can I test if the connection is still valid? Thanks Frans __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test for a condition in a vector for loop not working
Once again, thanks! MVS - MVS = Matthew Van Scoyoc Graduate Research Assistant, Ecology Wildland Resources Department Ecology Center Quinney College of Natural Resources Utah State University Logan, UT = Think SNOW! -- View this message in context: http://r.789695.n4.nabble.com/test-for-a-condition-in-a-vector-for-loop-not-working-tp4649212p4649216.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for treatment effect in a logistic regression
Dear R usuer, I need to fit logistic regression with binomial response. The objective is to compare treatment groups controlling other categorical and continuous predictors. The GLM procedure with family=binomial(Logit) gives me parameters estimates as well as odd ratios. But objective is to compare if treatment groups are significantly different. I have used wald test but got error message (Plz see code used and the error message) Any suggestion is much appreciated! wald.test(b=coef(fit),sigma=vcov(fit), Terms = 2:3) # 2 and 3 are the estimates for treatment group. ## Comparing GRoup B to Group C l - cbind(0, 1,-1, 0,0,0,0,0,0,0) wald.test(b = coef(fit), Sigma = vcov(fit), L =1 Error Message Error in wald.test(b = coef(), sigma = vcov(), Terms = 2:3) : unused argument(s) (sigma = vcov()) Thanks in advance for your suggestion, Bibek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
Hi Lorenzo, Just a quick thought, the uniform probability density on a unit sphere is 1 / (4pi), what about binning those random points according to their directions and do a chi-square test? Regards, Guo On Sun, Oct 7, 2012 at 2:16 AM, cbe...@tajo.ucsd.edu wrote: Lorenzo Isella lorenzo.ise...@gmail.com writes: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? There is a substantial literature on this topic and more than one (metaphorical?) direction you could follow. I suggest you Google 'directional statistics' and start reading. Visit http://www.rseek.org and enter 'directional statistics' in the search box and click on the search button to see if there is something in R to meet your needs. A post to r-sig-geo might get more helpful responses once you can focus the question a bit more. HTH, Chuck Many thanks Lorenzo -- Charles C. BerryDept of Family/Preventive Medicine cberry at ucsd edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
Lorenzo Isella lorenzo.ise...@gmail.com writes: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? There is a substantial literature on this topic and more than one (metaphorical?) direction you could follow. I suggest you Google 'directional statistics' and start reading. Visit http://www.rseek.org and enter 'directional statistics' in the search box and click on the search button to see if there is something in R to meet your needs. A post to r-sig-geo might get more helpful responses once you can focus the question a bit more. HTH, Chuck Many thanks Lorenzo -- Charles C. BerryDept of Family/Preventive Medicine cberry at ucsd edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for Random Points on a Sphere
Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Gut says to divide the surface into n bits of equal area and see if the points appear uniformly in those using something chi-squared-ish, but I'm not aware of a canonical way to do so. Cheers, Michael Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of R. Michael Weylandt Sent: Friday, October 05, 2012 11:17 AM To: Lorenzo Isella Cc: r-help@r-project.org Subject: Re: [R] Test for Random Points on a Sphere On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Gut says to divide the surface into n bits of equal area and see if the points appear uniformly in those using something chi-squared-ish, but I'm not aware of a canonical way to do so. Cheers, Michael Lorenzo I would be more inclined to use a method which is known to produce a points uniformly distributed on the surface of a sphere and not worry about testing your results. You might find the discussion at the following link useful. http://mathworld.wolfram.com/SpherePointPicking.html Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test Breslow-Day for svytable??
Hi all, I want to know how to perform the test Breslow-Day test for homogeneity of odds ratios (OR) stratified for svytable. This test is obtained with the following code: epi.2by2 (dat = daty, method = case.control conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE) where daty is the object type table svytable consider it, but when I run the code does not throw the homogeneity test. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test Breslow-Day for svytable??
Suggstion: You need to send us more information, i.e. the code that genrated daty, or a listing of the daty structure, and a copy of the listing produced by epi.2by2 John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Diana Marcela Martinez Ruiz dianamm...@hotmail.com 8/31/2012 10:20 AM Hi all, I want to know how to perform the test Breslow-Day test for homogeneity of odds ratios (OR) stratified for svytable. This test is obtained with the following code: epi.2by2 (dat = daty, method = case.control conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE) where daty is the object type table svytable consider it, but when I run the code does not throw the homogeneity test. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test Breslow-Day for svytable??
On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote: Hi all, I want to know how to perform the test Breslow-Day test for homogeneity of odds ratios (OR) stratified for svytable. This test is obtained with the following code: epi.2by2 (dat = daty, method = case.control conf.level = 0.95, missing comma here ...^ units = 100, homogeneity = breslow.day, verbose = TRUE) where daty is the object type table svytable consider it, but when I run the code does not throw the homogeneity test. You are asked in the Posting guide to copy all errors and warnings when asking about unexpected behavior. When I run epi.2y2 on the output of a syvtable object I get no errors, but I do get warnings which I think are due to non-integer entries in the weighted table. I also get from a svytable() usingits first example on the help page an object that is NOT a set of 2 x 2 tables in an array of the structure as expected by epi.2by2(). The fact that epi.2by2() will report numbers with labels for a 2 x 3 table means that its error checking is weak. This is the output of str(dat) from one of the example on epi.2by2's help page: str(dat) table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ... - attr(*, dimnames)=List of 3 ..$ Exposure: chr [1:2] + - ..$ Disease : chr [1:2] + - ..$ Strata : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply reading the help pages and using str() to look at the objects created to get an idea regarding success or failure. I am not an experienced user of either package.) I doubt that what you got from svytable is a 2 x 2 table. As another example you can build a 2 x 2 x n table from the built-in dataset: UCBAdmissions DF - as.data.frame(UCBAdmissions) ## Now 'DF' is a data frame with a grid of the factors and the counts ## in variable 'Freq'. dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF) epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog #- test.statistic dfp.value 1 18.82551 5 0.00207139 Using svydesign and svytable I _think_ this is how one would go about constructing a 2 x 2 table: tbl2-svydesign( ~ Gender + Admit+Dept, weights=~Freq, data=DF) summary(dclus1) (tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2)) epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog #--- test.statistic dfp.value 1 18.82551 5 0.00207139 (At least I got internal consistency. I see you copied Thomas Lumley, which is a good idea. I'll be happy to get corrected on any point. I'm adding the maintainer of epiR to the recipients.) -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test Breslow-Day for svytable??
On Sat, Sep 1, 2012 at 4:27 AM, David Winsemius dwinsem...@comcast.net wrote: On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote: Hi all, I want to know how to perform the test Breslow-Day test for homogeneity of odds ratios (OR) stratified for svytable. This test is obtained with the following code: epi.2by2 (dat = daty, method = case.control conf.level = 0.95, missing comma here ...^ units = 100, homogeneity = breslow.day, verbose = TRUE) where daty is the object type table svytable consider it, but when I run the code does not throw the homogeneity test. You are asked in the Posting guide to copy all errors and warnings when asking about unexpected behavior. When I run epi.2y2 on the output of a syvtable object I get no errors, but I do get warnings which I think are due to non-integer entries in the weighted table. I also get from a svytable() usingits first example on the help page an object that is NOT a set of 2 x 2 tables in an array of the structure as expected by epi.2by2(). The fact that epi.2by2() will report numbers with labels for a 2 x 3 table means that its error checking is weak. This is the output of str(dat) from one of the example on epi.2by2's help page: str(dat) table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ... - attr(*, dimnames)=List of 3 ..$ Exposure: chr [1:2] + - ..$ Disease : chr [1:2] + - ..$ Strata : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply reading the help pages and using str() to look at the objects created to get an idea regarding success or failure. I am not an experienced user of either package.) I doubt that what you got from svytable is a 2 x 2 table. As another example you can build a 2 x 2 x n table from the built-in dataset: UCBAdmissions DF - as.data.frame(UCBAdmissions) ## Now 'DF' is a data frame with a grid of the factors and the counts ## in variable 'Freq'. dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF) epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog #- test.statistic dfp.value 1 18.82551 5 0.00207139 Using svydesign and svytable I _think_ this is how one would go about constructing a 2 x 2 table: tbl2-svydesign( ~ Gender + Admit+Dept, weights=~Freq, data=DF) summary(dclus1) (tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2)) epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95, units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog #--- test.statistic dfp.value 1 18.82551 5 0.00207139 (At least I got internal consistency. I see you copied Thomas Lumley, which is a good idea. I'll be happy to get corrected on any point. I'm adding the maintainer of epiR to the recipients.) Yes, that will give internal consistency from a data structure point of view. It won't give a valid test in real examples, though -- epi.2by2 doesn't know about complex sampling, and what you're passing it is just an estimate of the population 2x2xK table. What would work, though it's not quite the same as the Breslow-Day test, is to use svyloglin() and do a Rao-Scott test comparing the model with all two-way interactions ~(Gender+Dept+Admit)^2 to the saturated model ~Gender*Dept*Admit. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Tue, Aug 7, 2012 at 10:26 PM, Marc Schwartz marc_schwa...@me.com wrote: since there are alpha-numerics present, whereas the first option will: grepl([^[:alnum:]], ab%) [1] TRUE So, use the first option. And I should start reading more carefully. The above works fine for me. I ended up defining the following wrappers: is_alpha - function(x) {grepl([[:alpha:]], x)} ##Alphabetic characters is_digit - function(x) {grepl([[:digit:]], x)} ##Digits is_alnum - function(x) {grepl([[:alnum:]], x)} ##Alphanumeric characters is_punct - function(x) {grepl([[:punct:]], x)} ##Punctuation characters is_notalnum - function(x) {grepl([^[:alnum:]], x)} ##Non-Alphanumeric characters Thanks again Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Quick follow-up question. I'm always reluctant to create functions that would resemble the method of a function (here, is() ), but would in fact not be a genuine method. So would there be any incompatibility between is() and is.letter(), given that the latter is not a method of the former? Is it good (or acceptable) practice to define is.letter() as above? Would is_letter() be better? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Another follow-up. To test for (non-)alphanumeric one would do the following: x - c(letters, 1:26, '+', '-', '%^') x[1:10] - paste(x[1:10], 1:10, sep='') x [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 + - %^ xb - grepl([[:alnum:]],x) ##test for alphanumeric chars x[xb] [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 xb - grepl([[:punct:]],x) ##test for non-alphanumeric chars x[xb] [1] + - %^ More regex rules are available on the Wiki [1]. Regards Liviu [1] http://en.wikipedia.org/wiki/Regular_expression __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Tue, Aug 7, 2012 at 4:28 AM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Quick follow-up question. I'm always reluctant to create functions that would resemble the method of a function (here, is() ), but would in fact not be a genuine method. So would there be any incompatibility between is() and is.letter(), given that the latter is not a method of the former? Is it good (or acceptable) practice to define is.letter() as above? Would is_letter() be better? It certainly won't cause problems if you never define anything of class letter or number. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Another follow-up. To test for (non-)alphanumeric one would do the following: x - c(letters, 1:26, '+', '-', '%^') x[1:10] - paste(x[1:10], 1:10, sep='') x [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 + - %^ xb - grepl([[:alnum:]],x) ##test for alphanumeric chars x[xb] [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 xb - grepl([[:punct:]],x) ##test for non-alphanumeric chars x[xb] [1] + - %^ That will get you values where punctuation characters are used, but there may be other non-alphanumeric characters in the vector. There may be ASCII control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found by using [:punct:]. For example: grepl([[:punct:]], ) [1] FALSE If you want to explicitly look for non-alphanumeric characters, you would be better off using a negation of [:alnum:] such as: grepl([^[:alnum:]], x) or !grepl([[:alnum:]], x) Regards, Marc More regex rules are available on the Wiki [1]. Regards Liviu [1] http://en.wikipedia.org/wiki/Regular_expression __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Aug 7, 2012, at 3:18 PM, Marc Schwartz marc_schwa...@me.com wrote: On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Another follow-up. To test for (non-)alphanumeric one would do the following: x - c(letters, 1:26, '+', '-', '%^') x[1:10] - paste(x[1:10], 1:10, sep='') x [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 + - %^ xb - grepl([[:alnum:]],x) ##test for alphanumeric chars x[xb] [1] a1 b2 c3 d4 e5 f6 g7 h8 i9 j10 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 xb - grepl([[:punct:]],x) ##test for non-alphanumeric chars x[xb] [1] + - %^ That will get you values where punctuation characters are used, but there may be other non-alphanumeric characters in the vector. There may be ASCII control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found by using [:punct:]. For example: grepl([[:punct:]], ) [1] FALSE If you want to explicitly look for non-alphanumeric characters, you would be better off using a negation of [:alnum:] such as: grepl([^[:alnum:]], x) or !grepl([[:alnum:]], x) Actually (for the second time in two days) I need to correct myself. The second option would not work correctly in cases where there is a mix of alpha-numerics and non: !grepl([[:alnum:]], ab%) [1] FALSE since there are alpha-numerics present, whereas the first option will: grepl([^[:alnum:]], ab%) [1] TRUE So, use the first option. Regards, Marc who is heading to the coffee machine... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Tue, Aug 7, 2012 at 10:18 PM, Marc Schwartz marc_schwa...@me.com wrote: That will get you values where punctuation characters are used, but there may be other non-alphanumeric characters in the vector. There may be ASCII control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found by using [:punct:]. For example: grepl([[:punct:]], ) [1] FALSE If you want to explicitly look for non-alphanumeric characters, you would be better off using a negation of [:alnum:] such as: [..] !grepl([[:alnum:]], x) Good point! Thanks. Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test if elements of a character vector contain letters
Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
nzchar(x) !is.na(x) No? -- Bert On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On 08/06/2012 09:51 AM, Rui Barradas wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. system.time(res0 - grepl([[:alpha:]], x)) user system elapsed 0.060 0.000 0.061 system.time(res1 - has_letter(x)) user system elapsed 3.728 0.008 3.747 all.equal(res0, res1, check.attributes=FALSE) [1] TRUE Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Hi, Not sure whether this is you wanted. x-letters (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) x1-c(x,1:26) x1 [1] a4 b3 c5 d2 e9 f6 g1 h8 i10 j7 k l [13] m n o p q r s t u v w x [25] y z 1 2 3 4 5 6 7 8 9 10 [37] 11 12 13 14 15 16 17 18 19 20 21 22 [49] 23 24 25 26 grepl(^[[:alpha:]][[:digit:]],x1) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, August 6, 2012 12:25 PM Subject: [R] test if elements of a character vector contain letters Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20 21 22 23 24 25 26 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20 21 22 23 24 25 26 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Aug 6, 2012, at 12:06 PM, Marc Schwartz marc_schwa...@me.com wrote: Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) Sorry, typos in the above from my CP. Should be: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Marc x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Only an extra set of brackets: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Without them, the functions are fast, but wrong. x [1] a8 b5 c10 d1 e6 f2 g4 h3 i7 j9 k l [13] m n o p q r s t u v w x [25] y z 1 2 3 4 5 6 7 8 9 10 [37] 11 12 13 14 15 16 17 18 19 20 21 22 [49] 23 24 25 26 is.letter - function(x) grepl([:alpha:], x) is.letter(x) [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE [13] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE is.letter - function(x) grepl([[:alpha:]], x) is.letter(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [25] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Marc Schwartz Sent: Monday, August 06, 2012 12:07 PM To: Rui Barradas Cc: r-help Subject: Re: [R] test if elements of a character vector contain letters Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote: nzchar(x) !is.na(x) No? It doesn't work for what I need: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 nzchar(x) !is.na(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [52] TRUE I need to have TRUE when an element contains a letter, and FALSE when an element contains only numbers. The above returns TRUE for the entire vector. Regards Liviu On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
You probably mean grepl('[a-zA-Z]', x) Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Mon, Aug 6, 2012 at 3:29 PM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote: nzchar(x) !is.na(x) No? It doesn't work for what I need: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 nzchar(x) !is.na(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [52] TRUE I need to have TRUE when an element contains a letter, and FALSE when an element contains only numbers. The above returns TRUE for the entire vector. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) This does exactly what I wanted: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 xb - grepl([[:alpha:]],x) x[xb] ##extract all vector elements that contain a letter [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z xb - grepl([[:digit:]],x) x[xb] ##extract all vector elements that contain a digit [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 1 2 3 4 [15] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [29] 19 20 21 22 23 24 25 26 Thanks all for the suggestions! Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test parallel slopes with svyolr
On Sun, Jul 8, 2012 at 2:32 AM, Diana Marcela Martinez Ruiz dianamm...@hotmail.com wrote: Hello, I would like to know how to test the assumption of proportional odds or parallel lines or slopes for an ordinal logistic regression with svyolr I wouldn't, but if someone finds a clear reference I'd be prepared to implement it anyway. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test parallel slopes with svyolr
Hello, I would like to know how to test the assumption of proportional odds or parallel lines or slopes for an ordinal logistic regression with svyolr Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test Binary File
As an alternative to the hexview package, an external Hex-Editor may help you investigate how the data is organised. -- View this message in context: http://r.789695.n4.nabble.com/Test-Binary-File-tp833690p4633075.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test if a sample mean of integers with range -inf; inf is different from zero
Hi all, how would you test if a sample mean of integers with range -inf;inf is different from zero: # my sample of integers: c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12) # is mean of c 0?: mean(c) Thanks, Kay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero
mean(c) != 0 But if you mean in a statistical sense... t.test() is one possibility. Michael On Fri, May 4, 2012 at 5:29 AM, Kay Cichini kay.cich...@gmail.com wrote: Hi all, how would you test if a sample mean of integers with range -inf;inf is different from zero: # my sample of integers: c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12) # is mean of c 0?: mean(c) Thanks, Kay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero
On Fri, May 04, 2012 at 11:29:51AM +0200, Kay Cichini wrote: Hi all, how would you test if a sample mean of integers with range -inf;inf is different from zero: # my sample of integers: c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12) # is mean of c 0?: mean(c) Hi. It is better to use a name of a vector different from c, which is a function, which you also use. Testing, whether the sample mean is zero is simple, since one can use mean(c) == 0 or sum(c) == 0 which are equivalent even in the inaccurate computer arithmetic. So, i think, you are asking for a statistical test, whether the true distribution mean is zero on the basis of a sample. Testing this requires some additional information on the distribution. If we do not know anything about the distribution except that the values are integers, then the sample mean can be arbitrarily large even if the distribuition mean is zero. Consider, for example, a uniform distribution on {-M, M} for some very large integer M. Observing a large sample mean does not allow to reject the null hypothesis on any level, since a large mean may have large probability even if the null hypothesis is true. If there is no bound on the values, then testing anything concerning the mean may not be possible, since the expected may not exist. Do you have a reason to think that the true distribution has an expected value? An example of an integer random variable without an expected value is s*X where s is uniform on {-1, 1} and X has value 2^i with probability 2^-i for i a positive integer. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.