[R] Avoid text wrapping of output in R console
Hello! I am working on a statistical package called VOStat (http://vo.iucaa.ernet.in/~voi/VOStat.html) which uses a Java based GUI to input data and parameters from the user. Based on the inputs, an appropriate R script is generated and executed in the R console. As an example, consider the output to be a data frame. This data frame is printed in a well formatted way in the R console but the formatting is lost when the output is captured in a text file, which is later printed as output by the VOStat GUI. I have ways to format the output using Java by displaying it in a tabular form with grid lines. But I am facing problems in doing so when the R output is truncated to a new line, for instance when the number of columns is large. A trivial example is mentioned below:- new_df- data.frame(League Position=1, Team=Manchester City, Games played=38, Games won=28, Games drawn=5, Games lost=5, Goals scored=93, Goals conceded=29, Goal difference=64, Points=89) print(new_df, row.names=FALSE) League.PositionTeam Games.played Games.won Games.drawn Games.lost Goals.scored 1Manchester City 38 285 5 93 Goals.conceded Goal.difference Points 29 64 89 So my question is whether there is a way to prevent R from wrapping the output so that all columns of a row can displayed in a single line in the console or should I start thinking of alternate ways to do the formatting? Many thanks for your help. Regards Tejas Kale IUCAA, Pune [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error im install R
Dear All I install Rstudio in centos, but I cannot run it, I saw this error: Stopping rstudio-server: [ OK ] rserver[16661]: ERROR Unable to find libR.so in expected locations within R Home directory /usr/local/lib/R; LOGGED FROM: bool core::r_util::unnamed::detectRLocationsUsingR(const std::string, core::FilePath*, core::FilePath*, core::config_utils::Variables*, std::string*) /root/rstudio/src/cpp/core/r_util/REnvironmentPosix.cpp:532 rserver[16661]: ERROR R shared library (/usr/local/lib/R/lib/libR.so) not found. If this is a custom build of R, was it built with the --enable-R-shlib option?; LOGGED FROM: bool core::r_util::unnamed::validateREnvironment(const core::r_util::EnvironmentVars, const core::FilePath, std::string*) /root/rstudio/src/cpp/core/r_util/REnvironmentPosix.cpp:357 R shared library (/usr/local/lib/R/lib/libR.so) not found. If this is a custom build of R, was it built with the --enable-R-shlib option? Starting rstudio-server: [ OK ] Please guide me. Best Wishes, Soheila __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoid text wrapping of output in R console
On 2012-05-28 08:12, Tejas Kale wrote: So my question is whether there is a way to prevent R from wrapping the output so that all columns of a row can displayed in a single line in the console or should I start thinking of alternate ways to do the formatting? This should do: options(width=1) -- Best regards, Krzysztof Mitko __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factanal fits
On 28/05/2012 02:20, Hunsicker, Lawrence wrote: Greetings, all: I am using factanal in R. When I enter a matrix or a formula, the print method winds up with something like this: Test of the hypothesis that 6 factors are sufficient. The chi square statistic is 28.1 on 22 degrees of freedom. The p-value is 0.172 But when I enter a covmat, the print method winds up with something like this: The degrees of freedom for the model is 22 and the fit was 0.0904 The actual factanal print method is suppressed, so I can't figure out how the two calculations are done, or how they relate to one another. Can any of you help? No, it is not. You can find it by getS3method, for example. Many thanks in advance for any insight any of you can give me. To do the tests you need the number of observations. I expect you used 'covmat' incorrectly, but you were too unhelpful to actually show us what you did. Larry Hunsicker Professor, Internal Medicine, U. Iowa College of Medicine Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged. If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited. Please reply to the sender that you have received the message in error, then delete it. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. And that means YOU. No HTML (as we asked), and include reproducible code for what you did and claim does not work as you want. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart space in column names
It is isn't easy to write code that works with column names that have spaces. You could rewrite rpart, or just rename the columns in your data frame to work around the bug. See ?names. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Raji raji.sanka...@gmail.com wrote: Hi, Our data has column names with spaces in that.The names in dataFrame are, *[1] Sepal Length Sepal Width Petal Length Petal Width Species * When i try to use the column names in rpart function, it gives the following error. * rp-rpart(as.factor(`Species`)~`Sepal Length`) Error in `[.data.frame`(frame, predictors) : undefined columns selected* But , a similar call works for kmeans/nnet functions.For example, *nn-nnet(as.factor(`Species`)~`Sepal Length`,size=3)* Is there any way in which column names with spaces be used in rpart function like being used in nnet/kmeans function? Thanks in advance for your help, Raji -- View this message in context: http://r.789695.n4.nabble.com/rpart-space-in-column-names-tp4631557.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart space in column names
On 28/05/2012 08:27, Jeff Newmiller wrote: It is isn't easy to write code that works with column names that have spaces. You could rewrite rpart, or just rename the columns in your data frame to work around the bug. See ?names. In any case, rpart pre-dates the `` notation that made this possible. Note that this looks very like the iris data set, which does have syntactic names. Not that we have the reproducible code the posting guide asked for From ?formula Variable names can be quoted by backticks ‘`like this`’ in formulae, although there is no guarantee that all code using formulae will accept such non-syntactic names. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Rajiraji.sanka...@gmail.com wrote: Hi, Our data has column names with spaces in that.The names in dataFrame are, *[1] Sepal Length Sepal Width Petal Length Petal Width Species * When i try to use the column names in rpart function, it gives the following error. * rp-rpart(as.factor(`Species`)~`Sepal Length`) You don't need `` unless the name is non-syntactic, e.g. contains a space. Error in `[.data.frame`(frame, predictors) : undefined columns selected* But , a similar call works for kmeans/nnet functions.For example, *nn-nnet(as.factor(`Species`)~`Sepal Length`,size=3)* Is there any way in which column names with spaces be used in rpart function like being used in nnet/kmeans function? Thanks in advance for your help, Raji -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change width of bar when there are very few bars?
On 05/28/2012 02:44 AM, Manish Gupta wrote: Thanks it works! How can i make horizontal bar graph using barp? Hi Manish, The sad fact is that I haven't gotten around to coding it yet. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] import contingency table
hello everyone, i often work on contingency table that I create from data.frame (with table() function) but a friend sent me an excel sheet wich *already is* a contingency table (just a simple 2 way table !...) any clue on how to import it in R (keeping row names and col names) ? any tuto I come accross only mention the table transformation, but never the import of such data I only found read.ftable() but couldn't get it to work any help appreciated Sylv __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
Hello, Try function read.xls in library gdata Also, a good way of avoiding such doubts is library(sos) findFn('xls') It returns read.xls as the first line. Hope this helps, Rui Barradas Em 28-05-2012 11:32, sylvain willart escreveu: hello everyone, i often work on contingency table that I create from data.frame (with table() function) but a friend sent me an excel sheet wich *already is* a contingency table (just a simple 2 way table !...) any clue on how to import it in R (keeping row names and col names) ? any tuto I come accross only mention the table transformation, but never the import of such data I only found read.ftable() but couldn't get it to work any help appreciated Sylv __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
Thanks Rui, but my problem is not to read an xls file, I converted already to csv, but rather to read a contingency table into R, and telling R it is astually a contingency table, and not a data.frame... file below, if it helps... Sylv ,AUC,Alin,BLG,BrDep,CRF,CrfMkt,CAS,Casto,Confo,ElecDep,Geant,Halle,KIA,LerMrl,Match,METRO,MNP,SimpMkt Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 2012/5/28 Rui Barradas ruipbarra...@sapo.pt: Hello, Try function read.xls in library gdata Also, a good way of avoiding such doubts is library(sos) findFn('xls') It returns read.xls as the first line. Hope this helps, Rui Barradas Em 28-05-2012 11:32, sylvain willart escreveu: hello everyone, i often work on contingency table that I create from data.frame (with table() function) but a friend sent me an excel sheet wich *already is* a contingency table (just a simple 2 way table !...) any clue on how to import it in R (keeping row names and col names) ? any tuto I come accross only mention the table transformation, but never the import of such data I only found read.ftable() but couldn't get it to work any help appreciated Sylv __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
no, the problem is that the lines in my file do not correspond to individuals, but are variables, just like are the columns, my file is already a contingency table, with each cell being a frequency: here is a sample of it: *** ,AUC,Alin,BLG,BrDep,CRF,CMkt,CAS,Casto,Confo,ElDep,Geant,Halle,KIA,LMrl,Match,MET,MNP,SM, Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 ** I know how to read it into a df or a matrix, if it was a df or matrix, i could turn it into a table, but this is already a contingency table for example, the first number 4, is the number of people being in city Strasbg (first row) and working at AUC (first column) (this is Auchan actually) I do not have the original file where each row would be an individual, I just have that flat file, with variables on the rows and variables on the colums, and frequencies in each cell, And I wonder how to read it in R telling him this is a frequency/contingency table I can't believe there are no way of getting aroud it (or maybe the sun stroke to heavy on my head) Sylv 2012/5/28 Nicolas Iderhoff nicolasiderh...@googlemail.com: Wouldn't it work for you to read the data into a matrix/df so you can transform it into a table()? if you're worried about the names of the cols/rows, you can always do read.table(..)[,1] to get the row names for example and put them into the matrix with rownames() Am 28.05.2012 um 13:49 schrieb sylvain willart: there are no indication in ?table on how to read in a contingency table (only on how to transform a dataframe or matrix into a contingency table), when I read my file with read.table(), and run is.table() I get FALSE for an answer, and the function as.table() leads to an error message, Sylv 2012/5/28 Nicolas Iderhoff nicolasiderh...@googlemail.com: Try ?table Am 28.05.2012 um 13:33 schrieb sylvain willart: Thanks Rui, but my problem is not to read an xls file, I converted already to csv, but rather to read a contingency table into R, and telling R it is astually a contingency table, and not a data.frame... file below, if it helps... Sylv ,AUC,Alin,BLG,BrDep,CRF,CrfMkt,CAS,Casto,Confo,ElecDep,Geant,Halle,KIA,LerMrl,Match,METRO,MNP,SimpMkt Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 2012/5/28 Rui Barradas ruipbarra...@sapo.pt: Hello, Try function read.xls in library gdata Also, a good way of avoiding such doubts is library(sos) findFn('xls') It returns read.xls as the first line. Hope this helps, Rui Barradas Em 28-05-2012 11:32, sylvain willart escreveu: hello everyone, i often work on contingency table that I create from data.frame (with table() function) but a friend sent me an excel sheet wich *already is* a contingency table (just a simple 2 way table !...) any clue on how to import it in R (keeping row names and col names) ? any tuto I come accross only mention the table transformation, but never the import of such data I only found read.ftable() but couldn't get it to work any help appreciated Sylv __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
[R] rms::cr.setup and Hmisc::fit.mult.impute
I have fitted a proportional odds model, but would like to compare it to a continuation ratio model. However, I am unable to fit the CR model _including_ imputated data. I guess my troubles start with settuping the data for the CR model. Any hint is appreciated! Christian library(Hmisc) library(rms) library(mice) ## simulating data (taken from rms::residuals.lrm) set.seed(1) n - 400 age - rnorm(n, 50, 10) blood.pressure - rnorm(n, 120, 15) L - .05*(age-50) + .03*(blood.pressure-120) p12 - plogis(L) p2 - plogis(L-1) p - cbind(1-p12, p12-p2, p2) cp - matrix(cumsum(t(p)) - rep(0:(n-1), rep(3,n)), byrow=TRUE, ncol=3) y - (cp runif(n)) %*% rep(1,3) y - as.vector(y) ## generating missing data age[1:40]-NA blood.pressure[30:70]-NA ## multiple imputation using mice::mice d - as.matrix(cbind(y, age, blood.pressure)) imp - mice(d,seed=123) ## some cleanup rm(y, age, blood.pressure) d -as.data.frame(d) ## proportional odds model b-fit.mult.impute(y~age+blood.pressure, lrm, xtrans=imp, data=d) ## continuation ratio model attach(d) u - cr.setup(y) detach(d) attach(d[u$subs,]) y -u$y cohort-u$cohort c - lrm(y~cohort*(age+blood.pressure)) ## CR model with imputed data q - fit.mult.impute(y~cohort*(age+blood.pressure), lrm, xtrans=imp) Error in model.frame.default (formula = formula, data = completed.data, : variable length differ (found for 'cohort') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
Le lundi 28 mai 2012 à 15:19 +0200, sylvain willart a écrit : no, the problem is that the lines in my file do not correspond to individuals, but are variables, just like are the columns, my file is already a contingency table, with each cell being a frequency: here is a sample of it: *** ,AUC,Alin,BLG,BrDep,CRF,CMkt,CAS,Casto,Confo,ElDep,Geant,Halle,KIA,LMrl,Match,MET,MNP,SM, Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 ** I know how to read it into a df or a matrix, if it was a df or matrix, i could turn it into a table, but this is already a contingency table If it's already a matrix, just call as.table() on it, and you'll get a table object. for example, the first number 4, is the number of people being in city Strasbg (first row) and working at AUC (first column) (this is Auchan actually) I do not have the original file where each row would be an individual, I just have that flat file, with variables on the rows and variables on the colums, and frequencies in each cell, And I wonder how to read it in R telling him this is a frequency/contingency table I can't believe there are no way of getting aroud it (or maybe the sun stroke to heavy on my head) Please call dput() on the data as you have imported it, so that we can precisely discuss the problem. Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help in logistic regression
Hello everyone, I tried to understand the relationship between temperature and the death of an organism by using logistic regression. glm(formula = Death ~ Temperature, family = binomial(link = logit), data = mydata) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -87.9161 7.7987 -11.27 2e-16 *** Temperature2.9532 0.2616 11.29 2e-16 *** From the above summary, I could understand that log odds of death = -87.9161 + 2.9532*Temperature. Odds=exp(log[odds]). Probability = odds/(1+odds) Assuming my data is randomly normal distributed with (u=0, standard deviation=0.35), and I want to run it for n=10,000, how do I get to probability from log odds? Regards, Eddie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
Ok, try the following. df2table - function(x, Var1=Var1, Var2=Var2){ tbl - as.matrix(x) dnames - list(rownames(x), colnames(x)) names(dnames) - c(Var1, Var2) attr(tbl, dimnames) - dnames attr(tbl, class) - table tbl } df2table(xls_contingency) # using default names Rui Barradas Em 28-05-2012 15:00, Milan Bouchet-Valat escreveu: Le lundi 28 mai 2012 à 15:19 +0200, sylvain willart a écrit : no, the problem is that the lines in my file do not correspond to individuals, but are variables, just like are the columns, my file is already a contingency table, with each cell being a frequency: here is a sample of it: *** ,AUC,Alin,BLG,BrDep,CRF,CMkt,CAS,Casto,Confo,ElDep,Geant,Halle,KIA,LMrl,Match,MET,MNP,SM, Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 ** I know how to read it into a df or a matrix, if it was a df or matrix, i could turn it into a table, but this is already a contingency table If it's already a matrix, just call as.table() on it, and you'll get a table object. for example, the first number 4, is the number of people being in city Strasbg (first row) and working at AUC (first column) (this is Auchan actually) I do not have the original file where each row would be an individual, I just have that flat file, with variables on the rows and variables on the colums, and frequencies in each cell, And I wonder how to read it in R telling him this is a frequency/contingency table I can't believe there are no way of getting aroud it (or maybe the sun stroke to heavy on my head) Please call dput() on the data as you have imported it, so that we can precisely discuss the problem. Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
How about this? exdf - read.table(clipboard, sep=,, header=T, row.names=1) extbl - as.table(as.matrix(exdf)) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rui Barradas Sent: Monday, May 28, 2012 9:43 AM To: Milan Bouchet-Valat Cc: r-help Subject: Re: [R] import contingency table Ok, try the following. df2table - function(x, Var1=Var1, Var2=Var2){ tbl - as.matrix(x) dnames - list(rownames(x), colnames(x)) names(dnames) - c(Var1, Var2) attr(tbl, dimnames) - dnames attr(tbl, class) - table tbl } df2table(xls_contingency) # using default names Rui Barradas Em 28-05-2012 15:00, Milan Bouchet-Valat escreveu: Le lundi 28 mai 2012 à 15:19 +0200, sylvain willart a écrit : no, the problem is that the lines in my file do not correspond to individuals, but are variables, just like are the columns, my file is already a contingency table, with each cell being a frequency: here is a sample of it: *** ,AUC,Alin,BLG,BrDep,CRF,CMkt,CAS,Casto,Confo,ElDep,Geant,Halle,KIA,LMrl ,Match,MET,MNP,SM, Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 ** I know how to read it into a df or a matrix, if it was a df or matrix, i could turn it into a table, but this is already a contingency table If it's already a matrix, just call as.table() on it, and you'll get a table object. for example, the first number 4, is the number of people being in city Strasbg (first row) and working at AUC (first column) (this is Auchan actually) I do not have the original file where each row would be an individual, I just have that flat file, with variables on the rows and variables on the colums, and frequencies in each cell, And I wonder how to read it in R telling him this is a frequency/contingency table I can't believe there are no way of getting aroud it (or maybe the sun stroke to heavy on my head) Please call dput() on the data as you have imported it, so that we can precisely discuss the problem. Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import contingency table
It works. There goes my clever function. Rui Barradas Em 28-05-2012 16:31, David L Carlson escreveu: How about this? exdf- read.table(clipboard, sep=,, header=T, row.names=1) extbl- as.table(as.matrix(exdf)) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rui Barradas Sent: Monday, May 28, 2012 9:43 AM To: Milan Bouchet-Valat Cc: r-help Subject: Re: [R] import contingency table Ok, try the following. df2table- function(x, Var1=Var1, Var2=Var2){ tbl- as.matrix(x) dnames- list(rownames(x), colnames(x)) names(dnames)- c(Var1, Var2) attr(tbl, dimnames)- dnames attr(tbl, class)- table tbl } df2table(xls_contingency) # using default names Rui Barradas Em 28-05-2012 15:00, Milan Bouchet-Valat escreveu: Le lundi 28 mai 2012 à 15:19 +0200, sylvain willart a écrit : no, the problem is that the lines in my file do not correspond to individuals, but are variables, just like are the columns, my file is already a contingency table, with each cell being a frequency: here is a sample of it: *** ,AUC,Alin,BLG,BrDep,CRF,CMkt,CAS,Casto,Confo,ElDep,Geant,Halle,KIA,LMrl ,Match,MET,MNP,SM, Strasbg,4,0,0,2,3,0,0,6,2,1,2,1,0,2,3,2,3,6 Paris,0,0,0,0,10,1,5,2,4,0,5,1,0,0,0,3,7,7 Brest,3,0,0,2,8,0,5,9,4,0,5,0,2,0,0,0,0,0 Lyon,0,0,0,1,4,2,8,2,3,0,5,1,0,0,0,0,4,5 Nice,3,0,0,0,3,2,5,1,2,0,2,0,0,0,0,2,2,0 Limg,3,0,0,1,4,2,3,0,0,0,3,0,0,0,0,1,0,4 Toulse,0,0,0,1,5,4,3,2,2,0,5,0,0,0,0,2,1,5 Nancy,0,0,0,2,3,1,1,8,2,0,2,0,1,0,2,3,2,4 Lille,0,0,0,0,6,8,0,0,2,2,3,1,0,1,5,1,2,6 Mtplier,0,0,0,0,7,3,4,1,0,1,4,0,0,0,0,1,6,3 Aix,0,4,0,0,9,2,5,1,0,0,5,0,0,0,0,1,7,5 Senart,0,0,0,1,10,3,5,0,5,0,6,0,0,0,0,0,3,3 Grenbl,0,0,0,0,3,2,5,3,1,0,5,0,0,0,0,0,0,4 Angers,0,0,0,2,8,0,4,0,4,0,4,0,2,0,0,0,3,3 Brdx,3,0,0,2,4,3,3,0,1,0,5,0,2,0,0,1,3,4 Dijon,0,0,0,1,8,2,5,3,4,0,5,0,0,0,0,2,1,0 Rouen,3,0,0,1,2,0,2,0,3,1,2,1,2,0,0,0,0,6 ** I know how to read it into a df or a matrix, if it was a df or matrix, i could turn it into a table, but this is already a contingency table If it's already a matrix, just call as.table() on it, and you'll get a table object. for example, the first number 4, is the number of people being in city Strasbg (first row) and working at AUC (first column) (this is Auchan actually) I do not have the original file where each row would be an individual, I just have that flat file, with variables on the rows and variables on the colums, and frequencies in each cell, And I wonder how to read it in R telling him this is a frequency/contingency table I can't believe there are no way of getting aroud it (or maybe the sun stroke to heavy on my head) Please call dput() on the data as you have imported it, so that we can precisely discuss the problem. Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a bunch of csv files into R
OK, a couple of things (I only looked through quickly): 1. R doesn't allow variable names to begin with a number. Be sure you don't try that. 2. What's the overall goal here? Read them in, change the name, then write them out? Let us know and it will be easier to help you. 3. Regardless of your goal, I think you are over thinking the solution. Let us know what you want to accomplish and we can shorten it up I'm sure. Bryan On May 28, 2012, at 11:20 AM, HJ YAN wrote: Dear Rui, Kevin, Bryan and Nutter Thank you so much for your very helpful hints! Now I have extracted all the file names and managed to edit them using the code (1)-(4) below and obtained the name format as I wanted (1) files-list.files(path = myworking directory, pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE,ignore.case = FALSE, include.dirs = FALSE) (2) filenames - files[grep([.]csv, files)] [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv 523215_20120523123419.csv ...(a few hundred more...) (3) data_names - gsub([.]csv, , filenames) (4) NAME- paste(Data,data_names, sep=.) Up to here I got NAME containing all the names I'm going to use.. NAME [1] Data.512180_20120523150757 Data.513687_20120523181947 Data.513690_20120524112111 Data.521858_20120524091428 Data.523215_20120523123419 But I still haven't successfuly read the whole bunch of csv files into R and name them as expected...e.g. I want to read 512180_20120523150757.csv into R and name it Data.512180_20120523150757 and so on... For a single file we can just write Data.512180_20120523150757-read.csv(512180_20120523150757.csv) If any of the following commands (as you suggested) works, then my question is sorted out. But I got error messages for every attempt... (i) df.list - lapply(seq_len(filenames), read.csv) Error in seq_len(filenames) : argument must be coercible to non-negative integer In addition: Warning message: In is.vector(X) : NAs introduced by coercion filenames [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv [5] 523215_20120523123419.csv... (ii) None of the following code works... myDir=myworking directory #for(i in 1:length(filenames)){assign(NAME[i], read.csv(file.path(myDir, filenames[i])))} #for(i in 1:5){assign(NAME[i], read.csv(file.path=myDir, filenames[i]))} setwd(myworking directory) #for(i in 1:5){assign(NAME[i], read.csv( filenames[i]))} Warning messages: 1: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 2: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 3: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 4: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 5: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length Seems I am getting there, but could you spot where my code went wrong please?? Many thanks again! HJ On Fri, May 25, 2012 at 8:36 PM, Rui barradas rui1...@sapo.pt wrote: Hello, Or maybe put the data frames in a list df.list - lapply(seq_len(filenames), read.csv, ...) # '...other...' are options you might want to pass, (like headers=TRUE) names(df.list) - data_names Now access the data frames by number in the list or by name in data_names. Hope this helps, Rui Barradas Em 25-05-2012 20:08, Nutter, Benjamin escreveu: For example: myDir- some file path filenames- list.files(myDir) filenames- filenames[grep([.]csv, filenames)] data_names- gsub([.]csv, , filenames) for(i in 1:length(filenames)) assign(data_names[i], read.csv(file.path(myDir, filenames[i]))) Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic| 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kevin Wright Sent: Friday, May 25, 2012 2:55 PM To: HJ YAN Cc: r-help@r-project.org Subject: Re: [R] Reading a bunch of csv files into R See ?dir Assign the value to a vector and loop over the elements of the vector. Kevin On Fri, May 25, 2012 at 12:16 PM, HJ YANyhj...@googlemail.com wrote: Dear R users I am struggling from a data importing issue: I have some hundreds of csv files needed to be read into R for futher analysis. All those csv files are named in one of the three formats: (1) strings: e.g. London_Oxford street (2) Integer: e.g. 1234_5678 (3) combined: e.g. London_1234 I intend to use read.csv(_xxx.csv) but I only dealt with
Re: [R] need help in logistic regression
Look at ?predict.glm mydata.glm - glm(formula = Death ~ Temperature, family = binomial(link = logit), data = mydata) and see that predict(mydata.glm, type=response) gives the predictions on the probability scale. On Mon, May 28, 2012 at 10:16 AM, eddie smith eddie...@gmail.com wrote: Hello everyone, I tried to understand the relationship between temperature and the death of an organism by using logistic regression. glm(formula = Death ~ Temperature, family = binomial(link = logit), data = mydata) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -87.9161 7.7987 -11.27 2e-16 *** Temperature2.9532 0.2616 11.29 2e-16 *** From the above summary, I could understand that log odds of death = -87.9161 + 2.9532*Temperature. Odds=exp(log[odds]). Probability = odds/(1+odds) Assuming my data is randomly normal distributed with (u=0, standard deviation=0.35), and I want to run it for n=10,000, how do I get to probability from log odds? Regards, Eddie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple plot in ICEInfer
Hi I'm working in the excellent ICEinfer package which calculates bootstrapped cost-effectiveness ratios Obenchain, B. (2011). ICEinfer: #Incremental Cost-Effectiveness (ICE) Statistical Inference from Two Unbiased Samples. R package version 1.0-0. http://CRAN.R-#project.org/package=ICEinfer # Display the Bootstrap ICE Uncertainty Distribution... plot(dpunc) # to create 'ICEwedge': an Incremental Cost-Effectiveness Bootstrap Confidence Wedge... dpwdg - ICEwedge(dpunc) dpwdg plot(dpwdg) I can then plot separate plots for values of lambda - the 'willingness to pay for an increment of one output e.g. quality adjusted life years - a cost acceptability curve. # Computing VAGR Acceptability and ALICE Curves... dpacc - ICEalice(dpwdg) plot(dpacc) What I haven't figured out to do it get all these plots on to one plot so I can account for a sensitivity analysis say of different types of cost and their influence on the cost acceptability curve I've noticed with this package before that I cannot alter the plots for example by changing axes titles - it just doesn't work. I want to get four different curves from different analyses on one plot. Am i doing something wrong? Appreciate any advice from anyone who has used this package Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
thanks rui that's what I was looking for I have another related question: - why of the difference between the max distance D calculated with ks.test() and the max distance D “manually” calculated as in (2)? I guess it has something to do with the fact that KS is obtained with a maximisation that depends on the range of x values not necessarly coincident in the two different approaches ...any thought about this? maxbre -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631564.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with strptime
Fantastic, thanks very much Richard. The addition of 'tz=GMT' worked a treat. Best wishes, Des From: Richard M. Heiberger [mailto:r...@temple.edu] Sent: 28 May 2012 02:09 To: Des Callaghan Cc: r-help@r-project.org Subject: Re: [R] Problem with strptime Some of your dates are displayed BST and some GMT. High probablilty your dates span the break between summer time and regular time when certain hours do not exist. (in the US we go from 0200 directly to 0301 in the spring when we move from standard time to daylight time). 0230 would therefore be displayed as NA. You will need to take control of the time zone, probably by forcing GMT at all times. On Sun, May 27, 2012 at 3:56 PM, Des Callaghan des.callag...@ecostudy.co.uk wrote: Hello Forum, I have a problem with the strptime function. With the 'data1' dataset below it works fine, but with the 'data2' dataset something goes wrong (see final line below). Both data1 and data2 are in exactly the same original format, the only difference is that they span different dates. Please help, since it is driving me nuts! Many thanks. Best wishes, Des - data1=read.table(data1.txt,header=T,sep=\t) datetime1=strptime(data1$Date, %a %b %d %H:%M:%S %Y) #example line from data1 'Tue Aug 16 03:00:01 2011' summary(datetime1) Min. 1st Qu. Median Mean 2011-08-15 21:00:01 BST 2011-10-08 01:00:01 BST 2011-11-30 05:00:01 GMT 2011-11-30 04:38:47 GMT 3rd Qu. Max. 2012-01-22 09:00:01 GMT 2012-03-15 13:00:01 GMT min(datetime1) [1] 2011-08-15 21:00:01 BST data2=read.table(data2.txt,header=T,sep=\t) datetime2=strptime(data2$Date, %a %b %d %H:%M:%S %Y) #example line from data2 'Sun Nov 27 13:07:01 2011' summary(datetime2) Min. 1st Qu. Median Mean 2011-11-27 01:07:01 GMT 2012-01-09 20:07:01 GMT 2012-02-22 15:07:01 GMT 2012-02-22 15:26:16 GMT 3rd Qu. Max. 2012-04-06 12:07:01 BST 2012-05-20 07:07:01 BST min(datetime2) [1] NA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ No virus found in this message. Checked by AVG - www.avg.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stats q: multiple imputation and quantile regression
Dear list, this is perhaps more of a statistics question than an R question, but perhaps someone could help me out anyway. I'm doing sociological research and am currently in the process of familiarizing myself with the basic concepts of multiple imputation. Eventually, my goal is to perform quantile regression on a large data set, where one non-negative discrete variable contains missing values -- which I'm hoping to impute using multiple imputation. The variable in question has between 5-20% missing values (depending on the sample I'm using). Here's my question: Is it acceptable to use a linear-regression based model for imputation of the values of my non-negative discrete predictor variable, even though the aim is to use quantile regression for the substantive analysis? Section 2 (page 6) in Joseph L Schafer's Multiple Imputation: A primer (Statistical Methods in Medical Research 1999, Vol 8, pp 3-15) gives me the impression that I might have a problem, if the predictor's distribution is skewed and I'm mainly interested in conditional quantiles rather than means for my substantive analysis? Any pointers you could give me would be greatly appreciated. Best, Irene P. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] zoo: variable gets modified at making zoo object
I'm doing: alyL32007z - zoo(alyL32007,alyL32007$time) range(time(alyL32007z)) [1] 2007-01-01 00:00:00 UTC 2007-12-31 23:30:00 UTC But then, while the original variable is: summary(alyL32007$NEE_st) Min. 1st Qu. MedianMean 3rd Qu.Max.NA's -15.340 -1.615 -0.054 -0.814 0.750 8.965 11124 the variable within the zoo object is different: summary(alyL32007z$NEE_st) Index alyL32007z$NEE_st Min. :2007-01-01 00:00:00 0.335:7 1st Qu.:2007-04-02 05:52:30 0.582:7 Median :2007-07-02 11:45:00 0.611:7 Mean :2007-07-02 11:45:00 0.063:6 3rd Qu.:2007-10-01 17:37:30 0.069:6 Max. :2007-12-31 23:30:00 (Other): 6363 NA's :11124 and I get an error at plotting: plot(alyL32007z$NEE_st) Error in plot.window(...) : invalid 'ylim' value Any help appreciated, Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 e-mail agustin.l...@ictja.csic.es https://sites.google.com/site/aloboaleu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R quantreg - supWald Test
He folks=) I am trying to compute a supWald Test in R, trying to show that a subset of my regressors is significantly different from 0 but I am not able to compute this test. My sample looks as follows: I am regressing fit1 - (Y~X1+X2+X3,tau=tau). I know that if I want to show that e.g. X2 is significantly different from zero, quantreg package calculates the corresponding p-Value. But I am trying to test for the linear restriction that X2 and X3 are both different from 0. The test statistic is easy to compute, the only thing that remains is the standard error...and unfortunately I do not have a clue how to compute it as it contains a consistent estimator of the density of Y at the tau-th Quantile... Thank you very much for you help cheers Stefan -- View this message in context: http://r.789695.n4.nabble.com/R-quantreg-supWald-Test-tp4631570.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R quantreg anova: How to change summary se-type
He folks=) I want to check whether a coefficient has an impact on a quantile regression (by applying the sup-wald test for a given quantile range [0.05,0.95]. Therefore I am doing the following calculations: a=0; for (i in 5:95/100){ fitrestricted=rq(Y~X1+X2,tau=i) tifunrestrited=rq(Y~X1+X2+X3,tau=i) a[i]=anova(fitrestricted,fitunrestricted)$table$Tn) #gives the Test-Value } supW=max(a) As anova is using the summary.rq function I want to change the Standard error method used (default: se=nid leads to mistakes, I prefer se=ker). Do you know how to handle this information in the anova syntax? Thank you very much Stefan -- View this message in context: http://r.789695.n4.nabble.com/R-quantreg-anova-How-to-change-summary-se-type-tp4631576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
Hello, That's a very difficult question. See Marsaglia, Tsang, Wang (2003) http://www.jstatsoft.org/v08/i18/ Simard, L'Ecuyer (2011) http://www.jstatsoft.org/v39/i11 R's ks functions are a port of Marsaglia et al. to the .C interface. Rui Barradas maxbre wrote thanks rui that's what I was looking for I have another related question: - why of the difference between the max distance D calculated with ks.test() and the max distance D “manually” calculated as in (2)? I guess it has something to do with the fact that KS is obtained with a maximisation that depends on the range of x values not necessarly coincident in the two different approaches ...any thought about this? maxbre -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631571.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simulation of levene's test
hello, I try to run simulation of levene's test to find the p-value but the error of replacement has length zero occur, could anyone help me to fix this problem? asim - 1000 pv-rep(NA,asim) for(i in 1:asim) {print(i) set.seed(i) g1 - rnorm(20,0,2) g2 - rnorm(20,0,2) g3 - rnorm(20,0,2) x - c(g1,g2,g3) group-as.factor(c(rep(1,20),rep(2,20),rep(3,20))) library(Rcmdr) pv[i]-leveneTest(x,group)$p.value } -- View this message in context: http://r.789695.n4.nabble.com/simulation-of-levene-s-test-tp4631578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
thanks for the help: I'll have a look at the papers max Il 28/05/2012 12:31, Rui Barradas [via R] ha scritto: Hello, That's a very difficult question. See Marsaglia, Tsang, Wang (2003) http://www.jstatsoft.org/v08/i18/ Simard, L'Ecuyer (2011) http://www.jstatsoft.org/v39/i11 R's ks functions are a port of Marsaglia et al. to the .C interface. Rui Barradas maxbre wrote thanks rui that's what I was looking for I have another related question: - why of the difference between the max distance D calculated with ks.test() and the max distance D âmanuallyâ calculated as in (2)? I guess it has something to do with the fact that KS is obtained with a maximisation that depends on the range of x values not necessarly coincident in the two different approaches ...any thought about this? maxbre If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631571.html To unsubscribe from Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4631437code=bWJyZXNzYW5AYXJwYS52ZW5ldG8uaXR8NDYzMTQzN3wyMjQwMjkzMTc=. NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631573.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLMNET AUC vs. MSE
Hello - I am using glmnet to generate a model for multiple cohorts i. For each i, I run 5 separate models, each with a different x variable. I want to compare the fit statistic for each i and x combination. When I use auc, the output is in some cases is .5 (.49). In addition, if I compare mean MSE (with upper and lower bounds) ... there is no difference across my various x variables, but mean AUC (with upper and lower bounds) shows differentiation. My basic questions are, should I not expect AUC to lie between .5 and 1 and, which model fit measurement is most appropriate for comparing across models (if the various statistics are producing a somewhat inconsistent story). Thanks in advance for any advice. Below is my code and sample output for AUC/MSE. xc - split(dataS$P1_retained, dataS$TotalHours_R) yc - split(dataS$x, dataS$TotalHours_R) for (i in 1:length(yc)) { fit=cv.glmnet(as.matrix(yc[[i]]), y=xc[[i]], alpha=.05, type=mse, nfolds=10, standardize=TRUE,family=binomial) c_output = c(i,fit$cvlo[fit$lambda==fit$lambda.1se],fit$cvm[fit$lambda==fit$lambda.1se], fit$cvup[fit$lambda==fit$lambda.1se]) names(c_output) = names(output_x) output_x = rbind(output_x, t(c_output)) fit1=cv.glmnet(as.matrix(yc[[i]]), y=xc[[i]], alpha=.05, type=auc, nfolds=10, standardize=TRUE,family=binomial) c_output1 = c(i,fit1$cvlo[fit1$lambda==fit1$lambda.1se],fit1$cvm[fit1$lambda==fit1$lambda.1se], fit1$cvup[fit1$lambda==fit1$lambda.1se]) names(c_output1) = names(output_x1) output_x1 = rbind(output_x1, t(c_output1)) fit2=cv.glmnet(as.matrix(yc[[i]]), y=xc[[i]], alpha=.05, type=class, nfolds=10, standardize=TRUE,family=binomial) c_output2 = c(i,fit2$cvlo[fit2$lambda==fit2$lambda.1se],fit2$cvm[fit2$lambda==fit2$lambda.1se], fit2$cvup[fit2$lambda==fit2$lambda.1se]) names(c_output2) = names(output_x2) output_x2 = rbind(output_x2, t(c_output2)) } COHORT LB_MSE_X MEAN_MSE_X UB_MSE_X LB_AUC_X MEAN_AUC_X UB_AUC_X LB_CLASS_X MEAN_CLASS_X UB_CLASS_X 0 0.44 0.44 0.44 0.50 0.50 0.50 0.33 0.33 0.33 1 0.42 0.42 0.42 0.51 0.51 0.52 0.30 0.30 0.30 2 0.40 0.40 0.40 0.50 0.50 0.50 0.28 0.28 0.28 3 0.36 0.37 0.37 0.51 0.51 0.51 0.24 0.24 0.24 4 0.35 0.35 0.35 0.51 0.51 0.51 0.22 0.23 0.23 5 0.33 0.33 0.33 0.51 0.51 0.52 0.21 0.21 0.21 6 0.32 0.32 0.32 0.51 0.51 0.51 0.20 0.20 0.20 7 0.30 0.31 0.31 0.52 0.52 0.52 0.19 0.19 0.19 8 0.29 0.29 0.30 0.52 0.52 0.52 0.18 0.18 0.18 9 0.28 0.29 0.29 0.52 0.52 0.52 0.17 0.17 0.17 10 0.28 0.28 0.28 0.52 0.53 0.53 0.17 0.17 0.17 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimation of parameters represented in matrix using MLE package
Dear R-list members, I have a problem of estimation of parameters represented in a covariance matrix using maximum likelihood function. The problem is essentially a multivariate Gaussian random field model. The maximum likelihood function is L(m, *S2 , *N2 ; F1) =1/ (2* sqrt(det(*** X exp{-1/2(F1- **)' *F1-**) The covariance matrix represented in the formula is ** and the covariance matrix has the elements with variables like *S2 , *N2 and it is my intention to maximize the variables in the matrix using the MLE package in R. In this regard my concerns are, whether the MLE package is able to accept the matrix contains the variables for optimization and if it does so then how to represent the matrix or any other data structure with non-numeric character as variables in it in order to use it for MLE function. Any help in this sort will be highly appreciated. Thanks and regards, B.Nataraj [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to plot data in each list simultaneously
Dear jim and rest of R users, I initially had the following data set 01.01.1967 0.87 02.01.1967 0.87 03.01.1968 0.87 04.01.1968 0.87 05.01.1969 0.87 06.01.1969 0.87 07.01.1970 0.87 08.01.1970 0.87 09.01.1971 0.87 10.01.1971 0.87 11.01.1972 0.87 12.01.1972 0.87 13.01.1973 0.69 14.01.1973 0.70 15.01.1974 0.71 16.01.1974 0.72 I wanted to reshape it in the following FORMAT 1967 1968 1969 1970 1971 1972 1973 1974 1 0.87 0.87 0.87 0.87 0.71 2 0.87 0.87 0.87 0.87 0.72 with your help, by using following coding, I managed to convert it into desired format. # extract years from the dates qmu$year-as.numeric(sapply(strsplit(as.character(qmu$V1),[.]),[,3)) # get a vector of the unique years uyears-unique(qmu$year) # make an empty list newqmu-list() # populate the list year by year for(i in 1:length(uyears)) newqmu[[i]]-qmu$V2[qmu$year==uyears[i]] Now, Is there a way to plot, simultaneously, the values of each list (which contains data of each year) against its respective number of days, by using just a single command? As you know that I have problem of leap year in my data set, I therefore can’t rely on using a data set of 365 days and apply it to all the lists. Your help will be highly appreciated. Regards uzair __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standard error
Dear all, I want to determine the standard error or the mean squared error for the parameter estimate for beta and eta base on the real data. Any help on how to obtain these estimated errors. library(survival) d - data.frame(ob=c(149971, 70808, 133518, 145658, 175701, 50960, 126606, 82329), state=1) s - Surv(d$ob,d$state) sr - survreg(s~1,dist=weibull) beta-1/sr$scale p1=(beta) p1 eta-exp(sr$coefficients[1]) b=(eta) b Thank you Chris Guure Researcher Institute for Mathematical Research UPM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hash Table - Select and Change Data iniside Matrix
Hi Michel, More than equal to function, how can I use other function ?, For example :- Age between Age Greater Than or Equal to like that â¦. How ? From: Michael Weylandt [via R] [mailto:ml-node+s789695n4631319...@n4.nabble.com] Sent: Friday, May 25, 2012 7:48 PM To: Akkara, Antony (GE Energy, Non-GE) Subject: Re: Hash Table - Select and Change Data iniside Matrix There aren't empty values in R. nor is it likely you have a matrix of this form, but perhaps a data frame. Perhaps this works for you, If dat is the name of your data.frame, dat[dat$AGE == 30,TRUE/FALSE] - TRUE Next time do use dput() to give a reproducible example of your data -- if it's very large, just limit it to the first 30 rows or so with dput(head(dats, 30)) Michael On Fri, May 25, 2012 at 9:43 AM, Rantony [hidden email] wrote: Hi, Here i have been a matrix like this, *NAMEAGE PALCETRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR40 USA MNO 30KENIYA DEF25AUSTRALIA Here,* TRUE/FALSE* Column containing empty values. So my requirement what is , need to change all the *TRUE/FALSE *column value into *TRUE* where *AGE = 30*. Note :- i *dont want* to use* any loop *and do. Main intension is avoid loop,bcz there is a bulk of data. Final Matrix should be like this *NAMEAGE PALCETRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE TRUE PQR40 USA MNO 30KENIYA TRUE DEF25AUSTRALIA Immediate Help Requied. Your, Antony. -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312p4631319.html To unsubscribe from Hash Table - Select and Change Data iniside Matrix, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4631312code=YW50b255LmFra2FyYUBnZS5jb218NDYzMTMxMnwxNTUxOTQzMDI5 . NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312p4631579.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hash Table - Select and Change Data iniside Matrix Using Between
Hi, Here i have been an matrix like this, *NAMEAGE PALCETRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR40 USA MNO 30KENIYA DEF25AUSTRALIA GTY34 CANADA BNH 38JAPAN Here, *TRUE/FALSE *Column containing empty values. So my requirement what is, need to change all the TRUE/FALSE column value into TRUE where *AGE= 32*. Note :- i dont want to use any loop and do. Main intension is avoid loop,bcz there is a bulk of data. Final Matrix should be like this *NAMEAGE PALCETRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR40 USA MNO 30KENIYA DEF25AUSTRALIA GTY34 CANADA TRUE BNH 38JAPAN TRUE and finally got 1 solution like this, If dat is the name of your data.frame, dat[dat$AGE == 30,TRUE/FALSE] - TRUE But how will use if i want to change to TRUE, *AGE between *30-to-40 ? Immediate Help Requied -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-Using-Between-tp4631582.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why R order files as 1 10 100 not 1 2 3 ?
The code given below worked well. However, the problem is that when I typed dir1 to see the results I found that R order the files as: [1] data1.flt data10.flt data100.flt data101.flt [5] data102.flt data103.flt data104.flt data105.flt [9] data106.flt data107.flt data108.flt data109.flt [13] data11.flt data110.flt data111.flt data112.flt [17] data113.flt data114.flt data115.flt data116.flt . . to . . [357] data91.flt data92.flt data93.flt data94.flt [361] data95.flt data96.flt data97.flt data98.flt [365] data99.flt which will lead to wrong results. How to tell R to start reading from 1 to 365 in order. something like : [1] data1.flt data2.flt data3.flt data4.flt not like: [1] data1.flt data10.flt data100.flt data101.flt Here is the code: dir1- list.files(C:\\Users\\Amin\\Desktop\\2001, *.flt, full.names = TRUE) results- list() for (.files in seq_along(dir1)){ file2 - readBin(dir2[.files], double(), size = 4, n = w * 67420, signed = TRUE) results[[length(results) + 1L]]- file1[file1 != -]*10} for (i in seq_along(results)){ fileName - sprintf(C:\\Users\\aalyaari\\Desktop\\New folder (2)\\NewFile%03d.bin, i) writeBin(as.integer(results[[i]]), fileName, size = 2)} -- View this message in context: http://r.789695.n4.nabble.com/Why-R-order-files-as-1-10-100-not-1-2-3-tp4631584.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a bunch of csv files into R
Dear Rui, Kevin, Bryan and Nutter Thank you so much for your very helpful hints! Now I have extracted all the file names and managed to edit them using the code (1)-(4) below and obtained the name format as I wanted (1) files-list.files(path = myworking directory, pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE,ignore.case = FALSE, include.dirs = FALSE) (2) filenames - files[grep([.]csv, files)] [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv 523215_20120523123419.csv ...(a few hundred more...) (3) data_names - gsub([.]csv, , filenames) (4) NAME- paste(Data,data_names, sep=.) Up to here I got NAME containing all the names I'm going to use.. NAME [1] Data.512180_20120523150757 Data.513687_20120523181947 Data.513690_20120524112111 Data.521858_20120524091428 Data.523215_20120523123419 But I still haven't successfuly read the whole bunch of csv files into R and name them as expected...e.g. I want to read 512180_20120523150757.csv into R and name it Data.512180_20120523150757 and so on... For a single file we can just write Data.512180_20120523150757-read.csv(512180_20120523150757.csv) If any of the following commands (as you suggested) works, then my question is sorted out. But I got error messages for every attempt... (i) df.list - lapply(seq_len(filenames), read.csv) Error in seq_len(filenames) : argument must be coercible to non-negative integer In addition: Warning message: In is.vector(X) : NAs introduced by coercion filenames [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv [5] 523215_20120523123419.csv... (ii) None of the following code works... myDir=myworking directory #for(i in 1:length(filenames)){assign(NAME[i], read.csv(file.path(myDir, filenames[i])))} #for(i in 1:5){assign(NAME[i], read.csv(file.path=myDir, filenames[i]))} setwd(myworking directory) #for(i in 1:5){assign(NAME[i], read.csv( filenames[i]))} Warning messages: 1: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 2: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 3: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 4: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 5: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length Seems I am getting there, but could you spot where my code went wrong please?? Many thanks again! HJ On Fri, May 25, 2012 at 8:36 PM, Rui barradas rui1...@sapo.pt wrote: Hello, Or maybe put the data frames in a list df.list - lapply(seq_len(filenames), read.csv, ...) # '...other...' are options you might want to pass, (like headers=TRUE) names(df.list) - data_names Now access the data frames by number in the list or by name in data_names. Hope this helps, Rui Barradas Em 25-05-2012 20:08, Nutter, Benjamin escreveu: For example: myDir- some file path filenames- list.files(myDir) filenames- filenames[grep([.]csv, filenames)] data_names- gsub([.]csv, , filenames) for(i in 1:length(filenames)) assign(data_names[i], read.csv(file.path(myDir, filenames[i]))) Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic| 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-**project.orgr-help-boun...@r-project.org] On Behalf Of Kevin Wright Sent: Friday, May 25, 2012 2:55 PM To: HJ YAN Cc: r-help@r-project.org Subject: Re: [R] Reading a bunch of csv files into R See ?dir Assign the value to a vector and loop over the elements of the vector. Kevin On Fri, May 25, 2012 at 12:16 PM, HJ YANyhj...@googlemail.com wrote: Dear R users I am struggling from a data importing issue: I have some hundreds of csv files needed to be read into R for futher analysis. All those csv files are named in one of the three formats: (1) strings: e.g. London_Oxford street (2) Integer: e.g. 1234_5678 (3) combined: e.g. London_1234 I intend to use read.csv(_xxx.csv) but I only dealt with sigle documents before and if there are only no more than 20 files, I do not bother to search a more efficient way. Is there any claver way that I do not have to type in all these hundreds names by hand, maybe using a R package or write some code in some other languages if it is not too difficult to learn. Any thoughts/hints please?? Many thanks in advance! HJ [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] zoo: variable gets modified at making zoo object
On Mon, May 28, 2012 at 5:35 AM, Agustin Lobo agustin.l...@ictja.csic.es wrote: I'm doing: alyL32007z - zoo(alyL32007,alyL32007$time) range(time(alyL32007z)) [1] 2007-01-01 00:00:00 UTC 2007-12-31 23:30:00 UTC But then, while the original variable is: summary(alyL32007$NEE_st) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's -15.340 -1.615 -0.054 -0.814 0.750 8.965 11124 the variable within the zoo object is different: summary(alyL32007z$NEE_st) Index alyL32007z$NEE_st Min. :2007-01-01 00:00:00 0.335: 7 1st Qu.:2007-04-02 05:52:30 0.582: 7 Median :2007-07-02 11:45:00 0.611: 7 Mean :2007-07-02 11:45:00 0.063: 6 3rd Qu.:2007-10-01 17:37:30 0.069: 6 Max. :2007-12-31 23:30:00 (Other): 6363 NA's :11124 and I get an error at plotting: plot(alyL32007z$NEE_st) Error in plot.window(...) : invalid 'ylim' value Any help appreciated, Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 e-mail agustin.l...@ictja.csic.es https://sites.google.com/site/aloboaleu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Read the last two lines and particularly the part about posting reproducible code. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulation of levene's test
Dear Dila, Try the following: library(Rcmdr) asim - 1000 pv-NULL for(i in 1:asim) { print(i) set.seed(i) g1 - rnorm(20,0,2) g2 - rnorm(20,0,2) g3 - rnorm(20,0,2) x - c(g1,g2,g3) group-as.factor(c(rep(1,20),rep(2,20),rep(3,20))) pv-c(pv,leveneTest(x,group)$Pr(F)[1]) } Best Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/simulation-of-levene-s-test-tp4631578p4631600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why R order files as 1 10 100 not 1 2 3 ?
It's because those are character strings and they are sorted lexically (i.e., alphabetically). I think you probably can get what you prefer by using the mixedsort/mixedorder functions of the gtools package. Take a look at this x - paste0(data,1:100, .fit) order(x) sort(x) library(gtools) mixedorder(x) mixedsort(x) Best, Michael On Mon, May 28, 2012 at 10:06 AM, sam84 samiye...@yahoo.co.uk wrote: The code given below worked well. However, the problem is that when I typed dir1 to see the results I found that R order the files as: [1] data1.flt data10.flt data100.flt data101.flt [5] data102.flt data103.flt data104.flt data105.flt [9] data106.flt data107.flt data108.flt data109.flt [13] data11.flt data110.flt data111.flt data112.flt [17] data113.flt data114.flt data115.flt data116.flt . . to . . [357] data91.flt data92.flt data93.flt data94.flt [361] data95.flt data96.flt data97.flt data98.flt [365] data99.flt which will lead to wrong results. How to tell R to start reading from 1 to 365 in order. something like : [1] data1.flt data2.flt data3.flt data4.flt not like: [1] data1.flt data10.flt data100.flt data101.flt Here is the code: dir1- list.files(C:\\Users\\Amin\\Desktop\\2001, *.flt, full.names = TRUE) results- list() for (.files in seq_along(dir1)){ file2 - readBin(dir2[.files], double(), size = 4, n = w * 67420, signed = TRUE) results[[length(results) + 1L]]- file1[file1 != -]*10} for (i in seq_along(results)){ fileName - sprintf(C:\\Users\\aalyaari\\Desktop\\New folder (2)\\NewFile%03d.bin, i) writeBin(as.integer(results[[i]]), fileName, size = 2)} -- View this message in context: http://r.789695.n4.nabble.com/Why-R-order-files-as-1-10-100-not-1-2-3-tp4631584.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hash Table - Select and Change Data iniside Matrix
Greater than or equal to is simply = like most languages. For more complicated questions, simply combine booleans: E.g., between 5 and 25 with(dat, (AGE 25) (AGE 5)) and so on. Michael On Mon, May 28, 2012 at 9:34 AM, Rantony antony.akk...@ge.com wrote: Hi Michel, More than equal to function, how can I use other function ?, For example :- Age between Age Greater Than or Equal to like that …. How ? From: Michael Weylandt [via R] [mailto:ml-node+s789695n4631319...@n4.nabble.com] Sent: Friday, May 25, 2012 7:48 PM To: Akkara, Antony (GE Energy, Non-GE) Subject: Re: Hash Table - Select and Change Data iniside Matrix There aren't empty values in R. nor is it likely you have a matrix of this form, but perhaps a data frame. Perhaps this works for you, If dat is the name of your data.frame, dat[dat$AGE == 30,TRUE/FALSE] - TRUE Next time do use dput() to give a reproducible example of your data -- if it's very large, just limit it to the first 30 rows or so with dput(head(dats, 30)) Michael On Fri, May 25, 2012 at 9:43 AM, Rantony [hidden email] wrote: Hi, Here i have been a matrix like this, *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR 40 USA MNO 30 KENIYA DEF 25 AUSTRALIA Here,* TRUE/FALSE* Column containing empty values. So my requirement what is , need to change all the *TRUE/FALSE *column value into *TRUE* where *AGE = 30*. Note :- i *dont want* to use* any loop *and do. Main intension is avoid loop,bcz there is a bulk of data. Final Matrix should be like this *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE TRUE PQR 40 USA MNO 30 KENIYA TRUE DEF 25 AUSTRALIA Immediate Help Requied. Your, Antony. -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312p4631319.html To unsubscribe from Hash Table - Select and Change Data iniside Matrix, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4631312code=YW50b255LmFra2FyYUBnZS5jb218NDYzMTMxMnwxNTUxOTQzMDI5 . NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-tp4631312p4631579.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] standard error
On May 28, 2012, at 5:20 AM, Christopher Kelvin wrote: Dear all, I want to determine the standard error or the mean squared error for the parameter estimate for beta and eta base on the real data. Any help on how to obtain these estimated errors. library(survival) d - data.frame(ob=c(149971, 70808, 133518, 145658, 175701, 50960, 126606, 82329), state=1) s - Surv(d$ob,d$state) sr - survreg(s~1,dist=weibull) beta-1/sr$scale p1=(beta) p1 eta-exp(sr$coefficients[1]) b=(eta) b The usual approach is to rely on the normality of the parameters on the Weibull scale and then back transform coef +/- 1.96*se(coef) You get these with summary(sr) -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulation of levene's test
On Mon, May 28, 2012 at 12:14 PM, Özgür Asar oa...@metu.edu.tr wrote: Dear Dila, Try the following: library(Rcmdr) Or avoid the unncessary overhead of Rcmdr and use library(car) to provide levenTest instead. asim - 1000 pv-NULL It's also many orders of magnitude more efficient to preallocate pv and then simply put things into it. pv - vector(real, 1000) for(i in 1:asim) { print(i) set.seed(i) Setting the seed each loop seems excessive but I suppose it's a matter of taste. g1 - rnorm(20,0,2) g2 - rnorm(20,0,2) g3 - rnorm(20,0,2) x - c(g1,g2,g3) Is there any reason not to do this as x - rnorm(60, 0, 2) group-as.factor(c(rep(1,20),rep(2,20),rep(3,20))) and this as as.factor(rep(1:3, each = 20)) pv-c(pv,leveneTest(x,group)$Pr(F)[1]) Once you preallocate pv change this to pv[i] - leveneTest(x, group)$Pr(F)[1] But it's even better not to use the dollar sign shortcut here (defensive programming and all that -- particularly with nonstandard names which I'm pretty sure won't give a big error here but will elsewhere) pv[i] - leveneTest(x, group)[[Pr(F)]][1] And even better would be to do this all using the replicate function, but I'll leave that as an exercise to the reader. Michael } Best Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/simulation-of-levene-s-test-tp4631578p4631600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hash Table - Select and Change Data iniside Matrix Using Between
I already answered this. Don't double post questions. On Mon, May 28, 2012 at 10:01 AM, Rantony antony.akk...@ge.com wrote: Hi, Here i have been an matrix like this, *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR 40 USA MNO 30 KENIYA DEF 25 AUSTRALIA GTY 34 CANADA BNH 38 JAPAN Here, *TRUE/FALSE *Column containing empty values. So my requirement what is, need to change all the TRUE/FALSE column value into TRUE where *AGE= 32*. Note :- i dont want to use any loop and do. Main intension is avoid loop,bcz there is a bulk of data. Final Matrix should be like this *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR 40 USA MNO 30 KENIYA DEF 25 AUSTRALIA GTY 34 CANADA TRUE BNH 38 JAPAN TRUE and finally got 1 solution like this, If dat is the name of your data.frame, dat[dat$AGE == 30,TRUE/FALSE] - TRUE But how will use if i want to change to TRUE, *AGE between *30-to-40 ? Immediate Help Requied -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-Using-Between-tp4631582.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I diagnosis what's wrong with R crash?
On 27.05.2012 19:50, Michael wrote: R Console closed and exited quietly after loading RODBC... I suspect that's segment fault crash... Could be. Have you read the posting guide? How could we reproduce? WHich OS, version of R, version of RODBC? ?? but how do I figure out what exactly is the problem? where can I find the core dump? Ask the vendor of your unstated OS. Locations of core dumps are not related to R. Uwe Ligges Thanks a lot! (but this doesn't occur when I use R in RStudio...) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a bunch of csv files into R
Here's what I would do then, to keep it simple. 1. Put all the relevant csv files into a single directory. 2. setwd() to that directory. 3. Use the approach I suggested before: files - list.files(pattern = \\.(csv|CSV)$) for (i in 1:length(files)) { temp - read.csv(files[i], header = FALSE) ... do whatever you want with the contents of temp... } Under ...do whatever you want... the contents of each individual file is temporarily in the data frame 'temp'. Use the decoded file names (in files[]) to figure out what you need to do with that particular file contents. Then do it. Since it sounds like you need to hold each 'temp' for possible combination with other 'temp's, you could initialize an empty list of the right size (faster), then store each 'temp' in it (which from your note was where you are headed). That would mean changing the above to something like this (in approx/pseudo code): files - list.files(pattern = \\.(csv|CSV)$) myList - vector(list, length(files)) names(myList) - paste(Data, files, sep = .) for (i in 1:length(files)) { myList[i] - read.csv(files[i], header = FALSE) # This might need to be myList[[i]] -- experiment to get it right # I'd stick with numerical indices for lists. # indexing of lists is a pain but once you get it they rock. # see ?[ and study it carefully # One thing is says which is helpful is # The most important distinction between [, [[ and $ is that the [ can select more than one element whereas the other two select a single element. } This gets them all read-in, with each csv as a data frame in myList (so myList is a list of data frames). Now you can loop over myList and work on the data itself (and edit the file names as you go). Sounds like you would have to grep for phrases in the list element names (names(myList) to figure out which ones you want. You could grep and subset myList and basically turn it into related chunks of the original. HTH. Bryan On May 28, 2012, at 12:27 PM, HJ YAN wrote: Dear Bryan Thank you so much for your prompt reply! Please see my responds below under = in your reply... Many thanks again! HJ On Mon, May 28, 2012 at 4:45 PM, Bryan Hanson han...@depauw.edu wrote: OK, a couple of things (I only looked through quickly): 1. R doesn't allow variable names to begin with a number. Be sure you don't try that. Yes, I understand this. Some of my csv files' name begining with number, so I put 'Data' infront them using 'NAME- paste(Data,data_names, sep=.)' as shown in my last email. 2. What's the overall goal here? Read them in, change the name, then write them out? Let us know and it will be easier to help you. = The overall goal here is for my current study I receive hundreds of csv files every two weeks, and I need to read them into R for futher analysis, e.g. the data are recorded in 10 minutes apart interval and are collected every two weeks from a few hundreds monitors. So I want to know how to do these jobs more efficiently: (i) Read them into R; Put the data from same monitors together and checking missing values, manipulate the data in the way we need, e.g. accordig to region, monitoring type, which involves aggregating the whole group (or a sub group) of the data etc; (ii) Edit the names, because sometimes we want to match names in one format to another, e.g. 512180_20120523150757==London_2012_May_23rd_15:07:57 (e.g. Location name_Year_Month_Day_Hour_Minute_Second) (iii) If (i) and (ii) can be done I would think 'write them out' into csv would not be too difficult. Mainly we do analysis in R and no need output in csv format so far... 3. Regardless of your goal, I think you are over thinking the solution. Let us know what you want to accomplish and we can shorten it up I'm sure. = I am trying to input the data as a list which might be easier, but I am not sure if other data type has advantage over that... Data1-list( NAME) [1] NAME Data.512180_20120523150757 Data.513687_20120523181947 Data.513690_20120524112111 Data.521858_20120524091428 Data.523215_20120523123419 for(i in 1:length(filenames)) {Data1[[i]]-read.csv(filenames[i])} But when I tried to access the components in this list 'Data1', only the first method of the three (shown below) works, and I think the other two are more useful for me. Any ideas?? (1) Data1[[1]] *** this one works (2) Data1[[Data.512180_20120523150757]] *** this one doesn't work (3) Data1$Data.512180_20120523150757 *** this one doesn't work Hope I have made myself clear here. Thanks! HJ Bryan On May 28, 2012, at 11:20 AM, HJ YAN wrote: Dear Rui, Kevin, Bryan and Nutter Thank you so much for your very helpful hints! Now I have extracted all the file
[R] question how to add Standard Deviation as Whiskers in a simple plot
Dear Researchers, sorry for this simple question. I have a point plot with mean values and i wish to plot line with Standard Deviation as Whiskers. I calculate the mean+sd and mean-sd, but i can not figure out the way to add the line. mydata - data.frame(mean=c(0.42,0.41,0.41,0.43,0.45,0.43,0.43,0.42,0.44,0.45,0.45,0.45,0.46,0.43,0.42,0.37,0.44,0.46,0.46,0.39,0.40), sdUP=c(0.58,0.56,0.55,0.57,0.61,0.55,0.57,0.59,0.61,0.60,0.57,0.60,0.62,0.57,0.59,0.56,0.57,0.61,0.61,0.56,0.54), sdDOWN=c(0.26,0.26,0.28,0.29,0.30,0.30,0.29,0.26,0.28,0.31,0.34,0.30,0.31,0.30,0.25,0.19,0.31,0.31,0.31,0.22,0.25)) plot(mydata$mean, type=o, ylab=mean, xlab=class) thanks in advance and sorry for any disturb Gianni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arima model, breusch godfrey/breusch pagan test
Hi all I did an estimation of a simple regression model (ror_xxx~ror_spi_xxx) and assessed the quality of this estimation. After having detected that there are indications of autocorrelatio and an AR(1) process, I used an arima model: absi.arima=arima(ror_absi, order=c(1,0,0), xreg=ror_spi_absi) Output: absi.arima Call: arima(x = ror_absi, order = c(1, 0, 0), xreg = ror_spi_absi) Coefficients: ar1 intercept ror_spi_absi -0.5377 -1e-04 -0.0060 s.e. 0.0752 3e-040.0215 sigma^2 estimated as 1.579e-05: log likelihood = 513.49, aic = -1018.97 This eliminated the arch effect in my model, but I want to check weather there is still any autocorrelation in my model (with breusch godfrey test, bgtest). My question is now on how to implement this in the bgtest function. As there has to be typed in the exact equation of the model or a fitted lm model, I do not have any idea on what to do now Is there a simple solution for my problem? Same question would be when using the breusch pagan test. Any suggestions are higly appreciated! Kind regards, Andi -- View this message in context: http://r.789695.n4.nabble.com/Arima-model-breusch-godfrey-breusch-pagan-test-tp4631617.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R quantreg anova: How to change summary se-type
Stefan, You could try this: make a private version of anova.rqlist and change the call to lapply that computes summaries so that se = ker instead of se = nid. Please let me know if this does what you would like to do. This is about 20 lines into the code. Could you also explain what you mean by leads to mistakes below? Thanks, Roger url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Urbana, IL 61801 On May 28, 2012, at 7:54 AM, stefan23 wrote: He folks=) I want to check whether a coefficient has an impact on a quantile regression (by applying the sup-wald test for a given quantile range [0.05,0.95]. Therefore I am doing the following calculations: a=0; for (i in 5:95/100){ fitrestricted=rq(Y~X1+X2,tau=i) tifunrestrited=rq(Y~X1+X2+X3,tau=i) a[i]=anova(fitrestricted,fitunrestricted)$table$Tn) #gives the Test-Value } supW=max(a) As anova is using the summary.rq function I want to change the Standard error method used (default: se=nid leads to mistakes, I prefer se=ker). Do you know how to handle this information in the anova syntax? Thank you very much Stefan -- View this message in context: http://r.789695.n4.nabble.com/R-quantreg-anova-How-to-change-summary-se-type-tp4631576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Factanal fits
Many thanks to Michael Weylandt and Prof. Ripley for answers to yesterday's query. 1. The response to methods(print) is that the print.princomp method is non-visible, not suppressed, as I misquoted. The method can be located by either getAnywhere(print.princomp) as suggested by Michael or by getS3method(f = 'print', class = 'factanal') as suggested by Prof. Ripley. I learn something new about R every day! 2. Prof. Ripley is correct, of course. To print out the test whether The test of the hypothesis that X factors are sufficient. when submitting a covmat, the factanal function needs to know the n.obs. The following call, packaging n.obs with the covmat, worked perfectly: cor3.fa1 - factanal(factors = 6, covmat = list(cov = cor3, n.obs = 418)) Again, thanks. Larry Hunsicker Prof. Medicine, U. Iowa College of Medicine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RCurl postForm() not working for me
Hello, I am trying the postForm() function on a very simple webpage: http://www.colby.edu/chemistry/PChem/Hartree.html I am simply trying to fill the Hartrees text form with the value 100. But running this: url = http://www.colby.edu/chemistry/PChem/Hartree.html; test = postForm(url, H=100) cat(test, file = test.html) shell.exec(test.html) returns a identical webpage with an empty htmlForm, so no change. Why is this not working? I have tried postForm() on several other pages with the same results. I do get getForm going though. Anybody ANY experience with these RCurl functions, please help. ANY input is appreciated. The example functions in the package dont seem to work at all for postForm() I think all pages are outdated, I also couldnt find any plain documentation on omegahat.org, cant load the html pages. Best Sven -- View this message in context: http://r.789695.n4.nabble.com/RCurl-postForm-not-working-for-me-tp4631608.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a bunch of csv files into R
Dear Bryan Thank you so much for your prompt reply! Please see my responds below under = in your reply... Many thanks again! HJ On Mon, May 28, 2012 at 4:45 PM, Bryan Hanson han...@depauw.edu wrote: OK, a couple of things (I only looked through quickly): 1. R doesn't allow variable names to begin with a number. Be sure you don't try that. Yes, I understand this. Some of my csv files' name begining with number, so I put 'Data' infront them using 'NAME- paste(Data,data_names, sep=.)' as shown in my last email. 2. What's the overall goal here? Read them in, change the name, then write them out? Let us know and it will be easier to help you. = The overall goal here is for my current study I receive hundreds of csv files every two weeks, and I need to read them into R for futher analysis, e.g. the data are recorded in 10 minutes apart interval and are collected every two weeks from a few hundreds monitors. So I want to know how to do these jobs more efficiently: (i) Read them into R; Put the data from same monitors together and checking missing values, manipulate the data in the way we need, e.g. accordig to region, monitoring type, which involves aggregating the whole group (or a sub group) of the data etc; (ii) Edit the names, because sometimes we want to match names in one format to another, e.g. 512180_20120523150757==London_2012_May_23rd_15:07:57 (e.g. Location name_Year_Month_Day_Hour_Minute_Second) (iii) If (i) and (ii) can be done I would think 'write them out' into csv would not be too difficult. Mainly we do analysis in R and no need output in csv format so far... 3. Regardless of your goal, I think you are over thinking the solution. Let us know what you want to accomplish and we can shorten it up I'm sure. = I am trying to input the data as a list which might be easier, but I am not sure if other data type has advantage over that... Data1-list( NAME) [1] NAME Data.512180_20120523150757 Data.513687_20120523181947 Data.513690_20120524112111 Data.521858_20120524091428 Data.523215_20120523123419 for(i in 1:length(filenames)) {Data1[[i]]-read.csv(filenames[i])} But when I tried to access the components in this list 'Data1', only the first method of the three (shown below) works, and I think the other two are more useful for me. Any ideas?? (1) Data1[[1]] *** this one works (2) Data1[[Data.512180_20120523150757]] *** this one doesn't work (3) Data1$Data.512180_20120523150757 *** this one doesn't work Hope I have made myself clear here. Thanks! HJ Bryan On May 28, 2012, at 11:20 AM, HJ YAN wrote: Dear Rui, Kevin, Bryan and Nutter Thank you so much for your very helpful hints! Now I have extracted all the file names and managed to edit them using the code (1)-(4) below and obtained the name format as I wanted (1) files-list.files(path = myworking directory, pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE,ignore.case = FALSE, include.dirs = FALSE) (2) filenames - files[grep([.]csv, files)] [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv 523215_20120523123419.csv ...(a few hundred more...) (3) data_names - gsub([.]csv, , filenames) (4) NAME- paste(Data,data_names, sep=.) Up to here I got NAME containing all the names I'm going to use.. NAME [1] Data.512180_20120523150757 Data.513687_20120523181947 Data.513690_20120524112111 Data.521858_20120524091428 Data.523215_20120523123419 But I still haven't successfuly read the whole bunch of csv files into R and name them as expected...e.g. I want to read 512180_20120523150757.csv into R and name it Data.512180_20120523150757 and so on... For a single file we can just write Data.512180_20120523150757-read.csv(512180_20120523150757.csv) If any of the following commands (as you suggested) works, then my question is sorted out. But I got error messages for every attempt... (i) df.list - lapply(seq_len(filenames), read.csv) Error in seq_len(filenames) : argument must be coercible to non-negative integer In addition: Warning message: In is.vector(X) : NAs introduced by coercion filenames [1] 512180_20120523150757.csv 513687_20120523181947.csv 513690_20120524112111.csv 521858_20120524091428.csv [5] 523215_20120523123419.csv... (ii) None of the following code works... myDir=myworking directory #for(i in 1:length(filenames)){assign(NAME[i], read.csv(file.path(myDir, filenames[i])))} #for(i in 1:5){assign(NAME[i], read.csv(file.path=myDir, filenames[i]))} setwd(myworking directory) #for(i in 1:5){assign(NAME[i], read.csv( filenames[i]))} Warning messages: 1: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of replacement length 2: In N[i] - read.csv(filenames[i]) : number of items to replace is not a multiple of
Re: [R] Hash Table - Select and Change Data iniside Matrix Using Between
Hi, I guess this is what you are looking for, dat[dat$AGE=40 da$AGE=30,TRUE/FALSE]-TRUE A.K. - Original Message - From: Rantony antony.akk...@ge.com To: r-help@r-project.org Cc: Sent: Monday, May 28, 2012 10:01 AM Subject: [R] Hash Table - Select and Change Data iniside Matrix Using Between Hi, Here i have been an matrix like this, *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR 40 USA MNO 30 KENIYA DEF 25 AUSTRALIA GTY 34 CANADA BNH 38 JAPAN Here, *TRUE/FALSE *Column containing empty values. So my requirement what is, need to change all the TRUE/FALSE column value into TRUE where *AGE= 32*. Note :- i dont want to use any loop and do. Main intension is avoid loop,bcz there is a bulk of data. Final Matrix should be like this *NAME AGE PALCE TRUE/FALSE* ABC 20 INDIA XYZ 30 FRANCE PQR 40 USA MNO 30 KENIYA DEF 25 AUSTRALIA GTY 34 CANADA TRUE BNH 38 JAPAN TRUE and finally got 1 solution like this, If dat is the name of your data.frame, dat[dat$AGE == 30,TRUE/FALSE] - TRUE But how will use if i want to change to TRUE, *AGE between *30-to-40 ? Immediate Help Requied -- View this message in context: http://r.789695.n4.nabble.com/Hash-Table-Select-and-Change-Data-iniside-Matrix-Using-Between-tp4631582.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question how to add Standard Deviation as Whiskers in a simple plot
On May 28, 2012, at 9:55 AM, gianni lavaredo wrote: Dear Researchers, sorry for this simple question. I have a point plot with mean values and i wish to plot line with Standard Deviation as Whiskers. I calculate the mean+sd and mean-sd, but i can not figure out the way to add the line. mydata - data .frame (mean = c (0.42,0.41,0.41,0.43,0.45,0.43,0.43,0.42,0.44,0.45,0.45,0.45,0.46,0.43,0.42,0.37,0.44,0.46,0.46,0.39,0.40 ), sdUP = c (0.58,0.56,0.55,0.57,0.61,0.55,0.57,0.59,0.61,0.60,0.57,0.60,0.62,0.57,0.59,0.56,0.57,0.61,0.61,0.56,0.54 ), sdDOWN = c (0.26,0.26,0.28,0.29,0.30,0.30,0.29,0.26,0.28,0.31,0.34,0.30,0.31,0.30,0.25,0.19,0.31,0.31,0.31,0.22,0.25 )) plot(mydata$mean, type=o, ylab=mean, xlab=class) thanks in advance and sorry for any disturb If you tried lines() usin either sdUP or sdDOWN you saw nothing because the ylim was set by default using ony hte information in the mean vector. Try: plot(mydata$mean, +type=o, +ylab=mean, +xlab=class, ylim=range(c(mydata$sdUP, mydata$sdDOWN))) lines(mydata$sdDOWN, col=blue, lty=3) lines(mydata$sdUP, col=blue, lty=3) -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arima model, breusch godfrey/breusch pagan test
On Mon, 28 May 2012, and_mue wrote: Hi all I did an estimation of a simple regression model (ror_xxx~ror_spi_xxx) and assessed the quality of this estimation. After having detected that there are indications of autocorrelatio and an AR(1) process, I used an arima model: absi.arima=arima(ror_absi, order=c(1,0,0), xreg=ror_spi_absi) Output: absi.arima Call: arima(x = ror_absi, order = c(1, 0, 0), xreg = ror_spi_absi) Coefficients: ar1 intercept ror_spi_absi -0.5377 -1e-04 -0.0060 s.e. 0.0752 3e-040.0215 sigma^2 estimated as 1.579e-05: log likelihood = 513.49, aic = -1018.97 This eliminated the arch effect in my model, but I want to check weather there is still any autocorrelation in my model (with breusch godfrey test, bgtest). My question is now on how to implement this in the bgtest function. As there has to be typed in the exact equation of the model or a fitted lm model, I do not have any idea on what to do now Is there a simple solution for my problem? Same question would be when using the breusch pagan test. The bgtest() and bptest() functions from package lmtest expect a fitted lm object. To apply them to the residuals of another model you can fit a simple constant-only model: m - lm(residuals(absi.arima) ~ 1) bgtest(m) It would probably be more common to consider Box-type tests as conducted by tsdiag(absi.arima). hth, Z Any suggestions are higly appreciated! Kind regards, Andi -- View this message in context: http://r.789695.n4.nabble.com/Arima-model-breusch-godfrey-breusch-pagan-test-tp4631617.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fda modeling
Hi, Troels: I'm still trying to understand the structure of your data. Please check the discussion below. If what I suggest is correct, it should make the analysis much more routine and therefore easier requiring less time to analyze. On 5/21/2012 1:33 PM, Troels Ring wrote: Dear friends - We have 25 rats, 14 of these subjected to partial removal of kidney tissue, 11 to sham operation, and then followed for 6 weeks. So far we have data on 26 urine metabolites measured by NMR 7 times during the observation. So you collected urine samples at 7 different times on each rat throughout the experiment, separated out 26 different metabolites and measured each of those 7 using Nuclear Magnetic Resonance (NMR)? What were the ages of the rats at the time of the operation and at the times that each of the 7 urine samples were collected? In particular, were the 7 urine samples equally spaced? If yes, that could simplify the analysis. The greater the time differences between samples and between rats, the more difficult the analysis potentially. What were the ages of the 25 rats? Were they all from the same litter? If no, how were they related? The worst possible case is that you have 14 from one litter and 11 from another. If that's the case, then any difference you see between the two groups could be a litter effect. If they are one rat from each of 25 litters, that would simplify the statistical analyses. Scientifically, the best might be to have at 4 or 5 rats from each of 6 litters, assigned with at least 2 rats to experiment and 2 to control from each litter. You probably don't have that, but the litter effect is likely to be important and that needs to be part of the analysis, I think. I have smoothed the measurements by b.splines in fda including a roughness penalty, and inspecting the mean curves for nephrectomized and sham animals indicate differences for several of the metabolites. Now the real idea is to use the NMR measurements to understand what goes on in the kidneys since we know the partial removal of kidney tissue will result in progressive damage in the kidneys - the nature of that is what we want to understand. We have a blood sample from the rats just prior to sacrifice, and the creatinine concentration there is a good proxy for renal function. So you have one measure of creatinine for each rat measured just prior to sacrifice? So the course of concentrations of the metabolites are thought to be valuable in understanding the physiology. Some of these are thought to be correlated. We have two groups where sham animals have better renal function than partially nephrectomized, but there is variation in both groups which is also interesting - some animals progress more rapidly after the same operation than others - we would like to know why. The data are available (eventually - the resulting blood tests still are missing) if anyone would like to have a look but the main issue is if it is at all feasible to make fda work on such a problem. I suggest you forget about fda at least initially and start with simpler, more traditional tools. Later, you may or may not want to return to fda. I suggest you proceed as follows: I. DATA CLEANING: Make normal probability plots of everything: I'd start with making one normal probability plot for each of the 26 metabolites. Normally distributed data with approximately the same mean and standard deviation will look approximately like a straight line. The scientist's dream with this is the image of two lines with a gap in the middle, with the two lines corresponding exactly to the two groups (nephrectomized vs. controls). It's more likely that you will see mostly one distribution with a few observations away from a moderately straight line in the middle. If you see this, you should check the records and samples for the deviant observations to see if you can find, e.g., a data entry error or a problem with mishandling a sample. If you can't fix any observation that way, you should replace the numbers with NA (not available = missing). Another possibility is you see several little clusters corresponding to the litters. Or you might see curvature to the line; with curvature, if all the numbers are positive, you should try normal plots of the logarithms. If that helps straighten out the lines, you should analyze the logarithms not the raw numbers. I usually do this with something like qqnorm(x, datax=TRUE). The use of datax means that with one or more outliers, the slope of the center portion will be closer to 45 degrees and therefore more easily processed with the naked eye. II. UNIVARIATE ANALYSES: After data cleaning, I'd then use something like lme{nlme} to analyze each response variable (metabolite or creatinine) separately. I recommend lme, because it is exactly what is needed for this
Re: [R] question how to add Standard Deviation as Whiskers in a simple plot
The function i am looking is a bars from the mean points of the plot in boxplot style. I tryed several forum but I have no clear the way to create these bars. Gianni On Mon, May 28, 2012 at 7:13 PM, David Winsemius dwinsem...@comcast.netwrote: On May 28, 2012, at 9:55 AM, gianni lavaredo wrote: Dear Researchers, sorry for this simple question. I have a point plot with mean values and i wish to plot line with Standard Deviation as Whiskers. I calculate the mean+sd and mean-sd, but i can not figure out the way to add the line. mydata - data.frame(mean=c(0.42,0.41,0.**41,0.43,0.45,0.43,0.43,0.42,0.** 44,0.45,0.45,0.45,0.46,0.43,0.**42,0.37,0.44,0.46,0.46,0.39,0.**40), sdUP=c(0.58,0.56,0.55,0.57,0.**61,0.55,0.57,0.59,0.61,0.60,0.** 57,0.60,0.62,0.57,0.59,0.56,0.**57,0.61,0.61,0.56,0.54), sdDOWN=c(0.26,0.26,0.28,0.29,**0.30,0.30,0.29,0.26,0.28,0.31,** 0.34,0.30,0.31,0.30,0.25,0.19,**0.31,0.31,0.31,0.22,0.25)) plot(mydata$mean, type=o, ylab=mean, xlab=class) thanks in advance and sorry for any disturb If you tried lines() usin either sdUP or sdDOWN you saw nothing because the ylim was set by default using ony hte information in the mean vector. Try: plot(mydata$mean, +type=o, +ylab=mean, +xlab=class, ylim=range(c(mydata$sdUP, mydata$sdDOWN))) lines(mydata$sdDOWN, col=blue, lty=3) lines(mydata$sdUP, col=blue, lty=3) -- David. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question how to add Standard Deviation as Whiskers in a simple plot
This post are useful. http://myowelt.blogspot.com.br/2008/03/beautiful-error-bars-in-r.html http://mapas.mma.gov.br/i3geo/pacotes/rlib/win/gplots/html/plotCI.html Walmes. == Walmes Marques Zeviani LEG (Laboratório de Estatística e Geoinformação, 25.450418 S, 49.231759 W) Departamento de Estatística - Universidade Federal do Paraná fone: (+55) 41 3361 3573 VoIP: (3361 3600) 1053 1173 e-mail: wal...@ufpr.br twitter: @walmeszeviani homepage: http://www.leg.ufpr.br/~walmes linux user number: 531218 == [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question how to add Standard Deviation as Whiskers in a simple plot
Hello, The function 'arrows' with angle=90 can do the job. mydata - data.frame(mean=c(0.42,0.41,0.41,0.43,0.45,0.43,0.43,0.42,0.44,0.45,0.45,0.45,0.46,0.43,0.42,0.37,0.44,0.46,0.46,0.39,0.40), sdUP=c(0.58,0.56,0.55,0.57,0.61,0.55,0.57,0.59,0.61,0.60,0.57,0.60,0.62,0.57,0.59,0.56,0.57,0.61,0.61,0.56,0.54), sdDOWN=c(0.26,0.26,0.28,0.29,0.30,0.30,0.29,0.26,0.28,0.31,0.34,0.30,0.31,0.30,0.25,0.19,0.31,0.31,0.31,0.22,0.25)) x - 1:nrow(mydata) with(mydata, plot(1, type=n, xlim=c(1, nrow(mydata)), ylim=c(min(sdDOWN), max(sdUP with(mydata, points(x, mean)) with(mydata, arrows(x, mean, x, sdUP, angle=90, length=0.1)) with(mydata, arrows(x, mean, x, sdDOWN, angle=90, length=0.1)) Hope this helps, Rui Barradas Em 28-05-2012 18:24, gianni lavaredo escreveu: The function i am looking is a bars from the mean points of the plot in boxplot style. I tryed several forum but I have no clear the way to create these bars. Gianni On Mon, May 28, 2012 at 7:13 PM, David Winsemiusdwinsem...@comcast.netwrote: On May 28, 2012, at 9:55 AM, gianni lavaredo wrote: Dear Researchers, sorry for this simple question. I have a point plot with mean values and i wish to plot line with Standard Deviation as Whiskers. I calculate the mean+sd and mean-sd, but i can not figure out the way to add the line. mydata- data.frame(mean=c(0.42,0.41,0.**41,0.43,0.45,0.43,0.43,0.42,0.** 44,0.45,0.45,0.45,0.46,0.43,0.**42,0.37,0.44,0.46,0.46,0.39,0.**40), sdUP=c(0.58,0.56,0.55,0.57,0.**61,0.55,0.57,0.59,0.61,0.60,0.** 57,0.60,0.62,0.57,0.59,0.56,0.**57,0.61,0.61,0.56,0.54), sdDOWN=c(0.26,0.26,0.28,0.29,**0.30,0.30,0.29,0.26,0.28,0.31,** 0.34,0.30,0.31,0.30,0.25,0.19,**0.31,0.31,0.31,0.22,0.25)) plot(mydata$mean, type=o, ylab=mean, xlab=class) thanks in advance and sorry for any disturb If you tried lines() usin either sdUP or sdDOWN you saw nothing because the ylim was set by default using ony hte information in the mean vector. Try: plot(mydata$mean, +type=o, +ylab=mean, +xlab=class, ylim=range(c(mydata$sdUP, mydata$sdDOWN))) lines(mydata$sdDOWN, col=blue, lty=3) lines(mydata$sdUP, col=blue, lty=3) -- David. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rank a numerical variable
hello, Is there any function in R to transform a numerical continuos variable in a ranked variable? Thanks - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/Rank-a-numerical-variable-tp4631627.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
Just a final correction. I was wrong, stats::ks.test doesn't use only Marsaglia et al. It's even clearly written in the help page. Read the documentation before stating! Rui Barradas Em 28-05-2012 11:51, maxbre escreveu: thanks for the help: I'll have a look at the papers max Il 28/05/2012 12:31, Rui Barradas [via R] ha scritto: Hello, That's a very difficult question. See Marsaglia, Tsang, Wang (2003) http://www.jstatsoft.org/v08/i18/ Simard, L'Ecuyer (2011) http://www.jstatsoft.org/v39/i11 R's ks functions are a port of Marsaglia et al. to the .C interface. Rui Barradas maxbre wrote thanks rui that's what I was looking for I have another related question: - why of the difference between the max distance D calculated with ks.test() and the max distance D âEURoemanuallyâEUR? calculated as in (2)? I guess it has something to do with the fact that KS is obtained with a maximisation that depends on the range of x values not necessarly coincident in the two different approaches ...any thought about this? maxbre If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631571.html To unsubscribe from Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4631437code=bWJyZXNzYW5AYXJwYS52ZW5ldG8uaXR8NDYzMTQzN3wyMjQwMjkzMTc=. NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4631573.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about extracting certain rows from one column in a data.frame
I was wondering if there was a quick way to extract out certain rows from a data set in R? I have a data.frame, LOG, where in one column, sample_data_tx, there is a list of 62 different types of treatment. I've sub-selected the rows that contain the names, PLO and NOY to make a new vector which I call, Test. Here's my code so far, ##In LOG data set, Test set is every treatment, PLO and NOY## ##Select rows in the LOG data set that contain Noy## Noy - which(LOG$sample_data_tx == Noy) ##Select rows in the LOG data set that contain PLO## PLO - which(LOG$sample_data_tx == PLO) ##Make Test Set## Test - c(Noy, PLO) Test [1] 8 24 50 23 29 46 55 Within the data.frame, LOG, I would like to now make another vector, Training, that contains every row in the column, sample_data_tx, except rows 8, 24, 50, 23, 29, 46, 55. Test is also an integer and I am hoping to make a hierarchical plot with both the Test and Training vectors so I am not sure if I first need to convert the data from integer to numeric form? I am new to R so all help is appreciated. Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank a numerical variable
? rank Michael On Mon, May 28, 2012 at 2:17 PM, gaiarrido gaiarr...@usal.es wrote: hello, Is there any function in R to transform a numerical continuos variable in a ranked variable? Thanks - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/Rank-a-numerical-variable-tp4631627.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank a numerical variable
hi, read an Introduction to R. ?rank is what you are looking for? kd 2012.05.28. 20:17 keltezéssel, gaiarrido írta: hello, Is there any function in R to transform a numerical continuos variable in a ranked variable? Thanks - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/Rank-a-numerical-variable-tp4631627.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about extracting certain rows from one column in a data.frame
On Mon, May 28, 2012 at 3:16 PM, Kelly Cool kellycoo...@yahoo.com wrote: I was wondering if there was a quick way to extract out certain rows from a data set in R? I have a data.frame, LOG, where in one column, sample_data_tx, there is a list of 62 different types of treatment. I've sub-selected the rows that contain the names, PLO and NOY to make a new vector which I call, Test. Here's my code so far, ##In LOG data set, Test set is every treatment, PLO and NOY## ##Select rows in the LOG data set that contain Noy## Noy - which(LOG$sample_data_tx == Noy) ##Select rows in the LOG data set that contain PLO## PLO - which(LOG$sample_data_tx == PLO) ##Make Test Set## Test - c(Noy, PLO) Test [1] 8 24 50 23 29 46 55 Within the data.frame, LOG, I would like to now make another vector, Training, that contains every row in the column, sample_data_tx, except rows 8, 24, 50, 23, 29, 46, 55. I think you're looking for negative indexing (which is, in my opinion, pretty much the best thing ever) E.g., x - letters[1:10] x[1:3] # First three letters x[-(1:3)] # Without the first three letters x[-4] # Leave out d etc. Of course, for this case, you might also want the subset function: subset(LOG, sample_data_tx %in% c(Noy,PLO)) Test is also an integer and I am hoping to make a hierarchical plot with both the Test and Training vectors so I am not sure if I first need to convert the data from integer to numeric form? No, almost always these sorts of conversions will be taken care of you automatically Best, Michael I am new to R so all help is appreciated. Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about extracting certain rows from one column in a data.frame
Hi Kelly, Check ?subset in the R console. Here is a piece of code (untested): subset(LOG, sample_data %in% c(Noy, PLO)) HTH, Jorge.- On Mon, May 28, 2012 at 3:16 PM, Kelly Cool wrote: I was wondering if there was a quick way to extract out certain rows from a data set in R? I have a data.frame, LOG, where in one column, sample_data_tx, there is a list of 62 different types of treatment. I've sub-selected the rows that contain the names, PLO and NOY to make a new vector which I call, Test. Here's my code so far, ##In LOG data set, Test set is every treatment, PLO and NOY## ##Select rows in the LOG data set that contain Noy## Noy - which(LOG$sample_data_tx == Noy) ##Select rows in the LOG data set that contain PLO## PLO - which(LOG$sample_data_tx == PLO) ##Make Test Set## Test - c(Noy, PLO) Test [1] 8 24 50 23 29 46 55 Within the data.frame, LOG, I would like to now make another vector, Training, that contains every row in the column, sample_data_tx, except rows 8, 24, 50, 23, 29, 46, 55. Test is also an integer and I am hoping to make a hierarchical plot with both the Test and Training vectors so I am not sure if I first need to convert the data from integer to numeric form? I am new to R so all help is appreciated. Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rcurl, postForm()
Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic question, but I need some help regardless. Yours, Simon Kiss library(XML) library(RCurl) library(scrapeR) library(RHTMLForms) #Set URL bus-c('http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx') #Scrape URL orig-getURLContent(url=bus) #Parse doc doc-htmlParse(orig[[1]], asText=TRUE) #Get The forms forms-getNodeSet(doc, //form) forms[[1]] #These are the input nodes getNodeSet(forms[[1]], .//input) #These are the select nodes getNodeSet(forms[[1]], .//select) * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] importing multiple file form folder
Hi all, I have a set of files (which is growing) in a folder. The files are text files... The form of files is such : ...with numbers for Length (m) going up to 2000 ... Anyway...i just need the data from first two columns (length (m) and Temperature (C)), and no data before that... This Lenght (m) values are always the same. My final dataset should lokk like this : column 1 as Length(m) ; column 2 as Temperature from first file ; column3 as temperature from second file...and so on... I know how to import this manualy, but can seem to find a way to automate it...the problem is that the amout of files will be growing for quite quite some time, so automation is necessary. Any help is greatly apreciated. m -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rcurl, postForm()
On 28/05/12 20:46, Simon Kiss wrote: Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic question, but I need some help regardless. Yours, Simon Kiss library(XML) library(RCurl) library(scrapeR) library(RHTMLForms) #Set URL bus-c('http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx') #Scrape URL orig-getURLContent(url=bus) #Parse doc doc-htmlParse(orig[[1]], asText=TRUE) #Get The forms forms-getNodeSet(doc, //form) forms[[1]] #These are the input nodes getNodeSet(forms[[1]], .//input) #These are the select nodes getNodeSet(forms[[1]], .//select) * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey Simon, just had a look at the source of the webpage, if I am not mistaken, this involves javascript. I am trying the same on a different page, but couldnt get help either. If you get the solution from somewhere, please let me know. Sven __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing multiple file form folder
I managed to sort something out with a for loop, but it's till not working ok... What it does is it loops through all files in the folder, it imports each file from line 763 on. Than it just takes the second column (Temprerature) and binds the columns (cbind). BUT it just binds the values of the last file instead of EACH file. Any ideas? Attached are two files for easier understanding... thanks, m http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_153744_1.ddf channel_1_20120509_153744_1.ddf http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_154744_1.ddf channel_1_20120509_154744_1.ddf -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637p4631640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing multiple file form folder
Sure, I have lots of ideas, mostly involving you overwriting your results with each iteration. But unless you post your code to the the list, I'll never know if my ideas are right. Please read the posting guide. Using the Nabble interface does not exempt you from posting manners. Sarah On Mon, May 28, 2012 at 5:25 PM, mpavlic matevz.pav...@gi-zrmk.si wrote: I managed to sort something out with a for loop, but it's till not working ok... What it does is it loops through all files in the folder, it imports each file from line 763 on. Than it just takes the second column (Temprerature) and binds the columns (cbind). BUT it just binds the values of the last file instead of EACH file. Any ideas? Attached are two files for easier understanding... thanks, m http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_153744_1.ddf channel_1_20120509_153744_1.ddf http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_154744_1.ddf channel_1_20120509_154744_1.ddf -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637p4631640.html Sent from the R help mailing list archive at Nabble.com. __ -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm(weights) and standard errors
Thanks Peter for your clarifications. Yes, the definition I'm looking for is: - I have 0.1 observations identical to this one, i.e. this row and nine others similar (but not identical) to it together represent a single observation. in lm/glm ... the weights are really only relative This is the problem I would like to get around. do we get the extra variability of the variance right? The Wood et al paper suggests modifications to the weights to adjust for the varying amount of missingness in covariates. I know Thomas (we're both in Auckland) so I'll ask him about the survey package. -Original Message- From: peter dalgaard [mailto:pda...@gmail.com] Sent: Friday, 25 May 2012 9:37p To: ilai Cc: Steve Taylor; r-help@r-project.org Subject: Re: [R] glm(weights) and standard errors Weighting can be confusing: There are three standard forms of weighting which you need to be careful not to mix up, and I suspect that the imputation weights are really a 4th version. First, there is case (replication) vs. precision weighting. A weight of 10 means one of - I have 10 observations identical to this one - This observation has a variance of sigma^2/10 as if it were the average of 10 observations. There are also sampling weights: - For each observation like this, I have 10 similar observations in the population (and I want to estimate a population parameter like the national average income or the percentage of votes at a hypothetical general election). What R does in lm/glm is precision weights. Notice that when the variance is estimated from data, the weights are really only relative: if all observations are weighted equally (all 10, say), you get a 10-fold increase in the estimated sigma^2 and a tenfold decrease in the unscaled variance-covariance matrix. So the net result is that the standard errors are the same (but they won't be if the weights are unequal). The three weighting schemes share the same formula for the estimates, but differ both in the estimated variance and df, and in the formula for the standard errors. Sampling weights are the domain of the survey package, but I don't think it does replication weights (someone called Thomas may chime in and educate me otherwise). I'm not quite sure, but I think you can get from a precision-weighted analysis to a case-weighted one just by adjusting the DF for error (changing the residual df to df+sum(w)-n, and sigma^2 proportionally). Imputation weights look like the opposite of case weights: You give 10 observations when in fact you have only one. An educated guess would be that you could do something similar as for case weights -- in this case sum(w) will be much less than n, so you will decrease the residual rather than increase it. I get this nagging feeling that it might still not be quite right, though -- in the cases where the imputations actually differ, do we get the extra variability of the variance right? Or maybe we don't need to care. There is a literature on the subject On May 25, 2012, at 09:21 , ilai wrote: I'm confused (I bet David is too). First and last models are the same, what do SE's have to do with anything ? naive - glm(extra ~ group, data=sleep) imputWrong - glm(extra ~ group, data=sleep10) imput - glm(extra ~ group, data=sleep10,weights=rep(0.1,nrow(sleep10))) lapply(list(naive,imputWrong,imput),anova) sapply(list(naive,imuptWrong,imput),function(x) vcov(x)[1,1]/vcov(x)[2,2]) # or another way to see it (adjust for the DF) coef(summary(naive))[2,2] - sqrt(198)/sqrt(18) * coef(summary(imput))[2,2] coef(summary(naive))[2,2] - sqrt(198)/sqrt(18) * coef(summary(imputWrong))[2,2] Are you sure you are interpreting Wood et al. correctly ? (I haven't read it, this is not rhetorical) On Wed, May 23, 2012 at 7:49 PM, Steve Taylor steve.tay...@aut.ac.nz wrote: Re: coef(summary(glm(extra ~ group, data=sleep[ rep(1:nrow(sleep), 10L), ] ))) Your (corrected) suggestion is the same as one of mine, and doesn't do what I'm looking for. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, 22 May 2012 3:37p To: Steve Taylor Cc: r-help@r-project.org Subject: Re: [R] glm(weights) and standard errors On May 21, 2012, at 10:58 PM, Steve Taylor wrote: Is there a way to tell glm() that rows in the data represent a certain number of observations other than one? Perhaps even fractional values? Using the weights argument has no effect on the standard errors. Compare the following; is there a way to get the first and last models to produce the same results? data(sleep) coef(summary(glm(extra ~ group, data=sleep))) coef(summary(glm(extra ~ group, data=sleep, weights=rep(10L,nrow(sleep) Here's a reasonably simple way to do it: coef(summary(glm(extra ~ group, data=sleep[ rep(10L,nrow(sleep)), ] ))) -- David. sleep10 = sleep[rep(1:nrow(sleep),10),] coef(summary(glm(extra ~ group,
Re: [R] importing multiple file form folder
Hi, if you're on a mac, I would recommend Automator. If you're on unix I would recommend a handy bash script with regex. And on windows.. I don't know.. you could do regex in R, couldn't you? Am 28.05.2012 um 21:02 schrieb mpavlic: Hi all, I have a set of files (which is growing) in a folder. The files are text files... The form of files is such : ...with numbers for Length (m) going up to 2000 ... Anyway...i just need the data from first two columns (length (m) and Temperature (C)), and no data before that... This Lenght (m) values are always the same. My final dataset should lokk like this : column 1 as Length(m) ; column 2 as Temperature from first file ; column3 as temperature from second file...and so on... I know how to import this manualy, but can seem to find a way to automate it...the problem is that the amout of files will be growing for quite quite some time, so automation is necessary. Any help is greatly apreciated. m -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert an image to matrix, color frequencies
Dear R-users, In advance excuse me for this basic question. I´m trying to compare the coloration patterns on three spider species. In order to do that, I was trying to convert the image on a pixel matrix and compare them. Because of that, I´d like to know how to convert an image to a pixel matrix and if there exists some way to find the color frequency and distribution on the pictures using R. I have found some posts to make the opposite procedure but no this specifically. Thanks a lot, any help wll be greatly appreciated! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert from float32 to 16B
I want to just convert from float32 to 16b with scale factor of 10. I wonder why some files were converted correctly while some were not converted correctly. By means, the results of some files are weird. the original files are all ok!. dir1- list.files(C:\\New folder (13), *.img, full.names = TRUE) results- list() for (.files in seq_along(dir1)){ file2 - readBin(dir1[.files], double(), size = 4, n = 360*720, signed = TRUE) file2[file2 != -] - file2[file2 != -]*10 results[[length(results) + 1L]] - file2 fileName - sprintf(C:\\SWdown_21_%d.bin, .files) writeBin(as.integer(results[[.files]]), fileName, size = 2)} -- View this message in context: http://r.789695.n4.nabble.com/convert-from-float32-to-16B-tp4631638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zoo: variable gets modified at making zoo object
Thanks for your interest. I've put the dataframe alyL32007 in http://dl.dropbox.com/u/3180464/alyL32007.rda ready to be used with load() Agus On Mon, May 28, 2012 at 6:11 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, May 28, 2012 at 5:35 AM, Agustin Lobo agustin.l...@ictja.csic.es wrote: I'm doing: alyL32007z - zoo(alyL32007,alyL32007$time) range(time(alyL32007z)) [1] 2007-01-01 00:00:00 UTC 2007-12-31 23:30:00 UTC But then, while the original variable is: summary(alyL32007$NEE_st) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's -15.340 -1.615 -0.054 -0.814 0.750 8.965 11124 the variable within the zoo object is different: summary(alyL32007z$NEE_st) Index alyL32007z$NEE_st Min. :2007-01-01 00:00:00 0.335: 7 1st Qu.:2007-04-02 05:52:30 0.582: 7 Median :2007-07-02 11:45:00 0.611: 7 Mean :2007-07-02 11:45:00 0.063: 6 3rd Qu.:2007-10-01 17:37:30 0.069: 6 Max. :2007-12-31 23:30:00 (Other): 6363 NA's :11124 and I get an error at plotting: plot(alyL32007z$NEE_st) Error in plot.window(...) : invalid 'ylim' value Any help appreciated, Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 e-mail agustin.l...@ictja.csic.es https://sites.google.com/site/aloboaleu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Read the last two lines and particularly the part about posting reproducible code. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com -- -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 e-mail agustin.l...@ictja.csic.es https://sites.google.com/site/aloboaleu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing multiple file form folder
Hello, I've named your file 'file1.txt' and with readLines(file1.txt) saw 25 lines, then a header, then a table of tab separated values. The header is full of blanks, such as the ones in 'length (m)' and 'temperature (°C)', making it impratical. So if 'flist' is your list of files, try the following. flist - file1.txt # First, read only one and keep only the lengths column Length - read.table(flist[1], skip=26)[, 1] # Then read the temperatures from all files and cbind them into a matrix Temp - do.call(cbind, lapply(flist, function(x) read.table(x, skip=26)[, 2])) # Tidy up colnames(Temp) - paste(Temperature, seq_len(ncol(Temp)), sep=.) # And put it all together result - cbind(Length=Length, Temp) Hope this helps, Rui Barradas Em 28-05-2012 21:02, mpavlic escreveu: Hi all, I have a set of files (which is growing) in a folder. The files are text files... The form of files is such : ...with numbers for Length (m) going up to 2000 ... Anyway...i just need the data from first two columns (length (m) and Temperature (C)), and no data before that... This Lenght (m) values are always the same. My final dataset should lokk like this : column 1 as Length(m) ; column 2 as Temperature from first file ; column3 as temperature from second file...and so on... I know how to import this manualy, but can seem to find a way to automate it...the problem is that the amout of files will be growing for quite quite some time, so automation is necessary. Any help is greatly apreciated. m -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zoo: variable gets modified at making zoo object
On Mon, May 28, 2012 at 5:35 AM, Agustin Lobo agustin.l...@ictja.csic.es wrote: I'm doing: alyL32007z - zoo(alyL32007,alyL32007$time) The POSIXct time is erroneously being used twice: once as part of the data and once as the index. It should be: alyL32007z - zoo(alyL32007[-1], alyL32007$time) or alyL32007z - read.zoo(alyL32007, FUN = identity) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert from float32 to 16B
Most likely you have a bug in your program. Have you looked at the results of your calculations before writing them out? Since you have provided no data, we can not reproduce what you are doing to show where the error might be, or the correct way of doing it. In almost all cases, if you think you are getting weird resutls, it is because your calculations are producing the weird results. It is time to learn debugging 101; there are plenty of tools that will let you examine your results and determine where your errors are. On Mon, May 28, 2012 at 4:10 PM, sam84 samiye...@yahoo.co.uk wrote: I want to just convert from float32 to 16b with scale factor of 10. I wonder why some files were converted correctly while some were not converted correctly. By means, the results of some files are weird. the original files are all ok!. dir1- list.files(C:\\New folder (13), *.img, full.names = TRUE) results- list() for (.files in seq_along(dir1)){ file2 - readBin(dir1[.files], double(), size = 4, n = 360*720, signed = TRUE) file2[file2 != -] - file2[file2 != -]*10 results[[length(results) + 1L]] - file2 fileName - sprintf(C:\\SWdown_21_%d.bin, .files) writeBin(as.integer(results[[.files]]), fileName, size = 2)} -- View this message in context: http://r.789695.n4.nabble.com/convert-from-float32-to-16B-tp4631638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R quantreg anova: How to change summary se-type
Hi Roger, thank you very much for your fast response. First of all, the mistakes I mentioned are all of the sam type: In summary.rq(x, se = nid, covariance = TRUE) : 22 non-positive fis. That is the reason why I want to change the se-procedure as I made the experience that there problem disappears by using se=ker. But unfortunately I am not able to following your answer as I do not have so much experience with R. I changed the summary.rq so that the procedure is calculated by using se=ker as default, but anova.rq seems not to react on this change. How can I produce my private version? Do I have to generate my own quantreg package? Thank you very much for you help cheers Stefan Am 28.05.2012 19:05, schrieb Roger Koenker: Stefan, You could try this: make a private version of anova.rqlist and change the call to lapply that computes summaries so that se = ker instead of se = nid. Please let me know if this does what you would like to do. This is about 20 lines into the code. Could you also explain what you mean by leads to mistakes below? Thanks, Roger url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Urbana, IL 61801 On May 28, 2012, at 7:54 AM, stefan23 wrote: He folks=) I want to check whether a coefficient has an impact on a quantile regression (by applying the sup-wald test for a given quantile range [0.05,0.95]. Therefore I am doing the following calculations: a=0; for (i in 5:95/100){ fitrestricted=rq(Y~X1+X2,tau=i) tifunrestrited=rq(Y~X1+X2+X3,tau=i) a[i]=anova(fitrestricted,fitunrestricted)$table$Tn) #gives the Test-Value } supW=max(a) As anova is using the summary.rq function I want to change the Standard error method used (default: se=nid leads to mistakes, I prefer se=ker). Do you know how to handle this information in the anova syntax? Thank you very much Stefan -- View this message in context: http://r.789695.n4.nabble.com/R-quantreg-anova-How-to-change-summary-se-type-tp4631576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XY correlation
Hi there We have 25m XY pairs to be correlated. Data is bank financial data and smooth over Time ; observations for x and y are 32 quarters each. Testing 25m rships exhaustively will take forever ; this task is easily over-engineered. We'll use the best of XY relationships to predict. * We're swamped by choice in R packages. None seem to be readily comparable. Can someone help us with which package(s) is/are most apt ? That is, which can time-efficiently test for correlation given properties of the data ? We'll save lots of time with fine advice. (And yes, to conserve time, we'll employ foreach doSNOW.) Steve * X and Y are series both 50 quarters in length. Y is offset 8 quarters forward, such that X (periods 9 to 50) is compared to Y (1 to 42). Best fit (x actual vs x fitted using y actual) is determined. Method identified is used with Y (43 to 50) to predict X for the final 8 quarters. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing multiple file form folder
I believe my already posted solution works. I've just tried it with your examples. url - c(http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_153744_1.ddf;, http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_154744_1.ddf;) flist - url And the rest is exactly the same. But, ok, I'll post it again. Length - read.table(flist[1], skip=26)[, 1] Temp - do.call(cbind, lapply(flist, function(x) read.table(x, skip=26)[, 2])) colnames(Temp) - paste(Temperature, seq_len(ncol(Temp)), sep=.) result - cbind(Length=Length, Temp) head(result) Length Temperature.1 Temperature.2 [1,] -747.200 325.138 800.000 [2,] -746.18518.874 -200.000 [3,] -745.171 488.420 800.000 [4,] -744.15669.484 -78.434 [5,] -743.142 -70.252 129.180 [6,] -742.127 -200.000 -200.000 Em 28-05-2012 22:25, mpavlic escreveu: I managed to sort something out with a for loop, but it's till not working ok... What it does is it loops through all files in the folder, it imports each file from line 763 on. Line 763 on? I'm off by 737... Than it just takes the second column (Temprerature) and binds the columns (cbind). BUT it just binds the values of the last file instead of EACH file. Any ideas? Attached are two files for easier understanding... If you ask for help, read the answers, please, the second file changed nothing. (Nor it hurts to check what was posted before.) thanks, m http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_153744_1.ddf channel_1_20120509_153744_1.ddf http://r.789695.n4.nabble.com/file/n4631640/channel_1_20120509_154744_1.ddf channel_1_20120509_154744_1.ddf -- View this message in context: http://r.789695.n4.nabble.com/importing-multiple-file-form-folder-tp4631637p4631640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Rui Barradas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setting parameters equal in lm
Forgive me if this is a trivial question, but I couldn't find it an answer in former forums. I'm trying to reproduce some SAS results where they set two parameters equal. For example: y = b1X1 + b2X2 + b1X3 Notice that the variables X1 and X3 both have the same slope and the intercept has been removed. How do I get an estimate of this regression model? I know how to remove the intercept (-1 somewhere after the tilde). But how about setting parameters equal? I have used the car package to set up linear hypotheses: X1 = rnorm(20, 10, 5); X2 = rnorm(20, 10, 5); X3 = rnorm(20, 10, 5) Y = .5*X1 + 3*X2 + .5*X3 + rnorm(20, 0, 15) data.set = data.frame(cbind(X1, X2, X3, Y)) linMod = lm(Y~X1 + X2 + X3, data=data.set) require(car) linearHypothesis(linMod, c((Intercept)=0, X1-X3=0)) (forgive the unconventional use of the equal signold habit). Unfortunately, the linearHypothesis is always compared to a full model (where the parameters are freely estimated). I want to have an ANOVA summary table for the reduced model. Any ideas? Thanks in advance for the help! -- Dustin Fife PhD Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting parameters equal in lm
I don't know how it ties into the tools car gives you, but one (quick and dirty) way to do this is to simply regress on Y ~ aX2 + b(X1+X3) or in R code something like: lm(Y ~ X2 + I(X1+X3), data = data.set) which gives a linear model you can play around with. Note the I() function [that's the capital letter immediately preceding J] which tells R to interpret that term AsIs Hope this helps, Michael On Mon, May 28, 2012 at 11:14 PM, Dustin Fife fife.dus...@gmail.com wrote: Forgive me if this is a trivial question, but I couldn't find it an answer in former forums. I'm trying to reproduce some SAS results where they set two parameters equal. For example: y = b1X1 + b2X2 + b1X3 Notice that the variables X1 and X3 both have the same slope and the intercept has been removed. How do I get an estimate of this regression model? I know how to remove the intercept (-1 somewhere after the tilde). But how about setting parameters equal? I have used the car package to set up linear hypotheses: X1 = rnorm(20, 10, 5); X2 = rnorm(20, 10, 5); X3 = rnorm(20, 10, 5) Y = .5*X1 + 3*X2 + .5*X3 + rnorm(20, 0, 15) data.set = data.frame(cbind(X1, X2, X3, Y)) linMod = lm(Y~X1 + X2 + X3, data=data.set) require(car) linearHypothesis(linMod, c((Intercept)=0, X1-X3=0)) (forgive the unconventional use of the equal signold habit). Unfortunately, the linearHypothesis is always compared to a full model (where the parameters are freely estimated). I want to have an ANOVA summary table for the reduced model. Any ideas? Thanks in advance for the help! -- Dustin Fife PhD Student Quantitative Psychology University of Oklahoma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plz help. how to filter/group/sort data on mass data
I have know how to sort and filter and group. can anyone answer my another question? Is there any function in R like *lead *and *lag * in SQL. They are relative position function. We can use them to solve problem such as : on year-on-year basis, link relative ratio can anyone give a tips? -- View this message in context: http://r.789695.n4.nabble.com/plz-help-how-to-filter-group-sort-data-on-mass-data-tp4630714p4631653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plz help. how to filter/group/sort data on mass data
type ??lag at the R command line. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. bestbird bestbird7...@gmail.com wrote: I have know how to sort and filter and group. can anyone answer my another question? Is there any function in R like *lead *and *lag * in SQL. They are relative position function. We can use them to solve problem such as : on year-on-year basis, link relative ratio can anyone give a tips? -- View this message in context: http://r.789695.n4.nabble.com/plz-help-how-to-filter-group-sort-data-on-mass-data-tp4630714p4631653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to measure level of similarity of two data frames
Kel, in addition, and depending on how you define similarity, you might want to look into the RV coefficient as a measure of it (it is actually related to a correlation, so similarity would rather mean similar information though not necessarily small Euclidean distance); coeffRV in FactoMineR would be one option to determine it. HTH, Michael -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Lamke Sent: Samstag, 26. Mai 2012 20:05 To: r-help@r-project.org Subject: [R] How to measure level of similarity of two data frames Hi group, I've been thinking of calculating euclidean distance between each column of a data frames that each consists of standardized numerical columns. However, I don't know if there's a way of summarizing the overall distance by some kind of metrics. If anyone know a proper way of doing so and/or a package I would greatly appreciate your suggestions. Thanks very much! Kel -- View this message in context: http://r.789695.n4.nabble.com/How-to- measure-level-of-similarity-of-two-data-frames-tp4631466.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] community finding in a graph and heatplot
Hi everyone, I am using the fastgreedy.community function to get the $merges matrix and the $modularity vector. This serves my purpose of testing modularity of my graph. But I am greedy to plot the heat map and dendrrogram based on the $merges dendogram matrix. I know that heatplot does the graphics part but I am not sure if the dendogram generated by the heatplot will match the one given by fastgreedy.community in all cases and that the heat map will represent the same clustering. Tell me if my apprehension is incorrect. Otherwise please let me know of any alternatives. Here is the code I am testing so far: # http://igraph.sourceforge.net/doc/R/modularity.html # http://igraph.sourceforge.net/doc/R/fastgreedy.community.html # http://igraph.sourceforge.net/doc/R/graph.constructors.html library(igraph) library(made4) g - graph(c(1,2, 2,3, 3,1, 4,5)-1, , FALSE) print(g) ModuleInfo - fastgreedy.community(g) print(ModuleInfo) heatplot(c(1,2, 2,3, 3,1, 4,5)) Thanks Fayez Grad student UIUC IL, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.