Re: [R] install local packages
[EMAIL PROTECTED] wrote: Hello all, I'm trying to install the local package under window system. Two ways I've tried: 1. using the menupackages install package(s) from local zip files My .zip file is mclust.zip. But it shows Errors which are: Error in gzfile(file,r): unable to open connection In addition: Warning messages: 1.error -1 in extracting from zip file 2.cannot open compressed file 'mclust/DESCRIPTION' Probably you got the wrong version of mclust. (you have not told version of R, version of mclust nor where you got mclust from). Currently, CRAN has mclust_2.1-11.zip in its binary section for R-2.2.x for Windows. Hence simply type install.packages(mclust) with a running internet connection or download the recent version from CRAN. Uwe Ligges 2. using function install.packages. the command I use is install.packages(mclust.zip,D:\sfu\BC project\clustering project\stuff from Jeffrey\flowCytometryClustering,repos=NULL,destdir=C:\Program Files\R\rw2011\library) But error is object mclust.zip not found. Could you please help me how I can install the local packages? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install local packages
1.: Did you try to extract the zip file manually? It seem that something is wrong with the zip file, not the procedure. Try downloading (compiling) it again. 2.: You should not use \ in your paths, you should either \\ of / instead. Best, Ales Ziberna [EMAIL PROTECTED] pravi: Hello all, I'm trying to install the local package under window system. Two ways I've tried: 1. using the menupackages install package(s) from local zip files My .zip file is mclust.zip. But it shows Errors which are: Error in gzfile(file,r): unable to open connection In addition: Warning messages: 1.error -1 in extracting from zip file 2.cannot open compressed file 'mclust/DESCRIPTION' 2. using function install.packages. the command I use is install.packages(mclust.zip,D:\sfu\BC project\clustering project\stuff from Jeffrey\flowCytometryClustering,repos=NULL,destdir=C:\Program Files\R\rw2011\library) But error is object mclust.zip not found. Could you please help me how I can install the local packages? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] New to R
Wellcome. Hope you will enjoy it. Petr. PLEASE do read the posting guide! On 22 Mar 2006 at 15:24, [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Date sent: Wed, 22 Mar 2006 15:24:29 -0600 Subject:[R] New to R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sqlSave
Dear Tobias, I finally succeeded in exporting my dataframes from R into Access with this script: library(RODBC) canal - odbcConnectAccess(D:/Floristique.mdb) sqlSave(channel=canal, dat=flore, tablename=Floristique, rownames=F, safer=F, fast=F, varTypes=c(dates=Date)) odbcClose(canal) My problem in exporting my dataframe flore seems to be closely linked with the name of the column of dates in my dataframe, which was formerly date and which I changed into dates. Since my exporting worked (?). The varTypes argument works as described in the sqlSave() help page: an optional named character vector giving the DBMSs datatypes to be used for some (or all) of the columns if a table is to be created. However, I have only been able to export my dataframe with the dates column as Date; not as POSIX, though dates are in POSIX when importing into R with sqlQuery (?). I have still questions about the functionning of these functions, but I have to acknowledge that I haven't yet reviewed many things about the subject. Best regards, jacques Brandt, T. (Tobias) a écrit : Dear Jacques I refer to your post on r-help on 20 March 2006. I see that you had some trouble with sqlSave. I also noticed that you said you tried to use varTypes but didn't understand it properly. I've had the same problem in that I'm trying to use sqlSave to save my dataframes which contain dates to a database but have run into problems. My solution until now has been to save the dates as characters and then convert them to dates with a query. However I would prefer to import them as dates in the first place. Perhaps this can be done with varTypes but I have been unable to figure out how to use this properly. (...) I look forward to your response. Kind regards Tobias Brandt -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacques VESLOT Sent: 20 March 2006 12:30 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] sqlSave OK, I finally found what's wrong - date column name. Jacques VESLOT a écrit : Dear R-users, I tried to export a dataframe form R to Access, that way: library(RODBC) channel - odbcConnectAccess(d:/test.mdb) sqlSave(channel=channel, flore, rownames=F) odbcClose(channel) But I always got this error message: Erreur dans sqlSave(channel = channel, flore, Florist) : [RODBC] ERROR: Could not SQLExecDirect 37000 -3553 [Microsoft][Pilote ODBC Microsoft Access] Erreur de syntaxe dans la définition de champ. Note that I succeeded in exporting this dataframe into Excel and then import Excel file from Access. I tried to find where the problem comes from by exporting columns one by one, as follows: library(RODBC) canal - odbcConnectAccess(d:/test.mdb) for (i in names(flore)) { cat(i) sqlSave(channel=canal, dat=flore[i]) } odbcClose(canal) I could export all columns but one, named date, which consists of dates. I tried to export this column as POSIX, as Date and even as character, but without success. I still had the same error message: dateErreur dans sqlSave(channel = canal, dat = flore[i]) : [RODBC] ERROR: Could not SQLExecDirect 37000 -3553 [Microsoft][Pilote ODBC Microsoft Access] Erreur de syntaxe dans la définition de champ. I also tried with varTypes, though I am not sure how to use this argument correctly. I did: sqlSave(channel=canal, dat=flore, varTypes=c(date=Date)) sqlSave(channel=canal, dat=flore, varTypes=c(date=Date/Heure)) But still the same error message. Maybe it's in Windows, but I don't understand... Thanks for helping, jacques R.version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Nedbank Limited Reg No 1951/09/06. The following link displays the names of the Nedbank Board of Directors and Company Secretary http://www.nedbank.co.za/terms/DirectorsNedbank.htm. This email is confidential and is intended for the addressee only. The following link will take you to _Nedbank's legal notice http://www.nedbank.co.za/terms/EmailDisclaimer.htm_. __
Re: [R] NLME Covariates
HLM question? Is there a minmum number of observations required for a category..I have individusals in work teams.I have incomplete data for all the teams ..sometimes I only have data for one person in a team.I assume that HLM can't work here! But what would be the mimimal.at the moment I have a sample of about 240 in about 100 teams with teamsizes form 2 to 5. Any advice? Thanks Robin _ Dr Robin Martin School of Psychology University of Queensland Brisbane, QLD 4072 Australia Level 1, Room 132, McElwain Bldg tel. +61 7 3365 6392 fax. +61 7 3365 4466 email [EMAIL PROTECTED] web-page http://www.psy.uq.edu.au/people/personal.html?id=275 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] a question related to clustering
Sounds like you should go to the Bioconductor website http://www.bioconductor.org/ and get plugged into their extensive and professionally written software instead of shaking and baking your own. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Li, Aiguo (NIH/NCI) [C] Sent: Wednesday, March 22, 2006 2:16 PM To: r-help@stat.math.ethz.ch Subject: [R] a question related to clustering Hello all, I have a fundamental question to ask related to clustering. As you know, when we do statistic analysis, such as t-test, or anova, we like to log transform our data to make it normally distributed. My question is shall we use signal intensity of affy chips to cluster or use log transformed data. The Dendrogram profiles are certainly different when comparing two type of the data. Thanks, AG [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install local packages
Where did you get mclust.zip? CRAN shows mclust_2.1-11.zip. Why not just let R install mclust directly from the CRAN mirror at SFU? Use the menu: Packages Install package(s). As to your install.packages(), you need forward slashes, contriburl=, and you probably want a different destdir (or just leave the default NULL). Peter Ehlers [EMAIL PROTECTED] wrote: Hello all, I'm trying to install the local package under window system. Two ways I've tried: 1. using the menupackages install package(s) from local zip files My .zip file is mclust.zip. But it shows Errors which are: Error in gzfile(file,r): unable to open connection In addition: Warning messages: 1.error -1 in extracting from zip file 2.cannot open compressed file 'mclust/DESCRIPTION' 2. using function install.packages. the command I use is install.packages(mclust.zip,D:\sfu\BC project\clustering project\stuff from Jeffrey\flowCytometryClustering,repos=NULL,destdir=C:\Program Files\R\rw2011\library) But error is object mclust.zip not found. Could you please help me how I can install the local packages? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install local packages
Take a look at: R for Windows FAQ http://cran.us.r-project.org/bin/windows/base/rw-FAQ.html [EMAIL PROTECTED] wrote: Hello all, I'm trying to install the local package under window system. Two ways I've tried: 1. using the menupackages install package(s) from local zip files My .zip file is mclust.zip. But it shows Errors which are: Error in gzfile(file,r): unable to open connection In addition: Warning messages: 1.error -1 in extracting from zip file 2.cannot open compressed file 'mclust/DESCRIPTION' 2. using function install.packages. the command I use is install.packages(mclust.zip,D:\sfu\BC project\clustering project\stuff from Jeffrey\flowCytometryClustering,repos=NULL,destdir=C:\Program Files\R\rw2011\library) But error is object mclust.zip not found. Could you please help me how I can install the local packages? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- U.M.A. http://sophie.fata.unam.mx/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install local packages
1. My best guess, and it is only a guess, is that the package wasn't built properly. That's assuming that when you used the menu, you navigated to the zip file. 2. Please review the documentation for the install.packages() function. When repos is NULL, your first argument is supposed to be a character vector of one or more file paths to the zip files. The second argument is not normally needed. Also, in this case, you should not use destdir. It's a temporary holding place used during the installation process. -Don At 12:33 PM -0800 3/22/06, [EMAIL PROTECTED] wrote: Hello all, I'm trying to install the local package under window system. Two ways I've tried: 1. using the menupackages install package(s) from local zip files My .zip file is mclust.zip. But it shows Errors which are: Error in gzfile(file,r): unable to open connection In addition: Warning messages: 1.error -1 in extracting from zip file 2.cannot open compressed file 'mclust/DESCRIPTION' 2. using function install.packages. the command I use is install.packages(mclust.zip,D:\sfu\BC project\clustering project\stuff from Jeffrey\flowCytometryClustering,repos=NULL,destdir=C:\Program Files\R\rw2011\library) But error is object mclust.zip not found. Could you please help me how I can install the local packages? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lme convergence
Dear All: lme(sbp ~ cengirth, data = family, random= ~ 1 | familyid) converges but lme(sbp ~ cengirth, data = family, random= ~ cengirth | familyid) does not. I get the following message: Error in lme.formula(sbp ~ cengirth, data = family, random = ~cengirth | : iteration limit reached without convergence (9) The data has 488 rows and 154 familyid levels. For some familyid levels, there is only one row and cengirth is 0 as it is a centered variable within familyid. I appreciate your pointing me to the source of the problem. Ashraf _ M. Ashraf Chaudhary, Ph.D. Department of International Health Johns Hopkins Bloomberg School of Public Health 615 N. Wolfe St. Room W5506 Baltimore MD 21205-2179 (410) 502-0741/fax: (410) 502-6733 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] about histograms
x - rnorm(1000) hist(x) hist(x,nclass=100) The two histograms look different and check the help file but still confused... sorry if it is too simple. Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] rnorm
just curious, why put r in the rnorm for The Normal Distribution? Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RMySQL's column limit
Dear R-users, First, thank you to the developers for the very useful R-library RMySQL. While using this library a recieved an error message: RS-DBI driver: (could not run statement: Too many columns) The statement that generated the error was: dbWriteTable(dbcon, simdataseries, template, overwrite = TRUE, row.names = FALSE ) I am assuming this is a RMySQL rather than MySQL limit. If that is the case I was wondering just what this limit is and if it is possible to raise it? Thanks again for all the hard work. Sincerely Mark -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Estimation of skewness from quantiles of near-normal distribution
I have summary statistics from many sets (10,000's) of near-normal continuous data. From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal. I have the raw data for a few (700) of these sets. I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot. My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data. Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics: 0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions. I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma. I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew. Thank you for any ideas. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] about read txt file
I have txt file and use the following command to read it to R: z - read.table(c:/temp/test.txt) and I type z but the output is totally a mess; the test.txt is strutured with the first line as the variable names and when use excel to open it, all the vaules are well positioned. why in R it is totally in a mess? Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] new to R
z - read.table(c:/temp/q.txt) Warning message: incomplete final line found by readTableHeader on 'c:/temp/q.txt' what does that mean? my q.txt is like: 1, 2, 3, 33, 44, 88, 23, 43, 69, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R 2.2.1: install packages
Hi, when using a laptop and trying to install packages for R for Windows 2.2.1 I get the following error utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- Warning: unable to access index for repository http://cran.dk.r-project.org/bin/windows/contrib/2.2 Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.2 Error in install.packages(NULL, .libPaths()[1], dependencies = TRUE, type = type) : no packages were specified This works fine with R for Windows 2.2.0. I have this problem only with my laptop. A similar problem occurs when I try to update my MikTeX installation on the laptop. Any suggestions? Hannu Kahra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] rnorm
linda.s wrote: just curious, why put r in the rnorm for The Normal Distribution? From ?rnorm Value: 'dnorm' gives the density, 'pnorm' gives the distribution function, 'qnorm' gives the quantile function, and 'rnorm' generates random deviates. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] read txt?
which kind of txt file can be read into R? can anyone give me an example? I used a txt file and the result is: Warning message: incomplete final line found by readTableHeader on 'c:/temp/q.txt' Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] about read txt file
linda.s a écrit : I have txt file and use the following command to read it to R: z - read.table(c:/temp/test.txt) and I type z but the output is totally a mess; the test.txt is strutured with the first line as the variable names and when use excel to open it, all the vaules are well positioned. why in R it is totally in a mess? Because R needs some precision on the data file, example: z - read.table(c:/temp/test.txt, header=TRUE,sep=,,na.strings=NA, dec=,, strip.white=TRUE) You can also take a look at ?read.table E. Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] about read txt file
On 3/23/06, Cuvelier Etienne [EMAIL PROTECTED] wrote: linda.s a écrit : I have txt file and use the following command to read it to R: z - read.table(c:/temp/test.txt) and I type z but the output is totally a mess; the test.txt is strutured with the first line as the variable names and when use excel to open it, all the vaules are well positioned. why in R it is totally in a mess? Because R needs some precision on the data file, example: z - read.table(c:/temp/test.txt, header=TRUE,sep=,,na.strings=NA, dec=,, strip.white=TRUE) You can also take a look at ?read.table E. Linda Thank you. There are many things for me to learn by tutorial. I will do it step by step. thanks! Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] read txt?
What kind of file are you trying to read in R? If it is comma delimited, then try, Z-read. table(c:/temp/q.txt, sep=,, header=TRUE) ##[header=TRUE if you have variable names in your file ## -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of linda.s Sent: Thursday, March 23, 2006 3:29 PM To: Chuck Cleland Cc: R-help@stat.math.ethz.ch Subject: [R] read txt? which kind of txt file can be read into R? can anyone give me an example? I used a txt file and the result is: Warning message: incomplete final line found by readTableHeader on 'c:/temp/q.txt' Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sampling in R
Hi all! I was wondering if you can help me with a little sampling issue I'm having in R. I have a database with 100 observations. I need to sample n=9 sample size, 200 times (i.e. get 200 samples of size 9). N (pop. size) =100 Each sample can't contain the same observation more than one time (i.e. the program needs to check if the obs. has already been sampled into the sample - it it has, it should put it back and sample again another obs.) obviously I need to do this with replacement. Then, I need to draw (I think the best is with a histogram) the distribution of the mean of each os the 200 samples. I guess I need to do a loop for the 'sample' function in R. I could really use your help. Regards, Noam Friedman. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] about histograms
Hi On 23 Mar 2006 at 0:02, linda.s wrote: Date sent: Thu, 23 Mar 2006 00:02:40 -0800 From: linda.s [EMAIL PROTECTED] To: [EMAIL PROTECTED] Copies to: R-help@stat.math.ethz.ch Subject:[R] about histograms x - rnorm(1000) hist(x) hist(x,nclass=100) From help page nclass numeric (integer). For S(-PLUS) compatibility only, nclass is equivalent to breaks for a scalar or character argument. breaks one of: a vector giving the breakpoints between histogram cells, a single number giving the number of cells for the histogram, ... e.g. without breaks defined (or nclass) hist uses hist(x, breaks = Sturges, ... I do not know details but number of breaks is definitely lower than 100. HTH Petr The two histograms look different and check the help file but still confused... sorry if it is too simple. Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sampling in R
Does the following what you want: x - 1:100 s - matrix(0, nrow=200, ncol=9) for (i in 1:200) { s[i, ] - sample(x, 9) } m - rowMeans(s) hist(m) The default behavior of sample is without replacement. Thomas Noam Friedman wrote: Hi all! I was wondering if you can help me with a little sampling issue I'm having in R. I have a database with 100 observations. I need to sample n=9 sample size, 200 times (i.e. get 200 samples of size 9). N (pop. size) =100 Each sample can't contain the same observation more than one time (i.e. the program needs to check if the obs. has already been sampled into the sample - it it has, it should put it back and sample again another obs.) obviously I need to do this with replacement. Then, I need to draw (I think the best is with a histogram) the distribution of the mean of each os the 200 samples. I guess I need to do a loop for the 'sample' function in R. I could really use your help. Regards, Noam Friedman. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sampling in R
On 3/23/06 5:52 AM, Noam Friedman [EMAIL PROTECTED] wrote: Hi all! I was wondering if you can help me with a little sampling issue I'm having in R. I have a database with 100 observations. I need to sample n=9 sample size, 200 times (i.e. get 200 samples of size 9). N (pop. size) =100 Each sample can't contain the same observation more than one time (i.e. the program needs to check if the obs. has already been sampled into the sample - it it has, it should put it back and sample again another obs.) obviously I need to do this with replacement. Then, I need to draw (I think the best is with a histogram) the distribution of the mean of each os the 200 samples. I guess I need to do a loop for the 'sample' function in R. That sounds correct. I could really use your help. x - rnorm(100) meanvec - vector() for (j in 1:200) { y - sample(x,9) meanvec[j] - mean(y) } hist(meanvec) If this example code doesn't get you started, then you will probably need to be more specific about where you are getting stuck. Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sampling in R
I prefer the function resample as it is given in the sample help, in this way: resample - function(x, size, ...) { if length(x = 1) {if (!missing(size) size == 0) x[FALSE] else x} else sample(x, size, ...) } Anyway, if an observation can not be twice in the sample, you have to use replace=FALSE, this is, not replacement. The solution of Thomas is very good, although you have to check if your data is going to be in the matrix as rows vectors or columns vector. If the former, user byrow=TRUE in the matrix function, if the later, thomas solution is perfect. Instead of using a for loop, I prefer: t(replicate(100, resample(x, 9))) if you want a matrix where the vectors are rows or replicate(100, resample(x, 9)) if you want a matrix where the vectors are columns. So the program can be: hist(rowMeans(replicate(100, resample(x, 9. I hope this is useful for you!! Thanks Regards, Pedro Canadilla Oracle DBA On 3/23/06, Thomas Petzoldt [EMAIL PROTECTED] wrote: Does the following what you want: x - 1:100 s - matrix(0, nrow=200, ncol=9) for (i in 1:200) { s[i, ] - sample(x, 9) } m - rowMeans(s) hist(m) The default behavior of sample is without replacement. Thomas Noam Friedman wrote: Hi all! I was wondering if you can help me with a little sampling issue I'm having in R. I have a database with 100 observations. I need to sample n=9 sample size, 200 times (i.e. get 200 samples of size 9). N (pop. size) =100 Each sample can't contain the same observation more than one time (i.e. the program needs to check if the obs. has already been sampled into the sample - it it has, it should put it back and sample again another obs.) obviously I need to do this with replacement. Then, I need to draw (I think the best is with a histogram) the distribution of the mean of each os the 200 samples. I guess I need to do a loop for the 'sample' function in R. I could really use your help. Regards, Noam Friedman. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RGui: windows-record and command history
a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. b) Scrolling up in RGui (windows 2000) to see past commands is nice, but: Is it possible to type eg wi and the arrow up and see the last command that started with wi (like windows()). I know this feature from Matlab (Uops, one of the forbidden words here? ;) ) and it's nice to have it. Thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] gray level values
# clear everything: rm(list=ls(all=TRUE)) detach() graphics.off() # make a test matrix: ( m- matrix((1:12)/12, 3, 4) ) library(gplots) # display the test matrix: image(m, col=colorpanel(12, darkblue, yellow, white), axes=FALSE) # here's the action part (no index checking, lame, illustrative code only, with advance apologies, c.): image.ij- function(i,j) return( t(m)[dim(t(m))[1]:1,][i,j] ) # we're seeing a 1 in the top right-hand corner of the image at coordinates: i- 1 j- 3 image.ij(i,j) # cf. what's in the matrix at those coordinates: m[i,j] Hello. I have a matrix whit n1 rows and n2 columns called example. If I do image(example), R shows me an image with 480x480 pixels. How can I obtain the gray level of the pixel of row i and column j? Thanks, Arnau. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Best Solution: difference between 2 dates: IN MONTHS the way Mothers compute it
Folks: If you need to determine the difference between 2 dates in months, the best solution is described by David Reiner, below. Phil Smith CDC -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Fri 3/10/2006 3:52 PM To: Smith, Phil; r-help@stat.math.ethz.ch Cc: Subject: RE: [R] difference between 2 dates: IN MONTHS the way Mothers compute it I just solved this for dealing with deliverables for the CBOT's treasury futures. length(seq(from=x, to=y, by=months))-1 David L. Reiner -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Smith, Phil Sent: Friday, March 10, 2006 12:20 PM To: r-help@stat.math.ethz.ch Subject: [R] difference between 2 dates: IN MONTHS the way Mothers compute it Hi R-people: I need a function to compute the number of months between 2 dates, in the same way a mother would do it. For example, if a kid is born on February 6, the number of months between that date and March 7 is exactly 1 month, although it is only 29 days. Thank you! Phil Smith CDC [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sampling in R
From: Sean Davis On 3/23/06 5:52 AM, Noam Friedman [EMAIL PROTECTED] wrote: Hi all! I was wondering if you can help me with a little sampling issue I'm having in R. I have a database with 100 observations. I need to sample n=9 sample size, 200 times (i.e. get 200 samples of size 9). N (pop. size) =100 Each sample can't contain the same observation more than one time (i.e. the program needs to check if the obs. has already been sampled into the sample - it it has, it should put it back and sample again another obs.) obviously I need to do this with replacement. Then, I need to draw (I think the best is with a histogram) the distribution of the mean of each os the 200 samples. I guess I need to do a loop for the 'sample' function in R. That sounds correct. I could really use your help. x - rnorm(100) meanvec - vector() for (j in 1:200) { y - sample(x,9) meanvec[j] - mean(y) } hist(meanvec) If the samples are not needed beyond the calculations Noam mentioned, then this might save some memory: samp.means - replicate(200, mean(sample(x, 9))) hist(samp.means) Andy If this example code doesn't get you started, then you will probably need to be more specific about where you are getting stuck. Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] X11, fonts, R-2.0.1, R-2.2.1 and R-devel
Hello, I am having some problems with the X11 display in a gentoo linux laptop with R compiled manually. (https://stat.ethz.ch/pipermail/r-help/2006-March/089701.html) Whether I can open the X11 device and use it when I am using 'ion' as a window manager, I can't open it using 'gnome', due to a problem related to fonts: -8--- Error in X11() : could not find any X11 fonts Check that the Font Path is correct. -8--- I have tried and compiled R-2.0.1 and _it works_. With the latest stable version of R it does not work. And with the latest development (22 march 06) it does not work, neither. It is not due to the Xorg installation, because the display is opened when using other window manager different from gnome (and other version of R) It is not something related to the compilation options, for the same reason. So my last option is that it seems to be a problem with R, X11 and gnome, specifically. Any ideas or suggestions? I have googled and somebody in the FreeBSD lists talked about more or less the same problems, but it seems that without success: http://www.archivum.info/[EMAIL PROTECTED]/2005-06/msg00313.html http://www.archivum.info/[EMAIL PROTECTED]/2005-06/msg00307.html I paste the sessions and the errors, if it helps: -8--- R-2.0.1 $ ./bin/R R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for a HTML browser interface to help. Type 'q()' to quit R. WARNING: UTF-8 locales are not currently supported X11() q() Save workspace image? [y/n/c]: n -8--- Works. The display is opened. -8--- R-2.2.1 $ ./bin/R R : Copyright 2005, The R Foundation for Statistical Computing Version 2.2.1 (2005-12-20 r36812) ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. X11() Error in X11() : could not find any X11 fonts Check that the Font Path is correct. q() -8--- Doesn't work. -8--- deu R-devel $ ./bin/R R : Copyright 2006, The R Foundation for Statistical Computing Version 2.3.0 Under development (unstable) (2006-03-22 r37566) ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. X11() Error in X11(display, width, height, pointsize, if (is.null(gamma)) 1 else gamma, : invalid 'width' or 'height' x - rnorm(50) y - rnorm(50) plot(x,y) Error in X11(display, width, height, pointsize, if (is.null(gamma)) 1 else gamma, : invalid 'width' or 'height' X11(width=200, height=200) Error in X11(width = 200, height = 200) : could not find any X11 fonts Check that the Font Path is correct. -8--- -- Xavier Fernández i Marín __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cross correlation in time series
Hi list, I'm working on time series of (bio)physical data explaining (or not) the net ecosystem exchange of a system (+_ CO2 in versus CO2 out balance). I decomposed the time series of the various explaining variable according to scale (wavelet decomposition). With the coefficients I got from the wavelet decomposition I applied a (multiple) regression, giving some expected results. The net ecosystem exchange (CO2 balance) is mostly determined by the light regime (driving photosynthesis) and the water availability (inducing stress if absent), and this over all the scales. So sadly, pinpointing a certain scale on a certain process wasn't possible. I had hoped to see for example a relation between air temperature and a response of the vegetation/ecosystem, and this for a certain scale. Global trends are present but short term responses are not present. Using regressions in this previous analysis I thought that considering that some processes do show some lag if it comes to showing a response on a changing variable a one on one regression of time series might have been the wrong approach because this states that there is also an almost direct one in one relation between action and reaction. So what I'm looking for is a method to determine that after let's say 5 warm days, the net ecosystem exchange peaks as well. Any suggestions on how to determine a certain, lag between time series. I found a post in the archives discussing convolve() but this doesn't seem the right thing. Also ccf() is mentioned, does this calculate a correlation coefficient as two time series are shifted past eachother for certain lag distances or am I mistaken? If so, what is the implication of using a smaller and smaller sampling window (increasing your time resolution say going from year level to month to day level)? Any input on how to compare two time series would be appreciated... Sorry for the rather theoretical/technical post. Best regards, Koen Hufkens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] front- end problem while using nnet and tune.nnet
Dear R people, I am using tune.nnet from e1071 package to tune the parameters for nnet. I am using the following syntax: tuneclass -c(rep(1,46),rep(2,15)) tunennet -tune.nnet(x=traindata,y=tuneclass,size=c(50,75,100),decay=c(0,0.005,0.010),MaxNWts = 2) Here traindata is the training data that I want to tune for nnet which is a matrix with 61 rows(samples) and 200 columns(genes). tuneclass is the class indicator vector which says that the first 46 samples of the traindata are of class 1 and the next 15 samples are of class 2 type. The problem that I encountered was, when I used a higher size like 50 or more then it gave me the following error: R for Windows terminal front-end has encountered a problem and needs to close. We are sorry for the inconvenience. I didn't get this error for size 25. But even in such cases the program took huge amount of time. I really don't understand what the problem is and how to solve this. Can anyone please help me asap? Thanks in advance, Regards, Madhurima. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. b) Scrolling up in RGui (windows 2000) to see past commands is nice, but: Is it possible to type eg wi and the arrow up and see the last command that started with wi (like windows()). I know this feature from Matlab (Uops, one of the forbidden words here? ;) ) and it's nice to have it. We have things like that on platforms that use the readline library for input, but Rgui doesn't. It would be nice, but it's a fair bit of work to implement properly and it's not on my todo list. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Create graphs from text console
Hi I am using R 2.2.1 under Linux (SuSE 10) and would like to know if it is possible to create graphs, i.e. jpeg(filename=fn) try( coplot( mor$I_Morisita ~ mor$Year | mor$RunStr, show.given=FALSE) ) dev.off() from a text console? It gives me an error message on the jpeg() command: Error in X11(..snip..) unable to start device jpeg In addition: warning message: unable to open connection to X11 display. Are there any ways to create the plot and save it into a jpeg file from a text console? Thanks, Rainer -- -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell:+27 - (0)83 9479 042 email:[EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Thu, 23 Mar 2006 08:48:52 -0500 writes: Duncan On 3/23/2006 7:35 AM, Thomas Steiner wrote: b) Scrolling up in RGui (windows 2000) to see past commands is nice, but: Is it possible to type eg wi and the arrow up and see the last command that started with wi (like windows()). I know this feature from Matlab (Uops, one of the forbidden words here? ;) ) and it's nice to have it. Duncan We have things like that on platforms that use the Duncan readline library for input, but Rgui doesn't. Also note that ESS (Emacs Speaks Statistics) has exactly this feature on all platforms ! Martin Maechler, ETH Zurich Duncan It would be nice, but it's a fair bit of work to Duncan implement properly and it's not on my todo list. Duncan Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Duncan Murdoch wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. And also, remember that a single graph history is shared by the various graph windows. This may lead to unexpected results in you work with several devices at once. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. That would be really great! Philippe Grosjean b) Scrolling up in RGui (windows 2000) to see past commands is nice, but: Is it possible to type eg wi and the arrow up and see the last command that started with wi (like windows()). I know this feature from Matlab (Uops, one of the forbidden words here? ;) ) and it's nice to have it. We have things like that on platforms that use the readline library for input, but Rgui doesn't. It would be nice, but it's a fair bit of work to implement properly and it's not on my todo list. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Default lag.max in ACF
Hi, The default value for lag.max in ACF implementation is 10*log10(N) There several publications recommending setting lag.max to: - N/4 (Box and Jenkins, 1970; Chatfield, 1975; Anderson, 1976; Pankratz, 1983; Davis, 1986; etc.) - sqrt(N)+10 (Cryer, 1986) - 20=N=40 (Brockwell and Davis) Why R uses 10*log10(N) as a default? Please, give me a reference to a book or article where the recommendation for using lag.max=10*log10(N) is proposed and explained. Thanks Rafal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] clogit question
Hi, I am playing with clogit(case~spontaneous+induced+strata(stratum),data=infert) from clogit help file. This line works. 1. But, why strata(stratum) doesn't have a coefficient like spontaneous and induced? 2. When I remove strata(stratum) from the command, this function seems to keep running forever. Why? 3. I think the equation for clogit looks like P=1/(1+ exp(-1*(a+bx+cy+.)) In this example, I think the spontaneous is x, induced is y. So, b is the coefficient for spontaneous and c is coefficient for induced. Where can I find a? Thank you. Daniel Chan Meteorologist Georgia Forestry Commission P O Box 819 Macon, GA 31202 Tel: 478-751-3508 Fax: 478-751-3465 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lme plot
Hi all, I have a questions regarding mixed effects models: I'm trying to plot e.g. fitted vs residuals for each level of the random effects, and i'm getting the same error. I guess this might be a problem of the graphic capabilities of R. Is there any way to obtain those plots? library(nlme) attach(ergoStool) names(ergoStool) [1] effort TypeSubject m1-lme(effort~Type,random=~1|Subject) plot(m1,form=resid(.,type=p)~fitted(.)|Subject,abline=0)#resid and fitted for each level of subject Error in as.vector(x, list) : cannot coerce to vector Thanks leonardo Leonardo D. Bacigalupe Department of Animal and Plant Sciences University of Sheffield Sheffield S10 2TN United Kingdom [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RMySQL's column limit
Hi, Mark Van De Vyver wrote: Dear R-users, First, thank you to the developers for the very useful R-library RMySQL. While using this library a recieved an error message: RS-DBI driver: (could not run statement: Too many columns) The statement that generated the error was: dbWriteTable(dbcon, simdataseries, template, overwrite = TRUE, row.names = FALSE ) I am assuming this is a RMySQL rather than MySQL limit. We need more information, e.g., what's dim(template), what version of MySQL you're using, etc. If that is the case I was wondering just what this limit is and if it is possible to raise it? Thanks again for all the hard work. Sincerely Mark -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- David __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] still unclear about the parameters of Moran's I
I have read the introductions on Moran of SPDEP package, but still unclear about the parameters of Moran's I, and can't calculate the Moran's I. For example,I have a dataset like the following(only an example): longitude latitudex 110.23 32.53 10 109.52 33.2120 I want to use the moran(x, listw, n, S0, zero.policy=FALSE, NAOK=FALSE),and i can't make clear the meaning of the parameters listw and S0 because of my poor understanding on R, could anybody give me some programs to show how to calculate Moran's I? thanks in advance! -- Kind Regards __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] invalid variable type in model.frame within a function
Dear expeRts, I came across the following error in using model.frame: # make a data.frame jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) # spec of formula mf1=y~x1+x2 # make the model.frame mf=model.frame(formula=mf1,data=jet,weights=rvar) Which gives the desired output: mf y x1 x2 (weights) 1 0.8041254 0.1815366 0.4999551 1.4957814 2 -0.2546224 1.9368141 -2.2373186 0.7579341 3 0.8627935 -0.6690416 1.3948077 -0.2107092 4 0.3951245 0.5733776 -1.2926074 -0.3289226 5 -1.4805766 -0.6113256 1.1635959 0.2300376 6 -0.7418800 -0.1610305 0.4057340 -0.2280754 7 -1.1420962 -0.9363492 -0.4811192 -0.9258711 8 0.3507427 1.8744646 1.3227931 0.5292313 9 1.4196519 0.1340283 -1.3970614 -0.7189726 10 -1.0164708 -0.2044681 -0.6825873 -0.1719102 However, doing this inside another function like this: makemodelframe - function(formula,data,weights) { mf=model.frame(formula=formula,data=data,weights=weights) mf } produces the following error: makemodelframe(mf1,jet,weights=rvar) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type Searching the R-help archives I came across bug-reports about this but couldn't figure out whehter the bug was solved or whether there are work-arounds available. platform: platform powerpc-apple-darwin7.9.0 arch powerpc os darwin7.9.0 system powerpc, darwin7.9.0 status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R Any hints are welcome, Best, Ingmar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? Is your todo list published anywhere? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Duncan Murdoch wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. [snip] Duncan, This may be asking too much, but would it be possible to consider implementing selective graph removal from the graph history? I use graph.record=TRUE frequently for comparing graphs, but often find that I'd like to kill one of the graphs while keeping others in memory. Peter Ehlers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] gam y-axis interpretation
Sorry if this is an obvious question... I'm estimating a simple binomial generalized additive model using the gam function in the package mgcv. The model makes sense given my data, and the predicted values also make sense given what I know about the data. However, I'm having trouble interpreting the y-axis of the plot of the gam object. The y-axis is labeled s(x,2.52) which I understand to basically mean a smoothing estimator with approximately 2.52 degrees of freedom. The y-axis in my case ranges from -2 to 6 and I thought that it would be possible to convert the Y axis estimate to a probability via exp(Y)/(1+exp(Y)). So for instance, my lowest y-axis estimate is -2 for a probability of: exp(-2)/(1+exp(-2)) [1] 0.1192029 However, if I use the predict function my lowest estimate is -3.53862893 for a probability of 2.8%. The 2.8% estimate is a much better estimate than 11.9% given my specific data, so I'm clearly not interpreting the plot correctly. The help files say plot.gam provides the component smooth functions that make it up, on the scale of the linear predictor. I'm just not sure what that description means. Does someone have another description that might help me grasp the plot? Similar plots are on page 286 of Venables and Ripley (3rd Edition)... Thanks, Paul [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] clogit question
On Thu, 23 Mar 2006, Dan Chan wrote: Hi, I am playing with clogit(case~spontaneous+induced+strata(stratum),data=infert) from clogit help file. This line works. Yes, this is one of the nice features of R (that the examples work). 1. But, why strata(stratum) doesn't have a coefficient like spontaneous and induced? Because that's the whole point of conditional logistic regression. It is used in situations where the stratum coefficients can't be estimated reliably and where using ordinary maximum likelihood would give the wrong answer. When it is called conditional logistic regression it is usually used in matched case-control studies. [Econometricians use the same estimator but by a different name] 2. When I remove strata(stratum) from the command, this function seems to keep running forever. Why? It is effectively putting all the observations in the same stratum. The computation required is exponential in the number of observations per stratum and thus will take, to a first approximation, forever. 3. I think the equation for clogit looks like P=1/(1+ exp(-1*(a+bx+cy+.)) In this example, I think the spontaneous is x, induced is y. So, b is the coefficient for spontaneous and c is coefficient for induced. Where can I find a? You can't. Conditional logistic regression gives only the odds ratios, exp(b) and exp(c). Since the infert data come from a matched case-control study you couldn't get meaningful probabilities out of them anyway. Conditional logistic regression is a specialised and unusual estimator. It is theoretically interesting for being a genuinely useful example of an estimator that is consistent in a problem where the MLE is inconsistent. Apart from that it is of interest mostly to epidemiologists. If you really need to (or just want to) understand it you should read up on case-control studies. If you are near a university with a medical school they will have Breslow, N. E. and N. E. Day (1980). Statistical Methods in Cancer Research: Vol. 1 - The Analysis of Case-Control Studies. Lyon, France, IARC Scientific Publications. which I think is the best reference. There's probably stuff on the web, too. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] clogit question
Dear Dan, I think you need more (theorical) background here.. clogit() in package survival performs conditional logistic regression where you have several groups (the strata, the matched sets). There is an intercept for each stratum in the model, but you do not obtain them since estimation is carried out via conditional likelihood, i.e. *given* the sufficient statistics for the intercepts themselves. Have a look to standard book on categorical data analysis. If you need estimates for the intercepts (try to) use glm(): glm(case~spontaneous+induced+stratum,data=infert,data=binomial) best, vito Dan Chan wrote: Hi, I am playing with clogit(case~spontaneous+induced+strata(stratum),data=infert) from clogit help file. This line works. 1. But, why strata(stratum) doesn't have a coefficient like spontaneous and induced? 2. When I remove strata(stratum) from the command, this function seems to keep running forever. Why? 3. I think the equation for clogit looks like P=1/(1+ exp(-1*(a+bx+cy+.)) In this example, I think the spontaneous is x, induced is y. So, b is the coefficient for spontaneous and c is coefficient for induced. Where can I find a? Thank you. Daniel Chan Meteorologist Georgia Forestry Commission P O Box 819 Macon, GA 31202 Tel: 478-751-3508 Fax: 478-751-3465 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Is your todo list published anywhere? No, it's too embarrassing how long some things have been sitting on it. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Are there R-level facilities to manipulate the history, not just the top? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] nlme for groupedData with inner and outer factors
Hello, I am having trouble specifying a suitable nlme model. My data structure is described by gd - groupedData(ppath ~ lcut | exp, outer = ~ bait, inner = ~ weight, data = d) i.e. the response (ppath) of several subjects (sub) was measured at levels of a continuous variable (lcut). Subjects were given either of one level of a factor (bait), and all subjects were measured at two levels of another factor (weight). Therefore bait varies among subjects and weight varies within subjects. The relationship ppath ~ cut for each subject and weight appear to follow a logistic curve, with xmid and scal affected by bait and weight. There is also a random effect of subject on xmid and scal. Any help with formulating the correct model would be greatly appreciated. Many thanks, Dan Bebber Department of Plant Sciences University of Oxford p.s. Part of my data are shown below: sublcut ppath bait weight 1 pv1_ 0.0 1.01 0 2 pv1_ 0.1 0.8277738211 0 3 pv1_ 0.2 0.3801025021 0 4 pv1_ 0.3 0.2091518781 0 5 pv1_ 0.4 0.0769293041 0 6 pv1_ 0.5 0.0656815641 0 7 pv1_ 0.6 0.0206701081 0 8 pv1_ 0.7 0.0128170211 0 9 pv1_ 0.8 0.0086615141 0 10 pv1_ 0.9 0.0115683231 0 11 pv19 0.0 1.01 0 12 pv19 0.1 0.6683902911 0 13 pv19 0.2 0.3433184621 0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] generating multivariate autocorrelated time series
If you can specify an ARMA or state-space model for the vector process, then you can do this with simulate in the dse bundle. Paul Gilbert Thomas Petzoldt wrote: Hello expeRts, for an application in hydrology I need to generate multivariate (log)normally distributed time series with given auto- and cross-correlations. While this is simple for the univariate case (e.g. with conditional normal sampling) it seems to be not so trivial for multivariate time series (according to papers available about this topic). An example: I have several (e.g. 3) time series (which are, of course, *correlated* measurements in reality): z - ts(matrix(rnorm(300), 100, 3), start=c(1961, 1), frequency=12) and I want to get the vector for the next time step(s): z[n+1, 1:3] respecting the autocorrelations from that matrix up to a given lag value: a - acf(z, lag=2) My question: Does anybody know about a solution (function, package, example etc...) available in R? Thanks a lot! Thomas P. --- http://tu-dresden.de/Members/thomas.petzoldt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] conservative robust estimation in (nonlinear) mixed models
Conservative robust estimation methods do not appear to be currently available in the standard mixed model methods for R, where by conservative robust estimation I mean methods which work almost as well as the methods based on assumptions of normality when the assumption of normality *IS* satisfied. We are considering adding such a conservative robust estimation option for the random effects to our AD Model Builder mixed model package, glmmADMB, for R, and perhaps extending it to do robust estimation for linear mixed models at the same time. An obvious candidate is to assume something like a mixture of normals. I have tested this in a simple linear mixed model using 5% contamination with a normal with 3 times the standard deviation, which seems to be a common assumption. Simulation results indicate that when the random effects are normally distributed this estimator is about 3% less efficient, while when the random effects are contaminated with 5% outliers the estimator is about 23% more efficient, where by 23% more efficient I mean that one would have to use a sample size about 23% larger to obtain the same size confidence limits for the parameters. Question? I wonder if there are other distributions besides a mixture or normals. which might be preferable. Three things to keep in mind are: 1.) It should be likelihood based so that the standard likelihood based tests are applicable. 2.) It should work well when the random effects are normally distributed so that things that are already fixed don't get broke. 3.) In order to implement the method efficiently it is necessary to be able to produce code for calculating the inverse of the cumulative distribution function. This enables one to extend methods based one the Laplace approximation for the random effects (i.e. the Laplace approximation itself, adaptive Gaussian integration, adaptive importance sampling) to the new distribution. Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R 2.2.1: install packages
Hannu Kahra wrote: Hi, when using a laptop and trying to install packages for R for Windows 2.2.1 I get the following error utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- Warning: unable to access index for repository http://cran.dk.r-project.org/bin/windows/contrib/2.2 Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.2 Error in install.packages(NULL, .libPaths()[1], dependencies = TRUE, type = type) : no packages were specified This works fine with R for Windows 2.2.0. I have this problem only with my laptop. A similar problem occurs when I try to update my MikTeX installation on the laptop. This suggests that it is not an R nor a MikTeX problem but a problem with the internet connection of your laptop, hence completely off-topic here. Please check your network settings and your firewall / proxy stuff. Uwe Ligges Any suggestions? Hannu Kahra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Create graphs from text console
Rainer M Krug wrote: Hi I am using R 2.2.1 under Linux (SuSE 10) and would like to know if it is possible to create graphs, i.e. jpeg(filename=fn) try( coplot( mor$I_Morisita ~ mor$Year | mor$RunStr, show.given=FALSE) ) dev.off() from a text console? It gives me an error message on the jpeg() command: Error in X11(..snip..) unable to start device jpeg In addition: warning message: unable to open connection to X11 display. Are there any ways to create the plot and save it into a jpeg file from a text console? Via bitmap() (i.e. postscript with ghostscript postprocessing). Uwe Ligges Thanks, Rainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] warning message using lmer()?
Dear all, I use lmer to fit a mixed effect model.It give some warnings. What can I do about this? Here is the function and the warning message: model.growth.mcas5 - lmer(response ~ monthElapsed + (monthElapsed|studentID), +data= mcas5, family=binomial(link=logit), method='ML') Warning messages: 1: nlminb returned message false convergence (8) in: LMEopt(x = mer, value = cv) 2: nlminb returned message false convergence (8) in: LMEopt(x = mer, value = cv) 3: nlminb returned message false convergence (8) in: LMEopt(x = mer, value = cv) I tried to increase maximum iteration limit by adding control=list(msMaxIter=600, maxIter=300, PQLmaxIt=100) That seems not help. What can I do next? Thanks all! Mingyu Feng Department of Computer ScienceTutoring Research Group Worcester Polytechnic Institute __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to compare areas under ROC curves calculated with ROCR package
The seROC routine you included is an very good approximation to the standard error of the Mann-Whitney-Wilcoxon/Area under the ROC curve statistic. It is derived from negative exponential models, but works very well in general (e.g. Hanley and McNeil, Diagnostic Radiology, 1982, v. 143, p. 29). A more general estimator of the variance is given by Campbell, Douglas and Bailey, Proc. Computers in Cardiology, 1988, p.267) I've implemented that in R code included below. It is not an unbiased estimator, but it is very close. The cROC function is probably not what you want, however. It assumes that the data from the two different area measures are independent. You said your measures are from the same dataset. Your different AUC measures will be highly correlated. There are a number of methods to deal with correlated ROC curves in existence. If you are interested in performing hypothesis testing on the difference in AUC of two parameters, I would suggest a permutation test. Permuting the ranks of the data between parameters is simple and works well. -Frank ## AuROC-function(neg,pos) { #empirical Area under ROC/ Wilcoxon-Mann- stat. # Also calculate the empirical variance thereof. Goes as O(n*log(n)). nx-length(neg); ny-length(pos); nall-nx+ny; rankall-rank(c(neg,pos)) # rank of all samples with respect to one another. rankpos-rankall[(nx+1):nall];# ranks of the positive cases ranksum -sum(rankpos)-ny*(ny+1)/2 #sum of ranks of positives among negs. ranky-rank(pos); ## ranks of just the y's (positives) among themselves rankyx-rankpos-ranky# ranks of the y's among the x's (negatives) p21-sum(rankyx*rankyx-rankyx)/nx/(nx-1)/ny; #term in variance rankx-rank(neg); ## ranks of x's (negatives) among each other ## reverse ranks of x's with respect to y's. rankxy- ny- rankall[1:nx]+ rankx ; p12- sum(rankxy*rankxy-rankxy)/nx/ny/(ny-1); #another variance term a-ranksum/ny/nx; # the empirical area v-(a*(1-a)+(ny-1)*(p12-a*a) + (nx-1)*(p21-a*a))/nx/ny; c(a,v); # return vector containing Mann-Whitney stat and the variance. } Laurent Fanchon wrote: Dear all, I try to compare the performances of several parameters to diagnose lameness in dogs. I have several ROC curves from the same dataset. I plotted the ROC curves and calculated AUC with the ROCR package. I would like to compare the AUC. I used the following program I found on R-help archives : From: Bernardo Rangel Tura Date: Thu 16 Dec 2004 - 07:30:37 EST seROC-function(AUC,na,nn){ a-AUC q1-a/(2-a) q2-(2*a^2)/(1+a) se-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) se } cROC-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ se1-seROC(AUC1,na1,nn1) se2-seROC(AUC2,na2,nn2) sed-sqrt(se1^2+se2^2-2*r*se1*se2) zad-(AUC1-AUC2)/sed p-dnorm(zad) a-list(zad,p) a } The author of this script says: The first function (seROC) calculate the standard error of ROC curve, the second function (cROC) compare ROC curves. What do you think of this script? Is there any function to do it better in ROCR? Any help would be greatly appreciated. Laurent Fanchon DVM, MS Ecole Nationale Vétérinaire d'Alfort FRANCE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] AIC mathematical artefact or computation problem ?
Dear R user, I have made many logistic regression (glm function) with a second order polynomial formula on a data set containing 440 observation of 96 variables. I’ve made the plot of AIC versus the frequency (presence/observations) of each variable and I obtain a nearly perfect arch effect with a symmetric axe for a frequency of 0.5 . I obtain the same effect with deterministic data. Maybe I’ve miss something, but I have found nothing that could explain this in the theoretical calculation. Could it be due to the computation under R or AIC value is a function of frequency ? Thanks for your consideration Lionel Humbert PhD candidate Inter-University Forest Ecology Research Group University of Quebec in Montreal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Intercepts in linear models.
A colleague asked me if there is a way to specify with a ***variable*** (say ``cflag'') whether there is an intercept in a linear model. She had in mind something like lm(y ~ x - cflag) where cflag could be 0 or 1; if it's 0 an intercept should be fitted, if it's 1 then no intercept. This doesn't work ``of course''. The cflag just gets treated as another predictor and because it is of the wrong length an error is generated. The best I could come up with was lm(as.formula(paste(y ~ x -,cflag))) Which is kludgy. It (of course!) also be done using an if statement. Is there a sexier way? cheers, Rolf Turner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] invalid variable type in model.frame within a function
Examine the source code of lm to determine the proper way of doing this. Following that gives: makemodelframe - function(formula,data,weights) { mf - match.call() mf[[1]] - as.name(model.frame) eval(mf, parent.frame()) } On 3/23/06, Ingmar Visser [EMAIL PROTECTED] wrote: Dear expeRts, I came across the following error in using model.frame: # make a data.frame jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) # spec of formula mf1=y~x1+x2 # make the model.frame mf=model.frame(formula=mf1,data=jet,weights=rvar) Which gives the desired output: mf y x1 x2 (weights) 1 0.8041254 0.1815366 0.4999551 1.4957814 2 -0.2546224 1.9368141 -2.2373186 0.7579341 3 0.8627935 -0.6690416 1.3948077 -0.2107092 4 0.3951245 0.5733776 -1.2926074 -0.3289226 5 -1.4805766 -0.6113256 1.1635959 0.2300376 6 -0.7418800 -0.1610305 0.4057340 -0.2280754 7 -1.1420962 -0.9363492 -0.4811192 -0.9258711 8 0.3507427 1.8744646 1.3227931 0.5292313 9 1.4196519 0.1340283 -1.3970614 -0.7189726 10 -1.0164708 -0.2044681 -0.6825873 -0.1719102 However, doing this inside another function like this: makemodelframe - function(formula,data,weights) { mf=model.frame(formula=formula,data=data,weights=weights) mf } produces the following error: makemodelframe(mf1,jet,weights=rvar) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type Searching the R-help archives I came across bug-reports about this but couldn't figure out whehter the bug was solved or whether there are work-arounds available. platform: platform powerpc-apple-darwin7.9.0 arch powerpc os darwin7.9.0 system powerpc, darwin7.9.0 status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R Any hints are welcome, Best, Ingmar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] DIfference between weights options in lm GLm and gls.
In my tests, gls did NOT give the same answers as lm and glm, and I don't know why; perhaps someone else will enlighten us both. I got the same answers from lm and glm. Since you report different results, please supply a replicatable example. I tried the following: set.seed(1) DF - data.frame(x=1:8, xf=rep(c(a, b), 4), y=rnorm(8), w=1:8, one=rep(1,8)) fit.lm.w - lm(y~x, DF, weights=w) fit.glm.w - glm(y~x, data=DF, weights=w) fit.gls.w - gls(y~x, data=DF, weights=varFixed(~w)) coef(fit.lm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.glm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.gls.w) (Intercept) x -0.5924727 0.1608727 I also tried several variants of this. I know this does not answer your questions, but I hope it will contribute to an answer. spencer graves Goeland wrote: Dear r-users, Can anyone explain exactly the difference between Weights options in lm glm and gls? I try the following codes, but the results are different. lm1 Call: lm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 lm2 Call: lm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.04193 7.30660 lm3 Call: lm(formula = ys ~ Xs - 1) Coefficients: Xs Xsx 0.04193 7.30660 Here ys= y*sqrt(W), Xs- sqrt(W)*cbind(1,x) So we can see weights here for lm means the scale for X and y. But for glm and gls I try glm1 Call: glm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1049000 Residual Deviance: 28210AIC: 7414 glm2 Call: glm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.1955 7.3053 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1548000 Residual Deviance: 44800AIC: 11670 glm3 Call: glm(formula = y ~ x, weights = 1/W) Coefficients: (Intercept)x 0.03104 7.31033 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 798900 Residual Deviance: 19900AIC: 5285 glm4 Call: glm(formula = ys ~ Xs - 1) Coefficients: XsXsx 2.687 6.528 Degrees of Freedom: 1243 Total (i.e. Null); 1241 Residual Null Deviance: 449 Residual Deviance: 506700 AIC: 11000 With weights, the glm did not give the same results as lm why? Also for gls, I use varFixed here. gls3 Generalized least squares fit by REML Model: y ~ x Data: NULL Log-restricted-likelihood: -3737.392 Coefficients: (Intercept) x 0.03104214 7.31032540 Variance function: Structure: fixed weights Formula: ~W Degrees of freedom: 1243 total; 1241 residual Residual standard error: 4.004827 gls4 Generalized least squares fit by REML Model: ys ~ Xs - 1 Data: NULL Log-restricted-likelihood: -5500.311 Coefficients: Xs Xsx 2.687205 6.527893 Degrees of freedom: 1243 total; 1241 residual Residual standard error: 20.20705 We can see the relation between glm and gls with weight as what I think, but what's the difference between lm wit gls and glm? why? Thanks so much.! Goeland Goeland [EMAIL PROTECTED] 2006-03-16 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Martin == Martin Maechler [EMAIL PROTECTED] on Thu, 23 Mar 2006 15:04:23 +0100 writes: Duncan == Duncan Murdoch [EMAIL PROTECTED] on Thu, 23 Mar 2006 08:48:52 -0500 writes: Duncan On 3/23/2006 7:35 AM, Thomas Steiner wrote: Martin b) Scrolling up in RGui (windows 2000) to see past commands is nice, but: Is it possible to type eg wi and the arrow up and see the last command that started with wi (like windows()). I know this feature from Matlab (Uops, one of the forbidden words here? ;) ) and it's nice to have it. Duncan We have things like that on platforms that use the Duncan readline library for input, but Rgui doesn't. Martin Also note that ESS (Emacs Speaks Statistics) has exactly Martin this feature on all platforms ! uuhm... exactly (using the [up] arrow key) only if you use our (recommended but not default) emacs setting below; otherwise, you have to use 'C-c M-r' instead of the up arrow (eval-after-load comint '(progn (setq comint-scroll-to-bottom-on-output 'others) ; not current ;;=default: (setq comint-scroll-to-bottom-on-input nil) (setq comint-scroll-show-maximum-output t) ;;; this is the key (define-key comint-mode-map [up] 'comint-previous-matching-input-from-input) (define-key comint-mode-map [down] 'comint-next-matching-input-from-input) (define-key comint-mode-map \C-a 'comint-bol))) [for the non-emacs experts: the above is emacs-lisp language; you have to add it to your home-directory/.emacs or a site-wide emacs initialiation file {such as 'site-start.el'}. Martin Maechler, ETH Zurich Duncan It would be nice, but it's a fair bit of work to Duncan implement properly and it's not on my todo list. Duncan Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time-Series, multiple measurements, ANOVA model over time points, analysis advice
If this were my problem, I might start by considering each stimulus-response pair as a one observation, and I'd break the MEG into separate time series, each starting roughly 1 second before the stimulus and ending roughly 1 second after. If you've averaged many of these, I'm guessing you must have done something like this already. From plots of the averages plus from autocorrelation and partial autocorrelation functions of the individual series, I'd then try to develop a parsimoneous model for the changes. By fitting this model to each response, I could hopefully condense the data from a few thousand observations in each series to a small number of parameters. Then you could try to model the differences in the estimated parameters. Are you familiar with Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer)? This book and the companion nlme package has facilities for handling the kinds of models you describe. However, I doubt if the software will handle the volume of data you have. You may have to condence the data, e.g., by averaging sequences of roughly 100 observations each, matched somehow to the stimulus and response events, etc. hope this helps. spencer graves Darren Weber wrote: Hi, I have some general questions about statistical analysis for a research dataset and a request for advice on using R and associated packages for a valid analysis of this data. I can only pose the problem as how to run multiple ANOVA tests on time series data, with reasonable controls of the family-wise error rate. If we run analysis at many small sections of a long time-series, the Type-I family-wise error rate is a concern. Is it important to consider the temporal dependence in the time-series data? There are papers on ramdomization tests for the sort of time-series data we have (eg, Blair and Karnisky, 1993), but these papers often report only t-test comparisons, not F-tests with 2+ factors. BACKGROUND: We have a dataset from a human neuroimaging experiment. Subjects view a screen, with a fixation point at the center. A cue arrow appears, directing their attention to the lower left or right visual field (cue left or cue right, this is one factor in our analysis). After 1 sec, a stimulus (S) appears in the lower left or right visual quadrant and subjects have to respond to it only if it appeared in the cued location. This sequence of events repeated hundreds of times. Each trial comprised a cue followed 1 sec later by a stimulus (cue - S), with a longer gap in between these trials (about 2 sec). The brain activity was measured with magnetoencephalography (MEG), with a very high sample rate (1200 Hz). The activity from 275 MEG sensors was segmented precisely in relation to the onset of the cue. Each of these segments is known as an 'event-related field' or ERF and the segments for every cue left or cue right trial were averaged across trials (to improve signal-to-noise of the event-related activity). We have data that are averaged ERFs over several hundred trials. These averaged ERF for cue left or cue right was used to estimate the brain source activity (the details are not relevant here). A small example dataset and R scripts are available via ftp://ftp.mrsc.ucsf.edu/pub/dweber/cortical_timeSeries.tar These example data are from one brain region of interest (roi), called the middle frontal gyrus (MFG). We have estimated activity in this brain region for the left and right cerebral hemisphere (this is one factor in the analysis). These data are for a short period prior to the cue (-300 ms) and a longer period after the cue (1400 ms; the S appeared at 1000 ms). There are 8 subjects in this dataset. Each subject has an ERF ANALYSIS to DATE: For each time bin of about 20 ms duration, from -300 to 1400 ms, we need to evaluate the ANOVA model, MFGactivity = CUE + HEMI + error where the CUE (left, right) and HEMI (left, right) interactions are important. These two factors are within-subjects factors (ie, repeated measures for each subject). In classical terms, this is a split-plot design. The data frame and aov model are specified in the R scripts of the download. Given time-bins of about 20ms and a time-series of -300 to 1400 ms at small increments of 1-2 ms, we have a lot of analyses in just one brain region. How can we do this analysis and minimize family-wise error rates? Is it possible to run permutation analysis for an ANOVA model? R scripts in the download: source(Rscript_HiN_cortical_roi_analysis_aov_specifics.R) This will run ANOVA on several time-bins of the data. The time-bins are defined in Rscript_HiN_cortical_roi_analysis_setup.R, which is sourced by the script above. Reference Blair RC, Karniski W. 1993. An alternative method for significance testing of waveform difference potentials. Psychophysiology 30:518--524.
Re: [R] invalid variable type in model.frame within a function
On Thu, 23 Mar 2006, Ingmar Visser wrote: Dear expeRts, I came across the following error in using model.frame: # make a data.frame jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) # spec of formula mf1=y~x1+x2 # make the model.frame mf=model.frame(formula=mf1,data=jet,weights=rvar) Which gives the desired output: output snipped However, doing this inside another function like this: makemodelframe - function(formula,data,weights) { mf=model.frame(formula=formula,data=data,weights=weights) mf } produces the following error: makemodelframe(mf1,jet,weights=rvar) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type Searching the R-help archives I came across bug-reports about this but couldn't figure out whehter the bug was solved or whether there are work-arounds available. It's not a bug. There have been bug reports about related issues (and also about this issue, but they tend to be marked not a bug). If you think about it, how could makemodelframe(mf1,jet,weights=rvar) possibly work? R passes variables by value, so rvar has to be evaluated before the function is called. But rvar is not the name of any global variable (it's just a column in data frame), so how can R know where to look? The reason that people think it might work is by analogy with model.frame and the regression commands, where model.frame(y~x, data=d, weights=w) does somehow retrieve d$w as the weight. This analogy tends to override programming commonsense and make people believe that R will somehow know where to find the weights. Now, since model.frame() *does* manage to find the weights, it must be possible, and it is. That doesn't make it a good idea, though. Regression commands and model.frame() do some fairly advanced trickery to make it work. This is documented on developer.r-project.org. I don't think it's a good idea for people to write code like this. I should admit (especially since it's Lent at the moment, and so is an appropriate time to repent one's past errors) that I lobbied Ross and Robert to make model.frame() work compatibly with S-PLUS in its treatment of weights= arguments (when porting the survival package, nearly ten years ago). They were reluctant at the time, and I now think they were right, although this level of S-PLUS compatibility might have been unavoidable. I would advise writing your code so that you the call looks like makemodelframe(mf1,jet,weights=~rvar) That is, pass all the variables that are going to be evaluated in the data= argument as formulas (or as quoted expressions). This is basically what lme() does, where you supply two formulas and then various other bits and pieces as objects. It is what my survey package does. Then a user can do makemodelframe(mf1,jet,weights=rvar) if rvar is a variable in the current environment and makemodelframe(mf1,jet,weights=~rvar) if rvar is a variable in the data= argument, and both will work. There is some discussion of this in a note on Nonstandard evaluation on the developer.r-project.org webpage, including a function that will produce a single model frame from multiple formulas. Now, I think there are some exceptions to this recommendation, and I don't have a very clear definition of them. I think of them as macro-like functions that evaluate a supplied expression in some special context Functions like this in base R include with() and capture.output(), and you will find some more nice simple examples in the mitools package. For these functions it really isn't ambiguous where the evaluation takes place. A related issue is functions such as the plot() methods that use the unevaluated forms of their arguments as labels. Again, the evaluation of the labels isn't ambiguous, because it doesn't even happen. With a few exceptions like these, though, I think its a bad idea to subvert the pass-by-value illusion in R. This was a lot more than you probably wanted to know, but the alternative answer is the traditional Doctor, it hurts when I do this Don't do that, then -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Create graphs from text console
Also check out the GDD package created by Simon Urbanek if bitmap does not fit your needs. On some systems, bitmap is slow or produces an inferior quality plot, in part because anti-aliasing is not supported (at least on my system). GDD, however, produces excellent anti-aliased graphs using the GD Graphics Library with Freetype support instead of ghostscript. That said, GDD is still in beta and has a couple of missing features, such as lack of support for different line widths in graphs (as of 0.1-7). --Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Thursday, March 23, 2006 11:11 AM To: [EMAIL PROTECTED] Cc: R help list Subject: Re: [R] Create graphs from text console Rainer M Krug wrote: Hi I am using R 2.2.1 under Linux (SuSE 10) and would like to know if it is possible to create graphs, i.e. jpeg(filename=fn) try( coplot( mor$I_Morisita ~ mor$Year | mor$RunStr, show.given=FALSE) ) dev.off() from a text console? It gives me an error message on the jpeg() command: Error in X11(..snip..) unable to start device jpeg In addition: warning message: unable to open connection to X11 display. Are there any ways to create the plot and save it into a jpeg file from a text console? Via bitmap() (i.e. postscript with ghostscript postprocessing). Uwe Ligges Thanks, Rainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/2006 10:34 AM, Peter Ehlers wrote: Duncan Murdoch wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. [snip] Duncan, This may be asking too much, but would it be possible to consider implementing selective graph removal from the graph history? I use graph.record=TRUE frequently for comparing graphs, but often find that I'd like to kill one of the graphs while keeping others in memory. That's probably not too hard to do in R code. You just need to look at the source in src/library/grDevices/R/windows/windows.R for the print.SavedPlots method, and maybe the C level code in src/library/grDevices/src/devWindows.c for a bit more help on the interpretation of the .SavedPlots object, and then it should be fairly straightforward to write a function that deletes an entry in the history. (It's a list with 5 components, the first 4 of which describe the current state and what the user is looking at, and the last of which is a list containing the actual recorded plots.) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
On 3/23/2006 10:46 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Are there R-level facilities to manipulate the history, not just the top? Sure, it's a regular R object. You will need to read the source to know how to interpret it, and since it's undocumented there's a risk of changes in future R versions, but it's not very complicated. See my message to Peter. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Create graphs from text console
From: Uwe Ligges Rainer M Krug wrote: Hi I am using R 2.2.1 under Linux (SuSE 10) and would like to know if it is possible to create graphs, i.e. jpeg(filename=fn) try( coplot( mor$I_Morisita ~ mor$Year | mor$RunStr, show.given=FALSE) ) dev.off() from a text console? It gives me an error message on the jpeg() command: Error in X11(..snip..) unable to start device jpeg In addition: warning message: unable to open connection to X11 display. Are there any ways to create the plot and save it into a jpeg file from a text console? Via bitmap() (i.e. postscript with ghostscript postprocessing). Or alternatively, look in the list archive about using xvfb; e.g., http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63018.html Andy Uwe Ligges Thanks, Rainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Estimation of skewness from quantiles of near-normal distribution
This pertains to the first paragraph, you can use Dagostino test which is an omnibus test combining both skewness and kurtosis and has a high power, istead of only skewness of the data. Try ?dagoTest Ahmed Leif Kirschenbaum [EMAIL PROTECTED] wrote: I have summary statistics from many sets (10,000's) of near-normal continuous data. From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal. I have the raw data for a few (700) of these sets. I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot. My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data. Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics: 0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions. I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma. I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew. Thank you for any ideas. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R code to generate Sweave input ? Sweave squared ?
For some recurrent tasks, I would like to programmatically generate input for Sweave. While I could do that with many languages, in particular some starting with the letter P, I wouldn't mind advancing two more positions in the alphabet and use R itself to generate input for Sweave. In other words I want to write R code that can write Rnw input files which will be turned into R code ... that ends up as dvi/pdf output. Has anybody done this before? Are there any frameworks to (re-)use ? Regards, Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Intercepts in linear models.
You can use update to modify a formula: fo - y ~ x fo - update(fo, . ~ . -1) willl remove the intercept. See ?update.formula On 3/23/06, Rolf Turner [EMAIL PROTECTED] wrote: A colleague asked me if there is a way to specify with a ***variable*** (say ``cflag'') whether there is an intercept in a linear model. She had in mind something like lm(y ~ x - cflag) where cflag could be 0 or 1; if it's 0 an intercept should be fitted, if it's 1 then no intercept. This doesn't work ``of course''. The cflag just gets treated as another predictor and because it is of the wrong length an error is generated. The best I could come up with was lm(as.formula(paste(y ~ x -,cflag))) Which is kludgy. It (of course!) also be done using an if statement. Is there a sexier way? cheers, Rolf Turner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Writing a function to fit ALSOS models. problem with normalization?
Dear all, Below is my attempt at a function to fit Alternate Least Squares Optimal Scaling models, as described in Young (1981) Quantitative Analysis of Qualitative Data and Jacoby (1999) Levels of Measurement and Political Research: An Optimistic View. I would welcome any comments on coding style, tips tricks etc. I also have a specific problem: the output tends to give very small coefficients, and very large fitted values for specific factor levels. Am I doing the normalization right? Cheers David library(car) # for recode alsos.fit = function (y, x, tol = 1e-7) { # y is a vector or factor or ordered factor # x is a data frame of vectors/factors/ordereds # we treat the y's as the first column of the matrix x = cbind(y, x) x = x[complete.cases(x),] ox = x x.facs = sapply(x, is.factor) x.ords = sapply(x, is.ordered) # start with our numeric values whatever they are x = sapply(x, as.numeric) old.SSE = Inf while(T) { # least squares regression with an intercept ols = lm.fit(cbind(rep(1,nrow(x)), x[,-1]) , x[,1]) b = ols$coefficients SSE = drop(ols$residuals %*% ols$residuals) if (old.SSE-SSEtol) { factor.scores=list() for (i in (1:ncol(x))[x.facs]) { nm = colnames(x)[i] factor.scores[[nm]] = tapply(x[,i], ox[,i], function (foo) {return(foo[1])}) names(factor.scores[[nm]]) = levels(ox[,i]) } return(list( factor.scores=factor.scores, scaled.y=x[,1], scaled.x=x[,-1], coefficients=b, SSE=SSE, )) } old.SSE=SSE mx = nx = x mx[] = 0 nx[] = 0 for (i in (1:ncol(x))[x.facs]) { # optimal scaling if (i==1) nx[,i] = ols$fitted.values else nx[,i] = (x[,1] - cbind(rep(1,nrow(x)), x[,c(-1,-i)]) %*% b[-i])/b[i] # create within-category means tmpfac = factor(ox[,i], labels=1:nlevels(ox[,i])) catmeans = tapply(nx[,i], tmpfac, mean) # ensure ordinal values are correctly ordered if (x.ords[i]) { tmp = kruskal.ordering(nx[,i], tmpfac) tmpfac = tmp$tmpfac catmeans = tmp$catmeans } # set values to within-category means mx[,i] = catmeans[tmpfac] # normalize according to Young (1981) mx[,i] = mx[,i] * (nx[,i] %*% nx[,i]) / (mx[,i] %*% mx[,i]) } x[,x.facs] = mx[,x.facs] } } # as described in Kruskal (1964) kruskal.ordering = function(numeric.data, tmpfac) { j = 1 upact = T while(T) { catmeans = tapply(numeric.data, tmpfac, mean) # vector w as many items as tmpfac cats # have we finished? if (jnlevels(tmpfac)) return (list(catmeans=catmeans,tmpfac=tmpfac)) if ((j==nlevels(tmpfac) || catmeans[j] = catmeans[j+1]) (j==1 || catmeans[j] = catmeans[j-1])) { j=j+1 upact=T next } if (upact) { if (j nlevels(tmpfac) catmeans[j] catmeans[j+1]) { tmpfac = recode(tmpfac, paste(j, :, j+1,=', j+1, ', sep=)) levels(tmpfac) = 1:nlevels(tmpfac) } upact=F } else { if (j 1 catmeans[j] catmeans[j-1]) { tmpfac = recode(tmpfac, paste(j-1, :, j, =', j, ', sep=)) levels(tmpfac) = 1:nlevels(tmpfac) j=j-1 } upact=T } } } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] DIfference between weights options in lm GLm and gls.
Hi, Spencer, For your call to gls you actually want: fit.gls.w - gls(y~x, data=DF, weights=varFixed(~1/w)) HTH, --sundar Spencer Graves wrote: In my tests, gls did NOT give the same answers as lm and glm, and I don't know why; perhaps someone else will enlighten us both. I got the same answers from lm and glm. Since you report different results, please supply a replicatable example. I tried the following: set.seed(1) DF - data.frame(x=1:8, xf=rep(c(a, b), 4), y=rnorm(8), w=1:8, one=rep(1,8)) fit.lm.w - lm(y~x, DF, weights=w) fit.glm.w - glm(y~x, data=DF, weights=w) fit.gls.w - gls(y~x, data=DF, weights=varFixed(~w)) coef(fit.lm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.glm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.gls.w) (Intercept) x -0.5924727 0.1608727 I also tried several variants of this. I know this does not answer your questions, but I hope it will contribute to an answer. spencer graves Goeland wrote: Dear r-users, Can anyone explain exactly the difference between Weights options in lm glm and gls? I try the following codes, but the results are different. lm1 Call: lm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 lm2 Call: lm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.04193 7.30660 lm3 Call: lm(formula = ys ~ Xs - 1) Coefficients: Xs Xsx 0.04193 7.30660 Here ys= y*sqrt(W), Xs- sqrt(W)*cbind(1,x) So we can see weights here for lm means the scale for X and y. But for glm and gls I try glm1 Call: glm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1049000 Residual Deviance: 28210AIC: 7414 glm2 Call: glm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.1955 7.3053 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1548000 Residual Deviance: 44800AIC: 11670 glm3 Call: glm(formula = y ~ x, weights = 1/W) Coefficients: (Intercept)x 0.03104 7.31033 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 798900 Residual Deviance: 19900AIC: 5285 glm4 Call: glm(formula = ys ~ Xs - 1) Coefficients: XsXsx 2.687 6.528 Degrees of Freedom: 1243 Total (i.e. Null); 1241 Residual Null Deviance: 449 Residual Deviance: 506700 AIC: 11000 With weights, the glm did not give the same results as lm why? Also for gls, I use varFixed here. gls3 Generalized least squares fit by REML Model: y ~ x Data: NULL Log-restricted-likelihood: -3737.392 Coefficients: (Intercept) x 0.03104214 7.31032540 Variance function: Structure: fixed weights Formula: ~W Degrees of freedom: 1243 total; 1241 residual Residual standard error: 4.004827 gls4 Generalized least squares fit by REML Model: ys ~ Xs - 1 Data: NULL Log-restricted-likelihood: -5500.311 Coefficients: Xs Xsx 2.687205 6.527893 Degrees of freedom: 1243 total; 1241 residual Residual standard error: 20.20705 We can see the relation between glm and gls with weight as what I think, but what's the difference between lm wit gls and glm? why? Thanks so much.! Goeland Goeland [EMAIL PROTECTED] 2006-03-16 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Accessing specific values of linear regression model
I am just starting to learn R, and I'm struggling with a lot of the basic stuff. The online documentation is generally good, but I have been unable to find the answer for my problem this time. I am running linear regression on a particular set of data and plotting it. I've got the regression line plotted and everything is fine so far, except for one thing: I want to display the SSE from the regression and I don't know how to access it. In my script I have: model - lm(y~x) and I was able to get the r^2 value that I wanted with: summary(model)$r.squared But I don't know how to get the SSE. When I print the summary(aov(model)) I get the following: Df Sum Sq Mean Sq F valuePr(F) x1 1793928 1793928 67.463 3.447e-11 *** Residuals 56 1489118 26591 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 So it's there (I can see it!) but how do I get to it? Any help would be much appreciated. Thanks, Kelly __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] DIfference between weights options in lm GLm and gls.
Hi, Sundar: Thanks, Sundar. That should have been obvious to me. However, I hadn't used varFixed before, and evidently I thought about it for only 1 ms instead of the required 2. With that change, I get the same answers for all three. Best Wishes, spencer Sundar Dorai-Raj wrote: Hi, Spencer, For your call to gls you actually want: fit.gls.w - gls(y~x, data=DF, weights=varFixed(~1/w)) HTH, --sundar Spencer Graves wrote: In my tests, gls did NOT give the same answers as lm and glm, and I don't know why; perhaps someone else will enlighten us both. I got the same answers from lm and glm. Since you report different results, please supply a replicatable example. I tried the following: set.seed(1) DF - data.frame(x=1:8, xf=rep(c(a, b), 4), y=rnorm(8), w=1:8, one=rep(1,8)) fit.lm.w - lm(y~x, DF, weights=w) fit.glm.w - glm(y~x, data=DF, weights=w) fit.gls.w - gls(y~x, data=DF, weights=varFixed(~w)) coef(fit.lm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.glm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.gls.w) (Intercept) x -0.5924727 0.1608727 I also tried several variants of this. I know this does not answer your questions, but I hope it will contribute to an answer. spencer graves Goeland wrote: Dear r-users, Can anyone explain exactly the difference between Weights options in lm glm and gls? I try the following codes, but the results are different. lm1 Call: lm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 lm2 Call: lm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.04193 7.30660 lm3 Call: lm(formula = ys ~ Xs - 1) Coefficients: Xs Xsx 0.04193 7.30660 Here ys= y*sqrt(W), Xs- sqrt(W)*cbind(1,x) So we can see weights here for lm means the scale for X and y. But for glm and gls I try glm1 Call: glm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1049000 Residual Deviance: 28210AIC: 7414 glm2 Call: glm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.1955 7.3053 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1548000 Residual Deviance: 44800AIC: 11670 glm3 Call: glm(formula = y ~ x, weights = 1/W) Coefficients: (Intercept)x 0.03104 7.31033 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 798900 Residual Deviance: 19900AIC: 5285 glm4 Call: glm(formula = ys ~ Xs - 1) Coefficients: XsXsx 2.687 6.528 Degrees of Freedom: 1243 Total (i.e. Null); 1241 Residual Null Deviance: 449 Residual Deviance: 506700 AIC: 11000 With weights, the glm did not give the same results as lm why? Also for gls, I use varFixed here. gls3 Generalized least squares fit by REML Model: y ~ x Data: NULL Log-restricted-likelihood: -3737.392 Coefficients: (Intercept) x 0.03104214 7.31032540 Variance function: Structure: fixed weights Formula: ~W Degrees of freedom: 1243 total; 1241 residual Residual standard error: 4.004827 gls4 Generalized least squares fit by REML Model: ys ~ Xs - 1 Data: NULL Log-restricted-likelihood: -5500.311 Coefficients: Xs Xsx 2.687205 6.527893 Degrees of freedom: 1243 total; 1241 residual Residual standard error: 20.20705 We can see the relation between glm and gls with weight as what I think, but what's the difference between lm wit gls and glm? why? Thanks so much.! Goeland Goeland [EMAIL PROTECTED] 2006-03-16 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] DIfference between weights options in lm GLm and gls.
Spencer Graves [EMAIL PROTECTED] writes: In my tests, gls did NOT give the same answers as lm and glm, and I don't know why; perhaps someone else will enlighten us both. The weights argument in gls (gnlslmenlme) specifies the variance, not the actual weight which is the reciprocal. (This is to my mind a somewhat curious design decision, as is the fact that varPower and varConstPower actually specifies the SD rather than the variance). I got the same answers from lm and glm. Since you report different results, please supply a replicatable example. I tried the following: set.seed(1) DF - data.frame(x=1:8, xf=rep(c(a, b), 4), y=rnorm(8), w=1:8, one=rep(1,8)) fit.lm.w - lm(y~x, DF, weights=w) fit.glm.w - glm(y~x, data=DF, weights=w) fit.gls.w - gls(y~x, data=DF, weights=varFixed(~w)) coef(fit.lm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.glm.w) (Intercept) x -0.2667521 0.0944190 coef(fit.gls.w) (Intercept) x -0.5924727 0.1608727 I also tried several variants of this. I know this does not answer your questions, but I hope it will contribute to an answer. spencer graves Goeland wrote: Dear r-users£¬ Can anyone explain exactly the difference between Weights options in lm glm and gls? I try the following codes, but the results are different. lm1 Call: lm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 lm2 Call: lm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.04193 7.30660 lm3 Call: lm(formula = ys ~ Xs - 1) Coefficients: Xs Xsx 0.04193 7.30660 Here ys= y*sqrt(W), Xs- sqrt(W)*cbind(1,x) So we can see weights here for lm means the scale for X and y. But for glm and gls I try glm1 Call: glm(formula = y ~ x) Coefficients: (Intercept)x 0.1183 7.3075 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1049000 Residual Deviance: 28210AIC: 7414 glm2 Call: glm(formula = y ~ x, weights = W) Coefficients: (Intercept)x 0.1955 7.3053 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 1548000 Residual Deviance: 44800AIC: 11670 glm3 Call: glm(formula = y ~ x, weights = 1/W) Coefficients: (Intercept)x 0.03104 7.31033 Degrees of Freedom: 1242 Total (i.e. Null); 1241 Residual Null Deviance: 798900 Residual Deviance: 19900AIC: 5285 glm4 Call: glm(formula = ys ~ Xs - 1) Coefficients: XsXsx 2.687 6.528 Degrees of Freedom: 1243 Total (i.e. Null); 1241 Residual Null Deviance: 449 Residual Deviance: 506700 AIC: 11000 With weights, the glm did not give the same results as lm why? Also for gls, I use varFixed here. gls3 Generalized least squares fit by REML Model: y ~ x Data: NULL Log-restricted-likelihood: -3737.392 Coefficients: (Intercept) x 0.03104214 7.31032540 Variance function: Structure: fixed weights Formula: ~W Degrees of freedom: 1243 total; 1241 residual Residual standard error: 4.004827 gls4 Generalized least squares fit by REML Model: ys ~ Xs - 1 Data: NULL Log-restricted-likelihood: -5500.311 Coefficients: Xs Xsx 2.687205 6.527893 Degrees of freedom: 1243 total; 1241 residual Residual standard error: 20.20705 We can see the relation between glm and gls with weight as what I think, but what's the difference between lm wit gls and glm? why? Thanks so much.! Goeland Goeland [EMAIL PROTECTED] 2006-03-16 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RMySQL's column limit
My apologies, I'm using R v2.2.1 via the JGR Editor GUI, RMySQL v0.5.7 under windows XP with latest patches. MySQL is server version: 4.1.12 I'm pretty sure it's running on a linux box. The dimension of template is 2000, I know the error kicks in at 3000, but haven't iterated to find the exact point - if it would help I can do this. Regards Mark On 3/24/06, David James [EMAIL PROTECTED] wrote: Hi, Mark Van De Vyver wrote: Dear R-users, First, thank you to the developers for the very useful R-library RMySQL. While using this library a recieved an error message: RS-DBI driver: (could not run statement: Too many columns) The statement that generated the error was: dbWriteTable(dbcon, simdataseries, template, overwrite = TRUE, row.names = FALSE ) I am assuming this is a RMySQL rather than MySQL limit. We need more information, e.g., what's dim(template), what version of MySQL you're using, etc. If that is the case I was wondering just what this limit is and if it is possible to raise it? Thanks again for all the hard work. Sincerely Mark -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- David -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Duncan Murdoch wrote: On 3/23/2006 10:34 AM, Peter Ehlers wrote: Duncan Murdoch wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. [snip] Duncan, This may be asking too much, but would it be possible to consider implementing selective graph removal from the graph history? I use graph.record=TRUE frequently for comparing graphs, but often find that I'd like to kill one of the graphs while keeping others in memory. That's probably not too hard to do in R code. You just need to look at the source in src/library/grDevices/R/windows/windows.R for the print.SavedPlots method, and maybe the C level code in src/library/grDevices/src/devWindows.c for a bit more help on the interpretation of the .SavedPlots object, and then it should be fairly straightforward to write a function that deletes an entry in the history. (It's a list with 5 components, the first 4 of which describe the current state and what the user is looking at, and the last of which is a list containing the actual recorded plots.) Duncan Murdoch Thanks, Duncan. I'll have a look at it. Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] comparative density estimates
I have two series of events over time and I want to construct a graph of the relative frequency/density of these events that allows their distributions to be sensibly compared. The events are the milestones items in my project on milestones in the history of data visualization [1], and I want to compare trends in Europe vs. North America. I decided to use a graph of two overlaid density estimates with rug plots, but then the question arises of how to choose the bandwidth (BW) for the two series to allow them to be sensibly compared, because the range of time and total frequency differ for the two series. To avoid clutter on this list, I've placed the data and R code at http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears4.R I have two versions of this graph, one selecting an optimal BW for each separately and the other using the adjust= argument of density() to approximately equate the BW to the value determined for the whole series combined. The two versions (done with SAS) are shown at http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears32.gif http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears33.gif The densities in the first are roughly equivalent to the R code d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=1) d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=1) the second to d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=2.5) d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=0.75) The second graph seems to me to undersmooth the more extensive data from Europe and undersmooth the data from North America. - any comments or suggestions? - are there other methods I should consider? I did find overlap.Density() in the DAAG package, but perversely, it uses a bw= argument to select a BW/grayscale plot. thanks, -Michael [1] http://www.math.yorku.ca/SCS/Gallery/milestone/ -- Michael Friendly Email: [EMAIL PROTECTED] Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Hi Duncan Murdoch wrote: On 3/23/2006 10:46 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Are there R-level facilities to manipulate the history, not just the top? Sure, it's a regular R object. You will need to read the source to know how to interpret it, and since it's undocumented there's a risk of changes in future R versions, but it's not very complicated. See my message to Peter. Be careful with this. The objects that are recorded on the display list are calls to graphics functions PLUS state information in a raw binary format. The display list was originally intended for reuse within the same R session (for redrawing the screen). If you try to save it and use it between sessions or (worse) between versions of R you could run into some nasty problems. For example, what if the graphics function interface has changed? what if the raw binary state information format has changed? what if the required packages are not installed? At best, your saved object produces errors; at worst it becomes completely useless and is unrecoverable. Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Intercepts in linear models.
update() simply calls the model fitting function again using the new arguments, which means one redundant model fit. What I would do is construct the formula beforehand, e.g., myformula - if (cflag) y ~ x else y ~ x - 1 or something like that. Andy From: Gabor Grothendieck You can use update to modify a formula: fo - y ~ x fo - update(fo, . ~ . -1) willl remove the intercept. See ?update.formula On 3/23/06, Rolf Turner [EMAIL PROTECTED] wrote: A colleague asked me if there is a way to specify with a ***variable*** (say ``cflag'') whether there is an intercept in a linear model. She had in mind something like lm(y ~ x - cflag) where cflag could be 0 or 1; if it's 0 an intercept should be fitted, if it's 1 then no intercept. This doesn't work ``of course''. The cflag just gets treated as another predictor and because it is of the wrong length an error is generated. The best I could come up with was lm(as.formula(paste(y ~ x -,cflag))) Which is kludgy. It (of course!) also be done using an if statement. Is there a sexier way? cheers, Rolf Turner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Accessing specific values of linear regression model
You said you were fitting regression model, yet you used summary(aov(model))? It's rather unusual to use aov() for regression models... If you used lm() to fit the regression model, the following might help: y - rnorm(10); x - runif(10) fit - lm(y ~ x) anova(fit) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) x 1 1.2526 1.2526 1.6872 0.2302 Residuals 8 5.9397 0.7425 str(anova(fit)) Classes anova and `data.frame':2 obs. of 5 variables: $ Df : int 1 8 $ Sum Sq : num 1.25 5.94 $ Mean Sq: num 1.253 0.742 $ F value: num 1.69 NA $ Pr(F) : num 0.23 NA - attr(*, heading)= chr Analysis of Variance Table\n Response: y anova(fit)[x, Sum Sq] [1] 1.252643 Andy From: Kelly Vincent I am just starting to learn R, and I'm struggling with a lot of the basic stuff. The online documentation is generally good, but I have been unable to find the answer for my problem this time. I am running linear regression on a particular set of data and plotting it. I've got the regression line plotted and everything is fine so far, except for one thing: I want to display the SSE from the regression and I don't know how to access it. In my script I have: model - lm(y~x) and I was able to get the r^2 value that I wanted with: summary(model)$r.squared But I don't know how to get the SSE. When I print the summary(aov(model)) I get the following: Df Sum Sq Mean Sq F valuePr(F) x1 1793928 1793928 67.463 3.447e-11 *** Residuals 56 1489118 26591 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 So it's there (I can see it!) but how do I get to it? Any help would be much appreciated. Thanks, Kelly __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Intercepts in linear models.
Rolf Turner wrote: A colleague asked me if there is a way to specify with a ***variable*** (say ``cflag'') whether there is an intercept in a linear model. She had in mind something like lm(y ~ x - cflag) something like lm( if (cflag) y ~ x-0 else y ~ x, ... Kjetil where cflag could be 0 or 1; if it's 0 an intercept should be fitted, if it's 1 then no intercept. This doesn't work ``of course''. The cflag just gets treated as another predictor and because it is of the wrong length an error is generated. The best I could come up with was lm(as.formula(paste(y ~ x -,cflag))) Which is kludgy. It (of course!) also be done using an if statement. Is there a sexier way? cheers, Rolf Turner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] kmeans Clustering
Dear WizaRds, My goal is to program the VS-KM algorithm by Brusco and Cradit 01 and I have come to a complete stop in my efforts. Maybe anybody is willing to follow my thoughts and offer some help. In a first step, I want to use a single variable for the partitioning process. As the center-matrix I use the objects that belong to the cluster I found with the hierarchial Ward algorithm. Then, I have to take all possible variable pairs and apply kmeans again, which is quite confusing to me. Here is what I do: ## 0. data mat - matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T ) rownames(mat) - paste(v, 1:4, sep= ) tmat - t(mat) ## 1. Provide clusters via Ward: ward- hclust(d=dist(tmat), method = ward, members=NULL) ## 2. Compute cluster centers and create center-matrix for kmeans: groups - cutree(ward, k = 3, h = NULL) centroids - vector(mode=numeric, length=3) obj - vector(mode=list, length=3) for (i in 1:3){ where - which(groups==i) # which object belongs to which group? centroids[i] - mean( tmat[ where, ] ) obj[[i]] - tmat[where,] } P - vector(mode=numeric, dim(mat)[2] ) pj - vector(mode=list, length=dim(mat)[1]) for (i in 1:dim(mat)[1]){ pj[[i]] - kmeans( tmat[,i], centers=centroids, iter.max=10, algorithm=MacQueen) P - rbind(P, pj[[i]]$cluster) } P - P[-1,] ## gives a matrix of partitions using each single variable ## (I'm sure, P can be programmed much easier) ## 3. kmeans using all possible pairs of variables, here just e.g. variables 1 and 3: wjk - kmeans(tmat[,c(1,3)], centers=centroids, iter.max=10, algorithm=MacQueen) ### which, of course, gives an error message since centroids is not a matrix of the cluster centers. How on earth do I correctly construct a matrix of centers corresponding to the pairwise variables? Is it always the same matrix no matter which pair of variables I choose? I apologize for my lack of clustering knowledge and expertise - any help is welcome. Thank you very much. Many greetings mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] conservative robust estimation in (nonlinear) mixed models
I know of two fairly common models for robust methods. One is the contaminated normal that you mentioned. The other is Student's t. A normal plot of the data or of residuals will often indicate whether the assumption of normality is plausible or not; when the plot indicates problems, it will often also indicate whether a contaminated normal or Student's t would be better. Using Student's t introduces one additional parameter. A contaminated normal would introduce 2; however, in many applications, the contamination proportion (or its logit) will often b highly correlated with the ratio of the contamination standard deviation to that of the central portion of the distribution. Thus, in some cases, it's often wise to fix the ratio of the standard deviations and estimate only the contamination proportion. hope this helps. spencer graves dave fournier wrote: Conservative robust estimation methods do not appear to be currently available in the standard mixed model methods for R, where by conservative robust estimation I mean methods which work almost as well as the methods based on assumptions of normality when the assumption of normality *IS* satisfied. We are considering adding such a conservative robust estimation option for the random effects to our AD Model Builder mixed model package, glmmADMB, for R, and perhaps extending it to do robust estimation for linear mixed models at the same time. An obvious candidate is to assume something like a mixture of normals. I have tested this in a simple linear mixed model using 5% contamination with a normal with 3 times the standard deviation, which seems to be a common assumption. Simulation results indicate that when the random effects are normally distributed this estimator is about 3% less efficient, while when the random effects are contaminated with 5% outliers the estimator is about 23% more efficient, where by 23% more efficient I mean that one would have to use a sample size about 23% larger to obtain the same size confidence limits for the parameters. Question? I wonder if there are other distributions besides a mixture or normals. which might be preferable. Three things to keep in mind are: 1.) It should be likelihood based so that the standard likelihood based tests are applicable. 2.) It should work well when the random effects are normally distributed so that things that are already fixed don't get broke. 3.) In order to implement the method efficiently it is necessary to be able to produce code for calculating the inverse of the cumulative distribution function. This enables one to extend methods based one the Laplace approximation for the random effects (i.e. the Laplace approximation itself, adaptive Gaussian integration, adaptive importance sampling) to the new distribution. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
on 3/23/2006 2:42 PM Paul Murrell said the following: Hi Duncan Murdoch wrote: On 3/23/2006 10:46 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Are there R-level facilities to manipulate the history, not just the top? Sure, it's a regular R object. You will need to read the source to know how to interpret it, and since it's undocumented there's a risk of changes in future R versions, but it's not very complicated. See my message to Peter. Be careful with this. The objects that are recorded on the display list are calls to graphics functions PLUS state information in a raw binary format. The display list was originally intended for reuse within the same R session (for redrawing the screen). If you try to save it and use it between sessions or (worse) between versions of R you could run into some nasty problems. For example, what if the graphics function interface has changed? what if the raw binary state information format has changed? what if the required packages are not installed? At best, your saved object produces errors; at worst it becomes completely useless and is unrecoverable. Paul For that reason, there could be some benefit in saving desired graphics externally as files -- under Windows, savePlot() can be used -- and starting with a fresh graphics history in each R session. I regularly use the R command .SavedPlots - NULL in scripts to get rid of any old history. This seems to work fine, but of course I would appreciate comments from those more knowledgeable telling me what's wrong with it. MHP -- Michael Prager, Ph.D. Population Dynamics Team, NMFS SE Fisheries Science Center NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/ Opinions expressed are personal, not official. No government endorsement of any product is made or implied. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] PCA, Source analysis and Unmixing, environmental forensics
I am using R for environmental forensics (determination of the sources and/or groupings in mixtures of organic chemicals in the field). The goal is to determine in there are groups of samples with similar/dissimilar compositions, and to assign samples to a potential source or a mixture of sources based on the composition (unmixing and source allocation). Typically there are 10 to 50 chemicals that have been analyzed in all of the samples. In most cases concentrations are converted to proportion of total as we are interested in composition rather that simple dilution. I have had great success with ratio analysis; simple exploratory analysis such a property property plots etc; and cluster analysis such as principal components analysis (PCA) and hierarchal cluster analysis. I have also been experimenting with glyph representation, k-means clusters, and similar procedures as documented in MASS. More recently I have been experimenting with Independent Components Analysis (ICA). Another commonly used procedure is polytopic vector analysis (PVA), a procedure for which I have no implementation in R sand so I have not tried it out. My question is can anyone suggest : A) Other procures in R that I may have missed and should investigate B) Your experience and/or hints for using ICA and presenting the results C) An implementation for PVA in R I have not found Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] comparative density estimates
Michael, very nice and interesting plots! One alternative idea to compare the proportion of milestone items (that does not really answer the bandwith question) in Europe and North America might be a conditional density plot. After running your R source code, you could do: where - factor(c(rep(North America, length(sub1)), rep(Europe, length(sub2 year - c(sub1, sub2) cdplot(where ~ year, bw = sj) showing the decrease in the European proportion. Internally, this first computes the unconditional density as in plot(density(year, bw = sj)) and then the density for Europe with the same bandwidth. Best wishes, Z On Thu, 23 Mar 2006 14:25:53 -0500 Michael Friendly wrote: I have two series of events over time and I want to construct a graph of the relative frequency/density of these events that allows their distributions to be sensibly compared. The events are the milestones items in my project on milestones in the history of data visualization [1], and I want to compare trends in Europe vs. North America. I decided to use a graph of two overlaid density estimates with rug plots, but then the question arises of how to choose the bandwidth (BW) for the two series to allow them to be sensibly compared, because the range of time and total frequency differ for the two series. To avoid clutter on this list, I've placed the data and R code at http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears4.R I have two versions of this graph, one selecting an optimal BW for each separately and the other using the adjust= argument of density() to approximately equate the BW to the value determined for the whole series combined. The two versions (done with SAS) are shown at http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears32.gif http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears33.gif The densities in the first are roughly equivalent to the R code d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=1) d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=1) the second to d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=2.5) d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=0.75) The second graph seems to me to undersmooth the more extensive data from Europe and undersmooth the data from North America. - any comments or suggestions? - are there other methods I should consider? I did find overlap.Density() in the DAAG package, but perversely, it uses a bw= argument to select a BW/grayscale plot. thanks, -Michael [1] http://www.math.yorku.ca/SCS/Gallery/milestone/ -- Michael Friendly Email: [EMAIL PROTECTED] Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] outer() function
Greetings R-help community, I am relatively new to R, which may be why I am having trouble understanding this problem. I am trying to use outer() to generate a graphable surface of a function. If there is a better way to do this, I would appreciate the insight. Otherwise, could someone suggest a method to get the outer() function to work here? Below is my simplified R program. Further down is the output. Thanks in advance, Kyle ### data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); f(0,1); f(2,1); f(4,1); f(0,2); f(2,2); f(4,2); outer(x, y, f) f - function(x, y) x-x+y-y+force(sum(data^x)) outer(x, y, f) ## data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); [1] 0 [1] 12 [1] 24 f(0,1); f(2,1); f(4,1); [1] 1 [1] 13 [1] 25 f(0,2); f(2,2); f(4,2); [1] 2 [1] 14 [1] 26 outer(x, y, f) [,1] [,2] [,3] [1,] 20 21 22 [2,] 20 21 22 [3,] 20 21 22 Warning message: longer object length is not a multiple of shorter object length in: data * x __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PCA, Source analysis and Unmixing, environmental forensics
Review all the task views at: http://cran.r-project.org/src/contrib/Views/ On 3/23/06, Mike Bock [EMAIL PROTECTED] wrote: I am using R for environmental forensics (determination of the sources and/or groupings in mixtures of organic chemicals in the field). The goal is to determine in there are groups of samples with similar/dissimilar compositions, and to assign samples to a potential source or a mixture of sources based on the composition (unmixing and source allocation). Typically there are 10 to 50 chemicals that have been analyzed in all of the samples. In most cases concentrations are converted to proportion of total as we are interested in composition rather that simple dilution. I have had great success with ratio analysis; simple exploratory analysis such a property property plots etc; and cluster analysis such as principal components analysis (PCA) and hierarchal cluster analysis. I have also been experimenting with glyph representation, k-means clusters, and similar procedures as documented in MASS. More recently I have been experimenting with Independent Components Analysis (ICA). Another commonly used procedure is polytopic vector analysis (PVA), a procedure for which I have no implementation in R sand so I have not tried it out. My question is can anyone suggest : A) Other procures in R that I may have missed and should investigate B) Your experience and/or hints for using ICA and presenting the results C) An implementation for PVA in R I have not found Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RGui: windows-record and command history
Hi Michael H. Prager wrote: on 3/23/2006 2:42 PM Paul Murrell said the following: Hi Duncan Murdoch wrote: On 3/23/2006 10:46 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 10:29 AM, Gabor Grothendieck wrote: On 3/23/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 3/23/2006 7:35 AM, Thomas Steiner wrote: a) How can I set the recording of all windows()-history forever to true? I want something like windows(record = TRUE) but not just for the window that opens then, but for all windows I will open ever. options(graphics.record=TRUE) will make that happen for the rest of the session. To really make it happen forever, you need to put this line in your Rprofile (see ?Rprofile for where that comes from). Watch out though: the graphics history is stored in your current workspace in memory, and it can get big. You might find you're running out of memory if you store everything, and you'll find your .RData files quite large if you save your workspace. On my todo list (but not for 2.3.0) is the possibility of setting a default history length, perhaps defaulting to saving the last 2 or 3 pages. Would it be feasible to have history on disk or perhaps the last m in memory and the last n (possibly Inf) on disk? The history is just another R object. Saving big R objects on disk might be desirable, but it would be a big change, so I'd call it infeasible. I wouldn't want to get into special-casing this particular R object: that way lies madness. However, since it is just an R object, it's available for R code to work with, so someone who was interested in doing this could write a contributed package that did it. Are there R-level facilities to manipulate the history, not just the top? Sure, it's a regular R object. You will need to read the source to know how to interpret it, and since it's undocumented there's a risk of changes in future R versions, but it's not very complicated. See my message to Peter. Be careful with this. The objects that are recorded on the display list are calls to graphics functions PLUS state information in a raw binary format. The display list was originally intended for reuse within the same R session (for redrawing the screen). If you try to save it and use it between sessions or (worse) between versions of R you could run into some nasty problems. For example, what if the graphics function interface has changed? what if the raw binary state information format has changed? what if the required packages are not installed? At best, your saved object produces errors; at worst it becomes completely useless and is unrecoverable. Paul For that reason, there could be some benefit in saving desired graphics externally as files -- under Windows, savePlot() can be used -- and starting with a fresh graphics history in each R session. I regularly use the R command .SavedPlots - NULL in scripts to get rid of any old history. This seems to work fine, but of course I would appreciate comments from those more knowledgeable telling me what's wrong with it. FWIW, my recommendation would be to use R code as the primary storage format. It's useful to have the plot as a PDF or WMF file, but if you ever need to change anything, you need the original R code. Of course, this means you need to keep the underlying data, and possibly analysis code as well, but you should be doing that anyway. Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] conservative robust estimation in (nonlinear) mixed models
Ok, since Spencer has dived in,I'll go public (I made some prior private remarks to David because I didn't think they were worth wasting the list's bandwidth on. Heck, they may still not be...) My question: isn't the difficult issue which levels of the (co)variance hierarchy get longer tailed distributions rather than which distributions are used to model ong tails? Seems to me that there is an inherent identifiability issue here, and even more so with nonlinear models. It's easy to construct examples where it all essentially depends on your priors. Cheers, Bert -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves Sent: Thursday, March 23, 2006 12:34 PM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] conservative robust estimation in (nonlinear) mixed models I know of two fairly common models for robust methods. One is the contaminated normal that you mentioned. The other is Student's t. A normal plot of the data or of residuals will often indicate whether the assumption of normality is plausible or not; when the plot indicates problems, it will often also indicate whether a contaminated normal or Student's t would be better. Using Student's t introduces one additional parameter. A contaminated normal would introduce 2; however, in many applications, the contamination proportion (or its logit) will often b highly correlated with the ratio of the contamination standard deviation to that of the central portion of the distribution. Thus, in some cases, it's often wise to fix the ratio of the standard deviations and estimate only the contamination proportion. hope this helps. spencer graves dave fournier wrote: Conservative robust estimation methods do not appear to be currently available in the standard mixed model methods for R, where by conservative robust estimation I mean methods which work almost as well as the methods based on assumptions of normality when the assumption of normality *IS* satisfied. We are considering adding such a conservative robust estimation option for the random effects to our AD Model Builder mixed model package, glmmADMB, for R, and perhaps extending it to do robust estimation for linear mixed models at the same time. An obvious candidate is to assume something like a mixture of normals. I have tested this in a simple linear mixed model using 5% contamination with a normal with 3 times the standard deviation, which seems to be a common assumption. Simulation results indicate that when the random effects are normally distributed this estimator is about 3% less efficient, while when the random effects are contaminated with 5% outliers the estimator is about 23% more efficient, where by 23% more efficient I mean that one would have to use a sample size about 23% larger to obtain the same size confidence limits for the parameters. Question? I wonder if there are other distributions besides a mixture or normals. which might be preferable. Three things to keep in mind are: 1.) It should be likelihood based so that the standard likelihood based tests are applicable. 2.) It should work well when the random effects are normally distributed so that things that are already fixed don't get broke. 3.) In order to implement the method efficiently it is necessary to be able to produce code for calculating the inverse of the cumulative distribution function. This enables one to extend methods based one the Laplace approximation for the random effects (i.e. the Laplace approximation itself, adaptive Gaussian integration, adaptive importance sampling) to the new distribution. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] outer() function
On Thu, 23 Mar 2006, Kyle LaMalfa wrote: Greetings R-help community, I am relatively new to R, which may be why I am having trouble understanding this problem. I am trying to use outer() to generate a graphable surface of a function. If there is a better way to do this, I would appreciate the insight. Otherwise, could someone suggest a method to get the outer() function to work here? It's a FAQ (7.17). -thomas Below is my simplified R program. Further down is the output. Thanks in advance, Kyle ### data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); f(0,1); f(2,1); f(4,1); f(0,2); f(2,2); f(4,2); outer(x, y, f) f - function(x, y) x-x+y-y+force(sum(data^x)) outer(x, y, f) ## data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); [1] 0 [1] 12 [1] 24 f(0,1); f(2,1); f(4,1); [1] 1 [1] 13 [1] 25 f(0,2); f(2,2); f(4,2); [1] 2 [1] 14 [1] 26 outer(x, y, f) [,1] [,2] [,3] [1,] 20 21 22 [2,] 20 21 22 [3,] 20 21 22 Warning message: longer object length is not a multiple of shorter object length in: data * x __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RMySQL's column limit
Mark Van De Vyver wrote: My apologies, I'm using R v2.2.1 via the JGR Editor GUI, RMySQL v0.5.7 under windows XP with latest patches. MySQL is server version: 4.1.12 I'm pretty sure it's running on a linux box. It turns out that this may be a MySQL limit: http://bugs.mysql.com/bug.php?id=4117 The dimension of template is 2000, I know the error kicks in at 3000, but haven't iterated to find the exact point - if it would help I can do this. Regards Mark On 3/24/06, David James [EMAIL PROTECTED] wrote: Hi, Mark Van De Vyver wrote: Dear R-users, First, thank you to the developers for the very useful R-library RMySQL. While using this library a recieved an error message: RS-DBI driver: (could not run statement: Too many columns) The statement that generated the error was: dbWriteTable(dbcon, simdataseries, template, overwrite = TRUE, row.names = FALSE ) I am assuming this is a RMySQL rather than MySQL limit. We need more information, e.g., what's dim(template), what version of MySQL you're using, etc. If that is the case I was wondering just what this limit is and if it is possible to raise it? Thanks again for all the hard work. Sincerely Mark -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- David -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jeffrey Horner Computer Systems Analyst School of Medicine 615-322-8606 Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] outer() function
This is a FAQ: http://cran.r-project.org/doc/manuals/R-FAQ.html#Why-does-outer_0028_0029-behave-strangely-with-my-function_003f On 3/23/06, Kyle LaMalfa [EMAIL PROTECTED] wrote: Greetings R-help community, I am relatively new to R, which may be why I am having trouble understanding this problem. I am trying to use outer() to generate a graphable surface of a function. If there is a better way to do this, I would appreciate the insight. Otherwise, could someone suggest a method to get the outer() function to work here? Below is my simplified R program. Further down is the output. Thanks in advance, Kyle ### data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); f(0,1); f(2,1); f(4,1); f(0,2); f(2,2); f(4,2); outer(x, y, f) f - function(x, y) x-x+y-y+force(sum(data^x)) outer(x, y, f) ## data - c(0, 1, 2, 3) x - c(0,2,4) y - c(0,1,2) f - function(x, y) sum(data*x)+y f(0,0); f(2,0); f(4,0); [1] 0 [1] 12 [1] 24 f(0,1); f(2,1); f(4,1); [1] 1 [1] 13 [1] 25 f(0,2); f(2,2); f(4,2); [1] 2 [1] 14 [1] 26 outer(x, y, f) [,1] [,2] [,3] [1,] 20 21 22 [2,] 20 21 22 [3,] 20 21 22 Warning message: longer object length is not a multiple of shorter object length in: data * x __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] new to R
Hi Linda Did you already get a reply to your question? If not, try adding a new line at the end of the text (just hit enter after 69,the last number in your data and save the file). You also want to use the argument sep in read.table Since you have a comma at the end of each row you can either manually delete that and use read.table, or just import it the way it is and then delete the last variable (V4) created because of the extra comma i.e z- read.table(q.txt, sep=,) z V1 V2 V3 V4 1 1 2 3 NA 2 33 44 88 NA 3 23 43 69 NA #V4 is an artifact from your extra comma at the end of each row newz-z[,-4] #Deletes V4 I hope this helps Francisco From: linda.s [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Subject: [R] new to R Date: Thu, 23 Mar 2006 01:05:21 -0800 z - read.table(c:/temp/q.txt) Warning message: incomplete final line found by readTableHeader on 'c:/temp/q.txt' what does that mean? my q.txt is like: 1, 2, 3, 33, 44, 88, 23, 43, 69, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] new to R
On 3/23/06, Francisco J. Zagmutt [EMAIL PROTECTED] wrote: Hi Linda Did you already get a reply to your question? If not, try adding a new line at the end of the text (just hit enter after 69,the last number in your data and save the file). You also want to use the argument sep in read.table Since you have a comma at the end of each row you can either manually delete that and use read.table, or just import it the way it is and then delete the last variable (V4) created because of the extra comma i.e z- read.table(q.txt, sep=,) z V1 V2 V3 V4 1 1 2 3 NA 2 33 44 88 NA 3 23 43 69 NA #V4 is an artifact from your extra comma at the end of each row newz-z[,-4] #Deletes V4 I hope this helps Francisco It works! Thanks, Linda __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] pairs function from graphics
Hi, I have two questions about options for panel, upper.panel, lower.panel in pairs(). 1. I see examples using panel.smooth, panel.corr, panel.hist. Are there any plots that can be used here? 2. Can we have different plots in upper diagonal? For example, in a matrix of scatterplots, I want to know if we can have a scatter plot for an element (1,2) and then one smooth plot for an element (1,3). Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RMySQL's column limit
Thanks for pointing that out. I've run the create table uploaded with that bug report and it fails for the same reason using MySQL 4.1.12 on linux and 5.0.18-nt So this is not a limitation of RMySQL. Regards Mark On 3/24/06, Jeffrey Horner [EMAIL PROTECTED] wrote: Mark Van De Vyver wrote: My apologies, I'm using R v2.2.1 via the JGR Editor GUI, RMySQL v0.5.7 under windows XP with latest patches. MySQL is server version: 4.1.12 I'm pretty sure it's running on a linux box. It turns out that this may be a MySQL limit: http://bugs.mysql.com/bug.php?id=4117 The dimension of template is 2000, I know the error kicks in at 3000, but haven't iterated to find the exact point - if it would help I can do this. Regards Mark On 3/24/06, David James [EMAIL PROTECTED] wrote: Hi, Mark Van De Vyver wrote: Dear R-users, First, thank you to the developers for the very useful R-library RMySQL. While using this library a recieved an error message: RS-DBI driver: (could not run statement: Too many columns) The statement that generated the error was: dbWriteTable(dbcon, simdataseries, template, overwrite = TRUE, row.names = FALSE ) I am assuming this is a RMySQL rather than MySQL limit. We need more information, e.g., what's dim(template), what version of MySQL you're using, etc. If that is the case I was wondering just what this limit is and if it is possible to raise it? Thanks again for all the hard work. Sincerely Mark -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- David -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jeffrey Horner Computer Systems Analyst School of Medicine 615-322-8606 Department of Biostatistics Vanderbilt University -- Mark Van De Vyver, PhD -- My research is available from my SSRN Author page: http://ssrn.com/author=36577 -- Finance Discipline School of Business The University of Sydney Sydney NSW 2006 Australia Telephone: +61 2 9351-6452 Fax: +61 2 9351-6461 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html