Re: [R] Errors with systemfit package and systemfitClassic()
Hi iamisha1: Sorry for answering so late! On Wednesday 09 May 2007 23:47, [EMAIL PROTECTED] wrote: I get the following error message after using the sysfit package's function 'systemfitClassic': Error in data[[eqnVar]] : subscript out of bounds When I do this: MSYS1 - cbind(Y, Num, F, PO, PD, GO, GD) MigOLS1 - systemfitClassic(OLS, F ~ PO + PD + GO + GD, eqnVar = Num, timeVar = Y, data = MSYS1) and I get this error message: Argument data must be a data.frame (please read the documentation!). Hence, systemfitClassic( [...], data = as.data.frame( MSYS1 ) ) or MSYS1 - as.data.frame( cbind(Y, Num, F, PO, PD, GO, GD) ) should work. Arne Error in inherits(x, factor) : attempt to select more than one element when I do this (removing quotes from columns set as 'eqnVar' and 'timeVar'): MSYS1 - cbind(Y, Num, F, PO, PD, GO, GD) MigOLS1 - systemfitClassic(OLS, F ~ PO + PD + GO + GD, eqnVar = Num, timeVar = Y, data = MSYS1) When I query 'typeof()' I get the following: Y: Integer Num: Integer F: Integer PO: Integer PD: Integer GO: Double GD: Double I have set my data up in a manner analogous to that in the examples in the systemfit documentation. Also, the panel is balanced. If it matters, here are some descriptions of the data: Y: Year Num: ID of Flow F: Flow PO: Origin Population PD: Destination Population GO: Origin GDP GD: Destination GDP [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Arne Henningsen Department of Agricultural Economics University of Kiel Olshausenstr. 40 D-24098 Kiel (Germany) Tel: +49-431-880 4445 Fax: +49-431-880 1397 [EMAIL PROTECTED] http://www.uni-kiel.de/agrarpol/ahenningsen/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] opening a file from within a zipfile that is online
'description' has to be a filepath of a zip file. You will have to download it first. On Wed, 6 Jun 2007, [EMAIL PROTECTED] wrote: Hi Reading the help for ?unz I was wondering if I can read data into R from within an zipfile that is on some website, like maybe: dtaa = read.table(unz(http://www.ats.ucla.edu/stat/examples/alsm/alsm.zip,Ch01pr19.dat;)) Thanks for letting me know if you came acros such a thing before. Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating an Access (.mdb) database using R
Hello! I have a short question: Is it possible to create a (non-existing) Access database using R (and if yes, how)? I need to create a new database and then insert a few tables into it. Thank you in advance, Moshe Olshansky [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Suppressing the large amount of white space in heatmap.2 in gplots
Hi OK, quick question - I can suppress the calculation and drawing of the column dendrogram by using Colv=FALSE and dendrogram=row, but that leaves me with a large amount of white space at the top of the plot where the dendrogram would have been drawn... Is there a way of getting rid of that? Thanks Mick The information contained in this message may be confidentia...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Display Multiple page lattice plots
Gudday, I am generating a series of lattice contourplots that are conditioned on a variable (Year) that has 27 different levels. If I try and put them all on one plot, it ends up pretty messy and you can't really read anything, so instead I have set the layout to 3x3, thus generating three pages of nine plots each. The problem is that I can't display all these on screen at once, because each subsequent page overwrites the previous one. I have found in the mailing lists how to print them to separate files without any problems eg. p-contourplot(log10(x)~lat*long|Year, data=data.tbl, layout=c(3,3)) png(file=Herring Distribution%02d.png,width=800,height=800) print(p) dev.off() but there doesn't seem to be anything about how to output multiple pages to the screen... I suspect that I may need to use the page=... option in contourplot command, but I can't seem to make it work. Its a simple, and not particularly important problem, but it sure is bugging me! Thanks for the advice in advance. Cheers, Mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
Dear Chung-hong Chan, Thanks! Can you recommend a text editor for splitting? I used UltraEdit and TextPad but did not find they can split files. Sincerely, Alex On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote: Easy solution will be split your big txt files by text editor. e.g. 5000 rows each. and then combine the dataframes together into one. On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote: Dear list, I need to read a big txt file (around 130Mb; 23800 rows and 49 columns) for downstream clustering analysis. I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t) but it took a long time and failed. However, it had no problem if I just put data of 3 columns. Is there any way which can load this big file? Thanks for any suggestions! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- The scientists of today think deeply instead of clearly. One must be sane to think clearly, but one can think deeply and be quite insane. Nikola Tesla http://www.macgrass.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
Dear Michael, It consists of 238305 rows and 50 columns including the header and row names. Thanks! Alex On 6/7/07, michael watson (IAH-C) [EMAIL PROTECTED] wrote: Erm... Is that a typo? Are we really talking 23800 rows and 49 columns? Because that doesn't seem that many -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of ssls sddd Sent: 07 June 2007 10:48 To: r-help@stat.math.ethz.ch Subject: Re: [R] How to load a big txt file Dear Chung-hong Chan, Thanks! Can you recommend a text editor for splitting? I used UltraEdit and TextPad but did not find they can split files. Sincerely, Alex On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote: Easy solution will be split your big txt files by text editor. e.g. 5000 rows each. and then combine the dataframes together into one. On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote: Dear list, I need to read a big txt file (around 130Mb; 23800 rows and 49 columns) for downstream clustering analysis. I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t) but it took a long time and failed. However, it had no problem if I just put data of 3 columns. Is there any way which can load this big file? Thanks for any suggestions! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- The scientists of today think deeply instead of clearly. One must be sane to think clearly, but one can think deeply and be quite insane. Nikola Tesla http://www.macgrass.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating an Access (.mdb) database using R
On Wed, 6 Jun 2007, Moshe Olshansky wrote: Hello! I have a short question: Is it possible to create a (non-existing) Access database using R (and if yes, how)? I need to create a new database and then insert a few tables into it. Short answer: yes, if you are using Windows (you did not say). Slightly longer answer 1: If you have the ODBC drivers installed, library(RODBC) ch - odbcDriverConnect(Driver={Microsoft Access Driver (*.mdb)}) will allow you to create a database and select it. Slightly longer answer 2: Use DCOM to control Access if you have that installed. I'll leave you to do your own homework on this one. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
Dear Jim, Thanks a lot! The size of the text file is 189,588,541 bytes. It consists of 238305 rows (including the header) and 50 columns (the first column is for ID and the rest for 49 samples). The first row looks like: ID AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel and the second row looks like: SNP_A-17802711.85642004013061.50955998897551.7315399646759 1.5307699441911.65760004520421.4741799831392.1564099788666 1.77572267 1.59794998168952.1641461851.980849981308 2.1803700923921.87822997570042.14855003356931.5325000286102 1.72329998016362.22812008857731.9381694821.8546999692917 2.1590900421143 2.19284009933472.02532005310062.6680200099945 2.74359011650092.08049988746643.21423006057742.1001501083374 2.1475799083713.52442002296451.3744800090791.6613099575043 3.1606800556183 2.09170007705691.8727256131.8952000141144 1.8135700225831.81808996200562.25536990165711.927329428 1.67664003372191.34246003627781.56669998168951.7180800437927 1.9548699855804 1.9996948242.22429990768431.7591500282288 2.04801988601682.638689994812 Thanks a lot! Sincerely, Alex On 6/6/07, jim holtman [EMAIL PROTECTED] wrote: It would be useful if you could post the first couple of rows of the data so we can see what it looks like. On 6/6/07, ssls sddd [EMAIL PROTECTED] wrote: Dear list, I need to read a big txt file (around 130Mb; 23800 rows and 49 columns) for downstream clustering analysis. I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t) but it took a long time and failed. However, it had no problem if I just put data of 3 columns. Is there any way which can load this big file? Thanks for any suggestions! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
Erm... Is that a typo? Are we really talking 23800 rows and 49 columns? Because that doesn't seem that many -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of ssls sddd Sent: 07 June 2007 10:48 To: r-help@stat.math.ethz.ch Subject: Re: [R] How to load a big txt file Dear Chung-hong Chan, Thanks! Can you recommend a text editor for splitting? I used UltraEdit and TextPad but did not find they can split files. Sincerely, Alex On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote: Easy solution will be split your big txt files by text editor. e.g. 5000 rows each. and then combine the dataframes together into one. On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote: Dear list, I need to read a big txt file (around 130Mb; 23800 rows and 49 columns) for downstream clustering analysis. I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t) but it took a long time and failed. However, it had no problem if I just put data of 3 columns. Is there any way which can load this big file? Thanks for any suggestions! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- The scientists of today think deeply instead of clearly. One must be sane to think clearly, but one can think deeply and be quite insane. Nikola Tesla http://www.macgrass.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help
scott flemming wrote: Hi, I wonder whether R can finish the following project: I want to make a chart to represent 10 genes. Each gene has orientation and length. Therefore, a gene can be represented by arrows. Can R be used to draw 10 arrows in one line ? Hi Scott, Maybe the feather.plot function in the plotrix package is what you want. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] update packages with R on Vista: error
Dear R-list, I have encountered the following error message trying to update R packages: update.packages(ask='graphics') Warning in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : 'lib' is not writable Error in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : unable to install packages I remember did not have the problem on the last update where R installed the files then in the Documents/R folder on my user account. Any ideas how to handle this? I made the directories completely writable so I do not know where the problem is now (especially since update worked before...) Stefan PS: Tinn-R 1.19.2.3 + R 2.5.0 on Vista Business __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display Multiple page lattice plots
This works on my Windows machine starting off at a new R session: options(graphics.record = TRUE) library(lattice) xyplot(uptake ~ conc | Plant, CO2, layout = c(2,2)) Now switch focus to the graphics window and you can PgUp and PgDn through them. There are several variations to this: 1. use graphics.record option as shown above 2. the graphics.record option is passed to the windows driver so explicitly call windows yourself. See ?windows 3. prior to your xyplot switch focus to the graphics window and a History menu appears. Use that to turn on recording. On 6/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Gudday, I am generating a series of lattice contourplots that are conditioned on a variable (Year) that has 27 different levels. If I try and put them all on one plot, it ends up pretty messy and you can't really read anything, so instead I have set the layout to 3x3, thus generating three pages of nine plots each. The problem is that I can't display all these on screen at once, because each subsequent page overwrites the previous one. I have found in the mailing lists how to print them to separate files without any problems eg. p-contourplot(log10(x)~lat*long|Year, data=data.tbl, layout=c(3,3)) png(file=Herring Distribution%02d.png,width=800,height=800) print(p) dev.off() but there doesn't seem to be anything about how to output multiple pages to the screen... I suspect that I may need to use the page=... option in contourplot command, but I can't seem to make it work. Its a simple, and not particularly important problem, but it sure is bugging me! Thanks for the advice in advance. Cheers, Mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC and placeholders?
Hello, I have a question about interacting with MySQL from R, I have a vector of ids and for each id I would like to query my database and retrieve 3 values and combine the results of all the ids into a dataframe currently I have been using RODBC for single queries, but I have not found anything in the documentation about using placeholders I hope some can help, Thanks James Morris [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update packages with R on Vista: error
See the rw-FAQ, which describes this in detail. Almost certainly you are trying to update the package 'cluster' which is in the main library. But as you used the GUI, we can't see that. On Thu, 7 Jun 2007, Stefan Grosse wrote: Dear R-list, I have encountered the following error message trying to update R packages: update.packages(ask='graphics') Warning in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : 'lib' is not writable Error in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : unable to install packages I remember did not have the problem on the last update where R installed the files then in the Documents/R folder on my user account. Any ideas how to handle this? I made the directories completely writable so I do not know where the problem is now (especially since update worked before...) Stefan PS: Tinn-R 1.19.2.3 + R 2.5.0 on Vista Business __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
I took your data and duped the data line so I had 100,000 rows and it took 40 seconds to read in when specifying colClasses system.time(x - read.table('/tempxx.txt', header=TRUE,colClasses=c('factor', rep('numeric',49 user system elapsed 40.980.46 42.39 str(x) 'data.frame': 102272 obs. of 50 variables: $ ID : Factor w/ 1 level SNP_A-1780271: 1 1 1 1 1 1 1 1 1 1 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel : num 1.86 1.86 1.86 1.86 1.86 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel : num 1.51 1.51 1.51 1.51 1.51 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel : num 1.73 1.73 1.73 1.73 1.73 ... $ AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel : num 1.53 1.53 1.53 1.53 1.53 ... $ AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel : num 1.66 1.66 1.66 1.66 1.66 ... $ AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel : num 1.47 1.47 1.47 1.47 1.47 ... $ AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel : num 1.78 1.78 1.78 1.78 1.78 ... $ AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel : num 1.60 1.60 1.60 1.60 1.60 ... $ AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel : num 1.98 1.98 1.98 1.98 1.98 ... $ DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel : num 2.18 2.18 2.18 2.18 2.18 ... $ DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel : num 1.88 1.88 1.88 1.88 1.88 ... $ DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel : num 2.15 2.15 2.15 2.15 2.15 ... $ DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel : num 1.53 1.53 1.53 1.53 1.53 ... $ DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel : num 1.72 1.72 1.72 1.72 1.72 ... $ DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel : num 2.23 2.23 2.23 2.23 2.23 ... $ DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel : num 1.94 1.94 1.94 1.94 1.94 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel : num 1.85 1.85 1.85 1.85 1.85 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel : num 2.19 2.19 2.19 2.19 2.19 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel : num 2.03 2.03 2.03 2.03 2.03 ... $ GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel : num 2.67 2.67 2.67 2.67 2.67 ... $ GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel : num 2.74 2.74 2.74 2.74 2.74 ... $ GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel : num 2.08 2.08 2.08 2.08 2.08 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel: num 3.21 3.21 3.21 3.21 3.21 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel: num 2.1 2.1 2.1 2.1 2.1 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel: num 2.15 2.15 2.15 2.15 2.15 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel: num 3.52 3.52 3.52 3.52 3.52 ... $ HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel : num 1.37 1.37 1.37 1.37 1.37 ... $ HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel : num 1.66 1.66 1.66 1.66 1.66 ... $ HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel : num 3.16 3.16 3.16 3.16 3.16 ... $ SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel : num 2.09 2.09 2.09 2.09 2.09 ... $ SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel : num 1.87 1.87 1.87 1.87 1.87 ... $ SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel : num 1.90 1.90 1.90 1.90 1.90 ... $ SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel : num 1.81 1.81 1.81 1.81 1.81 ... $ SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel : num 1.82 1.82 1.82 1.82 1.82 ... $ SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel : num 2.26 2.26 2.26 2.26 2.26 ... $ VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel : num 1.93 1.93 1.93 1.93 1.93 ... $ VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel : num 1.68 1.68 1.68 1.68 1.68 ... $ VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel : num 1.34 1.34 1.34 1.34 1.34 ... $ VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel : num 1.57 1.57 1.57 1.57 1.57 ... $ VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel : num 1.72 1.72 1.72 1.72 1.72 ... $ VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel : num 1.95 1.95 1.95 1.95 1.95 ... $ VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel : num 1.44 1.44 1.44 1.44 1.44 ... $ VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel : num 2.22 2.22 2.22 2.22 2.22 ... $ VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel : num 1.76 1.76 1.76 1.76 1.76 ... $ VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel : num 2.05 2.05 2.05 2.05 2.05 ... $ VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel : num 2.64 2.64 2.64 2.64 2.64 ... On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote: Dear Jim, Thanks a lot! The size of the text file is 189,588,541 bytes. It consists of 238305 rows (including the header) and 50 columns (the first column is for ID and the rest for 49 samples).
[R] Use R in a pipeline as a filter
Hi, how can I use R in a pipline like this $ ./generate-data | R --script-file=Script.R | ./further-analyse-data result.dat Assume a column based output of ./generate-data, e.g. something like: 1 1 1 2 4 8 3 9 27 4 16 64 The R commands that process the data should come from Script.R and should print to stdout (Script.R could for example calculate the square of every entry or calculate the mean of the columns, ...) The output should be printed to stdout, such that further-analyse-data can use the output. Can some R expert code that for me please? I would be very happy. I am also happy about information how to do that myself although I dont think I know enough to do that myself. Thank you for your consideration, Micha -- GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS. Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional Sequential Gaussian Simulation
Hello, I'm wondering if there are any packages/functions that can perform conditional sequential gaussian simulation. I'm following an article written by Grunwald, Reddy, Prenger and Fisher 2007. Modeling of the spatial variability of biogeochemical soil properties in a freshwater ecosystem. Ecological Modelling 201: 521 - 535, and would like to explore this methodology. Thanks Steve Steve Friedman, PhD Everglades Division Senior Environmental Scientist, Landscape Ecology South Florida Water Management District 3301 Gun Club Road West Palm Beach, Florida 33406 email: [EMAIL PROTECTED] Office: 561 - 682 - 6312 Fax: 561 - 682 - 5980 If you are not doing what you truly enjoy its your obligation to yourself to change. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use R in a pipeline as a filter
This is one of the things that 'Rscript' is for: see 'An Introduction to R' (section B.4 in the HTML version, http://cran.r-project.org/doc/manuals/R-intro.html#Scripting-with-R). You haven't even told us your version of R or OS (see the posting guide): you need R = 2.5.0 for this. But your 'example' would be ./generate-data | Rscript Script.R | ./further-analyse-data result.dat On Thu, 7 Jun 2007, [EMAIL PROTECTED] wrote: Hi, how can I use R in a pipline like this $ ./generate-data | R --script-file=Script.R | ./further-analyse-data result.dat Assume a column based output of ./generate-data, e.g. something like: 1 1 1 2 4 8 3 9 27 4 16 64 The R commands that process the data should come from Script.R and should print to stdout (Script.R could for example calculate the square of every entry or calculate the mean of the columns, ...) The output should be printed to stdout, such that further-analyse-data can use the output. Can some R expert code that for me please? I would be very happy. I am also happy about information how to do that myself although I dont think I know enough to do that myself. Thank you for your consideration, Micha -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spectral analysis
David and Ted, since David asked about wavelets, there are some examples at the packages Wavethresh and Waveslim that could be useful. Waveslim deals with time series that are or are not a power of 2, but must be regularly spaced. Wavethresh 3 (http://www.maths.bris.ac.uk/~wavethresh/) has methods to analyze irregular time series, as suggested by Ted but their length must be a power of 2. Regards, Rogerio -- Cabeçalho original --- De: [EMAIL PROTECTED] Para: David LEDU [EMAIL PROTECTED] Cópia: r-help@stat.math.ethz.ch Data: Wed, 06 Jun 2007 22:34:21 +0100 (BST) Assunto: Re: [R] Spectral analysis On 06-Jun-07 20:55:09, David LEDU wrote: Hi all, I am dealing with paleoceanographic data and I have a C14 time serie and one other variable. I would like to perform a spectral analysis (fft or wavelet) and plot it. Unfortunately I don't know the exact script to do this. Does anybody could send me an example to perform my spectral analysis ? I Thank you David There are a lot of possible ramifications to your query, but for a basic spectral analysis of a series you can use the function spectrum() in the stats package. What is the role of the other variable? Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 06-Jun-07 Time: 22:34:07 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update packages with R on Vista: error
Actually the packages R wants to update are: VR, cluster, lattice, mgcv, nlme and rcompgen. I did how described in the R-Win-FAQ create a .Renviron File containing the path to the win-library that R already created (R_LIBS=C: ... ). I also tried to add R_LIBS= as Rgui parameter from within Tinn-R. Additionally I tried to leave a file named Renviron.site in the etc library. Nothing worked thus far. Interestingly installing packages does work fine even without specifying the R_LIBS path manually with any of the above mentioned methods. Even more puzzling is that even when I install eg. nlme manually via install.packages(nlme) it works but R still wants to update it. Even though e.g. library(nlme), ?nlme shows that the latest version is installed. I would guess there is some problem with the library path variable in the update program... Stefan Original Message Subject: Re:[R] update packages with R on Vista: error From: Prof Brian Ripley [EMAIL PROTECTED] To: Stefan Grosse [EMAIL PROTECTED] Date: 07.06.2007 13:01 See the rw-FAQ, which describes this in detail. Almost certainly you are trying to update the package 'cluster' which is in the main library. But as you used the GUI, we can't see that. On Thu, 7 Jun 2007, Stefan Grosse wrote: Dear R-list, I have encountered the following error message trying to update R packages: update.packages(ask='graphics') Warning in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : 'lib' is not writable Error in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : unable to install packages I remember did not have the problem on the last update where R installed the files then in the Documents/R folder on my user account. Any ideas how to handle this? I made the directories completely writable so I do not know where the problem is now (especially since update worked before...) Stefan PS: Tinn-R 1.19.2.3 + R 2.5.0 on Vista Business __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ordered logistic regression
Hi there, i tried to run an ordered logistic regression with polr. so far it worked after i turned my data into factors. but here´s my problem: my output is like this: Call: polr(formula = factor(fulltest[, 1]) ~ factor(fulltest[, 2]) + factor(fulltest[, 10]), method = logistic) Coefficients: factor(fulltest[, 2])0 factor(fulltest[, 2])1 factor(fulltest[, 2]) 2 factor(fulltest[, 2])3 factor(fulltest[, 10])0 factor(fulltest[, 10])1 factor(fulltest[, 10])2 1.0850358 0.8269005 0.8850035 1.0263442 1.3724189 1.8258853 2.2263393 factor(fulltest[, 10])3 2.5234381 Intercepts: -1|0 0|1 1|2 2|3 -2.42111430 -2.13077351 0.07966516 2.85951997 Residual Deviance: 1219.493 AIC: 1243.493 as far is i understand, there´s a dummy var introduced for every possible outcome ( 0 , 1, 2 ,3) . this is nice because it contains a lot of information, but far more than i need. and i have to many variables to research to use dummies for every single one of them. Can i get one coefficient per variable ? Is there another package for logistic regression ? did is use the factor thing the wrong way ? thx in advance mattthias __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to get the number of modes using kde2d
Hi, The silverman's paper introduction offer how to find a mode for one dimensional data based on software http://www.stanford.edu/~kasparr/software/silverman.r, for two dimensional data I use kde2d to smooth it out first, then I get a matrix of densities for all the X(one dimension) cross Y(another dimension). I sort X and Y first before I pass the values to kde2d(x, y, c(hx, hy)), the persp shape changes Does anyone know how to get the modes out of the two dimensional data programmatically. Also if I want to get the minumum of X, Y with modes =2, is the solution unique? Thanks pat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] comparison of two logistic regression models
Dear list members! Could you help me? I would like to compare two models: a) logistic regression model, 3 factors as independents b) logistic regression model, 3 factors and one random effect as independents (function glmmPQL). AIC are not available with PQL and model comparison using ANOVA is not possible. What should I do? Thanks in advance. Anna-Maria __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rdonlp2 - an extension library for constrained optimization
Hello R-list, I have released an update version (0.3-1) of Rdonlp2. Some (fatal) bugs which may kill interpreter should be fixed. In addition, user-visible changes are: * *.mes, *.pro files are not created if name=NULL(this is default) in donlp2(). * use machine-epsilons defined in R for internal calculations(step-size, etc.). * numeric hessian is now evaluated at the optimum and calculated with the algorithm specified in 'difftype' in donlp2.control(). Setting difftype=2 will produce (roughly) same value as optim() does. I sincerely appreciate users who sent me useful comments. Windows Binary, OSX Universal Binary, Source file are available at: http://arumat.net/Rdonlp2/ Regards, TAMURA Ryuichi, mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparison of two logistic regression models
You could use lmer in the lme4 package to fit the logistic regression with random effect as it does report the AIC. On 07/06/07, Anna-Maria Tyriseva [EMAIL PROTECTED] wrote: Dear list members! Could you help me? I would like to compare two models: a) logistic regression model, 3 factors as independents b) logistic regression model, 3 factors and one random effect as independents (function glmmPQL). AIC are not available with PQL and model comparison using ANOVA is not possible. What should I do? Thanks in advance. Anna-Maria __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] names not inherited in functions
Not sure what you are going to get. Can you shorten your functions and specify some example data? Then please tell us what your expected result is. Best, Uwe Ligges david dav wrote: Dear all, I 'd like to keep the names of variables when calling them in a function. An example might help to understand my problem : The following function puts in a new data frame counts and percent of a data.frame called as tablo the step nom.chiffr[1] - names(vari) is useless as names from the original data.frame aren't kept in the function environement. Hoping I use appropriate R-vocabulary, I thank you for your help David descriptif - function (tablo) { descriptifvar - function (vari) { table(vari) length(vari[!is.na(vari)]) chiffr - cbind(table(vari),100*table(vari)/(length(vari[!is.na(vari)]))) nom.chiffr - rep(NA, dim(table(vari))) if (is.null(names(vari))) nom.chiffr[1] - paste(i,) else nom.chiffr[1] - names(vari) chiffr - data.frame ( names(table(vari)),chiffr) rownames(chiffr) - NULL chiffr - data.frame (nom.chiffr, chiffr) return(chiffr) } res - rep(NA, 4) for (i in 1 : ncol(tablo)) res - rbind(res,descriptifvar(tablo[,i])) colnames(res) - c(variable, niveau, effectif, pourcentage) return(res[-1,]) } # NB I used this function on a data.frame with only factors in __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with Axis labels
ramakanth reddy wrote: Hi I am using the pamr.plotsurvival fucntion to plot the KM curves,how can I change the x axis and y axis labels according to my interest. If we are talking about the most recent version of package pamr (you forgot to tell us these details): In R, type pamr.plotsurvival and see that this function is rather a proof of concept than a well designed function for general use. You might want to extend the function by allowing xlab / ylab and other arguments. Additionally, you might want to remove a couple of hard coded values in order to make the function more generally usable. I am pretty sure the authors/maintainer (CCing the maintainer) of pamr will be happy about your contributions, if you submit well designed improvements. Best regards, Uwe Ligges Thanks Looking for people who are YOUR TYPE? Find them at in.groups.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate by two columns, sum not working while mean is
Dear Fellow Rers, I have a table looks like this: ca, la, 12 ca, sd, 22 ca, la, 33 nm, al, 9 ma, lx, 18 ma, bs, 90 ma, lx, 22 I want to sum the 3rd column grouped by the first and the second column, so the result look like this table: ca, la, 45 ca, sd, 22 nm, al, 9 ma, lx, 40 ma, bs, 90 The two rows with are sums. I tried aggregate(table,list(table$V1,table$V2),sum/mean), sum was not working while mean worked. Can anybody give a hint? Thanks. Guanrao __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate by two columns, sum not working while mean is
This seems to work fine: x - ca, la, 12 + ca, sd, 22 + ca, la, 33 + nm, al, 9 + ma, lx, 18 + ma, bs, 90 + ma, lx, 22 + table - read.csv(textConnection(x), header=FALSE) aggregate(table$V3,list(table$V1,table$V2),mean) Group.1 Group.2x 1 nm al 9.0 2 ma bs 90.0 3 ca la 22.5 4 ma lx 20.0 5 ca sd 22.0 aggregate(table$V3,list(table$V1,table$V2),sum) Group.1 Group.2 x 1 nm al 9 2 ma bs 90 3 ca la 45 4 ma lx 40 5 ca sd 22 On 6/7/07, Guanrao Chen [EMAIL PROTECTED] wrote: Dear Fellow Rers, I have a table looks like this: ca, la, 12 ca, sd, 22 ca, la, 33 nm, al, 9 ma, lx, 18 ma, bs, 90 ma, lx, 22 I want to sum the 3rd column grouped by the first and the second column, so the result look like this table: ca, la, 45 ca, sd, 22 nm, al, 9 ma, lx, 40 ma, bs, 90 The two rows with are sums. I tried aggregate(table,list(table$V1,table$V2),sum/mean), sum was not working while mean worked. Can anybody give a hint? Thanks. Guanrao __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Garch question
Hello I am trying to fit a GARCH model, but continue to get error messages. Here is what I am trying and the error message I get garchmodel-garchFit(~arma(0,1), ~garch(0,1),series=junk) Error in as.data.frame.default(data) : cannot coerce class formula into a data.frame junk is a time series. Any suggestions? Thanks in advance alison -- Alison Weir, PhD Statistics Faculty Advisor University of Toronto Mississauga (905) 828 -3946 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Garch question
On 07/06/07, alison weir [EMAIL PROTECTED] wrote: Hello I am trying to fit a GARCH model, but continue to get error messages. Here is what I am trying and the error message I get garchmodel-garchFit(~arma(0,1), ~garch(0,1),series=junk) Error in as.data.frame.default(data) : cannot coerce class formula into a data.frame junk is a time series. I am usinf R version 2.5.0 Any suggestions? Thanks in advance alison -- Alison Weir, PhD Statistics Faculty Advisor University of Toronto Mississauga (905) 828 -3946 -- Alison Weir, PhD Statistics Faculty Advisor University of Toronto Mississauga (905) 828 -3946 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparison of two logistic regression models
On Thu, 7 Jun 2007, David Barron wrote: You could use lmer in the lme4 package to fit the logistic regression with random effect as it does report the AIC. Indeed you could (lmer reports _an approximation_ to the AIC), but AIC comparison between these two models is not valid as whereas they are tested, the smaller model is on the boundary of the parameter space for the larger one, and that violates one of the assumptions in the derivation of AIC. From simulation studies I have heard seminar talks about, it makes a large practical difference as well. Withou knowing why Anna-Maria wants to 'compare two models', I could not begin to offer advice. Generally one should test how well each does the task to hand, whatever that is. On 07/06/07, Anna-Maria Tyriseva [EMAIL PROTECTED] wrote: Dear list members! Could you help me? I would like to compare two models: a) logistic regression model, 3 factors as independents b) logistic regression model, 3 factors and one random effect as independents (function glmmPQL). AIC are not available with PQL and model comparison using ANOVA is not possible. What should I do? Thanks in advance. Anna-Maria -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reading BMP or TIFF files
I realize that this question has been asked before (2003); From: Yi-Xiong Zhou Date: Sat 22 Nov 2003 - 10:57:35 EST but I am hoping that the answer has changed. Namely, I would rather read the BMP (or TIFF) files directly instead of putting them though a separate utility for conversion as suggested by, From: Prof Brian Ripley Date: Sat 22 Nov 2003 - 15:23:33 EST Even easier is to convert .bmp to .pnm by an external utility. For example, `convert' from the ImageMagick suite (www.imagemagick.org) can do this. Thanks, Robert Meglen [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] which syntax to use for ordered logit
Hi everybody, i am like to do a ordered logistic model, but cant figure out which syntax / library fits best. i´ve answer possibilites in a matrix (-1 0 1 2 3), these are saved as factors. i guess i need something pretty basic. i tried VGAM, polr but received not what i wanted. Whicht library / package / syntax woud you prefer ? thx in advance! matthias __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using Akima with nearly-gridded data
I am using the Akima interpolation package to generate an interpolated color contour plot. It is working very well, except for one problem. The data that I have represents real-time readings from a thermistor string vs. time, so the data points are often very nearly in a rectangular array, since the thermistors are read at regular time intervals and they are equally spaced physically. However, readings are sometimes delayed or missed, so I cannot assume that it will be a regular grid. Hence Akima. However, Akima simply will not work if the first three points are collinear (which is easy to get around), and it often leaves blank triangles in seemingly arbitrary places in the plot. It seems that the algorithm in Akima for building the triangles that it uses internally to do the interpolation is having a very hard time dealing with nearly regularly-spaced data points. The only way I have found to get Akima to work, is to slightly perturb the data points by adding random seconds to the times (the temperatures are read every 5 minutes, so a few seconds aren't going to matter). More recently I have had some luck simply feeding the points into the algorithm in a pseudo-randomized order. But then, of course, the outcome is largely the luck of the draw and sometimes the plot still ends up with a scattering of white triangles, or artifacts on the edges of the plot. Does anyone have any suggestions as to how to make this work consistently? -- Tom Hansen Senior Information Processing Consultant UWM Great Lakes WATER Institute www.glwi.uwm.edu [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to input data from the keyboard
Hello everybody, i wish to input data from the keyboard. In C++ it would seem like this: printf(Input parameter Alpha= ); scanf(%d, alpha); how would be in R? Thanks for your help. Bye Miguel. -- View this message in context: http://www.nabble.com/how-to-input-data-from-the-keyboard-tf3885387.html#a11013164 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] new data frame for loop
I have a data frame with three columns, one coded as a factor. I would like to separate my data out into separate data frames according to the factor level. Below is a simple example to illustrate. I can get R to return the data in the correct format but cannot work out how to get separate data frames. I am a newcommer to programming in R so am not sure what I am missing! Thanks, Emily a-seq(1,20, by=2) b-seq(1,30, by=3) ID-as.factor(c(1,1,1,2,2,2,3,3,3,3)) df-data.frame(a,b,ID) for(i in 1:length(unique(ID))) { df2-subset(df, select=a:b, ID==ID[i]) } [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to input data from the keyboard
Please do your homework: help.search(input) Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Miguel Caro Sent: Thursday, June 07, 2007 11:01 AM To: r-help@stat.math.ethz.ch Subject: [R] how to input data from the keyboard Hello everybody, i wish to input data from the keyboard. In C++ it would seem like this: printf(Input parameter Alpha= ); scanf(%d, alpha); how would be in R? Thanks for your help. Bye Miguel. -- View this message in context: http://www.nabble.com/how-to-input-data-from-the-keyboard-tf3885387.html#a11 013164 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] new data frame for loop
Hi Emily, Emily Broccoli wrote: I have a data frame with three columns, one coded as a factor. I would like to separate my data out into separate data frames according to the factor level. Below is a simple example to illustrate. I can get R to return the data in the correct format but cannot work out how to get separate data frames. I am a newcommer to programming in R so am not sure what I am missing! Thanks, Emily a-seq(1,20, by=2) b-seq(1,30, by=3) ID-as.factor(c(1,1,1,2,2,2,3,3,3,3)) df-data.frame(a,b,ID) The function split will give you a list of data frames split according to a factor: split(df, df$ID) $`1` a b ID 1 1 1 1 2 3 4 1 3 5 7 1 $`2` a b ID 4 7 10 2 5 9 13 2 6 11 16 2 $`3` a b ID 7 13 19 3 8 15 22 3 9 17 25 3 10 19 28 3 See ?split. HTH, Tobias -- Tobias Verbeke - Consultant Business Decision Benelux Rue de la révolution 8 1000 Brussels - BELGIUM +32 499 36 33 15 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading BMP or TIFF files
On 6/7/07, Bob Meglen [EMAIL PROTECTED] wrote: I realize that this question has been asked before (2003); From: Yi-Xiong Zhou Date: Sat 22 Nov 2003 - 10:57:35 EST but I am hoping that the answer has changed. Namely, I would rather read the BMP (or TIFF) files directly instead of putting them though a separate utility for conversion as suggested by, What would you like to do with the images? The GdkPixbuf bindings provided by the RGtk2 package allow you to read both types of images. In conjunction with the cairoDevice package, you could mix the image with R graphics. Another way might be to use some Java library via rJava and use the Java graphics device. Michael From: Prof Brian Ripley Date: Sat 22 Nov 2003 - 15:23:33 EST Even easier is to convert .bmp to .pnm by an external utility. For example, `convert' from the ImageMagick suite (www.imagemagick.org) can do this. Thanks, Robert Meglen [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Averaging across rows columns
I use Windows, R version 2.4.1. I have a dataset in which columns 1-3 are replicates, 4-6, are replicates, etc. I need to calculate an average for every set of replicates (columns 1-3, 4-6, 7-9, etc.) AND each set of replicates should be averaged every 14 rows (for more detail, to measure fruit color using a spectrometer, I recorded three readings per fruit -replicates- that I need to average to get one reading per fruit; each row is a point in the light spectrum and I need to calculate an average reading every 5nm -14 rows- for each fruit). Someone proposed to another user who wanted an avg across columns to do a - matrix(rnorm(360),nr=10) b - rep(1:12,each=3) avgmat - aggregate(a,by=list(b)) I tried doing this to get started with the columns first but it asks for an argument FUN that has no default. The help for aggregate isn't helping me much (a new R user) to discover what value to give to FUN -'average' doesn't seem to exist, and 'sum' (whatever it is supposed to sum) gives an error saying that arguments should have the same length- Any help will be much appreciated! Silvia. -- View this message in context: http://www.nabble.com/Averaging-across-rows---columns-tf3885900.html#a11014649 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MITOOLS: Error in eval(expr, envir, enclos) : invalid 'envir' argument
R-users helpers: I am using Amelia, mitools and cmprsk to fit cumulative incidence curves to multiply imputed datasets. The error message that I get Error in eval(expr, envir, enclos) : invalid 'envir' argument occurs when I try to fit models to the 50 imputed datasets using the with.imputationList function of mitools. The problem seems to occur intermittently, depending on the type of model that I try to fit to the datasets as well as the previous code that has been executed during the R session. I have read the previous postings for similar problems and have tried renaming many of my objects which has not solved the problem. What is weird is that I have not been able to reproduce the problem using other standard survival datasets (like pbc). It therefore seems to have something to do with my particular analysis, likely the names of my objects. I cannot find the source of the problem and would greatly appreciate any help. Brant Below is my session information and some code demonstrating the issue occuring with coxph. sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United attached base packages: [1] splines grid stats graphics grDevices utils [7] datasets methods base other attached packages: cmprsk mitools Amelia survivalRGraphics latticeExtra 2.1-71.0 1.1-23 2.31 1.0-6 0.2-1 lattice foreign MASS 0.15-8 0.8-20 7.2-34 str(utt.mi)# My dataset 'data.frame': 168 obs. of 25 variables: $ age : num 79.5 67.1 63.7 76.9 69.0 ... $ gender : Factor w/ 2 levels 0,1: 1 2 2 2 1 2 2 2 2 2 ... $ symptoms : Factor w/ 2 levels 0,1: 1 2 1 1 2 1 2 1 1 2 ... $ site : Factor w/ 3 levels 1,2,3: 1 1 2 1 2 1 1 2 1 3 ... $ multifoc : Factor w/ 2 levels 0,1: 1 1 1 1 1 1 1 1 1 2 ... $ ctnm : Factor w/ 2 levels 1,2: 1 NA 2 1 2 2 1 NA 1 2 ... $ prebca : Factor w/ 2 levels 0,1: 1 1 1 1 2 1 2 1 1 1 ... $ precystec: Factor w/ 2 levels 0,1: 1 1 1 1 1 1 1 1 1 1 ... $ surgery : Factor w/ 2 levels 1,2: 1 1 2 1 2 1 1 1 1 1 ... $ ptnm.t : Factor w/ 5 levels 0,1,2,3,..: 3 3 5 1 2 4 2 1 1 5 ... $ grade: Factor w/ 3 levels 1,2,3: 2 2 3 2 2 3 2 1 1 3 ... $ histol : Factor w/ 2 levels 0,1: 2 2 2 2 2 2 2 2 2 2 ... $ postbca : Factor w/ 2 levels 0,1: 2 1 2 2 2 NA 2 1 1 1 ... $ postcyst : Factor w/ 2 levels 0,1: 1 1 1 2 1 1 1 1 1 1 ... $ chemo: Factor w/ 2 levels 0,1: 1 1 2 1 1 2 2 1 1 2 ... $ mets : Factor w/ 2 levels 0,1: 1 2 2 1 2 2 2 1 1 2 ... $ status : Factor w/ 4 levels 1,2,3,4: 1 3 2 1 3 3 3 1 1 3 ... $ futime : num 10.46 1.15 2.43 2.83 6.82 ... $ smk : Factor w/ 2 levels 0,1: 2 2 2 1 2 1 2 1 2 2 ... $ surg.yr : int 88 94 92 93 86 85 95 98 91 85 ... $ nodes: Factor w/ 2 levels 0,1: 1 1 2 1 1 2 1 1 1 2 ... $ os : num 0 1 0 0 1 1 1 0 0 1 ... $ css : num 0 1 0 0 1 1 1 0 0 1 ... $ rfs : num 0 1 1 0 1 1 1 0 0 1 ... $ comp : num 0 1 1 0 1 1 1 0 0 1 ... set.seed(200) M - 50 # Number of imputations am.imp - amelia(utt.mi, m=M, p2s=1, startvals=1, write.out=F, + idvars=c('os','css','rfs','comp'), + noms=c('gender','symptoms','site','multifoc','ctnm','prebca','precystec' , + 'smk','surgery','ptnm.t','nodes','grade','histol','postbca', + 'postcyst','chemo','mets','status'), + sqrts=c('futime')) -- Imputation 1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 snip -- Imputation 50 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 MIset - imputationList(am.imp[1:M]) mifit - with(MIset, + coxph(Surv(futime, os) ~ age + symptoms + ctnm + smk)) Error in eval(expr, envir, enclos) : invalid 'envir' argument __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display Multiple page lattice plots
On 6/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Gudday, I am generating a series of lattice contourplots that are conditioned on a variable (Year) that has 27 different levels. If I try and put them all on one plot, it ends up pretty messy and you can't really read anything, so instead I have set the layout to 3x3, thus generating three pages of nine plots each. The problem is that I can't display all these on screen at once, because each subsequent page overwrites the previous one. I have found in the mailing lists how to print them to separate files without any problems eg. p-contourplot(log10(x)~lat*long|Year, data=data.tbl, layout=c(3,3)) png(file=Herring Distribution%02d.png,width=800,height=800) print(p) dev.off() but there doesn't seem to be anything about how to output multiple pages to the screen... I suspect that I may need to use the page=... option in contourplot command, but I can't seem to make it work. Its a simple, and not particularly important problem, but it sure is bugging me! You haven't told us what you want to happen exactly. Gabor's solution will work (on Windows), and a multi-page PDF file is a similar option that's portable. Here's another option if you want multiple windows: xyplot(1:10 ~ 1:10 | gl(3, 1, 10), layout = c(1, 1), page = function(n) dev.copy(x11)) you should replace x11 with the appropriate choice on your platform. This will produce an extra copy of the last page, which you can suppress by making use of 'n' inside your page function. (Unfortunately page = function(n) x11() does not work, even though that would have been more natural.) Another option is to print your trellis object in parts; e.g. p-contourplot(log10(x)~lat*long|Year, data=data.tbl, layout=c(3,3)) x11() p[1:9] x11() p[10:18] x11() p[19:27] -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mandriva Spring 2007 and R
Thank you everyone for all of your suggestions!! I am going to try compiling R from the source- it should be the best exercise to broaden my understanding of Linux. Best. Jonathan. Roland Rau [EMAIL PROTECTED] wrote: Hi Jonathan, Jonathan Morse wrote: I am new to Linux (not to R) and recently installed Mandriva Spring 2007 on my partitioned hard drive. My next objective is to install R in the Linux environment, unfortunately Mandriva is not one of the Linux distributions available for download... Could someone please let me know which distribution I should use? One possibility is, of course, that you compile it yourself for your computer. Compiling R was my first shot at compiling programs when I was new to Linux, and it was not very difficult. It is described nicely in the R Installation Administration Manual. http://cran.r-project.org/doc/manuals/R-admin.html Basically, you only need to take care of the following steps to get you started: - did you download and unpack the source distribution (see section 1.1 of the manual)? - do you have the required tools installed (see section A.1 of the manual)? (C compiler, Fortran compiler, libreadline, libjpeg, libpng, tex/latex, Perl5, xorg-x11-dev) - compilation (see section 2.1 in the manual) I hope this helps? Best, Roland - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging across rows columns
Check out rowMeans to average over replicate columns first, ie: means - data.frame(t1=rowMeans(a[,1:3]), t2=rowMeans(a[,4:6]), etc) Then, if you want to aggregate every 14 rows: aggregate(means, by=list(rows=rep(1:(nrow(means)/14), each=14)), mean) Or something... -Original Message- From: [EMAIL PROTECTED] on behalf of Silvia Lomascolo Sent: Thu 07/06/2007 8:26 PM To: r-help@stat.math.ethz.ch Subject: [R] Averaging across rows columns I use Windows, R version 2.4.1. I have a dataset in which columns 1-3 are replicates, 4-6, are replicates, etc. I need to calculate an average for every set of replicates (columns 1-3, 4-6, 7-9, etc.) AND each set of replicates should be averaged every 14 rows (for more detail, to measure fruit color using a spectrometer, I recorded three readings per fruit -replicates- that I need to average to get one reading per fruit; each row is a point in the light spectrum and I need to calculate an average reading every 5nm -14 rows- for each fruit). Someone proposed to another user who wanted an avg across columns to do a - matrix(rnorm(360),nr=10) b - rep(1:12,each=3) avgmat - aggregate(a,by=list(b)) I tried doing this to get started with the columns first but it asks for an argument FUN that has no default. The help for aggregate isn't helping me much (a new R user) to discover what value to give to FUN -'average' doesn't seem to exist, and 'sum' (whatever it is supposed to sum) gives an error saying that arguments should have the same length- Any help will be much appreciated! Silvia. -- View this message in context: http://www.nabble.com/Averaging-across-rows---columns-tf3885900.html#a11014649 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging across rows columns
michael watson (IAH-C) wrote: Check out rowMeans to average over replicate columns first, ie: means - data.frame(t1=rowMeans(a[,1:3]), t2=rowMeans(a[,4:6]), etc) Then, if you want to aggregate every 14 rows: aggregate(means, by=list(rows=rep(1:(nrow(means)/14), each=14)), mean) Or something... YES! This seems to work. Thank you! -Original Message- From: [EMAIL PROTECTED] on behalf of Silvia Lomascolo Sent: Thu 07/06/2007 8:26 PM To: r-help@stat.math.ethz.ch Subject: [R] Averaging across rows columns I use Windows, R version 2.4.1. I have a dataset in which columns 1-3 are replicates, 4-6, are replicates, etc. I need to calculate an average for every set of replicates (columns 1-3, 4-6, 7-9, etc.) AND each set of replicates should be averaged every 14 rows (for more detail, to measure fruit color using a spectrometer, I recorded three readings per fruit -replicates- that I need to average to get one reading per fruit; each row is a point in the light spectrum and I need to calculate an average reading every 5nm -14 rows- for each fruit). Someone proposed to another user who wanted an avg across columns to do a - matrix(rnorm(360),nr=10) b - rep(1:12,each=3) avgmat - aggregate(a,by=list(b)) I tried doing this to get started with the columns first but it asks for an argument FUN that has no default. The help for aggregate isn't helping me much (a new R user) to discover what value to give to FUN -'average' doesn't seem to exist, and 'sum' (whatever it is supposed to sum) gives an error saying that arguments should have the same length- Any help will be much appreciated! Silvia. -- View this message in context: http://www.nabble.com/Averaging-across-rows---columns-tf3885900.html#a11014649 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Averaging-across-rows---columns-tf3885900.html#a11015925 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update packages with R on Vista: error
If R is installed within Program Files, one of Vista's security settings may interfere with the -update- process. The setting may be disabled globally by choosing: Windows (Start) menu, Control Panels, User Accounts and Family Safety (green title), User Accounts (green title), and Turn User Account Control on or off (very bottom). You will be prompted for permission to continue; click continue. On the screen you will see a checkbox titled Use User Account Control (UAC) to help protect your computer. Uncheck this and click the OK button to save the changes. Windows Vista will now allow programs, including R, to update files in Program Files. Rod. 2007/6/7, Stefan Grosse [EMAIL PROTECTED]: Actually the packages R wants to update are: VR, cluster, lattice, mgcv, nlme and rcompgen. I did how described in the R-Win-FAQ create a .Renviron File containing the path to the win-library that R already created (R_LIBS=C: ... ). I also tried to add R_LIBS= as Rgui parameter from within Tinn-R. Additionally I tried to leave a file named Renviron.site in the etc library. Nothing worked thus far. Interestingly installing packages does work fine even without specifying the R_LIBS path manually with any of the above mentioned methods. Even more puzzling is that even when I install eg. nlme manually via install.packages(nlme) it works but R still wants to update it. Even though e.g. library(nlme), ?nlme shows that the latest version is installed. I would guess there is some problem with the library path variable in the update program... Stefan Original Message Subject: Re:[R] update packages with R on Vista: error From: Prof Brian Ripley [EMAIL PROTECTED] To: Stefan Grosse [EMAIL PROTECTED] Date: 07.06.2007 13:01 See the rw-FAQ, which describes this in detail. Almost certainly you are trying to update the package 'cluster' which is in the main library. But as you used the GUI, we can't see that. On Thu, 7 Jun 2007, Stefan Grosse wrote: Dear R-list, I have encountered the following error message trying to update R packages: update.packages(ask='graphics') Warning in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : 'lib' is not writable Error in install.packages(update[instlib == l, Package], l, contriburl = contriburl, : unable to install packages I remember did not have the problem on the last update where R installed the files then in the Documents/R folder on my user account. Any ideas how to handle this? I made the directories completely writable so I do not know where the problem is now (especially since update worked before...) Stefan PS: Tinn-R 1.19.2.3 + R 2.5.0 on Vista Business __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] character to time problem
I am trying to clean up some dates and I am clearly doing something wrong. I have laid out an example that seems to show what is happening with the real data. The coding is lousy but it looks like it should have worked. Can anyone suggest a) why I am getting that NA appearing after the strptime() command and b) why the NA is disappearing in the sort()? It happens with na.rm=TRUE and na.rm=FALSE - aa - data.frame( c(12/05/2001, , 30/02/1995, NA, 14/02/2007, M ) ) names(aa) - times aa[is.na(aa)] - M aa[aa== ] - M bb - unlist(subset(aa, aa[,1] !=M)) dates - strptime(bb, %d/%m/%Y) dates sort(dates) -- Session Info R version 2.4.1 (2006-12-18) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252; LC_CTYPE=English_Canada.1252; LC_MONETARY=English_Canada.1252; LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: gdata Hmisc 2.3.1 3.3-2 (Yes I know I'm out of date but I don't like upgrading just as I am finishing a project) Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] character to time problem
Perhaps you want one of these: sort(as.Date(aa$times, %d/%m/%Y)) [1] 1995-03-02 2001-05-12 2007-02-14 sort(as.Date(aa$times, %d/%m/%Y), na.last = TRUE) [1] 1995-03-02 2001-05-12 2007-02-14 NA NA [6] NA On 6/7/07, John Kane [EMAIL PROTECTED] wrote: I am trying to clean up some dates and I am clearly doing something wrong. I have laid out an example that seems to show what is happening with the real data. The coding is lousy but it looks like it should have worked. Can anyone suggest a) why I am getting that NA appearing after the strptime() command and b) why the NA is disappearing in the sort()? It happens with na.rm=TRUE and na.rm=FALSE - aa - data.frame( c(12/05/2001, , 30/02/1995, NA, 14/02/2007, M ) ) names(aa) - times aa[is.na(aa)] - M aa[aa== ] - M bb - unlist(subset(aa, aa[,1] !=M)) dates - strptime(bb, %d/%m/%Y) dates sort(dates) -- Session Info R version 2.4.1 (2006-12-18) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252; LC_CTYPE=English_Canada.1252; LC_MONETARY=English_Canada.1252; LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: gdata Hmisc 2.3.1 3.3-2 (Yes I know I'm out of date but I don't like upgrading just as I am finishing a project) Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] power of logistic regression in case control design
Hi All, This is not directly related to R but I post the questions here since there are a lot of experts on statistics. I want to calculate power of logistic regression using likelihood ratio test in unmatched case control design. The paper I have read is power calculations for likelihood ratio tests in generalized linear models by Steven G. Self in 1992. They showed a formula for calculating power for prospective cohort design. Can I plug in the various parameters from case control study? That is, can I pretend that the case control data I have is a prospective cohort study? I guess I can do this since in principal logistic regression is valid for both case control and cohort design in estimating odds ratio. Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ubu edgy + latest CRAN R + Rmpi = no go
I'm just curious if anyone else has had problems with this configuration. I added the CRAN repository to apt and installed 2.5.0 with apt-get. I then did an install.packages(Rmpi) on cluster nodes. Rmpi loads and lamhosts() shows the nodes, but mpi.spawn.Rslaves() fails (something to do with temp files?). Rmpi works fine with the Edgy-native version of R (2.3.x) and installing Edgy's r-cran-rmpi with apt. (But I need some other packages that only work in 2.4+!) Could this be a problem with the latest Ubu debs on CRAN? The Rmpi author says his R 2.5 setup works fine. CC me please as I'm not subscribed. THK -- Timothy H. Keitt, University of Texas at Austin Contact info and schedule at http://www.keittlab.org/tkeitt/ Reprints at http://www.keittlab.org/tkeitt/papers/ ODF attachment? See http://www.openoffice.org/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] character to time problem
Hi John, a) The NA appears because '30/02/1995' is not a valid date. strptime('30/02/1995' , %d/%m/%Y) [1] NA b) dates which has the following classes uses sort.POSIXlt which in turns sets na.last to NA. ?order details how NA's are handled in ordering data via na.last. class(dates) [1] POSIXt POSIXlt methods(sort) [1] sort.default sort.POSIXlt sort.POSIXlt function (x, decreasing = FALSE, na.last = NA, ...) x[order(as.POSIXct(x), na.last = na.last, decreasing = decreasing)] environment: namespace:base After resetting the Feb. date the code works. HTH, -jason - Original Message - From: John Kane [EMAIL PROTECTED] To: R R-help r-help@stat.math.ethz.ch Sent: Thursday, June 07, 2007 2:17 PM Subject: [R] character to time problem I am trying to clean up some dates and I am clearly doing something wrong. I have laid out an example that seems to show what is happening with the real data. The coding is lousy but it looks like it should have worked. Can anyone suggest a) why I am getting that NA appearing after the strptime() command and b) why the NA is disappearing in the sort()? It happens with na.rm=TRUE and na.rm=FALSE - aa - data.frame( c(12/05/2001, , 30/02/1995, NA, 14/02/2007, M ) ) names(aa) - times aa[is.na(aa)] - M aa[aa== ] - M bb - unlist(subset(aa, aa[,1] !=M)) dates - strptime(bb, %d/%m/%Y) dates sort(dates) -- Session Info R version 2.4.1 (2006-12-18) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252; LC_CTYPE=English_Canada.1252; LC_MONETARY=English_Canada.1252; LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: gdata Hmisc 2.3.1 3.3-2 (Yes I know I'm out of date but I don't like upgrading just as I am finishing a project) Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tools For Preparing Data For Analysis
As noted on the R-project web site itself ( www.r-project.org - Manuals - R Data Import/Export ), it can be cumbersome to prepare messy and dirty data for analysis with the R tool itself. I've also seen at least one S programming book (one of the yellow Springer ones) that says, more briefly, the same thing. The R Data Import/Export page recommends examples using SAS, Perl, Python, and Java. It takes a bit of courage to say that ( when you go to a corporate software web site, you'll never see a page saying This is the type of problem that our product is not the best at, here's what we suggest instead ). I'd like to provide a few more suggestions, especially for volunteers who are willing to evaluate new candidates. SAS is fine if you're not paying for the license out of your own pocket. But maybe one reason you're using R is you don't have thousands of spare dollars. Using Java for data cleaning is an exercise in sado-masochism, Java has a learning curve (almost) as difficult as C++. There are different types of data transformation, and for some data preparation problems an all-purpose programming language is a good choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has excellent regular expression facilities. However, for some types of complex demanding data preparation problems, an all-purpose programming language is a poor choice. For example: cleaning up and preparing clinical lab data and adverse event data - you could do it in Perl, but it would take way, way too much time. A specialized programming language is needed. And since data transformation is quite different from data query, SQL is not the ideal solution either. There are only three statistical programming languages that are well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more popular than S for data cleaning. If you're an R user with difficult data preparation problems, frankly you are out of luck, because the products I'm about to mention are new, unknown, and therefore regarded as immature. And while the founders of these products would be very happy if you kicked the tires, most people don't like to look at brand new products. Most innovators and inventers don't realize this, I've learned it the hard way. But if you are a volunteer who likes to help out by evaluating, comparing, and reporting upon new candidates, well you could certainly help out R users and the developers of the products by kicking the tires of these products. And there is a huge need for such volunteers. 1. DAP This is an open source implementation of SAS. The founder: Susan Bassein Find it at: directory.fsf.org/math/stats (GNU GPL) 2. PSPP This is an open source implementation of SPSS. The relatively early version number might not give a good idea of how mature the data transformation features are, it reflects the fact that he has only started doing the statistical tests. The founder: Ben Pfaff, either a grad student or professor at Stanford CS dept. Also at : directory.fsf.org/math/stats (GNU GPL) 3. Vilno This uses a programming language similar to SPSS and SAS, but quite unlike S. Essentially, it's a substitute for the SAS datastep, and also transposes data and calculates averages and such. (No t-tests or regressions in this version). I created this, during the years 2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in my opinion. The tarball includes about 100 or so test cases used for debugging - for logical calculation errors, but not for extremely high volumes of data. The maintenance of Vilno has slowed down, because I am currently (desparately) looking for employment. But once I've found new employment and living quarters and settled in, I will continue to enhance Vilno in my spare time. The founder: that would be me, Robert Wilkins Find it at: code.google.com/p/vilno ( GNU GPL ) ( In particular, the tarball at code.google.com/p/vilno/downloads/list , since I have yet to figure out how to use Subversion ). 4. Who knows? It was not easy to find out about the existence of DAP and PSPP. So who knows what else is out there. However, I think you'll find a lot more statistics software ( regression , etc ) out there, and not so much data transformation software. Not many people work on data preparation software. In fact, the category is so obscure that there isn't one agreed term: data cleaning , data munging , data crunching , or just getting the data ready for analysis. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nonlinear Regression
Hello I followed the example in page 59, chapter 11 of the 'Introduction to R' manual. I entered my own x,y data. I used the least squares. My function has 5 parameters: p[1], p[2], p[3], p[4], p[5]. I plotted the x-y data. Then I used lines(spline(xfit,yfit)) to overlay best curves on the data while changing the parameters. My question is how do I calculate the residual sum of squares. In the example they have the following: df - data.frame( x=x, y=y) fit - nls(y ~SSmicmen(s, Vm, K), df) fit In the second line how would I input my function? Would it be: fit - nls(y ~ myfunction(p[1], p[2], p[3], p[4], p[5]), df) where myfunction is the actual function? My function doesnt have a name, so should I just enter it? Thanks -- View this message in context: http://www.nabble.com/Nonlinear-Regression-tf3886617.html#a11016968 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading BMP or TIFF files
See the EBImage package on Bioconductor. /Henrik On 6/7/07, Bob Meglen [EMAIL PROTECTED] wrote: I realize that this question has been asked before (2003); From: Yi-Xiong Zhou Date: Sat 22 Nov 2003 - 10:57:35 EST but I am hoping that the answer has changed. Namely, I would rather read the BMP (or TIFF) files directly instead of putting them though a separate utility for conversion as suggested by, From: Prof Brian Ripley Date: Sat 22 Nov 2003 - 15:23:33 EST Even easier is to convert .bmp to .pnm by an external utility. For example, `convert' from the ImageMagick suite (www.imagemagick.org) can do this. Thanks, Robert Meglen [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tools For Preparing Data For Analysis
An additional option for Windows users is Micro Osiris http://www.microsiris.com/ best robert On 6/7/07, Robert Wilkins [EMAIL PROTECTED] wrote: As noted on the R-project web site itself ( www.r-project.org - Manuals - R Data Import/Export ), it can be cumbersome to prepare messy and dirty data for analysis with the R tool itself. I've also seen at least one S programming book (one of the yellow Springer ones) that says, more briefly, the same thing. The R Data Import/Export page recommends examples using SAS, Perl, Python, and Java. It takes a bit of courage to say that ( when you go to a corporate software web site, you'll never see a page saying This is the type of problem that our product is not the best at, here's what we suggest instead ). I'd like to provide a few more suggestions, especially for volunteers who are willing to evaluate new candidates. SAS is fine if you're not paying for the license out of your own pocket. But maybe one reason you're using R is you don't have thousands of spare dollars. Using Java for data cleaning is an exercise in sado-masochism, Java has a learning curve (almost) as difficult as C++. There are different types of data transformation, and for some data preparation problems an all-purpose programming language is a good choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has excellent regular expression facilities. However, for some types of complex demanding data preparation problems, an all-purpose programming language is a poor choice. For example: cleaning up and preparing clinical lab data and adverse event data - you could do it in Perl, but it would take way, way too much time. A specialized programming language is needed. And since data transformation is quite different from data query, SQL is not the ideal solution either. There are only three statistical programming languages that are well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more popular than S for data cleaning. If you're an R user with difficult data preparation problems, frankly you are out of luck, because the products I'm about to mention are new, unknown, and therefore regarded as immature. And while the founders of these products would be very happy if you kicked the tires, most people don't like to look at brand new products. Most innovators and inventers don't realize this, I've learned it the hard way. But if you are a volunteer who likes to help out by evaluating, comparing, and reporting upon new candidates, well you could certainly help out R users and the developers of the products by kicking the tires of these products. And there is a huge need for such volunteers. 1. DAP This is an open source implementation of SAS. The founder: Susan Bassein Find it at: directory.fsf.org/math/stats (GNU GPL) 2. PSPP This is an open source implementation of SPSS. The relatively early version number might not give a good idea of how mature the data transformation features are, it reflects the fact that he has only started doing the statistical tests. The founder: Ben Pfaff, either a grad student or professor at Stanford CS dept. Also at : directory.fsf.org/math/stats (GNU GPL) 3. Vilno This uses a programming language similar to SPSS and SAS, but quite unlike S. Essentially, it's a substitute for the SAS datastep, and also transposes data and calculates averages and such. (No t-tests or regressions in this version). I created this, during the years 2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in my opinion. The tarball includes about 100 or so test cases used for debugging - for logical calculation errors, but not for extremely high volumes of data. The maintenance of Vilno has slowed down, because I am currently (desparately) looking for employment. But once I've found new employment and living quarters and settled in, I will continue to enhance Vilno in my spare time. The founder: that would be me, Robert Wilkins Find it at: code.google.com/p/vilno ( GNU GPL ) ( In particular, the tarball at code.google.com/p/vilno/downloads/list , since I have yet to figure out how to use Subversion ). 4. Who knows? It was not easy to find out about the existence of DAP and PSPP. So who knows what else is out there. However, I think you'll find a lot more statistics software ( regression , etc ) out there, and not so much data transformation software. Not many people work on data preparation software. In fact, the category is so obscure that there isn't one agreed term: data cleaning , data munging , data crunching , or just getting the data ready for analysis. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do clustering
Dear List, I have another question to bother you about how to do clustering. My data consists of 49 columns (49 variables) and 238804 rows. I would like to do hierarchical clustering (unsupervised clustering and PCA). So far I tried pvclust (www.is.titech.ac.jp/~shimo/prog/*pvclust* /) but I always had the problem like for R like cannot allocate the memory. I am curious about what else packages can perform the clustering analysis while memory efficient. Meanwhile, is there any way that I can extract the features of each cluster. In other words, I would like to identify which are responsible for classifying these variables (samples). Thanks a lot! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load a big txt file
Dear Jim, It works great. I appreciate your help. Sincerely, Alex On 6/7/07, jim holtman [EMAIL PROTECTED] wrote: I took your data and duped the data line so I had 100,000 rows and it took 40 seconds to read in when specifying colClasses system.time(x - read.table('/tempxx.txt', header=TRUE,colClasses=c('factor', rep('numeric',49 user system elapsed 40.980.46 42.39 str(x) 'data.frame ': 102272 obs. of 50 variables: $ ID : Factor w/ 1 level SNP_A-1780271: 1 1 1 1 1 1 1 1 1 1 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel : num 1.86 1.86 1.86 1.86 1.86 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel : num 1.51 1.51 1.51 1.51 1.51 ... $ AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel : num 1.73 1.73 1.73 1.73 1.73 ... $ AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel : num 1.53 1.53 1.53 1.53 1.53 ... $ AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel : num 1.66 1.66 1.66 1.66 1.66 ... $ AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel : num 1.47 1.47 1.47 1.47 1.47 ... $ AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel : num 1.78 1.78 1.78 1.78 1.78 ... $ AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel : num 1.60 1.60 1.60 1.60 1.60 ... $ AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel : num 1.98 1.98 1.98 1.98 1.98 ... $ DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel : num 2.18 2.18 2.18 2.18 2.18 ... $ DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel : num 1.88 1.88 1.88 1.88 1.88 ... $ DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel : num 2.15 2.15 2.15 2.15 2.15 ... $ DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel : num 1.53 1.53 1.53 1.53 1.53 ... $ DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel : num 1.72 1.72 1.72 1.72 1.72 ... $ DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel : num 2.23 2.23 2.23 2.23 2.23 ... $ DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel : num 1.94 1.94 1.94 1.94 1.94 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel : num 1.85 1.85 1.85 1.85 1.85 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel : num 2.16 2.16 2.16 2.16 2.16 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel : num 2.19 2.19 2.19 2.19 2.19 ... $ DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel : num 2.03 2.03 2.03 2.03 2.03 ... $ GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel : num 2.67 2.67 2.67 2.67 2.67 ... $ GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel : num 2.74 2.74 2.74 2.74 2.74 ... $ GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel : num 2.08 2.08 2.08 2.08 2.08 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel: num 3.21 3.21 3.21 3.21 3.21 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel: num 2.1 2.1 2.1 2.1 2.1 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel: num 2.15 2.15 2.15 2.15 2.15 ... $ HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel: num 3.52 3.52 3.52 3.52 3.52 ... $ HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel : num 1.37 1.37 1.37 1.37 1.37 ... $ HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel : num 1.66 1.66 1.66 1.66 1.66 ... $ HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel : num 3.16 3.16 3.16 3.16 3.16 ... $ SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel : num 2.09 2.09 2.09 2.09 2.09 ... $ SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel : num 1.87 1.87 1.87 1.87 1.87 ... $ SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel : num 1.90 1.90 1.90 1.90 1.90 ... $ SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel : num 1.81 1.81 1.81 1.81 1.81 ... $ SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel : num 1.82 1.82 1.82 1.82 1.82 ... $ SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel : num 2.26 2.26 2.26 2.26 2.26 ... $ VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel : num 1.93 1.93 1.93 1.93 1.93 ... $ VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel : num 1.68 1.68 1.68 1.68 1.68 ... $ VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel : num 1.34 1.34 1.34 1.34 1.34 ... $ VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel : num 1.57 1.57 1.57 1.57 1.57 ... $ VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel : num 1.72 1.72 1.72 1.72 1.72 ... $ VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel : num 1.95 1.95 1.95 1.95 1.95 ... $ VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel : num 1.44 1.44 1.44 1.44 1.44 ... $ VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel : num 2.22 2.22 2.22 2.22 2.22 ... $ VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel : num 1.76 1.76 1.76 1.76 1.76 ... $ VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel : num 2.05 2.05 2.05 2.05 2.05 ... $ VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel : num 2.64 2.64 2.64 2.64 2.64 ... On 6/7/07, ssls sddd
Re: [R] How to do clustering
sorry, I hit send before finishing my thoughts... and as for clustering microarray data, you might want to consider the bioconductor mailing list... [EMAIL PROTECTED] b On Jun 7, 2007, at 10:42 PM, ssls sddd wrote: Dear List, I have another question to bother you about how to do clustering. My data consists of 49 columns (49 variables) and 238804 rows. I would like to do hierarchical clustering (unsupervised clustering and PCA). So far I tried pvclust (www.is.titech.ac.jp/~shimo/prog/ *pvclust* /) but I always had the problem like for R like cannot allocate the memory. I am curious about what else packages can perform the clustering analysis while memory efficient. Meanwhile, is there any way that I can extract the features of each cluster. In other words, I would like to identify which are responsible for classifying these variables (samples). Thanks a lot! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do clustering
Hi Alex, just in case you're trying to get genotypes from the Affymetrix 500K set, you might want to check the oligo package available on BioConductor. best, b On Jun 7, 2007, at 10:42 PM, ssls sddd wrote: Dear List, I have another question to bother you about how to do clustering. My data consists of 49 columns (49 variables) and 238804 rows. I would like to do hierarchical clustering (unsupervised clustering and PCA). So far I tried pvclust (www.is.titech.ac.jp/~shimo/prog/ *pvclust* /) but I always had the problem like for R like cannot allocate the memory. I am curious about what else packages can perform the clustering analysis while memory efficient. Meanwhile, is there any way that I can extract the features of each cluster. In other words, I would like to identify which are responsible for classifying these variables (samples). Thanks a lot! Sincerely, Alex [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tools For Preparing Data For Analysis
Robert Wilkins wrote: As noted on the R-project web site itself ( www.r-project.org - Manuals - R Data Import/Export ), it can be cumbersome to prepare messy and dirty data for analysis with the R tool itself. I've also seen at least one S programming book (one of the yellow Springer ones) that says, more briefly, the same thing. The R Data Import/Export page recommends examples using SAS, Perl, Python, and Java. It takes a bit of courage to say that ( when you go to a corporate software web site, you'll never see a page saying This is the type of problem that our product is not the best at, here's what we suggest instead ). I'd like to provide a few more suggestions, especially for volunteers who are willing to evaluate new candidates. SAS is fine if you're not paying for the license out of your own pocket. But maybe one reason you're using R is you don't have thousands of spare dollars. Using Java for data cleaning is an exercise in sado-masochism, Java has a learning curve (almost) as difficult as C++. There are different types of data transformation, and for some data preparation problems an all-purpose programming language is a good choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has excellent regular expression facilities. However, for some types of complex demanding data preparation problems, an all-purpose programming language is a poor choice. For example: cleaning up and preparing clinical lab data and adverse event data - you could do it in Perl, but it would take way, way too much time. A specialized programming language is needed. And since data transformation is quite different from data query, SQL is not the ideal solution either. We deal with exactly those kinds of data solely using R. R is exceptionally powerful for data manipulation, just a bit hard to learn. Many examples are at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf Frank There are only three statistical programming languages that are well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more popular than S for data cleaning. If you're an R user with difficult data preparation problems, frankly you are out of luck, because the products I'm about to mention are new, unknown, and therefore regarded as immature. And while the founders of these products would be very happy if you kicked the tires, most people don't like to look at brand new products. Most innovators and inventers don't realize this, I've learned it the hard way. But if you are a volunteer who likes to help out by evaluating, comparing, and reporting upon new candidates, well you could certainly help out R users and the developers of the products by kicking the tires of these products. And there is a huge need for such volunteers. 1. DAP This is an open source implementation of SAS. The founder: Susan Bassein Find it at: directory.fsf.org/math/stats (GNU GPL) 2. PSPP This is an open source implementation of SPSS. The relatively early version number might not give a good idea of how mature the data transformation features are, it reflects the fact that he has only started doing the statistical tests. The founder: Ben Pfaff, either a grad student or professor at Stanford CS dept. Also at : directory.fsf.org/math/stats (GNU GPL) 3. Vilno This uses a programming language similar to SPSS and SAS, but quite unlike S. Essentially, it's a substitute for the SAS datastep, and also transposes data and calculates averages and such. (No t-tests or regressions in this version). I created this, during the years 2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in my opinion. The tarball includes about 100 or so test cases used for debugging - for logical calculation errors, but not for extremely high volumes of data. The maintenance of Vilno has slowed down, because I am currently (desparately) looking for employment. But once I've found new employment and living quarters and settled in, I will continue to enhance Vilno in my spare time. The founder: that would be me, Robert Wilkins Find it at: code.google.com/p/vilno ( GNU GPL ) ( In particular, the tarball at code.google.com/p/vilno/downloads/list , since I have yet to figure out how to use Subversion ). 4. Who knows? It was not easy to find out about the existence of DAP and PSPP. So who knows what else is out there. However, I think you'll find a lot more statistics software ( regression , etc ) out there, and not so much data transformation software. Not many people work on data preparation software. In fact, the category is so obscure that there isn't one agreed term: data cleaning , data munging , data crunching , or just getting the data ready for analysis. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] evaluating variables in the context of a data frame
Given D = data.frame(o=gl(2,1,4)) this works as I expected: evalq(o, D) [1] 1 2 1 2 Levels: 1 2 but neither of these does: f - function(x, dat) evalq(x, dat) f(o, D) Error in eval(expr, envir, enclos) : object o not found g - function(x, dat) eval(x, dat) g(o, D) Error in eval(x, dat) : object o not found What am I doing wrong? This seems to be what the helpfiles say you do to evaluate arguments in the context of a passed-in data frame... zw __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need Help with robustbase package: fitnorm2 and plotnorm2
This is my first post requesting help to this mailing list. I am new to R. My apologies for any breach in posting etiquette. I am new to this language and just learning my way around. I am attempting to run some sample code and and am confused by the error message: Loading required package: rrcov Error in fitNorm2(fdat[, FSC-H], fdat[, SSC-H], scalefac = ScaleFactor) : Required package rrcov could not be found. In addition: Warning message: there is no package called 'rrcov' in: library(package, lib.loc = lib.loc, character.only = TRUE, logical = TRUE, that I get when I attempt to run the following sample snippet of code. The error above is taken from the code below. I am running Ubuntu Linux with all the r packages listed in the Synaptic package manager (universa). I loaded the prada bioconductor package as instructed in the comments and the robustbase was downloaded and installed with the command: sudo R CMD INSTALL robustbase_0.2- 7.tar.gz, the robustbase folder is in /usr/local/lib/R/site-library/ When I type in 'library(robustbase)' no error appears; I believe robustbase is installed correctly. The sample code was taken from FCS-prada.pdf. The sample code was written in 2005, I understand that rrcov was made part of the robustbase package sometime in the past year. This may be the cause of the problem, but, if it is, I have no idea how to fix it. Thank you in advance for helping out! Below you will find the code that generates the error and the complete output of the code. Let me know what I can do to get up and running! Matt #prada Bioconductor package #http://www.bioconductor.org/repository/devel/vignette/norm2.pdf # To install prada #source(http://www.bioconductor.org/biocLite.R;) #biocLite(prada) library(prada) filepath - system.file(extdata, fas-Bcl2-plate323-04-04.A01, package = pra da) print(filepath) sampdat - readFCS(filepath) fdat - exprs(sampdat) print(dim(fdat)) print(colnames(fdat)) plot(fdat[, FSC-H], fdat[, SSC-H], pch = 20, col = #303030, xlab = FSC, ylab = SSC, main = Scatter plot FSC vs SSC) #All of this goes as the help documentation suggests it should # 2. Show selections for various scale factors savepar - par(mfrow=c(2,2)) for (Scalefactor in c(1.0, 1.5, 2.0, 2.5) ) { # The next line gives the error I've included below. nfit - fitNorm2 (fdat[, FSC-H], fdat[, SSC-H], scalefac = ScaleFactor) plotnorm2(nfit, selection = TRUE, ellipse = TRUE, xlab=FSC-H, ylab=SSC-H, main=paste(SSC-H vs. FSC-H (ScaleFactor=,ScaleFactor,), sep= )) } par(savepar) Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()' or start with 'help(Biobase)'. For details on reading vignettes, see the openVignette help page. Loading required package: RColorBrewer Loading required package: grid Loading required package: geneplotter Loading required package: annotate KernSmooth 2.22 installed Copyright M. P. Wand 1997 [1] /usr/local/lib/R/site-library/prada/extdata/fas-Bcl2-plate323-04-04.A01 [1] 21158 $P1N$P2N$P3N$P4N$P5N$P6N$P7N$P8N FSC-H SSC-H FL1-H FL2-H FL3-H FL2-A FL4-H Time Loading required package: rrcov Error in fitNorm2(fdat[, FSC-H], fdat[, SSC-H], scalefac = ScaleFactor) : Required package rrcov could not be found. In addition: Warning message: there is no package called 'rrcov' in: library(package, lib.loc = lib.loc, character.only = TRUE, logical = TRUE, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.