Re: [R] SVM regression
Thank you very much! Eleni On Fri, Dec 11, 2009 at 7:19 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi Eleni, On Dec 11, 2009, at 12:04 PM, Eleni Christodoulou wrote: Dear R users, I am trying to apply SVM regression for a set of microarray data. I am using the function svm() under the package {e1071}. Can anyone tell me what the *residuals *value represents? I have some observed values *y_obs* for the parameter that I want to estimate and I would expect that *svm$residuals = y_obs - svm$fitted. *However, this does not happen...Does anyone have any idea on that? This actually is what's happening. The $residuals that are reported in the model are against your *scaled* y-vector. So, with your data: R m - svm(x,y) R all(scale(y) - predict(m,x) == m$residuals) [1] TRUE -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About zero-inflation poisson model
On Fri, 11 Dec 2009, Xiongqing Zhang wrote: Hello all, I am Xiongqing Zhang, come from Beijing of China. I know you from the web site: http://finzi.psych.upenn.edu/Rhelp08/2008-February/154627.html. I am not very clear about the R-project software. Look at http://www.R-project.org/ especially the Documentation section in the menu. But I want to estimate the parameters and errors of zero-inflation poisson model. Can you help me? Look at http://www.jstatsoft.org/v27/i08/ Data is in the attachement. Thank you. I will be very appreciated if you can help me. Best regards, Yours, Xiongqing Zhang__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] some problems with ram usage and warnings
Datum: Fri, 11 Dec 2009 21:52:30 -0500 On Dec 11, 2009, at 11:08 AM, Tom Knockinger wrote: Hi, i am new to the R-project but until now i have found solutions for every problem in toturials, R Wikis and this mailing list, but now i have some problems which I can't solve with this knowledge. [snip] 2) ram usage and program shutdowns length(data) is usually between 50 to 1000. So it takes some space in ram (approx 100-200 mb) which is no problem but I use some analysis code which results in about 500-700 mb ram usage, also not a real problem. The results are matrixes of (50x14 to 1000x14) so they are small enough to work with them afterwards: create plots, or make some more analysis. So i wrote a function which do the analysis one file after another and keep only the results in a list. But after some about 2-4 files my R process uses about 1500MB and then the troubles begin. Windows? Yes, I use R 2.9.1 under Windows. [snip] It is possible to call the garbage collector with gc(). Supposedly that should not be necessary, since garbage collection is automatic, but I have the impression that it helps prevent situations that otherwise lead to virtual memory getting invoked on the Mac (which I also thought should not be happening, but I will swear that it does.) -- David David Winsemius, MD Heritage Laboratories West Hartford, CT Thanks for this advice. I tried gc() but it seems that it doesn't do anything or not enough to reduce the process memory in my case. I called the function tmp-load.report(file), to load the data from the same file several times and called memory.size(), gc(), memory.size() after each function call and here are the results: before gc() | after gc() count memory.size() process ram | memory.size() process ram init10 20MB| 1 97 220MB | 47 202MB 2 128 363MB | 48 357MB 3 126 466MB | 50 466MB 4 131 629MB | 52 629MB So it seems that at the beginning it releases some memory but not enough. And also R itself (memory.size()) shows good values after gc() but this values doesn't have anything to do with the real process memory usage. Or there are hugh memory holes in the windows binaries. The called function is: load.report - function( reportname ) { library(XML) xml - xmlTreeParse(reportname, useInternal=TRUE) globalid - as.character(getNodeSet(xml, //@gid)) sysid - as.integer(getNodeSet(xml, //@sid)) xmldataset = getNodeSet(xml, /test/data) free(xml) xmldata - sapply(xmldataset, function(el) xmlValue(el)) dftlist - lapply(1:length(xmldata), function(i) list( data.frame(gid=globalid[i],sid=sysid[i]), load.csvTable(xmldata,i)) ) return(dftlist) } which uses this helper function, which i used to get rid of these warnings but only reduced them from greater 50 to about 3 or 5 each time the main function is calling. load.csvTable - function( xmldata, pos ) { res = read.table(textConnection(xmldata[pos]), header=TRUE, sep = c(;)) closeAllConnections() return(res) } May be you or someone else has some additional advice. Thanks Tom -- Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.! http://portal.gmx.net/de/go/dsl02 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.csv to read output of system()?
Dear list, I have a file that is comma delimited but contains some erroneous non-delimiter commas. I would like to replace these commas with semicolons and then read the correct file into R as a data frame. I want to do this from within R, without changing the original data file. My current idea of how to do this would be to use system(sed ...) and feed the result to read.csv(), but I cannot figure out how to combine the two. Minimal example: system(echo \one,two,three\ file.csv) # create mockup file read.csv(file=system(sed -e 's/,/;/' file.csv)) # this does not work I think the answer must be in ?connections, maybe pipe() but I have fiddled with these and cannot figure it out. Marianne -- Marianne Promberger PhD, King's College London http://promberger.info R version 2.10.0 (2009-10-26) Ubuntu 9.04 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cube root on array
Dear R developers, is that right? -27^(1/3) [1] -3 c(-27,27)^(1/3) [1] NaN 3 i'm using sign( c(-27,27) ) * abs( c(-27,27)) ^(1/3) , thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cube root on array
On Dec 12, 2009, at 9:31 AM, Rodrigo Tsai wrote: Dear R developers, is that right? -27^(1/3) [1] -3 library(fortunes) fortune(^) Thomas Lumley: The precedence of ^ is higher than that of unary minus. It may be surprising, [...] Hervé Pagès: No, it's not surprising. At least to me... In the country where I grew up, I've been teached that -x^2 means -(x^2) not (-x)^2. -- Thomas Lumley and Hervé Pagès (both explaining that operator precedence is working perfectly well) R-devel (January 2006) Also see R FAQ 7.33: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-powers-of-negative-numbers-wrong_003f Using the example in the FAQ: as.list(quote(-27^(1/3))) [[1]] `-` [[2]] 27^(1/3) So what you see above is the consequence of operator precedence, thus: (-27)^(1/3) [1] NaN which is what you are getting below for the first value in the vector. c(-27,27)^(1/3) [1] NaN 3 i'm using sign( c(-27,27) ) * abs( c(-27,27)) ^(1/3) , thanks That seems to be a reasonable approach and if memory serves, has been posted to the list previously. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Antw: Export R output to Word/RTF?
I am using SWord from statconn: http://rcom.univie.ac.at/download.html It allows you to put r-commands into Word (similar to odfWeave). Output is directed to Word including figures and tables. It is still a beta-version but works fine. Frank Bloos Wenjie Lee wenjieleemaill...@gmail.com 12.12.2009 00:28 Hi R Experts, I'm aware of pdf(), jpeg(),... functions. But, 1. Is it also possible to export graphs directly to word or RTF? I use to copy and paste graphs but resolutions are not so great. 2. Also, is it possible to export your out to word file? I use sink() function to export it text files. Any suggestions, thanks, Wenjie Lee [[alternative HTML version deleted]] Universitätsklinikum Jena Körperschaft des öffentlichen Rechts und Teilkörperschaft der Friedrich-Schiller-Universität Jena BachstraÃe 18, 07743 Jena Verwaltungsratsvorsitzender: Prof. Dr. Walter Bauer-Wabnegg; Medizinischer Vorstand: Prof. Dr. Klaus Höffken; Wissenschaftlicher Vorstand: Prof. Dr. Klaus Benndorf; Kaufmännischer Vorstand und Sprecher des Klinikumsvorstandes Rudolf Kruse Bankverbindung: Sparkasse Jena; BLZ: 830 530 30; Kto.: 221; Gerichtsstand Jena Steuernummer: 161/144/02978; USt.-IdNr. : DE 150545777 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv to read output of system()?
On Dec 12, 2009, at 7:54 AM, Marianne Promberger wrote: Dear list, I have a file that is comma delimited but contains some erroneous non-delimiter commas. I would like to replace these commas with semicolons and then read the correct file into R as a data frame. I want to do this from within R, without changing the original data file. My current idea of how to do this would be to use system(sed ...) and feed the result to read.csv(), but I cannot figure out how to combine the two. Minimal example: system(echo \one,two,three\ file.csv) # create mockup file read.csv(file=system(sed -e 's/,/;/' file.csv)) # this does not work I think the answer must be in ?connections, maybe pipe() but I have fiddled with these and cannot figure it out. You need to figure out how to do multiple replacements unless it is only the first comma that you are targeting: readLines(pipe(sed -e 's/,/;/' ~/file.csv)) [1] one;two,three Marianne -- Marianne Promberger PhD, King's College London http://promberger.info David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv to read output of system()?
Thanks for both replies. Let me start by giving a better minimal example, although indeed the regex replacement is not my problem. system(echo \var1,var2,var3\none,two,three\none,this is a comment,with commas.,three\ file.csv) On 12/12/09 11:02, David Winsemius wrote: You need to figure out how to do multiple replacements unless it is only the first comma that you are targeting: readLines(pipe(sed -e 's/,/;/' ~/file.csv)) [1] one;two,three Lovely. What I really need is read.csv and this works (with my good enough for the existing data; will optimize later as needed regex): read.csv(pipe(sed -e 's/\\( [a-zA-Z]\\+\\),/\\1;/g' file.csv)) I can't understand that I didn't try this. I think what I tried was pipe( ... file.csv |) (with a Unix pipe symbol a the end) Thanks! Jon Baron ba...@psych.upenn.edu 12-Dec-09 16:21: gsub(readLines(file.csv),,,;) Using gsub would be even neater, as it would really be self-contained in R. gsub(( [A-Za-z]+),,\\1;,readLines(file.csv)) seems to work fine, but how to get this into a data frame? Marianne -- Marianne Promberger PhD, King's College London http://promberger.info R version 2.10.0 (2009-10-26) Ubuntu 9.04 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv to read output of system()?
On Dec 12, 2009, at 12:01 PM, Marianne Promberger wrote: Thanks for both replies. Let me start by giving a better minimal example, although indeed the regex replacement is not my problem. system(echo \var1,var2,var3\none,two,three\none,this is a comment,with commas.,three\ file.csv) On 12/12/09 11:02, David Winsemius wrote: You need to figure out how to do multiple replacements unless it is only the first comma that you are targeting: readLines(pipe(sed -e 's/,/;/' ~/file.csv)) [1] one;two,three Lovely. What I really need is read.csv and this works (with my good enough for the existing data; will optimize later as needed regex): I didn't post a read.csv versions thought the application was obvious. read.csv(pipe(sed -e 's/\\( [a-zA-Z]\\+\\),/\\1;/g' file.csv)) I can't understand that I didn't try this. I think what I tried was pipe( ... file.csv |) (with a Unix pipe symbol a the end) Thanks! Jon Baron ba...@psych.upenn.edu 12-Dec-09 16:21: gsub(readLines(file.csv),,,;) Using gsub would be even neater, as it would really be self-contained in R. gsub(( [A-Za-z]+),,\\1;,readLines(file.csv)) txt - gsub(( [A-Za-z]+),,\\1;,readLines(file.csv)) read.csv(textConnection(txt), header=TRUE) var1 var2 var3 1 onetwo three 2 one this is a comment;with commas. three seems to work fine, but how to get this into a data frame? Marianne -- Marianne Promberger PhD, King's College London http://promberger.info R version 2.10.0 (2009-10-26) Ubuntu 9.04 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv to read output of system()?
David Winsemius dwinsem...@comcast.net 12-Dec-09 17:12: txt - gsub(( [A-Za-z]+),,\\1;,readLines(file.csv)) read.csv(textConnection(txt), header=TRUE) var1 var2 var3 1 onetwo three 2 one this is a comment;with commas. three Wonderful, thanks a lot! Marianne -- Marianne Promberger PhD, King's College London http://promberger.info R version 2.10.0 (2009-10-26) Ubuntu 9.04 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matched pair proportion test
Hi, ALL, Is there any function in R that does the exact test for the matched pair proportions (one sided), which I assume is binomial(b+c, .5). Thanks, Annie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple ts.plot question
* I have a simple question regarding plots of time series in R. I have to plot conc against time **for each individual and display in the same panel for the in-built dataset Indometh in R. * ***I have six time series, say subject1.ts, subject2.ts, ., subject6.ts. The observations are taken at an interval of 0.25 hr. All of the series ranges from 0.25hr to 8.00 hr. How can I plot all the six time series in the same panel**?* *I am eagerly waiting for your reply. Thanks in advance.* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple ts.plot question
*Respected Sir, I have a simple question regarding plots of time series in R. I have to plot conc against time **for each individual and display in the same panel for the in-built dataset Indometh in R. * *I have six time series, say subject1.ts, subject2.ts, ., subject6.ts. The observations are taken at an interval of 0.25 hr. All of the series ranges from 0.25hr to 8.00 hr. How can I plot all the six time series in the same panel**?* *I am eagerly waiting for your reply. Thanks in advance.* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by function ??
Thanks for all the help, They all worked, But I'm stuck again. I've tried searching, but I not sure how to word my search as nothing came up. Here is my new hurdle, my data has 7 abservations and my results have 2 answers: Here is my data LEAID ratio 3 6307 0.720 1 6307 0.7623810 2 6307 0.860 4 6307 0.920 5 8300 0.5678462 7 8300 0.770 6 8300 0.830 median-summaryBy(ratio ~ LEAID, data = Dataset, FUN = median) print(median) LEAID ratio.median 1 63070.8111905 2 83000.770 Now what I want is a way to compute abs(ratio- median)by LEAID for each observation to produce something like this LEAID ratio abs 3 6307 0.720 .0912 1 6307 0.7623810 .0488 2 6307 0.860 .0488 4 6307 0.920 .1088 5 8300 0.5678462 .2022 7 8300 0.770 . 6 8300 0.830 .0600 Thanks, L.A. Ista Zahn wrote: Hi, I think you want by(TestData[ , RATIO], LEAID, median) -Ista On Tue, Dec 8, 2009 at 8:36 PM, L.A. ro...@millect.com wrote: I'm just learning and this is probably very simple, but I'm stuck. I'm trying to understand the by(). This works. by(TestData, LEAID, summary) But, This doesn't. by(TestData, LEAID, median(RATIO)) ERROR: could not find function FUN HELP! Thanks, LA -- View this message in context: http://n4.nabble.com/by-function-tp955789p955789.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/by-function-tp955789p962666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple ts.plot question
* I have a simple question regarding plots of time series in R. I have to plot conc against time **for each individual and display in the same panel for the in-built dataset Indometh in R. * *I have six time series, say subject1.ts, subject2.ts, ., subject6.ts. The observations are taken at an interval of 0.25 hr. All of the series ranges from 0.25hr to 8.00 hr. How can I plot all the six time series in the same panel**?* *I am eagerly waiting for your reply. Thanks in advance.* -- View this message in context: http://n4.nabble.com/R-simple-ts-plot-question-tp962667p962667.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matched pair proportion test
annie Zhang wrote: Hi, ALL, Is there any function in R that does the exact test for the matched pair proportions (one sided), which I assume is binomial(b+c, .5). binom.test should fit nicely, I think. Thanks, Annie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help to complete missing value (Date and Time) in Sp500 Data
Dear all, here my problem my be somone can help to solve this. I have tow timeseries from different stock market with different length (diff ca. 4000 ) Now I would to add the missing times of the one of this series with proper time (they are minute data) and set the value to 0 since I need to have the same length for my calculation. I tried to use the seq in R and merge but without success because of format of the date. My data of one vector i like to complete and extend to the length of the other vectors looks like: Date;open;hight;low;close;Volume 02.04.2008 09:00;6.749,24;6.755,55;6.746,89;6.754,11;0 02.04.2008 09:01;6.754,70;6.754,70;6.748,13;6.749,55;0 02.04.2008 09:02;6.749,36;6.757,00;6.745,50;6.749,38;0 02.04.2008 09:03;6.748,08;6.753,84;6.748,08;6.753,84;0 02.04.2008 09:04;6.753,79;6.755,59;6.752,18;6.752,41;0 02.04.2008 09:05;6.753,23;6.753,23;6.748,17;6.748,47;0 02.04.2008 09:06;6.749,43;6.750,62;6.748,22;6.748,26;0 02.04.2008 09:07;6.748,26;6.748,89;6.745,54;6.745,54;0 02.04.2008 09:08;6.745,49;6.746,58;6.744,82;6.745,58;0 02.04.2008 09:09;6.745,62;6.745,98;6.741,47;6.741,55;0 02.04.2008 09:10;6.741,58;6.741,73;6.737,21;6.739,85;0 02.04.2008 09:11;6.739,10;6.742,81;6.738,24;6.742,53;0 02.04.2008 09:12;6.742,32;6.742,80;6.740,42;6.741,81;0 02.04.2008 09:13;6.741,84;6.744,78;6.741,84;6.744,60;0 02.04.2008 09:14;6.744,60;6.744,60;6.740,54;6.740,54;0 02.04.2008 09:15;6.740,45;6.740,67;6.736,32;6.737,68;0 02.04.2008 09:16;6.737,72;6.740,68;6.737,45;6.740,11;0 02.04.2008 09:17;6.740,04;6.746,34;6.740,04;6.746,34;0 02.04.2008 09:18;6.746,21;6.750,64;6.746,21;6.749,99;0 02.04.2008 09:19;6.750,61;6.752,95;6.749,07;6.750,69;0 02.04.2008 09:20;6.750,82;6.751,01;6.748,20;6.750,74;0 02.04.2008 09:21;6.750,57;6.752,98;6.748,62;6.752,98;0 02.04.2008 09:22;6.752,74;6.756,24;6.752,74;6.753,84;0 02.04.2008 09:23;6.753,90;6.755,51;6.752,70;6.753,05;0 Thanks in advance for any help! H. -- View this message in context: http://n4.nabble.com/Need-help-to-complete-missing-value-Date-and-Time-in-Sp500-Data-tp962671p962671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart - classification and regression trees (CART)
Hi, I had a question regarding the rpart command in R. I used seven continuous predictor variables in the model and the variable called TB122 was chosen for the first split. But in looking at the output, there are 4 variables that improve the predicted membership equally (TB122, TB139, TB144, and TB118) - output pasted below. Node number 1: 268 observations,complexity param=0.6 predicted class=0 expected loss=0.3 class counts: 19771 probabilities: 0.735 0.265 left son=2 (188 obs) right son=3 (80 obs) Primary splits: TB122 80 to the left, improve=50, (0 missing) TB139 90 to the left, improve=50, (0 missing) TB144 90 to the left, improve=50, (0 missing) TB118 90 to the left, improve=50, (0 missing) TB129 100 to the left, improve=40, (0 missing) I need to know what methods R is using to select the best variable for the node. Somewhere I read that the best split = greatest improvement in predictive accuracy = maximum homogeneity of yes/no groups resulting from the split = reduction of impurity. I also read that the Gini index, Chi-square, or G-square can be used evaluate the level of impurity. For this function in R: 1) Why exactly did R pick TB122 over the other variables despite the fact that they all had the same level of improvement? Was TB122 chosen to be the first node because the groups TB12280 and TB12280 were the most homogeneous (ie had the least impurity)? 2) If R is using impurity to determine the best nodes, which method (the Gini index, Chi-square, or G-square) is R using? Thanks! Katie -- View this message in context: http://n4.nabble.com/rpart-classification-and-regression-trees-CART-tp962680p962680.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by function ??
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of L.A. Sent: Saturday, December 12, 2009 12:39 PM To: r-help@r-project.org Subject: Re: [R] by function ?? Thanks for all the help, They all worked, But I'm stuck again. I've tried searching, but I not sure how to word my search as nothing came up. Here is my new hurdle, my data has 7 abservations and my results have 2 answers: Here is my data LEAID ratio 3 6307 0.720 1 6307 0.7623810 2 6307 0.860 4 6307 0.920 5 8300 0.5678462 7 8300 0.770 6 8300 0.830 median-summaryBy(ratio ~ LEAID, data = Dataset, FUN = median) print(median) LEAID ratio.median 1 63070.8111905 2 83000.770 Now what I want is a way to compute abs(ratio- median)by LEAID for each observation to produce something like this LEAID ratio abs 3 6307 0.720 .0912 1 6307 0.7623810 .0488 2 6307 0.860 .0488 4 6307 0.920 .1088 5 8300 0.5678462 .2022 7 8300 0.770 . 6 8300 0.830 .0600 Try ave(), as in Dataset$abs - with(Dataset, ave(ratio, LEAID, FUN=function(x)abs(x-median(x Dataset LEAID ratio abs 3 6307 0.720 0.0911905 1 6307 0.7623810 0.0488095 2 6307 0.860 0.0488095 4 6307 0.920 0.1088095 5 8300 0.5678462 0.2021538 7 8300 0.770 0.000 6 8300 0.830 0.060 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Thanks, L.A. Ista Zahn wrote: Hi, I think you want by(TestData[ , RATIO], LEAID, median) -Ista On Tue, Dec 8, 2009 at 8:36 PM, L.A. ro...@millect.com wrote: I'm just learning and this is probably very simple, but I'm stuck. I'm trying to understand the by(). This works. by(TestData, LEAID, summary) But, This doesn't. by(TestData, LEAID, median(RATIO)) ERROR: could not find function FUN HELP! Thanks, LA -- View this message in context: http://n4.nabble.com/by-function-tp955789p955789.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/by-function-tp955789p962666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regularized gamma function/ incomplete gamma function
RV I would be very grateful if you could help me with: RV Given the regularized gamma function Reg=int_0^r RV (x^(k-1)e^(-x))dx/int_0^Inf (x^(k-1)e^(-x))dx ; 0rInf RV (which is eventually the ratio of the Incomplete gamma RV function by the gamma function), and which is exactly what R' s pgamma() is ! RV does anyone know of a RV package in R that would evaluate the derivative of the RV inverse of Reg with respect to k? I am aware that the RV function Rgamma.inv of the package Zipfr evaluates RV the inverse of Reg [ well, the package's names is 'zipfR', as you should know case does matter on decent computer environments Indeed, it seems that the author of zipfR has neither been aware that the (scaled / aka regularized) incomplete gamma (and beta, for that matter!) functions have been part of R all along. ... ... well , inspecting his code reveals he did know it. But why then on earth provide all the new foogamma() functions, all trivially defined via pgamma(), qgamma() and gamma() ?? Never mind ... Let's get to answer your Q ] ] RV the inverse of Reg and I'm wondering wether there is a RV function that would evaluate the derivative of the RV inverse.. I'm a bit shocked by the lack of basic calculus knowledge both in your question and even more in the answers. I'm pretty sure that even before I've started studying math, I knew the formula for getting the derivative of an inverse. The mnemonic trick is dy / dx = 1/ (dx / dy), spelled out, that's d/dy f^{-1}(y) = 1 / f'(x) = 1 / f'(f^{-1}(y)) Now if you apply this to f(x) = pgamma(x, a) then the derivative of the inverse of the regularized incomplete gamma function, i.e.e the derivative of qgamma() is simply 1 / dgamma(qgamma(x, a), a) you can easily check this comparing with the results from 'numDeriv' if you want or just the simple one liner (computing the difference ratio as approximate differential ratio) below: For a = 1.25, and x = 0.2, e.g. : sapply(10^-(3:9), function(e) diff(qgamma(.2 + c(-e,e), sh = 1.25))/(2*e)) [1] 1.675105 1.675103 1.675103 1.675103 1.675103 1.675103 1.675103 1/dgamma(qgamma(0.2, sh = 1.25), sh = 1.25) [1] 1.675103 Martin Maechler, ETH Zurich RV Alternatively, a good numerical integration package/ or RV simply a function that could evaluate the integral RV int_0^r (log(x) x^(k-1) e^(-x))dx; 0rInf would be RV useful. I tried the function int of the package RV rmutil but I'm not sure wether it is accurate RV especially for small values of k. Does R have a powerful RV numerical integration package that can deal with such RV functions especially when the limit close to zero in + RV or - Inf? RV Many thanks for this opportunity to post our queries, RV Amy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
saveObject() and loadObject() are using save() and load() in base. The default is to compress the data when saving, which take some time. Using saveObject(..., compress=FALSE) is probably faster, but takes up more disk space. ...and make sure you don't work towards a file system over a network, because that can slow things down (doesn't sound like you do). /H On Sun, Nov 8, 2009 at 6:26 AM, Hao Cen h...@andrew.cmu.edu wrote: Hi Henrik, I am using your saveObject/loadObject to handle over 1000 matrices. It worked beautifully. Because I need to load those matrices often for evaluating a few functions on them and those matrices do not fit all in memory at once, is there a way to speed up the loading part? I tried save all the binary files to /dev/shm (shared memory section in linux) but the speed of loadObject on /dev/shm remains the same as on the disk. Thanks Hao -Original Message- From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On Behalf Of Henrik Bengtsson Sent: Monday, November 02, 2009 12:34 AM To: David Winsemius Cc: r-help@r-project.org; jeffc Subject: Re: [R] save an object by dynamicly created name On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote: On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } Then a more convenient solution is to use saveObject() and loadObject() of R.utils. saveObject() does not save the name of the object save. The OP asked for this outcome : I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... If you want to save multiple objects, the wrap them up in a list. I agree that a list would makes sense if it were to be stored in one file , although it was not what requested. That comment was not for the OP, but for saveObject()/loadObject() in general. But wouldn't that require assign()-ing a name before list()-wrapping? Nope, the whole point of using saveObject()/loadObject() is to save the objects/values without their names that you happens to choose in the current session, and to avoid overwriting existing ones in your next session. My example could also have been: library(R.utils); saveObject(list(a=1,b=LETTERS,c=Sys.time()), file=foo.Rbin); y - loadObject(foo.Rbin); z - loadObject(foo.Rbin); stopifnot(identical(y,z)); If you really want to attach the elements of the saved list, do: attachLocally(loadObject(foo.Rbin)); str(a) num 1 str(b) chr [1:26] A B C D E F G H I J ... str(c) POSIXct[1:1], format: 2009-11-01 21:30:41 I suppose we ought to mention that the use of assign to create a variable is a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times to refer to it by number. http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a- variable_003f My personal take on assign() and get() is that if you find yourself using them (at this level), there is a good chance there exists a better solution that you should use instead. My $.02 /H -- David loadObject() does not assign variable, but instead return them. Example: library(R.utils); x - list(a=1,b=LETTERS,c=Sys.time()); saveObject(x, file=foo.Rbin); y - loadObject(foo.Rbin); stopifnot(identical(x,y)); So, for the original example, I'd recommend: library(R.utils); path - data; mkdirs(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); saveObject(m, file=filename, path=path); } and loading the objects back as: for (i in 1:10) { filename - sprintf(m%02d.Rbin, i); m - loadObject(filename, path=path); print(m); } /Henrik -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
Re: [R] by function ??
On Dec 12, 2009, at 3:38 PM, L.A. wrote: Thanks for all the help, They all worked, But I'm stuck again. I've tried searching, but I not sure how to word my search as nothing came up. Here is my new hurdle, my data has 7 abservations and my results have 2 answers: Here is my data LEAID ratio 3 6307 0.720 1 6307 0.7623810 2 6307 0.860 4 6307 0.920 5 8300 0.5678462 7 8300 0.770 6 8300 0.830 median-summaryBy(ratio ~ LEAID, data = Dataset, FUN = median) print(median) LEAID ratio.median 1 63070.8111905 2 83000.770 Now what I want is a way to compute abs(ratio- median)by LEAID for each observation to produce something like this ?ave # creates a vector of length = length of original data.frame abs( ave(dtst$ratio, dtst$LEAID, FUN=median)-dtst$ratio) [1] 0.0911905 0.0488095 0.0488095 0.1088095 0.2021538 0.000 0.060 LEAID ratio abs 3 6307 0.720 .0912 1 6307 0.7623810 .0488 2 6307 0.860 .0488 4 6307 0.920 .1088 5 8300 0.5678462 .2022 7 8300 0.770 . 6 8300 0.830 .0600 Thanks, L.A. Ista Zahn wrote: Hi, I think you want by(TestData[ , RATIO], LEAID, median) -Ista On Tue, Dec 8, 2009 at 8:36 PM, L.A. ro...@millect.com wrote: I'm just learning and this is probably very simple, but I'm stuck. I'm trying to understand the by(). This works. by(TestData, LEAID, summary) But, This doesn't. by(TestData, LEAID, median(RATIO)) ERROR: could not find function FUN HELP! Thanks, LA -- -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple ts.plot question
*Respected Sir, I have a simple question regarding plots of time series in R. I have to plot conc against time **for each individual and display in the same panel for the in-built dataset Indometh in R. * ***I have six time series, say subject1.ts, subject2.ts, ., subject6.ts. The observations are taken at an interval of 0.25 hr. All of the series ranges from 0.25hr to 8.00 hr. How can I plot all the six time series in the same panel**?* *I am eagerly waiting for your reply. Thanks in advance.* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Call STATA code from within R
Dear R Users, Do you know whether is there any way of calling STATA from within R (possibly in a similar way that is possible to call WinBUGS from within R using the function Bugs)?? Thanks, Manuel -- View this message in context: http://n4.nabble.com/Call-STATA-code-from-within-R-tp961885p961885.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create sequence given start and end vector
How can I create the following without the 'for' loop? start=c(1,10,20) end=c(4,15,27) out=c() for (i in 1:length(start)) { out=c(out,start[i]:end[i]) } out [1] 1 2 3 4 10 11 12 13 14 15 20 21 22 23 24 25 26 27 I know there must be an easier (and, hopefully, faster) way. Many thanks in advance, Kevin Ummel Central European University Department of Environmental Science and Policy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create sequence given start and end vector
Hi Kevin, Here is a suggestion using mapply(): start - c(1,10,20) end - c(4,15,27) do.call(c, mapply( seq, start, end)) See ?mapply and ?do.call for more information. HTH, Jorge On Sat, Dec 12, 2009 at 2:27 PM, Kevin Ummel wrote: How can I create the following without the 'for' loop? start=c(1,10,20) end=c(4,15,27) out=c() for (i in 1:length(start)) { out=c(out,start[i]:end[i]) } out [1] 1 2 3 4 10 11 12 13 14 15 20 21 22 23 24 25 26 27 I know there must be an easier (and, hopefully, faster) way. Many thanks in advance, Kevin Ummel Central European University Department of Environmental Science and Policy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create sequence given start and end vector
Also what about c(seq(1,4,1),seq(10,15,1),seq(20,27,1)) Joe King j...@joepking.com Never throughout history has a man who lived a life of ease left a name worth remembering. --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jorge Ivan Velez Sent: Saturday, December 12, 2009 1:43 PM To: Kevin Ummel Cc: r-help@r-project.org Subject: Re: [R] Create sequence given start and end vector Hi Kevin, Here is a suggestion using mapply(): start - c(1,10,20) end - c(4,15,27) do.call(c, mapply( seq, start, end)) See ?mapply and ?do.call for more information. HTH, Jorge On Sat, Dec 12, 2009 at 2:27 PM, Kevin Ummel wrote: How can I create the following without the 'for' loop? start=c(1,10,20) end=c(4,15,27) out=c() for (i in 1:length(start)) { out=c(out,start[i]:end[i]) } out [1] 1 2 3 4 10 11 12 13 14 15 20 21 22 23 24 25 26 27 I know there must be an easier (and, hopefully, faster) way. Many thanks in advance, Kevin Ummel Central European University Department of Environmental Science and Policy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with graphing -- Points in my graph are not apparent, always displayed in steps
Hi Philip, I must confess that I not understood what is the problem. Could you clarify it a little bit more? Cheers miltinho brazil=toronto On Sat, Dec 12, 2009 at 12:39 AM, philip robinson robin...@students.wwu.edu wrote: I am trying to graphically represent a large set of data who's result is not strictly uniform. http://n4.nabble.com/file/n961629/egraph_rules_list_2.png The scatter plot to the left has all of the data rising in steps however I know that there are cases within my data that do not fit the dotted line. temp-edat[edat$R1SC0,,] png(egraph_rules_list_1.png,width=800,height=700,res=72); par(mfrow=c(2,2)); qqplot(x=temp$words,y=temp$R1SC,ylab=With Rules applied SC Shortlist,xlab=Number of Words,col=blue,main=Subordinating Conjunctions\n(Number of Words),type=p); hist(temp$R1SC/temp$words,col=heat.colors(max(temp$R1SC)),main=Subortinating Conjunctions \n/ Number of Words); temp-edat[edat$R1CC0,,] qqplot(x=temp$words,y=temp$R1CC,ylab=With Rules applied CC Shortlist,xlab=Number of Words,col=purple,main=Coordinating Conjunctions\n(Number of Words),type=p); hist(temp$R1CC/temp$words,col=heat.colors(max(temp$R1CC)),main=Coordinating Conjunctions \n/ Number of Words); dev.off(); your help is much appreciated -- View this message in context: http://n4.nabble.com/help-with-graphing-Points-in-my-graph-are-not-apparent-always-displayed-in-steps-tp961629p961629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multi-dimensional array with different number of dimensions?
Hi, Is it possible to assign to an array with different dimensions? That is to say, supposing a three dimensional array, the third dimension of the array has matrices of different sizes? array , , 1 [1] [2] [3] [1] 111 , , 2 [1] [2] [3] [1] 111 [2] 111 , , 3 [1] [2] [3] [1] 111 [2] 111 [3] 111 something like this?? Thanks, B _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace NAs in a range of data frame columns
Dear all, I'm stuck in a seemingly trivial task that I need to perform for many datasets. Basically, I want to replace NA with 0 in a specified range of columns in a dataframe. I know the first and last column to be recoded only by its name. I can select the columns starting like this a[match('first',names(a)): match('last',names(a))] The question is how can replace all NA with 0 in this subset of the data? Thanks and greetings, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace NAs in a range of data frame columns
hi Michael, the following code should work b - a[match('first',names(a)): match('last',names(a))] b[is.na(b)]-0 a[match('first',names(a)): match('last',names(a))] - b cheers, Patrizio 2009/12/13 Michael Scharkow mich...@underused.org: Dear all, I'm stuck in a seemingly trivial task that I need to perform for many datasets. Basically, I want to replace NA with 0 in a specified range of columns in a dataframe. I know the first and last column to be recoded only by its name. I can select the columns starting like this a[match('first',names(a)): match('last',names(a))] The question is how can replace all NA with 0 in this subset of the data? Thanks and greetings, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- +- | Patrizio Frederic, PhD | Assistant Professor, | Department of Economics, | University of Modena and Reggio Emilia, | Via Berengario 51, | 41100 Modena, Italy | | tel: +39 059 205 6727 | fax: +39 059 205 6947 | mail: patrizio.frede...@unimore.it +- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matched pair proportion test
Yes, thanks. It's exactly what I want. Annie On Sat, Dec 12, 2009 at 12:46 PM, Peter Dalgaard p.dalga...@biostat.ku.dkwrote: annie Zhang wrote: Hi, ALL, Is there any function in R that does the exact test for the matched pair proportions (one sided), which I assume is binomial(b+c, .5). binom.test should fit nicely, I think. Thanks, Annie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multi-dimensional array with different number of dimensions?
On Dec 12, 2009, at 5:36 PM, parkbomee wrote: Hi, Is it possible to assign to an array with different dimensions? That is to say, supposing a three dimensional array, the third dimension of the array has matrices of different sizes? Use a list. Or populate with NA's array , , 1 [1] [2] [3] [1] 111 , , 2 [1] [2] [3] [1] 111 [2] 111 , , 3 [1] [2] [3] [1] 111 [2] 111 [3] 111 something like this?? Thanks, B _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create sequence given start and end vector
On Sat, 12 Dec 2009, Jorge Ivan Velez wrote: Hi Kevin, Here is a suggestion using mapply(): start - c(1,10,20) end - c(4,15,27) do.call(c, mapply( seq, start, end)) ...which is what I would usually do. But for heavy duty applications, the IRanges package and function may be worth studying: require(IRanges) # from bioConductor.org as.vector( IRanges( start, end ) ) [1] 1 2 3 4 10 11 12 13 14 15 20 21 22 23 24 25 26 27 An introduction to the package is at http://bioconductor.org/packages/2.5/bioc/vignettes/IRanges/inst/doc/IRangesOverview.pdf HTH, Chuck See ?mapply and ?do.call for more information. HTH, Jorge On Sat, Dec 12, 2009 at 2:27 PM, Kevin Ummel wrote: How can I create the following without the 'for' loop? start=c(1,10,20) end=c(4,15,27) out=c() for (i in 1:length(start)) { out=c(out,start[i]:end[i]) } out [1] 1 2 3 4 10 11 12 13 14 15 20 21 22 23 24 25 26 27 I know there must be an easier (and, hopefully, faster) way. Many thanks in advance, Kevin Ummel Central European University Department of Environmental Science and Policy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace NAs in a range of data frame columns
On Dec 12, 2009, at 6:15 PM, Patrizio Frederic wrote: hi Michael, the following code should work b - a[match('first',names(a)): match('last',names(a))] b[is.na(b)]-0 This might not throw an error: b - apply(a[match('first',names(a)): match('last',names(a))], 1:2, function(x) ifelse(is.na(x), 0, x) ) a[match('first',names(a)): match('last',names(a))] - b And this might actually replace a range of columns: a[ , match('first',names(a)): match('last',names(a))] - b more cheer; David cheers, Patrizio 2009/12/13 Michael Scharkow mich...@underused.org: Dear all, I'm stuck in a seemingly trivial task that I need to perform for many datasets. Basically, I want to replace NA with 0 in a specified range of columns in a dataframe. I know the first and last column to be recoded only by its name. I can select the columns starting like this a[match('first',names(a)): match('last',names(a))] The question is how can replace all NA with 0 in this subset of the data? Thanks and greetings, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- +- | Patrizio Frederic, PhD | Assistant Professor, | Department of Economics, | University of Modena and Reggio Emilia, | Via Berengario 51, | 41100 Modena, Italy | | tel: +39 059 205 6727 | fax: +39 059 205 6947 | mail: patrizio.frede...@unimore.it +- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Have you used RGoogleDocs and RGoogleData?
Farrel Buchinsky wrote: It Works! Thanks a lot! Its great. Thanks for letting me know. Glad that fixed things for you. What were your few minor, but important, changes - in a nutshell. I will not understand unless you describe it as high level issues. Basically, recognizing the type of a document, e.g. a spreadsheet or word processing document or generic document. The changes made the detection more robust or more consistent with any changes at Google. D. Farrel Buchinsky Google Voice Tel: (412) 567-7870 On Fri, Dec 11, 2009 at 19:07, Duncan Temple Lang dun...@wald.ucdavis.eduwrote: Hi Farrel I have taken a look at the problems using RGoogleDocs to read spreadsheets and was able to reproduce the problem I believe you were having. A few minor, but important, changes and I can read spreadsheets again and apparently still other types of documents. I have put an updated version of the source of the package with these changes. It is available from http://www.omegahat.org/RGoogleDocs/RGoogleDocs_0.4-1.tar.gz There is a binary for Windows in http://www.omegahat.org/RGoogleDocs/RGoogleDocs_0.4-1.zip Hopefully this will cure the problems you have been experiencing. I'd appreciate knowing either way. Thanks, D. Farrel Buchinsky wrote: Both of these applications fulfill a great need of mine: to read data directly from google spreadsheets that are private to myself and one or two collaborators. Thanks to the authors. I had been using RGoogleDocs for the about 6 months (maybe more) but have had to stop using it in the past month since for some reason that I do not understand it no longer reads google spreadsheets. I loved it. Its loss depresses me. I started using RGoogleData which works. I have noticed that both packages read data slowly. RGoogleData is much slower than RGoogleDocs used to be. Both seem a lot slower than if one manually downloaded a google spreadsheet as a csv and then used read.csv function - but then I would not be able to use scripts and execute without finding and futzing. Can anyone explain in English why these packages read slower than a csv download? Can anyone explain what the core difference is between the two packages? Can anyone share their experience with reading Google data straight into R? Farrel Buchinsky Google Voice Tel: (412) 567-7870 Sent from Pittsburgh, Pennsylvania, United States [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xtabs - missing combination
Dear list, I am trying to make a contingency table with xtabs but I am getting a 0 where I expect a 'NA'. Here is a simple example: options(stringsAsFactors = FALSE) rn - LETTERS[1:4] df1 - data.frame(r07 = rep(rn, each=4), r08 = rep(rn, 4), value = 1:16) xtabs(value ~ r07 + r08, df1) # Delete the combination [A, C] df1 - df1[-3,] # Set 'value' for this combination to 0 df1[13, 3] - 0 # This is the output I want tapply(df1[, value], df1[, c(r07, r08)], c) # but using 'xtabs' I get a 0 for [A, C] xtabs(value ~ r07 + r08, df1) Hmm, what have I missed... Thanks for any help! Best, Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-linear Weibull model for aggregated parasite data
Hi, I am trying to fit a non-linear model for a parasite dataset. Initially, I tried log-transforming the data and conducting a 2-way ANCOVA, and found that the equal variance of populations and normality assumptions were violated. Gaba et al. (2005) suggests that the Weibull Distribution is best for highly aggregated parasite distributions, and performs better (lower type 1 and 2 error rates) than models using normal (with log-transformed data) and negative binomial error structure. I have looked at the R help site and had no success in conducting the analysis, so I had no choice but to turn to the R masters. The dependent variable is coccidiaopg (a fecal egg count) and the independent variables are age (continuous), year (continous), sex (2 level factor), and season (2 level factor). The variable sex is a nested factor in season due to the fact that different individuals were sampled during the different seasons. I may need to talk with a local statistician, but if it is simple for someone to help with the code to execute this analysis in R, I would be very greatful. Also, I am unsure how to estimate the starting parameters for shape and scale. Thank you. Best, Daniel Eacker -- View this message in context: http://n4.nabble.com/Non-linear-Weibull-model-for-aggregated-parasite-data-tp962780p962780.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
for an example, counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9); treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) confint(glm.D93) confint.default(glm.D93) # based on asymptotic normality to verify the confidence interval (confint.default(glm.D93)) for outcome2 -4.542553e-01 + c(-1,1) * 0.2021708 * qt(0.975,df=4) -1.0155714 0.1070608 does not give me outcome2-0.8505027 -0.05800787 as in confint.default(glm.D93) Thanks -- View this message in context: http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p962790.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] too large dimension problem
Dear R family When I run the command below, the error message came up. It seems like the problem is about computer capacity. It would be appreciated if anyone could give me a solution. ### N - 415884 tau - diag(1, N)[c(N, 1:(N - 1)),] Error in array(0, c(n, p)) : 'dim' specifies too large an array Best Moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hmisc filled bands colors
Hi all, i am still using the Hmisc package and I like the filled bands part, is there a way though to have different groups have different color of bands, maybe a lighter version of the color of the line that is used? --- Joe King, M.A. Ph.D. Student University of Washington - Seattle 206-913-2912 j...@joepking.com --- Never throughout history has a man who lived a life of ease left a name worth remembering. --Theodore Roosevelt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lines don't wrap. must scroll horizontally to see/edit a long line in R GUI
I'm facing this problem on R GUI version 2.10.0 on Windows Vista. I have not changed Windows settings or R GUI settings much except to change from MDI to SDI. Someone else reported this problem a few months ago: https://stat.ethz.ch/pipermail/r-help/2009-April/195714.html but it wasn't followed up. I'd change the settings on Preferences, but there's no help explaining the various options. (That's another problem, by the way... Is there a link you know?) I guess fixing this would be straightforward. Hope someone who did will reply. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lines don't wrap. must scroll horizontally to see/edit a long line in R GUI
Hi Viju, Here is a suggestion: R options(width = 80) R 1:120 HTH, Jorge On Sat, Dec 12, 2009 at 10:34 PM, Viju Moses wrote: I'm facing this problem on R GUI version 2.10.0 on Windows Vista. I have not changed Windows settings or R GUI settings much except to change from MDI to SDI. Someone else reported this problem a few months ago: https://stat.ethz.ch/pipermail/r-help/2009-April/195714.html but it wasn't followed up. I'd change the settings on Preferences, but there's no help explaining the various options. (That's another problem, by the way... Is there a link you know?) I guess fixing this would be straightforward. Hope someone who did will reply. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Easily switch between R GUI and R term on Windows?
Is there an easy way to switch between R GUI and R terminal? I'm currently using R GUI 2.10.0 on Windows Vista and would like to see something like the R terminal found in Linux. But I may also want to switch back to GUI. I don't remember seeing the terminal option at the R download site, or during installation. I've set the pager style to SDI, and that has helped partly. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lines don't wrap. must scroll horizontally to see/edit a long line in R GUI
Dear Jorge, Thanks. But. (My R console width is set by default at 91, which is fine. Output for R1:120 wraps correctly already. ) The problem is that when I am typing a command, it does not wrap like it does in this email while typing. As I am reach the right border, a horizontal scroll bar appears, and as I keep typing further, the initial part of the line starts disappearing below the left border of the console. So, I am unable to see a long line completely without moving the scroll bar right and left. From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com] Sent: Sunday, December 13, 2009 9:09 To: Viju Moses Cc: r-help@r-project.org Subject: Re: [R] lines don't wrap. must scroll horizontally to see/edit a long line in R GUI Hi Viju, Here is a suggestion: R options(width = 80) R 1:120 HTH, Jorge On Sat, Dec 12, 2009 at 10:34 PM, Viju Moses wrote: I'm facing this problem on R GUI version 2.10.0 on Windows Vista. I have not changed Windows settings or R GUI settings much except to change from MDI to SDI. Someone else reported this problem a few months ago: https://stat.ethz.ch/pipermail/r-help/2009-April/195714.html but it wasn't followed up. I'd change the settings on Preferences, but there's no help explaining the various options. (That's another problem, by the way... Is there a link you know?) I guess fixing this would be straightforward. Hope someone who did will reply. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] too large dimension problem
On Dec 12, 2009, at 8:31 PM, Moohwan Kim wrote: Dear R family When I run the command below, the error message came up. It seems like the problem is about computer capacity. It would be appreciated if anyone could give me a solution. You would want to just store such an object but also to manipulated to the answer must be: buy a computer with 500GB of RAM. ### N - 415884 tau - diag(1, N)[c(N, 1:(N - 1)),] Error in array(0, c(n, p)) : 'dim' specifies too large an array Best Moohwan -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
On Dec 12, 2009, at 8:19 PM, casperyc wrote: for an example, counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9); treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) confint(glm.D93) confint.default(glm.D93) # based on asymptotic normality to verify the confidence interval (confint.default(glm.D93)) for outcome2 -4.542553e-01 + c(-1,1) * 0.2021708 * qt(0.975,df=4) -1.0155714 0.1070608 does not give me outcome2-0.8505027 -0.05800787 as in confint.default(glm.D93) But this does (up to rounding anyway): coef(summary(glm.D93))[2,1] + c(-1,1) * coef(summary(glm.D93)) [2,2]*qnorm(0.975) [1] -0.85050267 -0.05800787 I can understand thinking that the CI's might be t-distributed but the usual formulation is that they are normally distributed. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.