[R] Are there ANOVA for compositional data?
The compositional data xi=(x_i1, x_i2,..., x_in), for each fixed i , xij0, and sum(xij)=1; I want to compare the mean( u_i) of several groups i.e. H0: u_1=u_2=...=u_N or Hj0: u_1j=u_2j=...=u_Nj Are there any ANOVA tpye tools to do this work in R? Thanks, WEN S Q [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About compositional data analysis
Well, one place to start is to read the following vignette http://finzi.psych.upenn.edu/R/library/compositions/doc/UsingCompositions.pdf This was found using the search function RSiteSearch(compositional data) in R. You may also want to study @Article{aitchison82, author = J. Aitchison, title = The statistical analysis of compositional data, journal = jrssb, year = 1982, volume = 44, number = 2, pages = 139-177, annote = With discussion. } @Book{aitchison86, author = J. Aitchison, title = The Statistical Annalysis of Compositional Data, publisher = Chapman and Hall, year = 1986, series = Monographs on Statistics and Applied Probability, address = London, annote = A greatly expanded version of the original 1982 paper, with lots of examples of hypothesis testing } Med venlig hilsen Frede Aakmann Tøgersen -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af S.Q. WEN Sendt: 17. oktober 2006 06:50 Til: R-help@stat.math.ethz.ch Emne: [R] About compositional data analysis The compositional data xi=(x_i1,x_i2,...,x_in), for each fixed i , xij0, and sum(xij)=1; I want to compare the mean( u_i) of several groups i.e. H0: u_1=u_2=...=u_N or H0: u_11=u_21=...=u_N1 Are there any ANOVA tpye tools to do this work in R? Thanks, WEN S Q [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error Correcting Codes, Simplex
On 10/16/06, Björn Egert [EMAIL PROTECTED] wrote: On 10/8/06, Egert, Bjoern [EMAIL PROTECTED] wrote: Hello, Is there a way in R to construct an (error correcting) binary code e.g. for an source alphabet containing integers from 1 to say 255 with the property that each pair of distinct codewords of length m is at Hamming distance exactly m/2 ? I was suggested to use so called simplex codes, which should be fairly standard, but I haven't found a direct way via R packages to do so, that's why I ask whether there might be in indirect way to solve this problem. Example: v1 =c(1,2,3,4) v2 =c(1,2,5,6) similarity(v1,v2)=0.5, (because 2 out of 4 elements are equal). Obviously, a binary representation of would yield a different similarity of: binary(v1) =001 010 011 100 binary(v1) =001 010 101 110 similarity(binary(v1),binary(v2))= 9/12 Remark: The focus here is not on error correction, but rather the binary encoding retaining similarity of the elements of vectors. Many thanks, Bjoern Bjoern, NB: I'm an R newbie and I only know a bit about error correcting codes. I haven't seen any responses to your questions and I don't know if you still have a need, but it is certainly possible to construct forward error correction codes with all the great math capability in R. It seems you want to generate code words that still have the original bits present. These are systematic codes and there are lots of them available to use. Many codes are specified by the code word length (n), number of original data bits in each code word (k), and the minimum Hamming distance of the code words (d) as a [n,k,d] code. Simplex Codes have these parameters: [2^k - 1, k, 2^(k - 1)]. These codes could be generated as a simple matrix multiply in R, but are you sure that's what you want? The code words will be quite long. Regards, Richard Graham Hello, thank you. yes, basically, that's what I want. Just a binary encoding of an arbitrary integer value (or vector of integers) with the property that each pair of distinct integer values have an equal Hamming- distance (m/2), so as to be able to a similarity search I got the idea from: Gionis: Efficient and Tunable Similar Set Retrieval (Chap 3.2) regards Bjoern Bjoern, I read only the section of the paper you mention and I'll trust that the stated properties of Simplex Codes are true. I haven't researched or verified it. [from http://magma.maths.usyd.edu.au/] Magma is a large, well-supported software package designed to solve computationally hard problems in algebra, number theory, geometry and combinatorics. It provides a mathematically rigorous environment for computing with algebraic, number-theoretic, combinatoric and geometric objects. I don't understand a fraction of its capability but I still find it to be very useful. In fact, they have an online calculator that will give you the generator matrix you want. The online Magma calculator is at: http://magma.maths.usyd.edu.au/calc/ To calculate the generator matrix I think you are asking for, go to the above URL and cut/paste the following command: ExtendCode(SimplexCode(8)); Click Evaluate and the output window will contain a [256, 8, 128] Linear Code over GF(2). You'll need to massage this a bit to use it as a matrix for R. I'd use Ruby to do this, but anything will do. If you want to encode more/less than 8 bits, you can modify the above argument to SimplexCode. I used ExtendCode so that the codeword length == Dmin * 2 The Gionis claim I'll research or verify sometime is that _every_ pair of Simplex Code words of length m have Hamming distance == m/2. If you have a reference to a proof, I'd like to read it (like I said, I only know a bit about ECC). Good Luck with your work! Richard Graham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Review process for new packages
Hi all, i'm currently working on a creditmetrics package which includes functions for computing the credit risk model creditmetrics. I guess it would be finished in a few days. My question now is, does there exist some review process before sending it to ctan or is it reviewed after having sended it? best regards Andreas -- NEU: Jetzt bis zu 16.000 kBit/s! http://www.gmx.net/de/go/dsl __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lda
Pieter Vermeesch wrote: I'm trying to do a linear discriminant analysis on a dataset of three classes (Affinities), using the MASS library: data.frame2 - na.omit(data.frame1) data.ld = lda(AFFINITY ~ ., data.frame2, prior = c(1,1,1)/3) Error in var(x - group.means[g, ]) : missing observations in cov/cor What does this error message mean and how can I get rid of it? What does str(data.frame2) tell us? Uwe Ligges Thanks! Pieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lda
Pieter == Pieter Vermeesch [EMAIL PROTECTED] on Mon, 16 Oct 2006 19:15:59 +0200 writes: Pieter I'm trying to do a linear discriminant analysis on a Pieter dataset of three classes (Affinities), using the Pieter MASS library: ^^^ No, no!MASS *package* (please!) data.frame2 - na.omit(data.frame1) data.ld = lda(AFFINITY ~ ., data.frame2, prior = c(1,1,1)/3) Pieter Error in var(x - group.means[g, ]) : missing observations in cov/cor Pieter What does this error message mean and how can I get rid of it? You have (+ or -) 'Inf' data values which na.omit() does not omit and 'x - group.means[g, ]' contains 'Inf - Inf' which is NaN. Ideally, MASS:::lda.default() would check for such a case and give a more user-friendly error message. Pieter Thanks! you're welcome. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance of fitted value in lm
You can get these via the predict function. On 17/10/06, Li Zhang [EMAIL PROTECTED] wrote: Hi, I am wondering if a linear model lm(y~ x1+x2) calculates the variance of a fitted value. Thank you Li __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lda
Dear Martin and Uwe, I did indeed have a few -Inf values in my data frame. Few enough that I didn't notice them when I inspected my data. Thanks a lot for helping me better understand the MASS *package* :-) Pieter On 10/17/06, Martin Maechler [EMAIL PROTECTED] wrote: Pieter == Pieter Vermeesch [EMAIL PROTECTED] on Mon, 16 Oct 2006 19:15:59 +0200 writes: Pieter I'm trying to do a linear discriminant analysis on a Pieter dataset of three classes (Affinities), using the Pieter MASS library: ^^^ No, no!MASS *package* (please!) data.frame2 - na.omit(data.frame1) data.ld = lda(AFFINITY ~ ., data.frame2, prior = c(1,1,1)/3) Pieter Error in var(x - group.means[g, ]) : missing observations in cov/cor Pieter What does this error message mean and how can I get rid of it? You have (+ or -) 'Inf' data values which na.omit() does not omit and 'x - group.means[g, ]' contains 'Inf - Inf' which is NaN. Ideally, MASS:::lda.default() would check for such a case and give a more user-friendly error message. Pieter Thanks! you're welcome. Martin Maechler, ETH Zurich -- Pieter Vermeesch ETH Zürich, Isotope Geology and Mineral Resources Clausiusstrasse 25, NW C 85, CH-8092 Zurich, Switzerland email: [EMAIL PROTECTED], tel: +41 44 632 4643 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about managing searching path
Hi all, I'm having sometrouble with managing the seach path, in a function , I need to attach some data set at the begining and detach them at the end, say, myfunction- function() { attach(mylist); .detach(mylist) } , the problem is, since I am still debugging this code, sometimes it got error and ended before reaching the end, thus the data is left in the searching path. What 's the right way to make mylist detached no matter what ? Thanks a lot. best __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about managing searching path
Dear Tong I think on.exit() makes the job..Namely: attach(Yourdata) on.exit(detach(YourData)) vito Tong Wang wrote: Hi all, I'm having sometrouble with managing the seach path, in a function , I need to attach some data set at the begining and detach them at the end, say, myfunction- function() { attach(mylist); .detach(mylist) } , the problem is, since I am still debugging this code, sometimes it got error and ended before reaching the end, thus the data is left in the searching path. What 's the right way to make mylist detached no matter what ? Thanks a lot. best __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New package Ryacas
Maybe there are timing problems using that setup with sockets? I once tried VMware (not with Ryacas but just to try it out) and found it slow as can be expected with an emulated environment. Since you have Windows XP just use the Windows version of Ryacas directly. On 10/17/06, Simon Blomberg [EMAIL PROTECTED] wrote: Hi Gabor, I'm running Quantian (Debian) inside a VMware virtual machine, on a Windows XP host. I installed the latest version of yacas from the source tarball. I remembered to ./configure --enable-server to allow server connections. make and make install worked ok, after some fiddling. I checked that the yacas server option worked, by doing yacas --server , and then telnet'ing to 127.0.0.1 to check. It worked fine. I installed Ryacas. I then tried it out and got the following error: library(Ryacas) Loading required package: XML yacas('Integrate(x)x;') [1] Starting Yacas! Error in socketConnection(host = 127.0.0.1, port = 9734, server = FALSE, : unable to open connection In addition: Warning message: 127.0.0.1:9734 cannot be opened Accepting requests from port 9734 I tried again (stubborn, I guess): yacas('Integrate(x)x;') [1] Starting Yacas! Accepting requests from port 9734 YacasServer Could not bind to the socket : Address already in use /usr/local/lib/R/site-library/Ryacas/yacdir/R.ys(1) : File not found CommandLine(1) : Expecting ) closing bracket for sub-expression, but got x instead Any ideas where I may be going wrong? I don't know anything about sockets. I've cross-posted to r-sig-debian. They may be interested. Cheers, Simon. -- Simon Blomberg, B.Sc.(Hons.), Ph.D, M.App.Stat. Centre for Resource and Environmental Studies The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 7800 email: Simon.Blomberg_at_anu.edu.au F: +61 2 6125 0757 CRICOS Provider # 00120C The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC and NULL values
Dear All, Writing sooner than I thought I'd need to. I'm using R 2.4 on Mac OS X, with RODBC, PostgreSQL 8.1 and Actual's ODBC driver. I have all my data in Filemaker 8.5, but it is automatically exported into PostgreSQL for analysis as Filemaker's ODBC and JDBC access is awful, slow and has a tendency to crash. I have disability data where for each patient there is a survival time in years from disease onset to a particular disease stage, namely unilateral support, bilateral support, wheelchair use, and death. Valid values may include NULL (patient hasn't reached that stage), 0 (for example, patient needed support immediately at disease onset), and any positive integer. When I query the database manually using psql, it is clear there are NULL values. 3 | 3 | 18 | | 27 |1 | || | 13 |1 1 | 5 || | 10 |0 10 |13 | 13 | | 22 |0 However, these are all converted to zeros when I use RODBC's sqlQuery(), making interpretation impossible. I have tried using the nullstring and na.strings options, but these don't seem to have any effect. I have tried various combinations of NULL, NA and . Forgive my awkward SQL. channel = odbcConnect(ataxia, uid=mark) disease = sqlQuery(channel, select calc_survival_unilateral_support as unlateral, calc_survival_bilateral_support as bilateral, calc_survival_wheelchair as wheelchair,calc_survival_death as death, calc_follow_up as followup, has_family_history_ataxia as familial from clinical, patient where clinical.patient_fk = patient_id and excluded=0 and calc_walking_disability_valid=1) disease # and show results 1273 3 18 0 271 1280 0 0 0 131 1291 5 0 0 100 130 1013 13 0 220 It doesn't seem to be the old repeating rows NULL bug talked about a href=http://tolstoy.newcastle.edu.au/R/help/04/07/0803.html;here/a. Is this because my ODBC driver is not returning the correct values for RODBC to parse? Is there anyway of debugging this (the intricacies of ODBC are beyond my skill) and is my only alternative to store a non-valid number in the database (999?) and use my query or R to remove those datapoints afterwards? Looking in the archives, there are lots of people asking about how to convert NAs to numeric, but I want the NAs passed through unaltered! Many thanks in advance, Mark -- Dr. Mark Wardle Clinical research fellow and Specialist Registrar in Neurology, C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New package Ryacas
Just one other comment. If you want to try running Linux over Windows you might want to check out how the AndLinux project (google to find) is progressing. I had tried it about a year ago and it was much faster than VMware although at that time it was still a bit immature. On 10/17/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Maybe there are timing problems using that setup with sockets? I once tried VMware (not with Ryacas but just to try it out) and found it slow as can be expected with an emulated environment. Since you have Windows XP just use the Windows version of Ryacas directly. On 10/17/06, Simon Blomberg [EMAIL PROTECTED] wrote: Hi Gabor, I'm running Quantian (Debian) inside a VMware virtual machine, on a Windows XP host. I installed the latest version of yacas from the source tarball. I remembered to ./configure --enable-server to allow server connections. make and make install worked ok, after some fiddling. I checked that the yacas server option worked, by doing yacas --server , and then telnet'ing to 127.0.0.1 to check. It worked fine. I installed Ryacas. I then tried it out and got the following error: library(Ryacas) Loading required package: XML yacas('Integrate(x)x;') [1] Starting Yacas! Error in socketConnection(host = 127.0.0.1, port = 9734, server = FALSE, : unable to open connection In addition: Warning message: 127.0.0.1:9734 cannot be opened Accepting requests from port 9734 I tried again (stubborn, I guess): yacas('Integrate(x)x;') [1] Starting Yacas! Accepting requests from port 9734 YacasServer Could not bind to the socket : Address already in use /usr/local/lib/R/site-library/Ryacas/yacdir/R.ys(1) : File not found CommandLine(1) : Expecting ) closing bracket for sub-expression, but got x instead Any ideas where I may be going wrong? I don't know anything about sockets. I've cross-posted to r-sig-debian. They may be interested. Cheers, Simon. -- Simon Blomberg, B.Sc.(Hons.), Ph.D, M.App.Stat. Centre for Resource and Environmental Studies The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 7800 email: Simon.Blomberg_at_anu.edu.au F: +61 2 6125 0757 CRICOS Provider # 00120C The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC and NULL values
Mark Wardle wrote: ... Is this because my ODBC driver is not returning the correct values for RODBC to parse? Is there anyway of debugging this (the intricacies of ODBC are beyond my skill) and is my only alternative to store a non-valid number in the database (999?) and use my query or R to remove those datapoints afterwards? ... Actually, it appears that the Actual ODBC driver isn't returning the data properly. I've just tested it using Excel and it returns zeros for NULLs. Wasn't able to use iodbctest as it got very confused and tried to connect to a MySQL database (which I don't have). There is nothing RODBC can magic to fix this. It's a bit odd, as I use Filemaker to export data via raw SQL commands against the ODBC driver, and that does cope with NULLs, but it appears fetching, at least with Excel and RODBC, does not. I was just going to try installing Rdbi to see whether that has better luck, but I can't access CRAN this morning. Hopefully the 403 Forbidden message will be temporary! So unless anyone knows a better alternative, I shall have to store nonsense values rather than NULLs in the database (or fix it within the SELECT query as a quick hack solution instead). Best wishes, Mark -- Dr. Mark Wardle Clinical research fellow and Specialist Registrar in Neurology, C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can i compute the average of three blocks for each column ?
Hi I haven't seen any answer yet so I try From your not very clear explanation I suspect you want to do some block aggregation test block x1 x2 x3 x4 x5 1 1 23 22 23 24 23 2 1 21 25 26 21 39 3 1 23 24 22 23 23 4 2 20 21 23 24 28 5 2 32 23 34 24 26 6 2 19 34 34 13 34 7 3 12 32 23 34 19 8 3 23 24 25 26 27 9 3 12 78 23 24 24 by(test[,-1], test$block, mean) test$block: 1 x1 x2 x3 x4 x5 22.3 23.7 23.7 22.7 28.3 - test$block: 2 x1 x2 x3 x4 x5 23.7 26.0 30.3 20.3 29.3 - test$block: 3 x1 x2 x3 x4 x5 15.7 44.7 23.7 28.0 23.3 aggregate(test[,-1], list(test$block), mean) Group.1 x1 x2 x3 x4 x5 1 1 22.3 23.7 23.7 22.7 28.3 2 2 23.7 26.0 30.3 20.3 29.3 3 3 15.7 44.7 23.7 28.0 23.3 Regarding your second question with plotting see arguments in par, especially mar or mai. HTH Petr On 15 Oct 2006 at 22:22, Yen Ngo wrote: Date sent: Sun, 15 Oct 2006 22:22:19 +0200 (CEST) From: Yen Ngo [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] how can i compute the average of three blocks for each column ? Dear all, I want to compute the average of the three blocks for each x-variable which is equal slide in the code below. How can I do that ? block x1x2x3x4x5 12322 232423 12125 262139 123242223 23 220 21 232428 2 32 2334 24 26 2 19 34341334 3 12 32 ´ 233419 3 23 24 252627 3 12 78 232424 # read table of data for this slide=(x1) a-read.table(file = slide[i],header=T,sep='\t',na.strings=NA) #length(a$ID) #Eleminate Neg. and Pos. controls from the dataset. The logical negation of the %in% function, #tells subset to only select those row where the ID column does not contain either empty or none new - subset(a,!ID %in% c(empty,none, )) #length(new$ID) #new[1:20,c(1,4,5,9)] #five first columns give position identifiers, include a column with block layout=new[,1:5] layout[1:30,] #9th columns which give the median foreground =values of x-variables fg1=as.matrix(new[,9]) length(fg1) mean(fg1) # calculate the mean of x1 I try to do something like :## block1=fg1[layout$Block==1,] block2=fg1[layout$Block==1,] block2=fg1[layout$Block==1,] average=(block1+block2+block3)/3 but it did not work. ## How can i calculate the means of remaining x_variables? # Read data for the remaining slides =x2,x3,x4,x5 ### for (i in 2:num.slides){ na1 - strsplit(na[[i]][k],.txt) na2 - strsplit(na1[[1]][1],-) bat=na2[[1]][1] sli=na2[[1]][2] nslide - cbind(nslide,as.numeric(sli)) # nslide is a vector giving the number of the slide in the batch # read table of data for this slide a-read.table(file=slide[i],header=T,sep='\t',na.strings=NA) new- subset(a,!ID %in% c(empty,none, )) # append FG data to the matrices containing the slides already read fg1=cbind(fg1,as.matrix(new[,9])) } colnames(fg1)=nslide fg-data.frame(peptide=c(new$Name),fg1) fg - edit(fg) # Another question : I have three graphs which are displayed one after one with a large space between them. Can I move these graph closer each other by making them bigger and how ? Below is the code that i have written for plotting the graphs. par(mfrow=c(3,1)) for (j in 1:3) { boxplot(split(pos$y[pos$Block==j],pos$Slide[pos$Block==j]), col=lightgray, cex=.65, outline=TRUE, main=paste(Positive Controls Block,j)) } Thank you for your help, Regards, Yen [[alternative HTML version deleted]] Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Anupam == Anupam Tyagi [EMAIL PROTECTED] on Mon, 16 Oct 2006 18:15:06 + (UTC) writes: Anupam Hi Stephane, Anupam Stéphane CRUVEILLER scruveil at genoscope.cns.fr writes: is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... Anupam I think you are looking for the %in% and %nin% which Anupam are part of Design package, and also in Hmisc Anupam library. You have to install and load these packages Anupam to access these functions. Hmm, '%in%' has been part of standard R for years ... Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert Contingency Table to Flat File
Hello All, Is there any R function out there to turn a multi-way contingency table back to a flat file table of individual rows and attribute columns.? Thanks! marco - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC and NULL values
What sqltype(s) are your variables? For numeric types, RODBC merely maps values the ODBC driver says are NULL to NA. Since you appear not to have character data, nullstring: character string to be used when reading 'SQL_NULL_DATA' character items from the database. na.strings: character string(s) to be mapped to 'NA' when reading character data. are not relevant to you. At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works correctly, and NULLs in numeric columns are mapped to NAs in R. (There is an example in my test suite.) On Tue, 17 Oct 2006, Mark Wardle wrote: Dear All, Writing sooner than I thought I'd need to. I'm using R 2.4 on Mac OS X, with RODBC, PostgreSQL 8.1 and Actual's ODBC driver. I have all my data in Filemaker 8.5, but it is automatically exported into PostgreSQL for analysis as Filemaker's ODBC and JDBC access is awful, slow and has a tendency to crash. I have disability data where for each patient there is a survival time in years from disease onset to a particular disease stage, namely unilateral support, bilateral support, wheelchair use, and death. Valid values may include NULL (patient hasn't reached that stage), 0 (for example, patient needed support immediately at disease onset), and any positive integer. When I query the database manually using psql, it is clear there are NULL values. 3 | 3 | 18 | | 27 |1 | || | 13 |1 1 | 5 || | 10 |0 10 |13 | 13 | | 22 |0 No, it is not clear. It is clear that there are values which are printed as blank or empty strings. However, these are all converted to zeros when I use RODBC's sqlQuery(), making interpretation impossible. I have tried using the nullstring and na.strings options, but these don't seem to have any effect. I have tried various combinations of NULL, NA and . Forgive my awkward SQL. channel = odbcConnect(ataxia, uid=mark) disease = sqlQuery(channel, select calc_survival_unilateral_support as unlateral, calc_survival_bilateral_support as bilateral, calc_survival_wheelchair as wheelchair,calc_survival_death as death, calc_follow_up as followup, has_family_history_ataxia as familial from clinical, patient where clinical.patient_fk = patient_id and excluded=0 and calc_walking_disability_valid=1) disease # and show results 1273 3 18 0 271 1280 0 0 0 131 1291 5 0 0 100 130 1013 13 0 220 It doesn't seem to be the old repeating rows NULL bug talked about a href=http://tolstoy.newcastle.edu.au/R/help/04/07/0803.html;here/a. That was about R 1.9.1, about a problem solved long before then. Let's not drag up ancient history Is this because my ODBC driver is not returning the correct values for RODBC to parse? Is there anyway of debugging this (the intricacies of ODBC are beyond my skill) and is my only alternative to store a non-valid number in the database (999?) and use my query or R to remove those datapoints afterwards? Find out what the types involved are. Perhaps try as.is=FALSE? Looking in the archives, there are lots of people asking about how to convert NAs to numeric, but I want the NAs passed through unaltered! Since the mapping of NULLs to NAs works in other examples, I find it hard to see how this can be an RODBC issue. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate NAs from known data: how to?
Hi In a dataset I have length and age for cod. The age, however, is ony given for 40-100% of the fish. What I need to do is to fill inn the NAs in a correct way, so that age has a value for each length. This is to be done for each sample seperately (there are 324 samples), meaning the NAs for sampleno 1 shall be calculated from the known values from sampleno 1. As for example length 55 cm can be both 4 and 5 years, I guess a fish with NA age and length 55 cm should be given a random age given a probability for example 55 cm = 4 years has a p=75%, while 55 cm = 4 years has a p=25%. Those p-values should be calculated from the real data. How can this be done in R, and what is the right way to do it? Sample number 1 is given below. Best regards Torleif Markussen Lunde length age sampleno 55 5 1 45 4 1 55 4 1 55 5 1 60 6 1 45 5 1 52 5 1 48 4 1 51 6 1 53 4 1 54 5 1 48 5 1 50 6 1 55 6 1 55 4 1 50 5 1 49 5 1 40 4 1 50 6 1 36 4 1 46 6 1 35 3 1 41 3 1 44 5 1 36 3 1 29 2 1 28 2 1 32 2 1 31 2 1 30 2 1 29 2 1 32 2 1 28 2 1 25 2 1 27 2 1 27 2 1 24 2 1 27 2 1 24 2 1 19 1 1 23 1 1 23 1 1 20 1 1 23 1 1 19 1 1 17 1 1 53 5 1 58 5 1 52 4 1 42 3 1 50 5 1 94 7 1 35 3 1 71 7 1 52 6 1 50 6 1 45 4 1 52 5 1 37 3 1 45 4 1 59 5 1 47 4 1 48 4 1 39 3 1 37 3 1 31 3 1 39 2 1 39 2 1 31 2 1 40 3 1 52 5 1 62 5 1 72 5 1 53 5 1 61 5 1 54 6 1 54 5 1 63 6 1 58 5 1 45 4 1 43 4 1 55 4 1 39 3 1 39 3 1 58 5 1 65 6 1 52 6 1 48 3 1 49 3 1 44 3 1 45 4 1 35 2 1 38 3 1 30 2 1 29 1 1 27 1 1 44 NA 1 48 NA 1 37 NA 1 27 NA 1 30 NA 1 67 NA 1 28 NA 1 65 NA 1 42 NA 1 27 NA 1 37 NA 1 30 NA 1 28 NA 1 26 NA 1 36 NA 1 29 NA 1 32 NA 1 45 NA 1 39 NA 1 27 NA 1 29 NA 1 28 NA 1 27 NA 1 53 NA 1 21 NA 1 15 NA 1 23 NA 1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC and NULL values
Prof Brian Ripley wrote: What sqltype(s) are your variables? The variables are all numeric. For numeric types, RODBC merely maps values the ODBC driver says are NULL to NA. Since you appear not to have character data, nullstring: character string to be used when reading 'SQL_NULL_DATA' character items from the database. na.strings: character string(s) to be mapped to 'NA' when reading character data. are not relevant to you. I thought that, but was grasping at straws because at that point I didn't know whether it was problem with the ODBC driver misinforming RODBC about the correct character types. At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works correctly, and NULLs in numeric columns are mapped to NAs in R. (There is an example in my test suite.) I'm using Actual's ODBC driver. In my previous email, I did a test with another ODBC client (Microsoft Excel/Query) and found it too was misinterpreting NULL values as zero, concluding it was an issue with the ODBC driver itself. However, I was wrong - using the iodbctest program, the ODBC driver *is* successfully returning NULLs. It is only Microsoft Excel/Query and R that I am having the problem with these empty spaces/NULL characters being converted to zeros. ... When I query the database manually using psql, it is clear there are NULL values. 3 | 3 | 18 | | 27 |1 | || | 13 |1 1 | 5 || | 10 |0 10 |13 | 13 | | 22 |0 No, it is not clear. It is clear that there are values which are printed as blank or empty strings. I *think* postgresql is regarding them as NULL values. I don't know whether this proves it? [The first two must be functionally equivalent) ataxia=#select count(calc_survival_bilateral_support) from clinical; count --- 53 (1 row) ataxia=#select count(calc_survival_bilateral_support) from clinical where calc_survival_bilateral_support is NOT NULL; count --- 53 (1 row) ataxia=# select count(*) from clinical; count --- 140 (1 row) Find out what the types involved are. Perhaps try as.is=FALSE? Have done, and I'm afraid it doesn't change anything. Since the mapping of NULLs to NAs works in other examples, I find it hard to see how this can be an RODBC issue. Perhaps it is a peculiarity in my set-up, or I'm missing something obvious and making some assumption somewhere. I will retrace my steps! Perhaps I should use a different approach, but I always have difficulty giving up on a problem unsolved! -- Dr. Mark Wardle Clinical research fellow and Specialist Registrar in Neurology, C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about managing searching path
On 10/17/2006 3:49 AM, Tong Wang wrote: Hi all, I'm having sometrouble with managing the seach path, in a function , I need to attach some data set at the begining and detach them at the end, say, myfunction- function() { attach(mylist); .detach(mylist) } , the problem is, since I am still debugging this code, sometimes it got error and ended before reaching the end, thus the data is left in the searching path. What 's the right way to make mylist detached no matter what ? on.exit, as Vito said. But you may find that doing the calculations in with does a better job, i.e. with(mylist, [do something]) The advantages of with() are: - it takes precedence over local variables; attach (by default) comes behind local variables and the global environment. This may mean your code fails when a user happens to have variables with the same name defined. - it is a temporary change, so no detach is needed. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Review process for new packages
On 10/17/2006 2:22 AM, Andreas Wittmann wrote: Hi all, i'm currently working on a creditmetrics package which includes functions for computing the credit risk model creditmetrics. I guess it would be finished in a few days. My question now is, does there exist some review process before sending it to ctan or is it reviewed after having sended it? You should read the instructions in the Writing R Extensions manual, and make sure it passes R CMD check without errors or warnings, before you send it. CRAN will run its own checks on a number of different platforms, and if your package doesn't pass, they'll probably ask you to fix it -- but you should do your best to make their job easier by getting it right before you send it. If your package passes those checks, it will likely be posted to CRAN. (There are exceptions, e.g. if they notice your license is not compatible with CRAN, etc.) There's no review process to decide whether your package is useful or well-written. If you want that kind of review you should submit it to the Journal of Statistical Software. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding out about objects and classes
When R help simply states something like: Value: An object of class 'loess'. How do I find out more about that class? Shouldn't there be a link in the help file or something? ATB Mick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Contingency Table to Flat File
On Tue, Oct 17, 2006 at 03:08:49AM -0700, Marco LO wrote: Is there any R function out there to turn a multi-way contingency table back to a flat file table of individual rows and attribute columns.? Are you looking for something like this? # generate some data x = sample(c(0,1), 100, replace=T) y = sample(c(0,1), 100, replace=T) z = sample(c(0,1), 100, replace=T) # contingency table mytab = table(x,y,z) # flat contingency table as.data.frame( mytab ) cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Review process for new packages
One thing you might want to do is an R CMD CHECK with both the development and released versions of R since CRAN will check it against both: http://cran.r-project.org/src/contrib/checkSummary.html On 10/17/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 10/17/2006 2:22 AM, Andreas Wittmann wrote: Hi all, i'm currently working on a creditmetrics package which includes functions for computing the credit risk model creditmetrics. I guess it would be finished in a few days. My question now is, does there exist some review process before sending it to ctan or is it reviewed after having sended it? You should read the instructions in the Writing R Extensions manual, and make sure it passes R CMD check without errors or warnings, before you send it. CRAN will run its own checks on a number of different platforms, and if your package doesn't pass, they'll probably ask you to fix it -- but you should do your best to make their job easier by getting it right before you send it. If your package passes those checks, it will likely be posted to CRAN. (There are exceptions, e.g. if they notice your license is not compatible with CRAN, etc.) There's no review process to decide whether your package is useful or well-written. If you want that kind of review you should submit it to the Journal of Statistical Software. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding out about objects and classes
apropos(loess) help.search(loess) methods(class = loess) class?loess # in this case it does not return anything but sometimes it does RiteSearch(loess) On 10/17/06, michael watson (IAH-C) [EMAIL PROTECTED] wrote: When R help simply states something like: Value: An object of class 'loess'. How do I find out more about that class? Shouldn't there be a link in the help file or something? ATB Mick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate NAs from known data: how to?
Torleif Markussen Lunde wrote: In a dataset I have length and age for cod. The age, however, is ony given for 40-100% of the fish. What I need to do is to fill inn the NAs in a correct way, so that age has a value for each length. This is to be done for each sample seperately (there are 324 samples), meaning the NAs for sampleno 1 shall be calculated from the known values from sampleno 1. As for example length 55 cm can be both 4 and 5 years, I guess a fish with NA age and length 55 cm should be given a random age given a probability for example 55 cm = 4 years has a p=75%, while 55 cm = 4 years has a p=25%. Those p-values should be calculated from the real data. How can this be done in R, and what is the right way to do it? Given the size of your sample, wouldn't it be more statistically valid to set the age of the NA records to the mean age of records of matching length? I suppose you could also use resampling or a bootstrap, but I'm not sure that adding randomization will give results that are any more statistically valid than using the mean. Regards, - Brian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANOVA and Levene's test in nested model
Dear All, I sent already before a message concerning Levene's test in nested model, but I didn't get any answer. Optimistically I hope to get an answer this time. I also point a new question related to the whole model, because I haven't find any sure answer if I am analysing it in a suitable way or not. I really have tried to do my homework. I have response variable (y) and four factors (a, b, c, d). One of these four factors (d) is nested within another factor (c). In addition, I would like to take into account only 2nd degree interactions in my model. I tried to analyse this model in the following ways (both gave same results): model1-aov(y~(a+b+c)^2 + Error(d)) model2-aov(y~(a+b+c)^2 + Error(d%in%c)) Is this correct? I guess another option would be lme in package nlme model3-lme(y~(a+b+c)^2, random=~1|d) anova(model3) I am also willing to test homogenity of variances in this model using Levene's test. How to do it in this kind of case? I'll appreciate all advices. With kind regards -Emilia - Emilia Pippola, research assistant University of Oulu Personal address (NEW!): Rautalammintie 3 B 306 FIN-00550 Helsinki, Finland Mobile: +358-50-5402551 E-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed effect model in R
Hi, I am analysing an experiment that has one fixed (6 conditions) and two random factors (11 subjects, 24 images in the conditions). I read somewhere else that you can also see such a design as a nested experiment with the hierarchy: subjects - condition - image. For some analysis I have one respond variable and for others I have more. The response variables are non-normally distributed. Now the question: Is there a package that can deal with such a design? I would like to use a generalized linear model. Are there glms that are extended to do multivariate analysis (for the 2 random + 1 fixed variable design)? And how do you call such a design? Last question: Can you suggest me some literature about such a problem? I am quite unsure concerning the analysis. Thanks for any advice lisra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CTRL-C behaviour with RODBC on Solaris2.8
After loading the RODBC package version 1.1-7, Ctrl-C changes its behaviour and is quitting R and returning to the (unix-)command prompt on the solaris2.8 platform here. Here's what happened before and after loading RODBC for (i in 1:10^5) rnorm(10) ^C library(RODBC) for (i in 1:10^5) rnorm(10) ^C bash-3.00$ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) This version of R was built with gcc-3.3 (too old?) and ODBC_INCLUDE, ODBC_LIBS pointing to non-standard locations /quant/temp/jagat/usr/local/include, /quant/temp/jagat/usr/local/lib, respectively. Will be glad to provide further details. Any ideas on how to correct this would be greatly appreciated. Thanks -- Jagat K. Sheth Prepayment Modeling and Economics Wells Fargo Home Mortgage 7911 Forsyth Boulevard Suite 500, M5001-061 Clayton, MO 63105 Tel: (314)-726-4496 Fax: (314)-726-4483 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Book recommendation for newbie to stats and R?
I'm trying to learn statistics and R at the same time. I have an undergraduate science degree and one year of calculus (30 years ago), but never took a stats course. I hope to take some stats courses in the next year, but thought I would start to see how much I could teach myself. I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I found these titles that may teach me both stats and R: --Data Analysis and Graphics Using R by John Maindonald, John Braun --Introductory Statistics with R by Peter Dalgaard --Statistics: An Introduction using R by Michael J. Crawley --Using R for Introductory Statistics by John Verzani I recognize some of the authors by their postings here. Can anyone recommend any of these books over the others? I'm interested in a book that I can learn statistics by reading the chapters and working out the exercises and problems, therefore having access to many or all of the problem solutions is important. Do you have any other recommendations for me in learning both R and stats? Is it an impossible quest to learn enough stats by myself to be useful in analyzing real data sets? Thanks so much for your advice and suggestions. Kevin Zembower Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University www.jhuccp.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Some questions on Rpart algorithm
Hello: I am using rpart and would like more background on how the splits are made and how to interpret results - also how to properly use text(.rpart). I have looked through Venables and Ripley and through the rpart help and still have some questions. If there is a source (say, Breiman et al) on decision trees that would clear this all up, please let me know. The questions below pertain to a classification task (ie., I'm using the class method). Many thanks in advance. (1) I'd like text(.rpart) to print percentages of each class rather then counts. I don't see an option for this so would like to modify the text.rpart. However, I can't find the source since it is a method that's hidden. How can I find the source? (2) printcp prints a table with columns cp, nsplit, rel error, xerror, xstd. I am guessing that cp is complexity, nsplit is the number of the split, rel error is the error on test set, xerror is cross-validation error and xstd is standard deviation of error across the cross-validation sets. Is there any documentation on this? For instance, how exactly is complexity computed? (3) What's a loss matrix? Is it the cost place on each type of misclassification? (4) [More of a methodology question] In practice, when would one use different costs on different splitting variables? Thanks for any help on this. Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Book recommendation for newbie to stats and R?
On 10/17/06, Zembower, Kevin [EMAIL PROTECTED] wrote: I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I recently bought Peter Dalgaard's book and have found it to be quite helpful. jab -- John Bollinger, CFA, CMT www.BollingerBands.com If you advance far enough, you arrive at the beginning. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Book recommendation for newbie to stats and R?
Kevin -- There are at least two that I recommend: Using R for Introductory Statistics, John Verzani, published by Chapman Hall, 2005, and Introductory Statistics with R, by Peter Dalgaard (a frequent contributor to this list)published by Springer (in paperback) 2002. Of these, IMHO you will find more basic, fundamental, ground level stat in Verzani (which is also longer by about 40%), but more elegant, insightful use of R and more creative ideas in Dalgaard. These two together with the R Introduction that comes with R and maybe Jon Baron's notes on the use of R in psychology will get you off on the right foot. Good luck! Ben Fairbank -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zembower, Kevin Sent: Tuesday, October 17, 2006 9:08 AM To: r-help@stat.math.ethz.ch Subject: [R] Book recommendation for newbie to stats and R? I'm trying to learn statistics and R at the same time. I have an undergraduate science degree and one year of calculus (30 years ago), but never took a stats course. I hope to take some stats courses in the next year, but thought I would start to see how much I could teach myself. I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I found these titles that may teach me both stats and R: --Data Analysis and Graphics Using R by John Maindonald, John Braun --Introductory Statistics with R by Peter Dalgaard --Statistics: An Introduction using R by Michael J. Crawley --Using R for Introductory Statistics by John Verzani I recognize some of the authors by their postings here. Can anyone recommend any of these books over the others? I'm interested in a book that I can learn statistics by reading the chapters and working out the exercises and problems, therefore having access to many or all of the problem solutions is important. Do you have any other recommendations for me in learning both R and stats? Is it an impossible quest to learn enough stats by myself to be useful in analyzing real data sets? Thanks so much for your advice and suggestions. Kevin Zembower Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University www.jhuccp.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split-plot analysis with lme()
Thanks, that clarifies things. And this gets all 5 interaction degrees of freedom: oats - read.table(testlme.dat, head=T) # This is a subset of the standard data set with # the combination Variety=Golden Rain, nitro=0 deleted oats$nitro - factor(oats$nitro) attach(oats) library(nlme) M - model.matrix(~Variety*nitro) fit - lme(yield ~ Variety+nitro+M[,7:11], random=~1|Block/Variety) anova(fit) On Sun, Oct 15, 2006 at 08:28:43AM -0700, Spencer Graves wrote: The problem in your example is that 'lme' doesn't know how to handle the Variety*nitro interaction when all 12 combinations are not present. The error message singularity in backsolve means that with data for only 11 combinations, which is what you have in your example, you can only estimate 11 linearly independent fixed-effect coefficients, not the 12 required by this model: 1 for intercept + (3-1) for Variety + (4-1) for nitro + (3-1)*(4-1) for Variety*nitro = 12. Since 'nitro' is a fixed effect only, you can get what you want by keeping it as a numeric factor and manually specifying the (at most 5, not 6) interaction contrasts you want, something like the following: fit2. - lme(yield ~ Variety+nitro+I(nitro^2)+I(nitro^3) +Variety:(nitro+I(nitro^2)), data=Oats, random=~1|Block/Variety, subset=!(Variety == Golden Rain nitro == 0)) NOTE: This gives us 4 degrees of freedom for the interaction. With all the data, we can estimate 6. Therefore, there should be some way to get 5, but so far I haven't figured out an easy way to do that. Perhaps someone else will enlighten us both. Even without a method for estimating an interaction term with 5 degrees of freedom, I hope I've at least answered your basic question. Best Wishes, Spencer Graves i.m.s.white wrote: Dear R-help, Why can't lme cope with an incomplete whole plot when analysing a split-plot experiment? For example: R : Copyright 2006, The R Foundation for Statistical Computing Version 2.3.1 (2006-06-01) library(nlme) attach(Oats) nitro - ordered(nitro) fit - lme(yield ~ Variety*nitro, random=~1|Block/Variety) anova(fit) numDF denDF F-value p-value (Intercept) 145 245.14333 .0001 Variety 210 1.48534 0.2724 nitro 345 37.68560 .0001 Variety:nitro 645 0.30282 0.9322 # Excellent! However --- fit2 - lme(yield ~ Variety*nitro, random=~1|Block/Variety, subset= + !(Variety == Golden Rain nitro == 0)) Error in MEEM(object, conLin, control$niterEM) : Singularity in backsolve at level 0, block 1 -- *I.White * *University of Edinburgh * *Ashworth Laboratories, West Mains Road* *Edinburgh EH9 3JT * *Fax: 0131 650 6564 Tel: 0131 650 5490 * *E-mail: [EMAIL PROTECTED] * __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error Correcting Codes, Simplex
On Tue, 17 Oct 2006, Richard Graham wrote: On 10/16/06, Björn Egert [EMAIL PROTECTED] wrote: On 10/8/06, Egert, Bjoern [EMAIL PROTECTED] wrote: Hello, Is there a way in R to construct an (error correcting) binary code e.g. for an source alphabet containing integers from 1 to say 255 with the property that each pair of distinct codewords of length m is at Hamming distance exactly m/2 ? I was suggested to use so called simplex codes, which should be fairly standard, but I haven't found a direct way via R packages to do so, that's why I ask whether there might be in indirect way to solve this problem. The survey package has a function hadamard() to construct Hadamard matrices, which are what simplex codes come from. -thomas__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed effect model in R
Interesting packages for you might be the nlme and lme4 packages and as a book Pinheiro/Bates, Mixed-Effects Models in S and S-Plus Lina Jansen schrieb: Hi, I am analysing an experiment that has one fixed (6 conditions) and two random factors (11 subjects, 24 images in the conditions). I read somewhere else that you can also see such a design as a nested experiment with the hierarchy: subjects - condition - image. For some analysis I have one respond variable and for others I have more. The response variables are non-normally distributed. Now the question: Is there a package that can deal with such a design? I would like to use a generalized linear model. Are there glms that are extended to do multivariate analysis (for the 2 random + 1 fixed variable design)? And how do you call such a design? Last question: Can you suggest me some literature about such a problem? I am quite unsure concerning the analysis. Thanks for any advice lisra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC and NULL values
On Tue, 2006-10-17 at 11:49 +0100, Mark Wardle wrote: Prof Brian Ripley wrote: What sqltype(s) are your variables? The variables are all numeric. I don't think this is an RODBC issue. I've had similar problems with numeric variables in FileMaker without using RODBC. I have exported from FileMaker to MySQL numeric variables containing non-numeric strings. Since MySQL won't allow non-numeric characters into numeric variables, empty strings and other non-numeric values were replaced by the default (0). I suppose that's the same with PostgreSQL. You might get more luck if you first convert your variables as TEXT in FileMaker and then import them to PostgreSQL where you can reconvert them to numeric there after fixing the NULL values. It's a bit of extra work... In my case, the strategy I used was to export the FileMaker data into XML format and then run a XSLT script to insert the data as TEXT into MySQL where I could detect and fix non-numeric strings. Regards, Jerome -- Jerome Asselin, M.Sc., Agent de recherche, RHCE CHUM -- Centre de recherche 3875 rue St-Urbain, 3e etage // Montreal QC H2W 1V1 Tel.: 514-890-8000 Poste 15914; Fax: 514-412-7106 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New package Ryacas
Simon, library(Ryacas) Loading required package: XML yacas(Integrate(x) x) [1] Starting Yacas! Accepting requests from port 9734 expression(x^2/2) yacas('Integrate(x) x') expression(x^2/2) yacas('Integrate(x)x') expression(x^2/2) yacas('Integrate(x)x;') CommandLine(1) : Expecting ) closing bracket for sub-expression, but got ; instead The 'Accepting ...' message shows yacas was already running. Which is also the case on your system (hence the socket already in use message). You can ignore that. On Mac OS and I guess several other unix/linux environments, yacas will remain running. If you leave off the ;, it should work. As this is quite Ryacas/yacas specific, for future messages I'll respond directly. Regards, Rob On Oct 17, 2006, at 1:39 AM, Gabor Grothendieck wrote: Maybe there are timing problems using that setup with sockets? I once tried VMware (not with Ryacas but just to try it out) and found it slow as can be expected with an emulated environment. Since you have Windows XP just use the Windows version of Ryacas directly. On 10/17/06, Simon Blomberg [EMAIL PROTECTED] wrote: Hi Gabor, I'm running Quantian (Debian) inside a VMware virtual machine, on a Windows XP host. I installed the latest version of yacas from the source tarball. I remembered to ./configure --enable-server to allow server connections. make and make install worked ok, after some fiddling. I checked that the yacas server option worked, by doing yacas --server , and then telnet'ing to 127.0.0.1 to check. It worked fine. I installed Ryacas. I then tried it out and got the following error: library(Ryacas) Loading required package: XML yacas('Integrate(x)x;') [1] Starting Yacas! Error in socketConnection(host = 127.0.0.1, port = 9734, server = FALSE, : unable to open connection In addition: Warning message: 127.0.0.1:9734 cannot be opened Accepting requests from port 9734 I tried again (stubborn, I guess): yacas('Integrate(x)x;') [1] Starting Yacas! Accepting requests from port 9734 YacasServer Could not bind to the socket : Address already in use /usr/local/lib/R/site-library/Ryacas/yacdir/R.ys(1) : File not found CommandLine(1) : Expecting ) closing bracket for sub-expression, but got x instead Any ideas where I may be going wrong? I don't know anything about sockets. I've cross-posted to r-sig-debian. They may be interested. Cheers, Simon. -- Simon Blomberg, B.Sc.(Hons.), Ph.D, M.App.Stat. Centre for Resource and Environmental Studies The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 7800 email: Simon.Blomberg_at_anu.edu.au F: +61 2 6125 0757 CRICOS Provider # 00120C The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: STRING_ELT() can only be applied to a 'character vector', not a 'builtin'
I have a daily job that attaches hundreds of pseudo-packages containing data as promise objects (DDP's, ref: g.data package), and plots the results to a multi-page pdf device. Sometimes it fails. Under R-2.2.1 it just gave segfaults. Under R-2.3.1 it gave this error message: *** caught segfault *** address (nil), cause 'memory not mapped' Traceback: 1: load(system.file(data, paste(i, RData, sep = .), package = pkg), env) 2: g.data.load(tm.time, hist.20051012) 3: g.inorder(93500, tm.time, 16) aborting ... Segmentation fault Under R-2.4.0, it now gives this message: Error: STRING_ELT() can only be applied to a 'character vector', not a 'builtin' (which appears to be generated inside main/memory.c). I'm sorry I can't give a reproducible example, because it seems to happen randomly, and at different points in the process. So this is just a shot in the dark -- does anybody recognize this behavior? TIA. -- David Brahm ([EMAIL PROTECTED]) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot error
I created a dataframe called OSA here is what it looks like no.surgery surgery 00.4 6.9 60.2 0.3 I have also attached it as an R data file I cannot understand why I am getting the following error. barplot(OSA) Error in barplot.default(OSA) : 'height' must be a vector or a matrix OSA is a data.frame which means R should see it as a matrix. What am I not understanding? -- Farrel Buchinsky Mobile: (412) 779-1073 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : Book recommendation for newbie to stats and R?
You can try the statistic book of T H WONNACOT. It is a good introduction to statistic for social sciences including economics, medecine, ... Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. - Message d'origine De : Ben Fairbank [EMAIL PROTECTED] À : Zembower, Kevin [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Envoyé le : Mardi, 17 Octobre 2006, 15h18mn 56s Objet : Re: [R] Book recommendation for newbie to stats and R? Kevin -- There are at least two that I recommend: Using R for Introductory Statistics, John Verzani, published by Chapman Hall, 2005, and Introductory Statistics with R, by Peter Dalgaard (a frequent contributor to this list)published by Springer (in paperback) 2002. Of these, IMHO you will find more basic, fundamental, ground level stat in Verzani (which is also longer by about 40%), but more elegant, insightful use of R and more creative ideas in Dalgaard. These two together with the R Introduction that comes with R and maybe Jon Baron's notes on the use of R in psychology will get you off on the right foot. Good luck! Ben Fairbank -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zembower, Kevin Sent: Tuesday, October 17, 2006 9:08 AM To: r-help@stat.math.ethz.ch Subject: [R] Book recommendation for newbie to stats and R? I'm trying to learn statistics and R at the same time. I have an undergraduate science degree and one year of calculus (30 years ago), but never took a stats course. I hope to take some stats courses in the next year, but thought I would start to see how much I could teach myself. I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I found these titles that may teach me both stats and R: --Data Analysis and Graphics Using R by John Maindonald, John Braun --Introductory Statistics with R by Peter Dalgaard --Statistics: An Introduction using R by Michael J. Crawley --Using R for Introductory Statistics by John Verzani I recognize some of the authors by their postings here. Can anyone recommend any of these books over the others? I'm interested in a book that I can learn statistics by reading the chapters and working out the exercises and problems, therefore having access to many or all of the problem solutions is important. Do you have any other recommendations for me in learning both R and stats? Is it an impossible quest to learn enough stats by myself to be useful in analyzing real data sets? Thanks so much for your advice and suggestions. Kevin Zembower Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University www.jhuccp.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] crush in edit()
Dear all, I am new to R system. When I tried to edit data read from a csv file, R system crushed, I got an error message as follows: edit(data) *** buffer overflow detected ***: /usr/lib/R/bin/exec/R terminated === Backtrace: = /lib/libc.so.6(__chk_fail+0x41)[0x49d020b1] /lib/libc.so.6[0x49d034a2] /usr/lib/R/modules//R_X11.so[0x33ed7a] /usr/lib/R/modules//R_X11.so[0x34050d] /usr/lib/R/modules//R_X11.so[0x341858] /usr/lib/R/modules//R_X11.so(RX11_dataentry+0xa25)[0x342f45] /usr/lib/R/lib/libR.so[0xa34675] /usr/lib/R/lib/libR.so[0x954ed6] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so[0x929ed8] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so[0x926a37] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2a7)[0x928117] /usr/lib/R/lib/libR.so[0x95661f] /usr/lib/R/lib/libR.so(Rf_usemethod+0x609)[0x957a89] /usr/lib/R/lib/libR.so[0x95825e] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2a7)[0x928117] /usr/lib/R/lib/libR.so(Rf_eval+0x2f4)[0x925994] /usr/lib/R/lib/libR.so(Rf_ReplIteration+0x311)[0x945361] /usr/lib/R/lib/libR.so[0x945571] /usr/lib/R/lib/libR.so(run_Rmainloop+0x60)[0x9458c0] /usr/lib/R/lib/libR.so(Rf_mainloop+0x1c)[0x9458ec] /usr/lib/R/bin/exec/R(main+0x46)[0x80486f6] /lib/libc.so.6(__libc_start_main+0xdc)[0x49c3b4e4] /usr/lib/R/bin/exec/R[0x80485f1] === Memory map: 00111000-0012f000 r-xp fd:00 16943095 /usr/lib/R/library/grDevices/libs/grDevices.so 0012f000-0013 rwxp 0001d000 fd:00 16943095 /usr/lib/R/library/grDevices/libs/grDevices.so 0013-00181000 r-xp fd:00 16976568 /usr/lib/R/library/stats/libs/stats.so 00181000-00183000 rwxp 00051000 fd:00 16976568 /usr/lib/R/library/stats/libs/stats.so 00339000-00352000 r-xp fd:00 15959326 /usr/lib/R/modules/R_X11.so 00352000-00353000 rwxp 00018000 fd:00 15959326 /usr/lib/R/modules/R_X11.so 00353000-0035f000 rwxp 00353000 00:00 0 0048-00496000 r-xp fd:00 15303387 /usr/lib/gconv/SJIS.so 00496000-00498000 rwxp 00015000 fd:00 15303387 /usr/lib/gconv/SJIS.so 0056e000-00598000 r-xp fd:00 16452204 /usr/lib/R/lib/libRblas.so 00598000-00599000 rwxp 00029000 fd:00 16452204 /usr/lib/R/lib/libRblas.so 00848000-00851000 r-xp fd:00 15204401 /lib/libnss_files-2.4.so 00851000-00852000 r-xp 8000 fd:00 15204401 /lib/libnss_files-2.4.so 00852000-00853000 rwxp 9000 fd:00 15204401 /lib/libnss_files-2.4.so 00885000-00abd000 r-xp fd:00 16452203 /usr/lib/R/lib/libR.so 00abd000-00aca000 rwxp 00238000 fd:00 16452203 /usr/lib/R/lib/libR.so 00aca000-00b61000 rwxp 00aca000 00:00 0 00c47000-00c4d000 r-xp fd:00 16944203 /usr/lib/R/library/methods/libs/methods.so 00c4d000-00c4e000 rwxp 5000 fd:00 16944203 /usr/lib/R/library/methods/libs/methods.so 00eb6000-00f31000 r-xp fd:00 15242987 /usr/lib/libgfortran.so.1.0.0 00f31000-00f32000 rwxp 0007b000 fd:00 15242987 /usr/lib/libgfortran.so.1.0.0 00f44000-00f45000 r-xp fd:00 15303344 /usr/lib/gconv/ISO8859-1.so 00f45000-00f47000 rwxp fd:00 15303344 /usr/lib/gconv/ISO8859-1.so 08048000-08049000 r-xp fd:00 15796032 /usr/lib/R/bin/exec/R 08049000-0804a000 rwxp fd:00 15796032 /usr/lib/R/bin/exec/R 09ef7000-0af9f000 rwxp 09ef7000 00:00 0 [heap] 49c08000-49c09000 r-xp 49c08000 00:00 0 [vdso] 49c09000-49c22000 r-xp fd:00 15206828 /lib/ld-2.4.so 49c22000-49c23000 r-xp 00018000 fd:00 15206828 /lib/ld-2.4.so 49c23000-49c24000 rwxp 00019000 fd:00 15206828 /lib/ld-2.4.so 49c26000-49d53000 r-xp fd:00 15206829 /lib/libc-2.4.so 49d53000-49d55000 r-xp 0012d000 fd:00 15206829 /lib/libc-2.4.so 49d55000-49d56000 rwxp 0012f000 fd:00 15206829 /lib/libc-2.4.so 49d56000-49d59000 rwxp 49d56000 00:00 0 49d5b000-49d7e000 r-xp fd:00 15206830 /lib/libm-2.4.so 49d7e000-49d7f000 r-xp 00022000 fd:00 15206830 /lib/libm-2.4.so 49d7f000-49d8 rwxp 00023000 fd:00 15206830 /lib/libm-2.4.so 49d82000-49d84000 r-xp fd:00 15206831 /lib/libdl-2.4.so 49d84000-49d85000 r-xp 1000 fd:00 15206831 /Aborted I am using R 2.4.0 i386 on Fedora core 5, any one please help me on this? Thank you very much. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot error
On 10/17/06, Farrel Buchinsky [EMAIL PROTECTED] wrote: I created a dataframe called OSA here is what it looks like no.surgery surgery 00.4 6.9 60.2 0.3 I have also attached it as an R data file I cannot understand why I am getting the following error. barplot(OSA) Error in barplot.default(OSA) : 'height' must be a vector or a matrix OSA is a data.frame which means R should see it as a matrix. What am I not understanding? A data.frame is not the same as a matrix. Try one of these using the builtin BOD data frame: barplot(as.matrix(BOD)) barplot(data.matrix(BOD)) barplot.data.frame - function(height, ...) barplot(as.matrix(height), ...) barplot(BOD) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : Book recommendation for newbie to stats and R?
Exact reference is : Wonnacot, T., Wonnacot, R., Introductory Statistics for Business and Economics, New York, 1990 Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. - Message d'origine De : Ben Fairbank [EMAIL PROTECTED] À : Zembower, Kevin [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Envoyé le : Mardi, 17 Octobre 2006, 15h18mn 56s Objet : Re: [R] Book recommendation for newbie to stats and R? Kevin -- There are at least two that I recommend: Using R for Introductory Statistics, John Verzani, published by Chapman Hall, 2005, and Introductory Statistics with R, by Peter Dalgaard (a frequent contributor to this list)published by Springer (in paperback) 2002. Of these, IMHO you will find more basic, fundamental, ground level stat in Verzani (which is also longer by about 40%), but more elegant, insightful use of R and more creative ideas in Dalgaard. These two together with the R Introduction that comes with R and maybe Jon Baron's notes on the use of R in psychology will get you off on the right foot. Good luck! Ben Fairbank -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zembower, Kevin Sent: Tuesday, October 17, 2006 9:08 AM To: r-help@stat.math.ethz.ch Subject: [R] Book recommendation for newbie to stats and R? I'm trying to learn statistics and R at the same time. I have an undergraduate science degree and one year of calculus (30 years ago), but never took a stats course. I hope to take some stats courses in the next year, but thought I would start to see how much I could teach myself. I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I found these titles that may teach me both stats and R: --Data Analysis and Graphics Using R by John Maindonald, John Braun --Introductory Statistics with R by Peter Dalgaard --Statistics: An Introduction using R by Michael J. Crawley --Using R for Introductory Statistics by John Verzani I recognize some of the authors by their postings here. Can anyone recommend any of these books over the others? I'm interested in a book that I can learn statistics by reading the chapters and working out the exercises and problems, therefore having access to many or all of the problem solutions is important. Do you have any other recommendations for me in learning both R and stats? Is it an impossible quest to learn enough stats by myself to be useful in analyzing real data sets? Thanks so much for your advice and suggestions. Kevin Zembower Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University www.jhuccp.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : [PS] Re : Book recommendation for newbie to stats and R?
Il y a eu ce débat il y a quelques semaines sur le forum. C'est pas efficace d'étudier les deux dans un seul bouquin. La statistique est si vaste qu'un seul bouqin ne suffirait pas d'écrire les méandres. Il en est de même de R. Et tu sais R c'est de la programmation des rudiments en algorithmiques sont nécessaires. Tu as essayé R pour débutants d'Emmanuel Paradis ? ( http://cran.r-project.org/doc/contrib/Paradis-rdebuts_fr.pdf) c'est par là que j'ai commencé. Il y a aussi le MASS de Venable et Ripley qui est très riche mais tu dois avoir certains prérequis. Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. - Message d'origine De : Ben Fairbank [EMAIL PROTECTED] À : justin bem [EMAIL PROTECTED] Envoyé le : Mardi, 17 Octobre 2006, 16h57mn 17s Objet : RE: [PS] Re : [R] Book recommendation for newbie to stats and R? Justin Merci bien. Le livre de M. Wonnacot, cest un livre au sujet de statistic seulement, ou de statistic _et_ R ? Ben Fairbank From: justin bem [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 10:49 AM To: Ben Fairbank; Zembower, Kevin; r-help@stat.math.ethz.ch Subject: [PS] Re : [R] Book recommendation for newbie to stats and R? You can try the statistic book of T H WONNACOT. It is a good introduction to statistic for social sciences including economics, medecine, ... Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. - Message d'origine De : Ben Fairbank [EMAIL PROTECTED] À : Zembower, Kevin [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Envoyé le : Mardi, 17 Octobre 2006, 15h18mn 56s Objet : Re: [R] Book recommendation for newbie to stats and R? Kevin -- There are at least two that I recommend: Using R for Introductory Statistics, John Verzani, published by Chapman Hall, 2005, and Introductory Statistics with R, by Peter Dalgaard (a frequent contributor to this list)published by Springer (in paperback) 2002. Of these, IMHO you will find more basic, fundamental, ground level stat in Verzani (which is also longer by about 40%), but more elegant, insightful use of R and more creative ideas in Dalgaard. These two together with the R Introduction that comes with R and maybe Jon Baron's notes on the use of R in psychology will get you off on the right foot. Good luck! Ben Fairbank -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zembower, Kevin Sent: Tuesday, October 17, 2006 9:08 AM To: r-help@stat.math.ethz.ch Subject: [R] Book recommendation for newbie to stats and R? I'm trying to learn statistics and R at the same time. I have an undergraduate science degree and one year of calculus (30 years ago), but never took a stats course. I hope to take some stats courses in the next year, but thought I would start to see how much I could teach myself. I work for an organization that analyses behavior change communication programs regarding HIV/AIDS and reproductive health. A typical question we're trying to answer is, Watching which television programs in South Africa is related to an increased use of condoms? All of our work is in the social sciences, I'd say. I'd like to help analyze our data using R. I found these titles that may teach me both stats and R: --Data Analysis and Graphics Using R by John Maindonald, John Braun --Introductory Statistics with R by Peter Dalgaard --Statistics: An Introduction using R by Michael J. Crawley --Using R for Introductory Statistics by John Verzani I recognize some of the authors by their postings here. Can anyone recommend any of these books over the others? I'm interested in a book that I can learn statistics by reading the chapters and working out the exercises and problems, therefore having access to many or all of the problem solutions is important. Do you have any other recommendations for me in learning both R and stats? Is it an impossible quest to learn enough stats by myself to be useful in analyzing real data sets? Thanks so much for your advice and suggestions. Kevin Zembower Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University www.jhuccp.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Découvrez une nouvelle façon d'obtenir des réponses à toutes vos qu ___ Découvrez une nouvelle façon d'obtenir des
Re: [R] Calculate NAs from known data: how to?
On 17-Oct-06 Torleif Markussen Lunde wrote: Hi In a dataset I have length and age for cod. The age, however, is only given for 40-100% of the fish. What I need to do is to fill in the NAs in a correct way, so that age has a value for each length. This is to be done for each sample seperately (there are 324 samples), meaning the NAs for sample no 1 shall be calculated from the known values from sample no 1. As for example length 55 cm can be both 4 and 5 years, I guess a fish with NA age and length 55 cm should be given a random age given a probability for example 55 cm = 4 years has a p=75%, while 55 cm = 4 years has a p=25%. Those p-values should be calculated from the real data. How can this be done in R, and what is the right way to do it? Sample number 1 is given below. [snip] A question with many ramifications! First of all, there are several possible approaches to imputing missing values. You are wise in recognising that there is uncertainty in this, in general and also for your data set. For this, I would normally recommend a Multiple Imputation approach, since this would proceed by sampling from a posterior distribution for Age given Length, as estimated by Maximum Likelihood from your data. The differing results of successive imputations then exhibit a variability corresponding to the uincertainty about what value to impute. Furthermore, when subsequent analyses (such as estimating the paameters of a growth curve from Age and Length, or estimating population dynamics from Age distributions in successive years) are carried out, these can be done for each of the imputations in the multiple set, and the results (and etimated standard errors) can be combined to give an overall estimate and the unceertainty in this--which not only includes the variability in the complete data, but also the uncertainty due to imputation. For this approach, I would be inclined to start with Shafer's norm or mix packages, available in R. But see below. However, I have had a look at the data for Sample 1 which you included. This throws up several features which should be taken into account, and which indicate that blind use of an imputation package may not be the best approach. First, I made a CSV file from your table (3 columns: length, age, sample). Then: D-read.csv(LengthAge.csv) A-D$age L-D$length Now: index which lines have NA for age, then a histogram of Length when Age is present: ix0 - is.na(A) hist(L[!ix0],breaks=5*(0:20)) Superimpose a histogram of Length when Age is NA: hist(L[ix0],add=TRUE,breaks=5*(0:20),col=red) hist(L[!ix0],add=TRUE,breaks=5*(0:20)) (the repetition of L[!ix0] is done because the red has overlaid it for one length range, and the repetition restores it). This immediately shows that, in general, missing Age is unusual, except for Length in the range (25:50) in which it is the majority. An alternative picture of the same scene appears if you first make the histogram of all lengths: X11() ## to get a second graphics window hist(L,breaks=5*(0:20)) hist(L[ix0],add=TRUE,breaks=5*(0:20),col=red) Now you can ask why there should be so many NAs for Age in the Length range (25:30) and, indeed (2nd histogram) why there are so many specimens anyway in that Length range compared with their neghbours. Comparing the two histograms idicates that the excess in that Length range arises from the fish with NA Ages: in the first histogram the number of non-NA Ages in )25:30) is very comparable with the numbers of all fish in other Length ranges. So my immediate suspicion is that there is something special about the Length range (25:30) in relation to whether Age is recorded or not. The next thing that emerges from the first histogram is the sharp decline in numbers above Length=55. I therefore wonder why this also happens. Is 55cm a magic (e.g. legal) length threshold for cod? And is 30cm also special in some way? If so, could there be some pressure on whoever records the data not to take measurements of Age when the fish is near but under 30cm, or to record a value of Length just below the threshold (i.e. incorrectly)? Does this happen in your other data sets? As well as these why questions, however, there is a a technical issue arising from the fact that many missing Ages occur in a very limited range of Length: sum(ix0) [1] 27 ### total number of Age = NA sum((L[ix0]25)(L[ix0]=30)) [1] 12 ### Age = NA in (25:50) sum((L[!ix0]25)(L[!ix0]=30)) [1] 11 ### Age is known in (25:30) This is: whether Age=NA is uninformative about Age given recorded Length. More precisely, whether Prob[Age=NA | true Age, recorded Length] = Prob[Age= NA | recorded Length] since many imputation techniques depend on this. If it is not true, then being missing is informative about Age (i.e. the distribution of Age at Length is different for Age=NA cases than for Age != NA cases), though you would not be able to ascertain this without a suitable model for
[R] Variance of Y_hat in a linear model
Y X Z 42.07.0 33.0 33.04.0 41.0 75.0 16.07.0 28.03.0 49.0 91.0 21.05.0 55.08.0 31.0 data-read.table(d.txt,header=TRUE) mod-lm(data$Y~data$X+data$Z) predict(mod) 123456 44.69961 34.22997 76.63735 29.32986 91.09000 48.01321 In the lm, the predicted(fitted) Y_1_hat is 44.6991, is there a function to give me the variance of y_1_hat? Neither anova nor summary gives this value. Thank You __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed effect model in R
Please always reply to the list as well as there always might be someone faster/better answering (or it could be that I am wrong, so someone might correct me) Indeed Pinheiro/Bates assume gaussian error terms... but I am not really sure whether you meant that with non normally distributed respond variable resp. with non-normal data however: / Mixed-effects models: / The recommended nlme http://cran.r-project.org/src/contrib/Descriptions/nlme.html package, associated with Pinheiro and Bates, / Mixed-Effects Models in S and S-PLUS / (Springer, 2000), fits linear and nonlinear mixed-effects models, commonly used in the social sciences for hierarchical and longitudinal data. Generalized linear mixed-effects models may be fit by the glmmPQL function in the MASS package, and by the lmer function in the Matrix http://cran.r-project.org/src/contrib/Descriptions/Matrix.html package (related to the lme4 http://cran.r-project.org/src/contrib/Descriptions/lme4.html package, which largely supersedes nlme http://cran.r-project.org/src/contrib/Descriptions/nlme.html for / linear / mixed models). Also see the lmeSplines http://cran.r-project.org/src/contrib/Descriptions/lmeSplines.html and lmm http://cran.r-project.org/src/contrib/Descriptions/lmm.html packages. [ http://cran.r-project.org/src/contrib/Views/SocialSciences.html ] Lina Jansen schrieb: 2006/10/17, Stefan Grosse [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]: Interesting packages for you might be the nlme and lme4 packages and as a book Pinheiro/Bates, Mixed-Effects Models in S and S-Plus Thank you for the answer. I am always unsure concerning the non-normality. Can I use the nlme and lme4 with non-normal data? First, I thought they would work like an ANOVA but with random and fixed effects. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance of Y_hat in a linear model
Using the builtin BOD data set try this: predict(lm(demand ~., BOD), se.fit = TRUE) On 10/17/06, Li Zhang [EMAIL PROTECTED] wrote: Y X Z 42.07.0 33.0 33.04.0 41.0 75.0 16.07.0 28.03.0 49.0 91.0 21.05.0 55.08.0 31.0 data-read.table(d.txt,header=TRUE) mod-lm(data$Y~data$X+data$Z) predict(mod) 123456 44.69961 34.22997 76.63735 29.32986 91.09000 48.01321 In the lm, the predicted(fitted) Y_1_hat is 44.6991, is there a function to give me the variance of y_1_hat? Neither anova nor summary gives this value. Thank You __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: STRING_ELT() can only be applied to a 'character vector', not a 'builtin'
I suspect you have a protection problem. The specific message you quote indicates that STRING_ELT is being called on an object of inappropriate type: but it is quite likely that it is being called on uninitialized memory as the intended object has been garbage-collected. Messages from a corrupted R session do not always make sense: see the debugging info in `Writing R Extensions' and especially the use of gctorture and valgrind. Followups to R-devel, please: this looks very like a programming issue. On Tue, 17 Oct 2006, Brahm, David wrote: I have a daily job that attaches hundreds of pseudo-packages containing data as promise objects (DDP's, ref: g.data package), and plots the results to a multi-page pdf device. Sometimes it fails. Under R-2.2.1 it just gave segfaults. Under R-2.3.1 it gave this error message: *** caught segfault *** address (nil), cause 'memory not mapped' Traceback: 1: load(system.file(data, paste(i, RData, sep = .), package = pkg), env) 2: g.data.load(tm.time, hist.20051012) 3: g.inorder(93500, tm.time, 16) aborting ... Segmentation fault Under R-2.4.0, it now gives this message: Error: STRING_ELT() can only be applied to a 'character vector', not a 'builtin' (which appears to be generated inside main/memory.c). I'm sorry I can't give a reproducible example, because it seems to happen randomly, and at different points in the process. So this is just a shot in the dark -- does anybody recognize this behavior? TIA. -- David Brahm ([EMAIL PROTECTED]) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] if statement error
Hi List, I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Jen if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } real.d and real.b are two vectors, beta1,beta2,theta1,and theta2 are constants. The error occurred like this: Error in if (length(real.d) = 30 length(real.b) = 30 beta1 * beta2 * : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
Jenny Stadt wrote: I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Are you a C programmer? :-) if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } I _think_ you should use instead of . And drop the second ;. Also, don't forget that return x is wrong [it took me a long time to figure out that R != C, and it's just return(x)] Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
Jenny Stadt jennystadt at yahoo.ca writes: if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } real.d and real.b are two vectors, beta1,beta2,theta1,and theta2 are constants. The error occurred like this: Error in if (length(real.d) = 30 length(real.b) = 30 beta1 * beta2 * : missing value where TRUE/FALSE needed Please follow the advice and provide a full example, where beta1 really is a vector. This works for me below, but it give the message you mentioned if you uncomment second line. Dieter - beta1 = beta2 = theta1 = theta2 = 1.0 #beta1 = NULL real.d = runif(35) real.b = runif(35) r=corr=0 if( length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
Jenny, are there any missing values in your vectors? If so, what effect do you think this will have on an expression like that required by the if statement that must resolve fully to either true or false? Regards, Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jenny Stadt Sent: 17 October 2006 18:19 To: r-help@stat.math.ethz.ch Subject: [R] if statement error Hi List, I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Jen if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } real.d and real.b are two vectors, beta1,beta2,theta1,and theta2 are constants. The error occurred like this: Error in if (length(real.d) = 30 length(real.b) = 30 beta1 * beta2 * : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
Jenny This following example works: real.d - rep(NA,30) real.b - rep(NA,30) b1=runif(1); b2=runif(1); t1=runif(1); t2=runif(1) if (length(real.d)=30 length(real.b)=30 b1*b2*t1*t20){bool=TRUE} bool [1] TRUE But this one doesn't: real.d - rep(NA,30) real.b - rep(NA,30) b1=runif(1); b2=runif(1); t1=runif(1); t2=NA if (length(real.d)=30 length(real.b)=30 b1*b2*t1*t20){bool=TRUE} Error in if (length(real.d) = 30 length(real.b) = 30 b1 * b2 * : missing value where TRUE/FALSE needed NA's in the vector make no difference. is correct. So, it appears at least one of your scalars is missing JFL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jenny Stadt Sent: Tuesday, October 17, 2006 12:19 PM To: r-help@stat.math.ethz.ch Subject: [R] if statement error Hi List, I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Jen if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } real.d and real.b are two vectors, beta1,beta2,theta1,and theta2 are constants. The error occurred like this: Error in if (length(real.d) = 30 length(real.b) = 30 beta1 * beta2 * : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
On 17 Oct 2006, at 18:34, Alberto Monteiro wrote: Jenny Stadt wrote: I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Are you a C programmer? :-) if(length(real.d)=30 length(real.b)=30 beta1*beta2*theta1*theta20 ) { r - 1; corr - 1; } I _think_ you should use instead of . And drop the second ;. The is correct in this case. is the vector logical AND operator in R (and analogously the bitwise logical AND in C) is the lazy scalar (atomic) logical AND operator in C and R. If it operates on a vector in R, it ignores all but the first element. see help() since if() in R is scalar (atomic) the is appropriate. The second ';' is syntactically correct in R and C, although optional in R. -Alex Out of interest, for a vector equivalent to if, see help(ifelse) Also, don't forget that return x is wrong [it took me a long time to figure out that R != C, and it's just return(x)] Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Contingency Table to Flat File
On Tue, 2006-10-17 at 13:09 +0200, Philipp Pagel wrote: On Tue, Oct 17, 2006 at 03:08:49AM -0700, Marco LO wrote: Is there any R function out there to turn a multi-way contingency table back to a flat file table of individual rows and attribute columns.? Are you looking for something like this? # generate some data x = sample(c(0,1), 100, replace=T) y = sample(c(0,1), 100, replace=T) z = sample(c(0,1), 100, replace=T) # contingency table mytab = table(x,y,z) # flat contingency table as.data.frame( mytab ) This thread reminds me of a discussion a while back, but which I cannot seem to find at the moment in the archives. The steps elucidated by Philipp result in a flattened contingency table, which contains the various cross-classifying factors as unique rows and the addition of a frequency column indicating the number of occurrences of each unique row. It does not however result in what might be considered the original raw data frame' containing a single row per observation, if that is what one desires. In other words, we get the following: set.seed(1) x - sample(c(0, 1), 100, replace = TRUE) y - sample(c(0, 1), 100, replace = TRUE) z - sample(c(0, 1), 100, replace = TRUE) # contingency table mytab - table(x, y, z) mytab , , z = 0 y x0 1 0 17 19 1 11 15 , , z = 1 y x0 1 0 6 10 1 12 10 # flattened contingency table FCT - as.data.frame(mytab) FCT x y z Freq 1 0 0 0 17 2 1 0 0 11 3 0 1 0 19 4 1 1 0 15 5 0 0 16 6 1 0 1 12 7 0 1 1 10 8 1 1 1 10 In order to take 'FCT' and convert it to 'raw data rows', we can do the following: expand.dft - function(x, na.strings = NA, as.is = FALSE, dec = .) { # Take each row in the source data frame table and replicate it # using the Freq value DF - sapply(1:nrow(x), function(i) x[rep(i, each = x$Freq[i]), ], simplify = FALSE) # Take the above list and rbind it to create a single DF # Also subset the result to eliminate the Freq column DF - subset(do.call(rbind, DF), select = -Freq) # Now apply type.convert to the character coerced factor columns # to facilitate data type selection for each column DF - as.data.frame(lapply(DF, function(x) type.convert(as.character(x), na.strings = na.strings, as.is = as.is, dec = dec))) # Return data frame DF } # Now use expand.dft() on the table from above new.DF - expand.dft(FCT) str(new.DF) 'data.frame': 100 obs. of 3 variables: $ x: int 0 0 0 0 0 0 0 0 0 0 ... $ y: int 0 0 0 0 0 0 0 0 0 0 ... $ z: int 0 0 0 0 0 0 0 0 0 0 ... # Re-create the multi-way table new.tab - table(new.DF) new.tab , , z = 0 y x0 1 0 17 19 1 11 15 , , z = 1 y x0 1 0 6 10 1 12 10 # Compare to initial mytab identical(new.tab, mytab) [1] TRUE So, if one needs it, expand.dft() can be used to take a multi-way contingency table that has been coerced to a data frame and convert it back to the raw data frame. I'm not sure if this functionality is available elsewhere, but thought that it might be helpful. I included the use of type.convert() in order to make a reasonable attempt at restoring original data types, as the lack of this step results in all columns as factors. I wonder if it might make sense to add an 'expand' argument to as.data.frame.table(), which would default to FALSE. It could be then set to TRUE and utilize expand.dft() to take the additional step and return the raw data frame as above. Anyway, I hope that this might be helpful. Regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if statement error
Thank you all for the advice here. I followed the suggestion that check the output of the parameters, and found that there might be two possibilities to cause the problem. First was there was missing value in real.d / real.b; the second was when beta2 was NA. I fixed the data set and the error no longer shows up. Thank you very much! Jen -Original Message- From:Alberto Monteiro , [EMAIL PROTECTED] Sent: 2006-10-17, 11:36:40 To: r-help@stat.math.ethz.ch CC: Subject: Re: [R] if statement error Jenny Stadt wrote: I was not able to make this work. I know it is a simple one, sorry to bother. Give me some hints pls. Thanks! Are you a C programmer? :-) if(length(real.d) =30 length(real.b) =30 beta1*beta2*theta1*theta2 0 ) { r - 1; corr - 1; } I _think_ you should use instead of . And drop the second ;. Also, don't forget that return x is wrong [it took me a long time to figure out that R != C, and it's just return(x)] Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Some questions on Rpart algorithm
With regards to your first question, here's a function I used a couple of times to get plots similar to those you're looking for. (Search the list for how to find the source code. Also, there's a reference other than MASS on the ?rpart page.) #bogdan romocea 2006-06 #adapted source code from # - text.rpart() from package mvpart # - functions$text from rpart() # to get acceptable plots of classification trees #the tweaked tree plots show the following: # - size of each node (counts and percentages) # - splitting rules # - % cases in each node, or counts # - targets with more than 3 categories are properly labelled through colors #(unlike in text.rpart() from mvpart) #example: # x - rpart(...,method=class) # plot(x,uniform=TRUE,margin=0.02) # my.tree.text(x,ncomp.offset=4) my.tree.text - function(x,percent=TRUE,pct.decimals=0,ncomp.offset=2, clr=c(red,yellow,blue,green,brown,purple,navy)) { frame - x$frame ; col - names(frame) method - x$method ; ylevels - attr(x, ylevels) xy - rpartco(x) ; node - as.numeric(row.names(x$frame)) leaves - rep(TRUE, nrow(frame)) bar.vals - x$functions$bar(yval2 = frame$yval2) node.size - rowSums(bar.vals) node.title - paste(node.size, / ,round(100*node.size/node.size[1]),%,sep=) #---the node barplots sub.barplot(xy$x,xy$y,bar.vals,leaves,xadj=1,yadj=1,bord=TRUE,line=TRUE,col=clr) rx - range(xy$x) ; ry - range(xy$y) #---the legend if (!is.null(ylevels)) bar.labs - ylevels else bar.labs - dimnames(x$y)[[2]] legend(min(xy$x) - 0.1 * rx, max(xy$y) + 0.05 * ry, bar.labs, col = clr, pch = 15, bty = n) text(xy$x[leaves],xy$y[leaves],labels=node.title,pos=3,cex=1.5,offset=1) #---the splitting rules cxy - par(cxy) left.child - match(2 * node, node) right.child - match(node * 2 + 1, node) rows - labels(x, pretty = pretty) text(xy$x,xy$y + 0.5 * cxy[2],rows[left.child],pos=2,col=navy) text(xy$x,xy$y + 0.5 * cxy[2],rows[right.child],pos=4,col=navy) #---target composition per node (% or counts) if (is.null(frame$yval2)) yval - frame$yval[leaves] else yval - frame$yval2[leaves,] nclass - (ncol(yval) - 1)/2 counts - yval[, 1 + (1:nclass)] group - yval[, 1] if (!is.null(bar.labs)) group - bar.labs[group] if (percent) { #identical(counts / rowSums(counts),prop.table(counts,1)) nbr - round(100*prop.table(counts,1),pct.decimals) #t1 - apply(matrix(nbr,ncol=nclass),2,paste,%,sep=) #t2 - apply(matrix(t1,ncol=nclass),1,paste,collapse=/) t2 - apply(matrix(nbr,ncol=nclass),1,paste,collapse=|) nlab - paste(format(group,justify=left),\n%: ,t2,sep = ) } else { t2 - apply(matrix(counts,ncol=nclass),1,paste,collapse=|) nlab - paste(format(group,justify=left),\nN: ,t2,sep = ) } text(xy$x[leaves],xy$y[leaves],labels=nlab,pos=1,offset=ncomp.offset) } -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Marcus, Jeffrey Sent: Tuesday, October 17, 2006 10:03 AM To: r-help@stat.math.ethz.ch Subject: [R] Some questions on Rpart algorithm Hello: I am using rpart and would like more background on how the splits are made and how to interpret results - also how to properly use text(.rpart). I have looked through Venables and Ripley and through the rpart help and still have some questions. If there is a source (say, Breiman et al) on decision trees that would clear this all up, please let me know. The questions below pertain to a classification task (ie., I'm using the class method). Many thanks in advance. (1) I'd like text(.rpart) to print percentages of each class rather then counts. I don't see an option for this so would like to modify the text.rpart. However, I can't find the source since it is a method that's hidden. How can I find the source? (2) printcp prints a table with columns cp, nsplit, rel error, xerror, xstd. I am guessing that cp is complexity, nsplit is the number of the split, rel error is the error on test set, xerror is cross-validation error and xstd is standard deviation of error across the cross-validation sets. Is there any documentation on this? For instance, how exactly is complexity computed? (3) What's a loss matrix? Is it the cost place on each type of misclassification? (4) [More of a methodology question] In practice, when would one use different costs on different splitting variables? Thanks for any help on this. Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging tables of differing lengths
Hi, I am using the table() function on two different vectors to obtain a frequency distribution for each: tabtyp1 - table(wintype1) tabtyp2 - table(wintype2) The resulting tables look like this: tabtyp1 - table(wintype1) tabtyp2 - table(wintype2) tabtyp1 wintype1 0 2 3 4 5 6 7 8 16826 10031 1636 797 239 39963 6 tabtyp2 wintype2 0 2 3 4 5 6 7 810 16857 10012 1703 788 171 3757714 3 What I want to do is merge these two tables into a 2X10 table in order to do a chi-square test. Given the unequal number of columns, all my attempts are failing. I am also having instances where the number of columns is the same, but the values are different, e.g., one table has values for 7 and not 8, while the other lacks 7s but has 8s. Any suggestions? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding per-panel text to panel strips in lattice xyplot
On 10/13/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: I would like to add auxiliary information to the bottom of two strips on each panel that comes from a table look-up using the values of two variables that define the panel. For example I might panel on sex and race, showing 3 randomly chosen time series in each panel and want to add (n=100) in the bottom strip to indicate the 3 curves were sampled from 100. Is there a not-too-hard way to do that? I would like to do this both with and without groups= and superposition, but especially with. There might be, but it might be easier with some changes to lattice. Can you give a minimal example so that we can try out ideas? Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Generate a random bistochastic matrix
Thank you, Ravi, Ravi == Ravi Varadhan [EMAIL PROTECTED] on Mon, 16 Oct 2006 18:54:16 -0400 writes: Ravi Martin, I don't think that a doubly stochastic matrix Ravi can be obtained from an arbitrary positive rectangular Ravi matrix. There is a theorem by Sinkhorn (Am Math Month Ravi 1967) on the diagonal equivalence of matrices with Ravi prescribed row and column sums. It shows that given a Ravi positive matrix A(m x n), there is a unique matrix DAE Ravi (where D and E are m x m and n x n diagonal matrices) Ravi with rows, k*r_i (i = 1, ..., m), and column sums, c_j Ravi (j=1,...,n) such that k = \sum_j c_j / \sum_i r_i. Ravi Therefore, the alternative row and column Ravi normalization algorithm (same as the iterative Ravi proportional fitting algorithm for contingency tables) Ravi will alternate between row and column sums being Ravi unity, while the other sum alternates between k and Ravi 1/k. Ravi Here is a slight modification of your algorithm for Ravi the rectangular case: Ravi bistochMat.rect - function(m,n, tol = 1e-7, maxit = 1000) { Ravi ## Purpose: Random bistochastic *square* matrix (M_{ij}): Ravi ##M_{ij} = 0; sum_i M_{ij} = sum_j M_{ij} = 1 (for all i, Ravi j) Ravi ## Ravi -- Ravi ## Arguments: n: (n * n) matrix dimension; Ravi ## Ravi -- Ravi ## Author: Martin Maechler, Date: 16 Oct 2006, 14:47 Ravi stopifnot(maxit = 1, tol = 0) Ravi M - matrix(runif(m*n), m,n) Ravi for(i in 1:maxit) { Ravi rM - rowSum(M) Ravi M - M / rep((rM),n) Ravi cM - colSum(M) Ravi M - M / rep((cM),each=m) Ravi if(all(abs(c(rM,cM) - 1) tol)) Ravi break Ravi } Ravi ## cat(needed, i, iterations\n) Ravi ## or rather Ravi attr(M, iter) - i Ravi M Ravi } Ravi Using this algorithm we get for an 8 x 4 matrix, for example, we get: M - bistochMat.rect(8,4) apply(M,1,sum) Ravi [1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 apply(M,2,sum) Ravi [1] 1 1 1 1 Of course I had tried that too, before I posted and limited the problem to square matrices. Ravi Clearly the algorithm didn't converge according to Ravi your convergence criterion, but the row sums oscillate Ravi between 1 and 0.5, and the column sums oscillate Ravi between 2 and 1, respectively. indeed, and I had tried similar examples. The interesting thing is really the theorem you mention a consequence of which seems to be that indeed, simple row and column scaling iterations would not converge. Intuitively, I'd still expect that relatively simple modification of the algorithm should lead to convergence. Your following statement seems to indicate so, or do I misunderstand its consequences? Ravi It is interesting to note that the above algorithm Ravi converges if we use the infinity norm, instead of the Ravi 1-norm, to scale the rows and columns, i.e. we divide Ravi rows and columns by their maxima. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predicted value at a new level in lm
Y X Z 42.07.0 33.0 33.04.0 41.0 75.0 16.07.0 28.03.0 49.0 91.0 21.05.0 55.08.0 31.0 data-read.table(d.txt,header=TRUE) mod-lm(data$Y~data$X+data$Z) --- I would like to know the predict value at a new level, say X=10 Z=30 Is there a function available to calculate it directly? Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding per-panel text to panel strips in lattice xyplot
Deepayan Sarkar wrote: On 10/13/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: I would like to add auxiliary information to the bottom of two strips on each panel that comes from a table look-up using the values of two variables that define the panel. For example I might panel on sex and race, showing 3 randomly chosen time series in each panel and want to add (n=100) in the bottom strip to indicate the 3 curves were sampled from 100. Is there a not-too-hard way to do that? I would like to do this both with and without groups= and superposition, but especially with. There might be, but it might be easier with some changes to lattice. Can you give a minimal example so that we can try out ideas? Deepayan Thanks for your note Deepayan. The difficulty is that the quantity to add may need to be obtained by a table look-up given current panel strip values. I have gotten around this by duplicating the lookup values (here sizecluster) to correspond to x and y then using subscripts. The code snippet below does not put the extra value in a strip but right under the bottom strip. Better would be inside the bottom strip. textfun - function(subscripts) { if(!length(subscripts)) return() size - sizecluster[subscripts[1]] txt - paste('N=',size,sep='') grid.text(txt, x=.005, y=.99, just=c(0,1), gp=gpar(fontsize=9, col=gray(.25))) } xyplot(Y ~ X | distribution*cluster, groups=curve, xlab=xlab, ylab=ylab, xlim=xlim, ylim=ylim, as.table=TRUE, panel=function(x, y, subscripts, ...) { panel.superpose(x, y, subscripts, ...) textfun(subscripts) }) Thanks -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] looking for a cleaner way to do something
I have two numeric vectors each of length 17 and each is named the exact same way. so obsnum p m pppmp . dot dot dot.. temp1is141752 63 85 obsnum p m pppmp . dot dot dot.. temp2 is 1213 4150 97 what i want to have is a resultant matrix with 2 rows and 16 columns where the 16 columns are the 2:17 columns divided by the respective first element in each vector. ( so 52, 63 and 85 should all get divided by 1417 and 41, 50 and 97 should all be divided by 1213 ). it doesn't have to retain the column names because i can just assign them again when i am assigning the rownames. below is my code : resultmatrix-rbind(temp1,temp2) resultmatrix-resultmatrix[,2:17]/resultmatrix[,1] colnames(resultmatrix)-colnames(temp1) rownames(resultmatrix)-c(group1,group2) I'm pretty sure above will work but it seemed kind of ugly and I wondering if there is a better way because i am trying to improve in R. if there isn't, that's fine. thanks. This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Generate a random bistochastic matrix
Martin, Sinkhorn's theorem excludes the possibility of obtaining a doubly stochastic matrix of the form D%*%A%*%E, which is diagonally equivalent to a given positive rectangular matrix A. But it doesn't say that one can't obtain a doubly stochastic matrix B from A by some other set of operations, other than multiplying by diagonal matrices. This throws up a number of issues: does a B always exist, is it unique in some sense, and if so, what is its relationship to A? This seems like a really hard problem. If this problem can be set up as an optimization problem, perhaps, then we could establish conditions under which a solution would exist. In the iterative proportional fitting for contingency tables, we have the row sums = column sums = grand total, so there is no problem. Also, in the case of infinity norm, the constraints are much looser so convergence is easy. I also wonder about the physical reality of this problem - i.e. is there a physical problem that can give rise to a rectangular doubly stochastic matrix? In the Markov chain problems, one always gets a square matrix. I am not familiar with graph theory applications, where doubly stochastic matrices play a useful role. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 3:31 PM To: Ravi Varadhan Cc: R-help@stat.math.ethz.ch; 'Florent Bresson' Subject: Re: [R] Re : Generate a random bistochastic matrix Thank you, Ravi, Ravi == Ravi Varadhan [EMAIL PROTECTED] on Mon, 16 Oct 2006 18:54:16 -0400 writes: Ravi Martin, I don't think that a doubly stochastic matrix Ravi can be obtained from an arbitrary positive rectangular Ravi matrix. There is a theorem by Sinkhorn (Am Math Month Ravi 1967) on the diagonal equivalence of matrices with Ravi prescribed row and column sums. It shows that given a Ravi positive matrix A(m x n), there is a unique matrix DAE Ravi (where D and E are m x m and n x n diagonal matrices) Ravi with rows, k*r_i (i = 1, ..., m), and column sums, c_j Ravi (j=1,...,n) such that k = \sum_j c_j / \sum_i r_i. Ravi Therefore, the alternative row and column Ravi normalization algorithm (same as the iterative Ravi proportional fitting algorithm for contingency tables) Ravi will alternate between row and column sums being Ravi unity, while the other sum alternates between k and Ravi 1/k. Ravi Here is a slight modification of your algorithm for Ravi the rectangular case: Ravi bistochMat.rect - function(m,n, tol = 1e-7, maxit = 1000) { Ravi ## Purpose: Random bistochastic *square* matrix (M_{ij}): Ravi ##M_{ij} = 0; sum_i M_{ij} = sum_j M_{ij} = 1 (for all i, Ravi j) Ravi ## Ravi -- Ravi ## Arguments: n: (n * n) matrix dimension; Ravi ## Ravi -- Ravi ## Author: Martin Maechler, Date: 16 Oct 2006, 14:47 Ravi stopifnot(maxit = 1, tol = 0) Ravi M - matrix(runif(m*n), m,n) Ravi for(i in 1:maxit) { Ravi rM - rowSum(M) Ravi M - M / rep((rM),n) Ravi cM - colSum(M) Ravi M - M / rep((cM),each=m) Ravi if(all(abs(c(rM,cM) - 1) tol)) Ravi break Ravi } Ravi ## cat(needed, i, iterations\n) Ravi ## or rather Ravi attr(M, iter) - i Ravi M Ravi } Ravi Using this algorithm we get for an 8 x 4 matrix, for example, we get: M - bistochMat.rect(8,4) apply(M,1,sum) Ravi [1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 apply(M,2,sum) Ravi [1] 1 1 1 1 Of course I had tried that too, before I posted and limited the problem to square matrices. Ravi Clearly the algorithm didn't converge according to Ravi your convergence criterion, but the row sums oscillate Ravi between 1 and 0.5, and the column sums oscillate Ravi between 2 and 1, respectively. indeed, and I had tried similar examples. The interesting thing is really the theorem you mention a consequence of which seems to be that indeed, simple row and column scaling iterations would not converge. Intuitively, I'd still expect that relatively simple modification of the algorithm should lead to convergence. Your following statement seems to indicate so, or do I misunderstand its consequences? Ravi It is interesting to note that the above algorithm Ravi converges if we use the infinity norm, instead of the Ravi
Re: [R] looking for a cleaner way to do something
Try this: X - structure(11:15, .Names = letters[1:5]) Y - structure(21:25, .Names = letters[1:5]) rbind(group1 = X, group2 = Y) tab - rbind(group1 = X, group2 = Y) tab[,-1] / tab[,1] On 10/17/06, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: I have two numeric vectors each of length 17 and each is named the exact same way. so obsnum p m pppmp . dot dot dot.. temp1is141752 63 85 obsnum p m pppmp . dot dot dot.. temp2 is 1213 4150 97 what i want to have is a resultant matrix with 2 rows and 16 columns where the 16 columns are the 2:17 columns divided by the respective first element in each vector. ( so 52, 63 and 85 should all get divided by 1417 and 41, 50 and 97 should all be divided by 1213 ). it doesn't have to retain the column names because i can just assign them again when i am assigning the rownames. below is my code : resultmatrix-rbind(temp1,temp2) resultmatrix-resultmatrix[,2:17]/resultmatrix[,1] colnames(resultmatrix)-colnames(temp1) rownames(resultmatrix)-c(group1,group2) I'm pretty sure above will work but it seemed kind of ugly and I wondering if there is a better way because i am trying to improve in R. if there isn't, that's fine. thanks. This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received vi! a ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looking for a cleaner way to do something
Sorry there was an extra line in there. It should be: X - structure(11:15, .Names = letters[1:5]) Y - structure(21:25, .Names = letters[1:5]) tab - rbind(group1 = X, group2 = Y) tab[,-1] / tab[,1] On 10/17/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: X - structure(11:15, .Names = letters[1:5]) Y - structure(21:25, .Names = letters[1:5]) rbind(group1 = X, group2 = Y) tab - rbind(group1 = X, group2 = Y) tab[,-1] / tab[,1] On 10/17/06, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: I have two numeric vectors each of length 17 and each is named the exact same way. so obsnum p m pppmp . dot dot dot.. temp1is141752 63 85 obsnum p m pppmp . dot dot dot.. temp2 is 1213 4150 97 what i want to have is a resultant matrix with 2 rows and 16 columns where the 16 columns are the 2:17 columns divided by the respective first element in each vector. ( so 52, 63 and 85 should all get divided by 1417 and 41, 50 and 97 should all be divided by 1213 ). it doesn't have to retain the column names because i can just assign them again when i am assigning the rownames. below is my code : resultmatrix-rbind(temp1,temp2) resultmatrix-resultmatrix[,2:17]/resultmatrix[,1] colnames(resultmatrix)-colnames(temp1) rownames(resultmatrix)-c(group1,group2) I'm pretty sure above will work but it seemed kind of ugly and I wondering if there is a better way because i am trying to improve in R. if there isn't, that's fine. thanks. This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received ! via ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicted value at a new level in lm
You may use predict.lm method! -- Início da mensagem original --- De: [EMAIL PROTECTED] Para: r-help@stat.math.ethz.ch Cc: Data: Tue, 17 Oct 2006 12:34:12 -0700 (PDT) Assunto: [R] Predicted value at a new level in lm Y X Z 42.0 7.0 33.0 33.0 4.0 41.0 75.0 16.0 7.0 28.0 3.0 49.0 91.0 21.0 5.0 55.0 8.0 31.0 data-read.table(d.txt,header=TRUE) mod-lm(data$Y~data$X+data$Z) --- I would like to know the predict value at a new level, say X=10 Z=30 Is there a function available to calculate it directly? Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding per-panel text to panel strips in lattice xyplot
On 10/17/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Deepayan Sarkar wrote: On 10/13/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: I would like to add auxiliary information to the bottom of two strips on each panel that comes from a table look-up using the values of two variables that define the panel. For example I might panel on sex and race, showing 3 randomly chosen time series in each panel and want to add (n=100) in the bottom strip to indicate the 3 curves were sampled from 100. Is there a not-too-hard way to do that? I would like to do this both with and without groups= and superposition, but especially with. There might be, but it might be easier with some changes to lattice. Can you give a minimal example so that we can try out ideas? Deepayan Thanks for your note Deepayan. The difficulty is that the quantity to add may need to be obtained by a table look-up given current panel strip values. I have gotten around this by duplicating the lookup values (here sizecluster) to correspond to x and y then using subscripts. The code snippet below does not put the extra value in a strip but right under the bottom strip. Better would be inside the bottom strip. textfun - function(subscripts) { if(!length(subscripts)) return() size - sizecluster[subscripts[1]] txt - paste('N=',size,sep='') grid.text(txt, x=.005, y=.99, just=c(0,1), gp=gpar(fontsize=9, col=gray(.25))) } xyplot(Y ~ X | distribution*cluster, groups=curve, xlab=xlab, ylab=ylab, xlim=xlim, ylim=ylim, as.table=TRUE, panel=function(x, y, subscripts, ...) { panel.superpose(x, y, subscripts, ...) textfun(subscripts) }) Well, the strip function has always been passed an argument called 'which.panel' (and the latest lattice has a function called 'which.packet()' which gives the same information inside a panel function as well). It seems to me that that's the thing you want to use. E.g. library(lattice) dotplot(variety ~ yield | site * year, data = barley, + strip = function(..., which.panel) print(which.panel)) [1] 1 1 [1] 1 1 [1] 2 1 ... [1] 5 2 [1] 6 2 [1] 6 2 -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging tables of differing lengths
What I want to do is merge these two tables into a 2X10 table in order to do a chi-square test. Given the unequal number of columns, all my You might want to try something like: levs - unique(c(wintype1, wintype2)) table(factor(wintype1, levels=lev)) table(factor(wintype2, levels=lev)) Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicted value at a new level in lm
On Tue, 2006-10-17 at 12:34 -0700, Li Zhang wrote: Y X Z 42.07.0 33.0 33.04.0 41.0 75.0 16.07.0 28.03.0 49.0 91.0 21.05.0 55.08.0 31.0 data-read.table(d.txt,header=TRUE) mod-lm(data$Y~data$X+data$Z) --- I would like to know the predict value at a new level, say X=10 Z=30 dat - scan() 1:42.07.0 33.0 4:33.04.0 41.0 7:75.0 16.07.0 10:28.03.0 49.0 13:91.0 21.05.0 16:55.08.0 31.0 19: Read 18 items dat - as.data.frame(matrix(dat, ncol = 3, byrow = TRUE)) names(dat) - c(X,Y,Z) dat X Y Z 1 42 7 33 2 33 4 41 3 75 16 7 4 28 3 49 5 91 21 5 6 55 8 31 mod - lm(Y ~ X + Z, data = dat) # note use of data argument pred - predict(mod, newdata = list(X = 10, Z = 30)) pred [1] -0.003469617 or pred - predict(mod, newdata = data.frame(X = 10, Z = 30)) pred [1] -0.003469617 HTH G Is there a function available to calculate it directly? Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% *Note new Address and Fax and Telephone numbers from 10th April 2006* %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/ WC1E 6BT [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot question
i'm doing a bar plot and there are 16 column variables. is there a way to make the variable names go down instead of across when you do the barplot ? because the names are so long, the barplot just shows 3 names and leaves the rest out. if i could rotate the names 90 degrees, it would probably fit a lot more. or maybe i can use space to make the horizontal width longer ? I looed up ?barlot but i'm not sure. when 1st and 2nd are on the bottom, things look fine but i'm not as interesed in those 2 barplots. i didn't use any special options. i just did barplot(probsignmatrix) barplot(t(probsignmatrix)) barplot(probsignmatrix,beside=T) barplot(t(probsignmatrix),beside=T) i put probsignmatrix below in case someone wants to see what i mean because it may not be clear. i don't expect anyone to type it in but rounding would still show what i mean. thanks a lot. pcount pmpppcount pmmppcount pmmmpcount pcount mcount pppmmcount ppmmmcount ppmppcount ppmmpcount pppmpcount ppmpmcount pmpmpcount pmpmmcount pmmpmcount pmppmcount 1st 0.03477157 0.02842640 0.03157360 0.03365482 0.04010152 0.03553299 0.03989848 0.04182741 0.02817259 0.03203046 0.02781726 0.02218274 0.01771574 0.02289340 0.02583756 0.02390863 2nd 0.04648895 0.02901495 0.03092490 0.03064044 0.04108420 0.03998700 0.03958062 0.04059655 0.03039662 0.03027471 0.02901495 0.02170026 0.01601105 0.02287874 0.02165962 0.02267555 This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cluster in R
hi, is there some good summary on clustering methods in R? It seems there are many packages involving it. And I have two questions on clustering here: 1. Is there a way of evaluate the effecitives (or seperation) of clustering (rather than by visualization)? 2. Is there a search method (like genetic search) which can help find the best subset of attributes which gives best seperation? Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
Try RSiteSearch(rotate barplot labels) Then read the first thread for an example of what you want to do. Cheers Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: Leeds, Mark (IED) [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Subject: [R] barplot question Date: Tue, 17 Oct 2006 17:15:43 -0400 i'm doing a bar plot and there are 16 column variables. is there a way to make the variable names go down instead of across when you do the barplot ? because the names are so long, the barplot just shows 3 names and leaves the rest out. if i could rotate the names 90 degrees, it would probably fit a lot more. or maybe i can use space to make the horizontal width longer ? I looed up ?barlot but i'm not sure. when 1st and 2nd are on the bottom, things look fine but i'm not as interesed in those 2 barplots. i didn't use any special options. i just did barplot(probsignmatrix) barplot(t(probsignmatrix)) barplot(probsignmatrix,beside=T) barplot(t(probsignmatrix),beside=T) i put probsignmatrix below in case someone wants to see what i mean because it may not be clear. i don't expect anyone to type it in but rounding would still show what i mean. thanks a lot. pcount pmpppcount pmmppcount pmmmpcount pcount mcount pppmmcount ppmmmcount ppmppcount ppmmpcount pppmpcount ppmpmcount pmpmpcount pmpmmcount pmmpmcount pmppmcount 1st 0.03477157 0.02842640 0.03157360 0.03365482 0.04010152 0.03553299 0.03989848 0.04182741 0.02817259 0.03203046 0.02781726 0.02218274 0.01771574 0.02289340 0.02583756 0.02390863 2nd 0.04648895 0.02901495 0.03092490 0.03064044 0.04108420 0.03998700 0.03958062 0.04059655 0.03039662 0.03027471 0.02901495 0.02170026 0.01601105 0.02287874 0.02165962 0.02267555 This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
Mark i'm doing a bar plot and there are 16 column variables. is there a way to make the variable names go down instead of across when you do the barplot ? because the names are so long, the barplot just shows 3 names and leaves the rest out. if i could rotate the names 90 degrees, it would probably fit a lot more. Is this the sort of thing you mean: temp - barplot(rnorm(16, 3)) text(temp, rep(-0.2, 16), paste('trt', 1:16), srt=90, adj=1) Peter Alspach or maybe i can use space to make the horizontal width longer ? I looed up ?barlot but i'm not sure. when 1st and 2nd are on the bottom, things look fine but i'm not as interesed in those 2 barplots. i didn't use any special options. i just did barplot(probsignmatrix) barplot(t(probsignmatrix)) barplot(probsignmatrix,beside=T) barplot(t(probsignmatrix),beside=T) i put probsignmatrix below in case someone wants to see what i mean because it may not be clear. i don't expect anyone to type it in but rounding would still show what i mean. thanks a lot. pcount pmpppcount pmmppcount pmmmpcount pcount mcount pppmmcount ppmmmcount ppmppcount ppmmpcount pppmpcount ppmpmcount pmpmpcount pmpmmcount pmmpmcount pmppmcount 1st 0.03477157 0.02842640 0.03157360 0.03365482 0.04010152 0.03553299 0.03989848 0.04182741 0.02817259 0.03203046 0.02781726 0.02218274 0.01771574 0.02289340 0.02583756 0.02390863 2nd 0.04648895 0.02901495 0.03092490 0.03064044 0.04108420 0.03998700 0.03958062 0.04059655 0.03039662 0.03027471 0.02901495 0.02170026 0.01601105 0.02287874 0.02165962 0.02267555 __ The contents of this e-mail are privileged and/or confidenti...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM Algorthm help library norm
Hello, i need some help concerning the library norm. i habe to impute some missing values using the em algorithm. The help offered for the library doesn't really help me, maybe somebody already worked on em algorithm or multiple imputation. some fictive Data x1 x2 50 60 24 . 26 20 87 . 21 . Problem: Em Algorithm in R calculating the missings in x2. Thanks in advance. -- View this message in context: http://www.nabble.com/EM-Algorthm-help-library-norm-tf2462486.html#a6864754 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster in R
On 10/17/06, Weiwei Shi [EMAIL PROTECTED] wrote: is there some good summary on clustering methods in R? It seems there are many packages involving it. Gabor provided this very useful link a couple of days back. http://cran.r-project.org/src/contrib/Views/Cluster.html jab -- John Bollinger, CFA, CMT www.BollingerBands.com If you advance far enough, you arrive at the beginning. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster in R
hi, I just happened to find that page. But it seems too brief to me. For example, my project involves non-determined cluster number and non-determined attributes for the would-be-clustered samples. What kind of methods should I start with? Thanks a lot for the prompty reply. W. On 10/17/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Go the R home page (google for R), click on CRAN in left pane, choose a mirror, click on Task Views in left pane and choose Cluster. On 10/17/06, Weiwei Shi [EMAIL PROTECTED] wrote: hi, is there some good summary on clustering methods in R? It seems there are many packages involving it. And I have two questions on clustering here: 1. Is there a way of evaluate the effecitives (or seperation) of clustering (rather than by visualization)? 2. Is there a search method (like genetic search) which can help find the best subset of attributes which gives best seperation? Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster in R
Go the R home page (google for R), click on CRAN in left pane, choose a mirror, click on Task Views in left pane and choose Cluster. On 10/17/06, Weiwei Shi [EMAIL PROTECTED] wrote: hi, is there some good summary on clustering methods in R? It seems there are many packages involving it. And I have two questions on clustering here: 1. Is there a way of evaluate the effecitives (or seperation) of clustering (rather than by visualization)? 2. Is there a search method (like genetic search) which can help find the best subset of attributes which gives best seperation? Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
On 10/17/06, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: i'm doing a bar plot and there are 16 column variables. is there a way to make the variable names go down instead of across when you do the barplot ? because the names are so long, the barplot just shows 3 names and leaves the rest out. if i could rotate the names 90 degrees, it would probably fit a lot more. or maybe i can use space to make the horizontal width longer ? I looed up ?barlot but i'm not sure. when 1st and 2nd are on the bottom, things look fine but i'm not as interesed in those 2 barplots. i didn't use any special options. i just did barplot(probsignmatrix) barplot(t(probsignmatrix)) barplot(probsignmatrix,beside=T) barplot(t(probsignmatrix),beside=T) i put probsignmatrix below in case someone wants to see what i mean because it may not be clear. i don't expect anyone to type it in but rounding would still show what i mean. thanks a lot. pcount pmpppcount pmmppcount pmmmpcount pcount mcount pppmmcount ppmmmcount ppmppcount ppmmpcount pppmpcount ppmpmcount pmpmpcount pmpmmcount pmmpmcount pmppmcount 1st 0.03477157 0.02842640 0.03157360 0.03365482 0.04010152 0.03553299 0.03989848 0.04182741 0.02817259 0.03203046 0.02781726 0.02218274 0.01771574 0.02289340 0.02583756 0.02390863 2nd 0.04648895 0.02901495 0.03092490 0.03064044 0.04108420 0.03998700 0.03958062 0.04059655 0.03039662 0.03027471 0.02901495 0.02170026 0.01601105 0.02287874 0.02165962 0.02267555 Don't know if you want to go this way, but try library(lattice) barchart(t(probsignmatrix)) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Input buffer overflow
No one answered this question but to answer my own question I did notice that since I posted this, there have been changes to parse.Rd in the development version of R: https://svn.r-project.org/R/trunk/src/library/base/man/parse.Rd indicating: a limit of 8192 bytes on the size of strings which can be parsed. On 10/15/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: In gsubfn I replace matches with strings that represent calls to a function and then perform paste(eval(parse(text= ...)), collapse = ) on the result. One user of gsubfn is using it with very long strings (over 20,000 characters) and the parse is giving an input buffer overflow. Here is an artificial example: s - paste(rep(X, 25000), collapse = ) out - parse(text = shQuote(s)) Error in parse(text = shQuote(s)) : input buffer overflow Is there a way to increase the limit? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Semi-definite programming
Dear R users Are any R users aware of implementations of Semi-Definite optimization in R. For example, has anybody implemented any of the numerous public domain C libraries for SDP in R and would they be willing to share. My objective here is to implement variants of Maximum Variance Unfolding as outlined in, for example, http://www.icml2006.org/icml_documents/camera-ready/131_A_Duality_View_of_Sp.pdf Any ideas would be warmly appreciated. Simon Gatehouse School of Biological Earth Environmental Sciences University of New South Wales __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding per-panel text to panel strips in lattice xyplot
This looks like an application for bottom.strips and right.strips, in addition to the current left.strips and strips, in each panel. The new feature is that these new strips might be associated with different information than the regular strips that reflect the levels of the conditioning variables. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlapping intervals
Hello Jim, thank you very much for this elegant code. It does exactly what I need. I'll check findInterval as well. Thanks again Giovanni On Oct 16, 2006, at 7:22 AM, jim holtman wrote: Here is a more general way that looks for the transitions: series1-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) series2-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) x - rbind(series1, series2) # create +1 for start and -1 for end x.s - rbind(cbind(x[,1], 1), cbind(x[,2], -1)) #sort by start times x.s - x.s[order(x.s[,1]),] # cumsum is a count of the transitions x.s - cbind(x.s, cumsum(x.s[,2])) # c(1,2) is start and c(-1,1) is the end of an overlap cbind(x.s[x.s[,2] == 1 x.s[,3] == 2, 1], x.s[x.s[,2] == -1 x.s [,3] == 1, 1]) [,1] [,2] [1,] 25 26 [2,] 40 40 [3,] 60 70 [4,] 300 350 On 10/15/06, Giovanni Coppola [EMAIL PROTECTED] wrote: Hello everybody, I have two series of intervals, and I'd like to output the shared regions. For example: series1-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) series2-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) series1 Start End [1,]10 20 [2,]21 26 [3,]40 70 [4,] 300 350 series2 Start End [1,]25 40 [2,]60 100 [3,] 210 400 [4,] 500 1000 I'd like to have something like this as result: shared Start End [1,]25 26 [2,]60 70 [3,] 300 350 I found this post, but the solution finds the regions shared across all the intervals. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html Can anybody help me with this? Thanks Giovanni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange error in mtrace
Dear useRs, I am experiencing very strange error with Mark Bravington's package debug. I haven't seen them before. Here is the sample session library(debug) Loading required package: mvbutils MVBUTILS: no tasks vector found in ROOT Loading required package: tcltk Loading Tcl/Tk interface ... done x-function() return(1) mtrace(x) x() Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' mtrace(x,FALSE) x() [1] 1 mtrace(x) x() Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' This happened with any function, which I tried to debug. I use R 2.4.0 on Linux and on Windows. debug version is 1.1.0, mvbutils version is 1.1.1 on both systems. Linux R was compiled from sources. The packages were installed from the Internet repositories using install.packages function on both systems. update.packages only asks for repository for current session, and does nothing, which means, that everything is of the latest version. What could be wrong? --- Best regards, Vladimirmailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tcltk crashes with bad color with text widget
On 10/17/2006 8:36 AM, Peter Dalgaard wrote: Duncan Murdoch [EMAIL PROTECTED] writes: On 10/16/2006 10:47 PM, Alex Couture-Beil wrote: Hello I have been playing with tcl/tk in R 2.4.0 on windows XP and have managed to crash R by supplying tcl/tk with an incorrect color. Is this a bug? is there a way for me to test the color to see if it is a valid tcl/tk color, to avoid this? tt=tktoplevel() tklabel(parent=tt, text=hello world, foreground=reed) Error in structure(.External(dotTclObjv, objv, PACKAGE = tcltk), class = tclObj) : [tcl] unknown color name reed. An error is displayed as one would expect, however when I try tktext(parent=tt, foreground=blaaack) R crashes, rather than displaying an error as tklabel did. This, however, does not happen on my FreeBSD machine, which displays an error similar to the one for tklabel and does not crash. I see the same crash in Windows, occurring deep in one of the TCL routines, where it tries to work with a font, but the font has not been assigned. TK on Windows uses a different display driver than FreeBSD does, so this could be a TK bug, rather than an R bug, and it does look like that. Alternatively, we might be ignoring an error generated in TK, in which case it is our bug: but the tklabel example makes that sound wrong. To verify, it would be nice to try the same commands in wish (or some other TCL/TK platform). Do you know the pure TCL equivalent? Should be close to this toplevel .1 label .1.1 -text hello world -foreground reed text .1.2 -foreground blaack (and it doesn't crash on my machine, in R or wish) I only see the crash in R, not in wish. It happens when, in the midst of destroying the partially created text widget, TK needs to create the Windows window corresponding to it. Windows sends some messages to the newly created window (including WM_NCCREATE), these lead to TK servicing idle events, and one of those involves looking the incomplete text object, and that's when things crash. I don't know why wish can handle the error properly; maybe it just got lucky, or maybe it handles the error differently. I think I'm going to have to give up on this one. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Contingency Table to Flat File
Just a quick update on this thread. The version of expand.dft() that I posted earlier has a bug in it. This is the result of the use of lapply() and the evaluation of the additional arguments passed to type.convert(). I noted this when testing the function on the UCBAdmissions data set, which is a multi-way table used in some help file examples such as ?as.data.frame.table. Here is a corrected version: expand.dft - function(x, na.strings = NA, as.is = FALSE, dec = .) { DF - sapply(1:nrow(x), function(i) x[rep(i, each = x$Freq[i]), ], simplify = FALSE) DF - subset(do.call(rbind, DF), select = -Freq) for (i in 1:ncol(DF)) { DF[[i]] - type.convert(as.character(DF[[i]]), na.strings = na.strings, as.is = as.is, dec = dec) } DF } Thus if we now take the UCBAdmissions multi-way table data and convert it to a flat contingency table: FCT - as.data.frame(UCBAdmissions) FCT Admit Gender Dept Freq 1 Admitted MaleA 512 2 Rejected MaleA 313 3 Admitted FemaleA 89 4 Rejected FemaleA 19 5 Admitted MaleB 353 6 Rejected MaleB 207 7 Admitted FemaleB 17 8 Rejected FemaleB8 9 Admitted MaleC 120 10 Rejected MaleC 205 11 Admitted FemaleC 202 12 Rejected FemaleC 391 13 Admitted MaleD 138 14 Rejected MaleD 279 15 Admitted FemaleD 131 16 Rejected FemaleD 244 17 Admitted MaleE 53 18 Rejected MaleE 138 19 Admitted FemaleE 94 20 Rejected FemaleE 299 21 Admitted MaleF 22 22 Rejected MaleF 351 23 Admitted FemaleF 24 24 Rejected FemaleF 317 Thus, there should be: sum(FCT$Freq) [1] 4526 rows in the final 'raw' data frame. DF - expand.dft(FCT) str(DF) 'data.frame': 4526 obs. of 3 variables: $ Admit : Factor w/ 2 levels Admitted,Rejected: 1 1 1 1 1 1 1 1 1 1 ... $ Gender: Factor w/ 2 levels Female,Male: 2 2 2 2 2 2 2 2 2 2 ... $ Dept : Factor w/ 6 levels A,B,C,D,..: 1 1 1 1 1 1 1 1 1 1 ... Note that the three columns are coerced back to factors, which is of course the default behavior for data frames. If we now use: DF - expand.dft(FCT, as.is = TRUE) str(DF) 'data.frame': 4526 obs. of 3 variables: $ Admit : chr Admitted Admitted Admitted Admitted ... $ Gender: chr Male Male Male Male ... $ Dept : chr A A A A ... The three columns stay as character vectors. It was this behavior that did not work properly in the first version. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple histograms in one plot
Hi all, I'm trying to plot multiple histograms in one plot (cross-validation values of model parameters), but I cannot seem to reduce the margins enough to fit as many of them in as I would like. I'm using split.screen to divide the window into a 5x4 grid, then plotting with hist. I've tried explicitly reducing the margins with par(mar=c(1,1,1,1)), but it doesn't seem to have any effect. Visually, there is a lot of whitespace and very little histogram in my results. Can anyone suggest either a better method to visualize these results, a better way to plot histograms, or a way to actually reduce the margins used? The intent is to give a sense of how well-constrained the various model parameters are. Thanks, Johann __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] crush in edit()
It is a problem by stack smashing protector. --- src/modules/X11/dataentry.c.orig2006-09-04 23:41:34.0 +0900 +++ src/modules/X11/dataentry.c 2006-10-18 11:31:43.0 +0900 @@ -1046,7 +1046,7 @@ for(j=0;*(wcspc+j)!=L'\0';j++)wcs[j]=*(wcspc+j); wcs[j]=L'\0'; w_p=wcs; - cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(wcs),NULL); + cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(s)-1,NULL); s[cnt]='\0'; if (textwidth(s, strlen(s)) (bw - text_offset)) break; *(++wcspc) = L''; @@ -1056,7 +1056,7 @@ for(j=0;*(wcspc+j)!=L'\0';j++)wcs[j]=*(wcspc+j); wcs[j]=L'\0'; w_p=wcs; - cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(wcs),NULL); + cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(s)-1,NULL); s[cnt]='\0'; if (textwidth(s, strlen(s)) (bw - text_offset)) break; *(wcspbuf + i - 2) = L''; @@ -1066,7 +1066,7 @@ for(j=0;*(wcspc+j)!=L'\0';j++) wcs[j]=*(wcspc+j); wcs[j]=L'\0'; w_p=wcs; -cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(wcs),NULL); +cnt=wcsrtombs(s,(const wchar_t **)w_p,sizeof(s)-1,NULL); drawtext(x_pos + text_offset, y_pos + box_h - text_offset, s, cnt); @@ -2398,6 +2398,7 @@ int cnt; char last_mbs[8]; char *mbs; +size_t bytes; mbs = (str == NULL) ? buf : str; @@ -2411,8 +2412,8 @@ if(wcs[0] == L'\0') return 0; memset(last_mbs, 0, sizeof(last_mbs)); -wcrtomb(last_mbs, wcs[cnt-1], mb_st); -return(strlen(last_mbs)); +bytes=wcrtomb(last_mbs, wcs[cnt-1], mb_st); /* -Wall */ +return(bytes); #else return(1); #endif 2006/10/18, crazybuddy Vincent [EMAIL PROTECTED]: Dear all, I am new to R system. When I tried to edit data read from a csv file, R system crushed, I got an error message as follows: edit(data) *** buffer overflow detected ***: /usr/lib/R/bin/exec/R terminated === Backtrace: = /lib/libc.so.6(__chk_fail+0x41)[0x49d020b1] /lib/libc.so.6[0x49d034a2] /usr/lib/R/modules//R_X11.so[0x33ed7a] /usr/lib/R/modules//R_X11.so[0x34050d] /usr/lib/R/modules//R_X11.so[0x341858] /usr/lib/R/modules//R_X11.so(RX11_dataentry+0xa25)[0x342f45] /usr/lib/R/lib/libR.so[0xa34675] /usr/lib/R/lib/libR.so[0x954ed6] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so[0x929ed8] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so[0x926a37] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2a7)[0x928117] /usr/lib/R/lib/libR.so[0x95661f] /usr/lib/R/lib/libR.so(Rf_usemethod+0x609)[0x957a89] /usr/lib/R/lib/libR.so[0x95825e] /usr/lib/R/lib/libR.so(Rf_eval+0x483)[0x925b23] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2a7)[0x928117] /usr/lib/R/lib/libR.so(Rf_eval+0x2f4)[0x925994] /usr/lib/R/lib/libR.so(Rf_ReplIteration+0x311)[0x945361] /usr/lib/R/lib/libR.so[0x945571] /usr/lib/R/lib/libR.so(run_Rmainloop+0x60)[0x9458c0] /usr/lib/R/lib/libR.so(Rf_mainloop+0x1c)[0x9458ec] /usr/lib/R/bin/exec/R(main+0x46)[0x80486f6] /lib/libc.so.6(__libc_start_main+0xdc)[0x49c3b4e4] /usr/lib/R/bin/exec/R[0x80485f1] === Memory map: 00111000-0012f000 r-xp fd:00 16943095 /usr/lib/R/library/grDevices/libs/grDevices.so 0012f000-0013 rwxp 0001d000 fd:00 16943095 /usr/lib/R/library/grDevices/libs/grDevices.so 0013-00181000 r-xp fd:00 16976568 /usr/lib/R/library/stats/libs/stats.so 00181000-00183000 rwxp 00051000 fd:00 16976568 /usr/lib/R/library/stats/libs/stats.so 00339000-00352000 r-xp fd:00 15959326 /usr/lib/R/modules/R_X11.so 00352000-00353000 rwxp 00018000 fd:00 15959326 /usr/lib/R/modules/R_X11.so 00353000-0035f000 rwxp 00353000 00:00 0 0048-00496000 r-xp fd:00 15303387 /usr/lib/gconv/SJIS.so 00496000-00498000 rwxp 00015000 fd:00 15303387 /usr/lib/gconv/SJIS.so 0056e000-00598000 r-xp fd:00 16452204 /usr/lib/R/lib/libRblas.so 00598000-00599000 rwxp 00029000 fd:00 16452204 /usr/lib/R/lib/libRblas.so 00848000-00851000 r-xp fd:00 15204401 /lib/libnss_files-2.4.so 00851000-00852000 r-xp 8000 fd:00 15204401 /lib/libnss_files-2.4.so 00852000-00853000 rwxp 9000 fd:00 15204401 /lib/libnss_files-2.4.so 00885000-00abd000 r-xp fd:00 16452203 /usr/lib/R/lib/libR.so 00abd000-00aca000 rwxp 00238000 fd:00 16452203 /usr/lib/R/lib/libR.so 00aca000-00b61000 rwxp 00aca000 00:00 0 00c47000-00c4d000 r-xp fd:00 16944203 /usr/lib/R/library/methods/libs/methods.so 00c4d000-00c4e000 rwxp 5000 fd:00 16944203 /usr/lib/R/library/methods/libs/methods.so 00eb6000-00f31000 r-xp fd:00 15242987 /usr/lib/libgfortran.so.1.0.0 00f31000-00f32000 rwxp 0007b000 fd:00 15242987 /usr/lib/libgfortran.so.1.0.0 00f44000-00f45000 r-xp fd:00 15303344 /usr/lib/gconv/ISO8859-1.so 00f45000-00f47000 rwxp fd:00 15303344
[R] MARS help?
I'm trying to use mars{mda} to model functions that look fairly close to a sequence of straight line segments. Unfortunately, 'mars' seems to totally miss the obvious places for the knots in the apparent first order spline model, and I wonder if someone can suggest a better way to do this. The following example consists of a slight downward trend followed by a jump up after t1=4, following by a more marked downward trend after t1=5: Dat0 - cbind(t1=1:10, x=c(1, 0, 0, 90, 99, 95, 90, 87, 80, 77)) library(mda) fit0 - mars(Dat0[, 1, drop=FALSE], Dat0[, 2], penalty=.001) plot(Dat0, type=l) lines(Dat0[, 1], fit0$fitted.values, lty=2, col=red) Are there 'mars' options I'm missing or other software I should be using? I've got thousands of traces crudely like this of different lengths, and I want an automated way of summarizing similar traces in terms of a fixed number of knots and associated slopes for each linear spline segment max(0, t1-t.knot). Thanks, Spencer Graves __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sqlSave, fast=F option, bug?
Hi, Using the fast=F option, sqlSave saves without matching column names. It looks like a bug to me.. Here's a simple (artificial) example. - Create a dataframe and save it to a database table test as follows: df - data.frame(T=1, S=10) sqlSave(channel, df, test, rownames=F) The table now looks like T S 1 10 If I create another dataframe and save as follows df - data.frame(S=20, T=2) sqlSave(channel, df, test, rownames=F, append=T) Then table test now looks like T S 1 10 2 20 The important point is that although S was the first column of df, sqlSave checked the column names and matched the corresponding columns of df and table test. However, if I now create another dataframe and save it using the fast=F option as follows df - data.frame(S=30, T=3) sqlSave(channel, df, test, rownames=F, append=T, fast=F) the table test now looks like T S 1 10 2 20 30 3 In other words, sqlSave didn't check column names, it simply mapped column 1 to column 1 and column 2 to column 2. - This cannot be right. Opinions? I'm using R 2.3.0 and package RODBC 1.1-7 on Windows XP with MS SQL Server __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Book recommendation for newbie to stats and R?
Hi Kevin, justin bem justin_bem at yahoo.fr writes: Exact reference is : Wonnacot, T., Wonnacot, R., Introductory Statistics for Business and Economics, New York, 1990 Though now about R, a good book to read for analyzing non-experimental data (and even experimental data) is Identification Problems in the Social Sciences by Charles Manski. It is a small, clearly written book, with examples. Providing a reasonable answer (including caveats) to the kind of typical problem you described in your initial post will benefit from this. You should atleast consider this an important supplement. See the link below. Anupam. http://www.hup.harvard.edu/catalog/MANIDE.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Review process for new packages
Hello, Duncan Murdoch murdoch at stats.uwo.ca writes: On 10/17/2006 2:22 AM, Andreas Wittmann wrote: Hi all, i'm currently working on a creditmetrics package which includes functions for computing the credit risk model creditmetrics. I guess it would be finished in a few days. My question now is, does there exist some review process before sending it to ctan or is it reviewed after having sended it? There's no review process to decide whether your package is useful or well-written. If you want that kind of review you should submit it to the Journal of Statistical Software. Although, this is a sensitive issue, it is unfortunate that such review (or comment, if that is a more suitable word) process is not available at R. Is it possible to have some process where people can provide comments, even if it is not a journal review. It can help in improving the quality of packages submitted to R, in reducing bugs, or simply catching errors (coding and non-coding) that the author may have over-looked by mistake. Will contributing something to R, on provisional basis, and then asking for comments, and then submitting a final version work? It may also help to require the author to include a mathematical description of what has been submitted, if it is a statistical function. This be because most new users find it difficult to read R code at the level of functions. They may also not be familiar with the statistical concept, but may know about it mathematically---because different disciplines have differentiated their specialized terminology (with some variation) as discipline specific statistical applications have evolved. I think this will make R more accessible to a wider user-base. ---Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange error in mtrace
I vaguely remember seeing a bug report for the debug library a few days ago but I can't find the thread. I think it was a compatibility issue with 2.4.0 but the mantainer was already working on it. Maybe somebody else can provide a link to the specific posting? I am sorry I can't be of more assistance. Best regards, Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: Vladimir Eremeev [EMAIL PROTECTED] Reply-To: Vladimir Eremeev [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] strange error in mtrace Date: Tue, 17 Oct 2006 18:33:45 -0800 Dear useRs, I am experiencing very strange error with Mark Bravington's package debug. I haven't seen them before. Here is the sample session library(debug) Loading required package: mvbutils MVBUTILS: no tasks vector found in ROOT Loading required package: tcltk Loading Tcl/Tk interface ... done x-function() return(1) mtrace(x) x() Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' mtrace(x,FALSE) x() [1] 1 mtrace(x) x() Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' This happened with any function, which I tried to debug. I use R 2.4.0 on Linux and on Windows. debug version is 1.1.0, mvbutils version is 1.1.1 on both systems. Linux R was compiled from sources. The packages were installed from the Internet repositories using install.packages function on both systems. update.packages only asks for repository for current session, and does nothing, which means, that everything is of the latest version. What could be wrong? --- Best regards, Vladimirmailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.