Re: [R] information
On 10/16/2009 05:08 AM, rolm...@apolo.acatlan.unam.mx wrote: Hi!!, i'm Leonardo Olmedo, i'm student of master in Applied Mathematics in UAM and profesor in UNAM from Mexico. My thesis is a Proposal to test a central composite null hypothesis and alternative hypotheses bilateral in normal distribution. I did a program in R for obtain the significance level from test. I have a doubt: How can i do let my program be part of R and/or how i should have to write for this? Hi Leonardo, I think you are asking if you can contribute the function(s) you have written. This can be done by creating a package containing the function(s), help pages and optionally data sets to show how the function(s) work. Complete instructions are in the file Writing R Extensions (R-exts.html) that is part of the R distribution. Start the HTML help system and have a look. If you get stuck, let us know. Jim Si es mas facil a conocer in espanol, lee R-exts.html en su navegador HTML. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating p-values by row for data frames
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Christoph Heuck Sent: Donnerstag, 15. Oktober 2009 17:51 To: r-help@r-project.org Subject: [R] calculating p-values by row for data frames Hello R-users, I am looking for an elegant way to calculate p-values for each row of a data frame. My situation is as follows: I have a gene expression results from a microarray with 64 samples looking at 25626 genes. The results are in a data frame with the dimensions 64 by 25626 I want to create a volcano plot of difference of means vs. -log(10) of the p-values, comparing normal samples to abnormal samples. The results of both type of samples are all in my data frame. Now, I have found a way to calculate the p-value using a for (i in 1:25626) loop (see below): df.normal #dataframe, which only contains the normal samples df.samples #dataframe, which only contains abnormal samples DM=rowMeans(df.normal)-rowMeans(df.samples) #gives me a dataframe with the difference of means PV=array(1,c(25626,1)) for (i in 1:25626){ VL=t.test(matrix.b[i,],matrix.a[i,]) V=as.numeric(VL[3]) V=-log10(V) PV[i,1]=V} plot(DM, PV, main=title,xlab=x.lab, ylab=-log(10) P-Values,pch=20)} It takes around 3-5 minutes to generate the volcano plot this way. I will be running arrays which will look at 2.2 million sites this approach will then take way too long. I was wondering if there is a more elegant way to calculate the p-values for an array/fataframe/matrix in a row-by row fashion, which is similar to rowMeans. I thought writing a function to get the p-value and then using apply(x,1,function) would be the best. I have the function which will give me the p-value p.value = function (x,y){ PV=as.numeric(t.test(x,y)[3]) } and I can get a result if I test it only on one row (below is 6 by 10 data frame example of my original data) RRR X259863X267862 X267906X300875 X300877 X300878 MSPI0406S0183 -3.2257205 -3.2248899 2.85590082 -2.6293602 -3.5054348 -2.62817269 MSPI0406S0238 -2.6661903 -3.1135020 2.17073881 -3.2357307 -2.3309775 -1.76078452 MSPI0406S0239 -1.7636439 -0.6702877 0.19471126 -0.7397132 -1.4332662 -0.24822470 MSPI0406S0300 0.6471381 -0.2638928 -0.61876054 -0.9180127 0.2539848 -0.63122203 MSPI0406S0301 0.9207208 0.2164267 -0.33238846 -1.1450717 -0.2935584 -1.01659802 MSPI0406S0321 -0.4073272 -0.2852402 -0.08085746 -0.4109428 -0.2185432 -0.39736137 MSPI0406S0352 -0.7074175 -0.6987548 -1.22004647 -0.8570551 -0.5083861 -0.09267928 MSPI0406S0353 -0.2745682 0.3012990 -0.64787221 -0.5654195 0.4265007 -0.65963404 MSPI0406S0354 -1.1858394 -1.4388609 -0.07329722 -2.0010785 -1.3245696 -1.43216984 MSPI0406S0360 -1.4599809 -1.4929059 0.63453235 -1.1476760 -1.5849922 -1.03187399 zz=p.value(RRR[1,1:3],RRR[1,4:6]) zz $p.value [1] 0.485727 but I cannot do this row by row using apply xxx=apply(RRR,1,p.value(RRR[,1:3],RRR[,4:6])) xxx - apply(RRR, 1, function(x) p.value(x[1:3],x[4:6])) works for me. Check the examples in ?apply. HTH, Michael Error in match.fun(FUN) : 'p.value(RRR[, 1:3], RRR[, 4:6])' is not a function, character or symbol Does anyone have any suggestions? Thanks in advance Christoph Heuck Albert Einstein College of Medicine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with rJava and tm packages
On 10/15/2009 08:12 PM, PDXRugger wrote: I am looking to do some text analysis using R and have run into some issues with some of the packages. Im not sure if its my goofy Vista OS or what but using R 2.8.1 i s relatively successful loading the text but the rJava package was messed up somehow: library(tm) library(rJava) Error in if (!nchar(javahome)) stop(JAVA_HOME is not set and could not be determined from the registry) : argument is of length zero On windows rJava tries to guess where java lives at runtime by poking into the registry. It is possible that the registry has changed for vista ... I don't know What you can do is tell rJava where java lives yourself by setting the JAVA_HOME environment variable. ... Now the fact that you see argument is of length zero is a bug because the test should be : if( is.null(javahome) || !length(javahome) || nchar(javahome ) ) ..; but fixing this bug won't fix your problem. In addition: Warning message: package 'rJava' was built under R version 2.9.1 Error : .onLoad failed in 'loadNamespace' for 'rJava' Error: package/namespace load failed for 'rJava' You need to either install an older version of rJava or a more recent version of R. #Set documents directory DIR- G:/TextSearch/Speeches #Load corpus speech- Corpus(DirSource(DIR), readerControl = list(reader = readPlain, + language = en_US, load = TRUE)) #Remove stopwords speech- tmMap(speech, stripWhitespace) speech A corpus with 2 text documents tdm-TermDocumentMatrix(speech) Error in if (!nchar(javahome)) stop(JAVA_HOME is not set and could not be determined from the registry) : argument is of length zero Error: .onLoad failed in 'loadNamespace' for 'rJava' So the initial question is whats going on with the rJava package? I get the same error when i try and load the package and then when i try and utilize a function from the package. I tried installing 2.9.2 and ran into more problems when running the lines: utils:::menuInstallPkgs() Quit R, Start R again without loading the tm package, and then try again Warning: package 'tm' is in use and will not be installed speech- tmMap(speech, stripWhitespace) Error: could not find function tmMap the package is installed correctly but its not able to pick it up in this version of R. Again, im not sure if its somehting with Vista or what. Thanks guys and gals Cheers, JR -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/BcPw : celebrating R commit #5 |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc `- http://tr.im/yw8E : New R package : sos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metaMDS NMDS: use of alternative distances?
Kim Vanselow Vanselow at gmx.de writes: Dear r-helpers! How can I integrate other distances (in the form of a dist object) into function metaMDS? The problem: metaMDS needs the original data.frame for the calculation and only the default distances of function vegdist are allowed. Any suggestions are greatly appreciated! Thank you, Kim Kim, A small addition to Gavin's reply: the option of having your own dissimilarities was added in the latest release version (1.15-4). If you do not have that in your version of vegan, you should upgrade. Cheers, Jari Oksanen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting dataframe to matrix
Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x - as.matrix(my_data.frame) If I then as, is.matrix(x), I get TRUE. However everywhere I've tried to use the matrix returns errors. Is there a step I'm missing? Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminant plot
Hello Alejo, Please, keep sending your post to the R-help mailing list in order other people can also answer. The type of lda_analysis is lda and that is normal and it also is perfectly normal to find a different type for predict(lda_analysis)$x. Moreover the example of the lda() function about iris gives me the exact same types for the object z (of the example) and for predict(z). When you plot lda_analysis, you use the function plot.lda whereas you use the function plot for the predict object. As I told you in my previous e-mail the predicted class are not the class of X$G3 so it is normal if the two plots are not exactly the same. which(predict(lda_analysis)$class != X$G3) gives you all the observations that are predicted in a different category from X$G3. Look at this points and you can see they are the only different points from the two plots (the coordinates are the same). Alain Alejo C.S. wrote: Hi Alain, I thought (in the worng way I see) that the predict function applied to an object of class lda returned the coordinates of the discriminant axes. When doing the same to iris data, the original classes are the same than those returned by predict. Is not the case with my data, if you compare the original classes with those returned by predict(), the are different. I'm really confused now... Regards, Alejo 2009/10/15, Alain Guillet alain.guil...@uclouvain.be mailto:alain.guil...@uclouvain.be: Hi Alejo, According to my knowledge the two plots are different because in the first one a point belongs to a group depending on its group in the data whereas in the second plot a point belongs to the group predicted by the linear discriminant analysis. I hope somebody will correct me if I am wrong. Alain Alejo C.S. wrote: Hi Alain, this is the code: library(MASS) library(mda) #data attached, first column G3 group membership X - read.table(data, header=T) lda_analysis - lda(formula(X), data=X) plot(lda_analysis, col=palette()[X$G3]) #the above plot is completely different to: plot(predict(lda_analysis)$x, type=n) text(predict(lda_analysis)$x, labels=predict(lda_analysis)$class, col=palette()[predict(lda_analysis)$class]) The above code only reproduce the first plot using predict to obtain coordinates and classes for the first tow discriminant axis. Thanks , Alejo -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
Peng Yu wrote: On Thu, Oct 15, 2009 at 4:19 PM, RICHARD M. HEIBERGER r...@temple.edu wrote: On Thu, Oct 15, 2009 at 4:56 PM, Peng Yu pengyu...@gmail.com wrote: Can somebody point me a book on Fisher's exact test? I looked a few webpages. But the descriptions on the webpages are not very complete. Is there a book on that covers all the aspect of Fisher's exact test that is implemented in R? Section 15.2 of my book (Statistical Analysis and Data Display, with Burt Holland and published by Springer) shows a detailed example. It doesn't mention odd ratio. The general idea of basing the inference on the noncentral hypergeometric distribution is something I have first seen in BreslowDay's famous 1980 book on case-control studies, including the fact that the conditional MLE differs from the ordinary OR. (I'm sure there's an earlier reference, but I happened to be a grad student when that book came out...) The rest of what R does is carbon copied from similar procedures for the binomial distribution. I wouldn't know what kind of book to look for for that sort of minutiae. Alan Agresti is a possible source. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
Hi fexact.c points you to the original ACM paper: /* ALGORITHM 643, COLLECTED ALGORITHMS FROM ACM. THIS WORK PUBLISHED IN TRANSACTIONS ON MATHEMATICAL SOFTWARE, VOL. 19, NO. 4, DECEMBER, 1993, PP. 484-488. - You may find the discussion in the vignette(fishervig) in the aylmer package helpful. HTH Robin Peter Dalgaard wrote: Peng Yu wrote: On Thu, Oct 15, 2009 at 4:19 PM, RICHARD M. HEIBERGER r...@temple.edu wrote: On Thu, Oct 15, 2009 at 4:56 PM, Peng Yu pengyu...@gmail.com wrote: Can somebody point me a book on Fisher's exact test? I looked a few webpages. But the descriptions on the webpages are not very complete. Is there a book on that covers all the aspect of Fisher's exact test that is implemented in R? Section 15.2 of my book (Statistical Analysis and Data Display, with Burt Holland and published by Springer) shows a detailed example. It doesn't mention odd ratio. The general idea of basing the inference on the noncentral hypergeometric distribution is something I have first seen in BreslowDay's famous 1980 book on case-control studies, including the fact that the conditional MLE differs from the ordinary OR. (I'm sure there's an earlier reference, but I happened to be a grad student when that book came out...) The rest of what R does is carbon copied from similar procedures for the binomial distribution. I wouldn't know what kind of book to look for for that sort of minutiae. Alan Agresti is a possible source. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting dataframe to matrix
On Fri, Oct 16, 2009 at 01:33:14AM -0700, Noah Silverman wrote: Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x - as.matrix(my_data.frame) If I then as, is.matrix(x), I get TRUE. However everywhere I've tried to use the matrix returns errors. Without more information I can't even start to guess what is going wrong. Please give a short, reproducible example of what you did and what errors you encountered. as.matrix() should suffice for creating a matrix from a data.frame : foo - data.frame(1:4, 4:1, sqrt(1:4), log(4:1)) foo X1.4 X4.1 sqrt.1.4. log.4.1. 114 1.00 1.3862944 223 1.414214 1.0986123 332 1.732051 0.6931472 441 2.00 0.000 det(foo) Error in UseMethod(determinant) : no applicable method for determinant det(as.matrix(foo)) [1] -0.1092489 So probably your problem is somewhere else. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting dataframe to matrix
I think you may be correct. I've manage to get the data into a format that the function accepts. The error appears to be because I have negative values in my data: Error in apply(safeNormCDF(s), 1, prod) : dim(X) must have a positive length On 10/16/09 1:51 AM, Philipp Pagel wrote: On Fri, Oct 16, 2009 at 01:33:14AM -0700, Noah Silverman wrote: Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x- as.matrix(my_data.frame) If I then as, is.matrix(x), I get TRUE. However everywhere I've tried to use the matrix returns errors. Without more information I can't even start to guess what is going wrong. Please give a short, reproducible example of what you did and what errors you encountered. as.matrix() should suffice for creating a matrix from a data.frame : foo- data.frame(1:4, 4:1, sqrt(1:4), log(4:1)) foo X1.4 X4.1 sqrt.1.4. log.4.1. 114 1.00 1.3862944 223 1.414214 1.0986123 332 1.732051 0.6931472 441 2.00 0.000 det(foo) Error in UseMethod(determinant) : no applicable method for determinant det(as.matrix(foo)) [1] -0.1092489 So probably your problem is somewhere else. cu Philipp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beveridge-Nelson decomposition
Dear all, Is there some function in R to carry out Beveridge-Nelson decomposition? Is there some function in R to carry out seasonal adjustments of time series? Is there some function in R to carry out x11 or x12 seasonal adjustments ? Thanks! Best regards, wang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting dataframe to matrix
On Fri, Oct 16, 2009 at 01:55:03AM -0700, Noah Silverman wrote: I think you may be correct. I've manage to get the data into a format that the function accepts. The error appears to be because I have negative values in my data: Error in apply(safeNormCDF(s), 1, prod) : dim(X) must have a positive length Sounds like safeNormCDF() does not return a matrix but a vector. What does dim(safeNormCDF(s)) say? apply(1:9, 1, sum) Error in apply(1:9, 1, sum) : dim(X) must have a positive length apply(matrix(1:9, nrow=3), 1, sum) [1] 12 15 18 apply(matrix(1:9, nrow=1), 1, sum) [1] 45 cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help - Correlation between data.
hi everybody, I'm a student, and this is the first time I write on this forum! I'm looking for statistical help hoping somebody can answer me! This is my problem: I have 2 temporal series. The firstone is a series of mesured data (height of monitorated points), the second is a series of temperature (in Celsius degree). Using Matlab I have built the two graphs (Measured Data - Time Temperature - Time). Looking those graphs I can surely say that there is a clear correlation beetween theme, and also that the measured data are surely influenced by the variations of temperature. Unfortunately my statistical knowledges are not that large so using R seems quite difficult to me. My question is: is there a code already written the can compare the 2 temporal series and can find the correlation between the data??? And also: is there a code that can correct the Measured Data from the influence of temperature and return a clean data??? if you want I can send you the .txt file with data. Thanks Antonio Gonelli - University of Ferrara - Italy -- View this message in context: http://www.nabble.com/Help---Correlation-between-data.-tp25921464p25921464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two graphs 1 x-axis
Dear R-people I have a question concerning plotting graphs. Here an example dataset a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) d-as.data.frame(cbind(a,b,c)) plot.new() plot(d$a, d$b, col=red) par(new=TRUE) plot(d$a,d$c, col=red, pch=|) What I would want is to plot de second plot under the first plot. So not in the the first plot. There is a way to divide your graph in 2 or 3 parts and use the same x-axis but I do not seem to get it right. Could somebody help me out? Thanks in advance! Regards, Naomi Disclaimer: De informatie opgenomen in dit bericht (en bijlagen) kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te gebruiken, de afzender direct te informeren en het bericht te vernietigen. Aan dit bericht kunnen geen rechten of plichten worden ontleend. -- Disclaimer: The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, you are expected not to use the contents herein, to notify the sender immediately and to destroy the message. No rights can be derived from this message. Please consider the environment before printing this email __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two graphs 1 x-axis
On 10/16/2009 09:22 PM, Duijvesteijn, Naomi wrote: Dear R-people I have a question concerning plotting graphs. Here an example dataset a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) d-as.data.frame(cbind(a,b,c)) plot.new() plot(d$a, d$b, col=red) par(new=TRUE) plot(d$a,d$c, col=red, pch=|) What I would want is to plot de second plot under the first plot. So not in the the first plot. There is a way to divide your graph in 2 or 3 parts and use the same x-axis but I do not seem to get it right. Could somebody help me out? Hi Naomi, Try this: par(mfrow=c(2,1)) plot(d$a,d$b,col=red) plot(d$a,d$c,col=red,pch=|) You may want to change the margins (par(mar=c(...)) to get the spacing the way you want it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two graphs 1 x-axis
On Fri, Oct 16, 2009 at 12:22:06PM +0200, Duijvesteijn, Naomi wrote: I have a question concerning plotting graphs. Here an example dataset a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) d-as.data.frame(cbind(a,b,c)) plot.new() plot(d$a, d$b, col=red) par(new=TRUE) plot(d$a,d$c, col=red, pch=|) What I would want is to plot de second plot under the first plot. So not in the the first plot. There is a way to divide your graph in 2 or 3 parts and use the same x-axis but I do not seem to get it right. Could somebody help me out? Yes, use something alng these lines: par(mrfow=c(2,1)) plot(d$a, d$b, col=red) plot(d$a, d$c, col=red, pch=|) As both plots use the same data for X you are set. If you need to force two datasets with different x-ranges into the same range, you can use the xlim parameter to define the desired range. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalization of tabulate()
Using the generalized inner product defined in this post: https://www.stat.math.ethz.ch/pipermail/r-help/2006-July/109311.html try this: cbind(S, d = rowSums(inner(S, obs, identical))) On Fri, Oct 16, 2009 at 4:29 AM, Robin Hankin rk...@cam.ac.uk wrote: Hi I want a generalization of tabulate() which works on rows of a matrix. Suppose I have an integer matrix 'observation': observation y1 y2 y3 1 4 0 1 4 0 2 0 3 4 1 0 0 5 0 0 1 4 2 0 3 Each row corresponds to a (multivariate) observation. Note that the first two rows are identical: this means that data c(1,4,0) was observed twice. Now suppose I can list the sample space: S [1,] 5 0 0 [2,] 4 1 0 [3,] 3 2 0 [4,] 2 3 0 [5,] 1 4 0 [6,] 0 5 0 [7,] 4 0 1 [8,] 3 1 1 [9,] 2 2 1 [10,] 1 3 1 [11,] 0 4 1 [12,] 3 0 2 [13,] 2 1 2 [14,] 1 2 2 [15,] 0 3 2 [16,] 2 0 3 [17,] 1 1 3 [18,] 0 2 3 [19,] 1 0 4 [20,] 0 1 4 [21,] 0 0 5 (thus each row corresponds to a point in my sample space). Now what I need to do is to construct a new matrix, which uses the 'observation' matrix above, which is a sort of table: desired y1 y2 y3 d [1,] 5 0 0 0 [2,] 4 1 0 1 [3,] 3 2 0 0 [4,] 2 3 0 0 [5,] 1 4 0 2 [6,] 0 5 0 1 [7,] 4 0 1 0 [8,] 3 1 1 0 [9,] 2 2 1 0 [10,] 1 3 1 0 [11,] 0 4 1 0 [12,] 3 0 2 0 [13,] 2 1 2 0 [14,] 1 2 2 0 [15,] 0 3 2 0 [16,] 2 0 3 2 [17,] 1 1 3 0 [18,] 0 2 3 0 [19,] 1 0 4 0 [20,] 0 1 4 1 [21,] 0 0 5 0 Thus the 'd' column counts the number of times that each row occurs in variable 'observation'. So desired[5,4]=2 because the observation corresponding to desired[5,1:3] (viz c(1,4,0)) occurred twice. And desired[1,4]=0 because the observation corresponding to desired[1,1:3] (viz c(5,0,0)) did not occur once (it was not observed). In my application I have dim(S) ~= c(5,4e6). I've tried merge(), stack(), reshape(), but the best I can do is the (derisory): require(partitions) obs - matrix(as.integer(c( 1, 4, 0, 1, 4, 0, 2, 0, 3, 4, 1, 0, 0, 5, 0, 0, 1, 4, 2, 0, 3 )),ncol=3,byrow=TRUE) S - t(compositions(5,3)) d - rep(0,nrow(S)) for(i in seq_len(nrow(obs))){ for(j in seq_len(nrow(S))){ if(all(obs[i,,drop=TRUE] == S[j,,drop=TRUE])){ d[j] - d[j]+1 } } } S - cbind(S,d) Anyone got anything better before I try C? -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
On Fri, 16 Oct 2009, Robin Hankin wrote: Hi fexact.c points you to the original ACM paper: Well, you'll get a better idea from the help page as to the real 'original' source reference: the reference below is to a revised version in a remark. And indeed Agresti's book (first edition on the help page, also has a 2002 second edition) is a good source for the 'minutiae'. /* ALGORITHM 643, COLLECTED ALGORITHMS FROM ACM. THIS WORK PUBLISHED IN TRANSACTIONS ON MATHEMATICAL SOFTWARE, VOL. 19, NO. 4, DECEMBER, 1993, PP. 484-488. - You may find the discussion in the vignette(fishervig) in the aylmer package helpful. HTH Robin Peter Dalgaard wrote: Peng Yu wrote: On Thu, Oct 15, 2009 at 4:19 PM, RICHARD M. HEIBERGER r...@temple.edu wrote: On Thu, Oct 15, 2009 at 4:56 PM, Peng Yu pengyu...@gmail.com wrote: Can somebody point me a book on Fisher's exact test? I looked a few webpages. But the descriptions on the webpages are not very complete. Is there a book on that covers all the aspect of Fisher's exact test that is implemented in R? Section 15.2 of my book (Statistical Analysis and Data Display, with Burt Holland and published by Springer) shows a detailed example. It doesn't mention odd ratio. The general idea of basing the inference on the noncentral hypergeometric distribution is something I have first seen in BreslowDay's famous 1980 book on case-control studies, including the fact that the conditional MLE differs from the ordinary OR. (I'm sure there's an earlier reference, but I happened to be a grad student when that book came out...) The rest of what R does is carbon copied from similar procedures for the binomial distribution. I wouldn't know what kind of book to look for for that sort of minutiae. Alan Agresti is a possible source. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Urgent help requested to modify a script
I am hoping someone will tak up this chalenge (I am new to R) I have inheritied an R script but need to change it. The script currently includes hardcoded file locations on lines 12,166 and 167. I need to modify this script to allow the folder to be passed as a command line argument to Rscript.exe Can anybody help please? Regards, Ian http://www.nabble.com/file/p25924237/My_script.R My_script.R -- View this message in context: http://www.nabble.com/Urgent-help-requested-to-modify-a-script-tp25924237p25924237.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Urgent help requested to modify a script
On 10/16/2009 02:01 PM, one2luv wrote: I am hoping someone will tak up this chalenge (I am new to R) I have inheritied an R script but need to change it. The script currently includes hardcoded file locations on lines 12,166 and 167. I need to modify this script to allow the folder to be passed as a command line argument to Rscript.exe Can anybody help please? Regards, Ian http://www.nabble.com/file/p25924237/My_script.R My_script.R ?commandArgs -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/BcPw : celebrating R commit #5 |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc `- http://tr.im/yw8E : New R package : sos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] currency conversion function?
On 10/15/09, Henrique Dallazuanna www...@gmail.com wrote: foo - function(from, to, date){ url - http://www.oanda.com/convert/classic?script=..%2Fconvert%2Fclassiclanguage=envalue=1; params - sprintf(%sdate=%sexch=%sexch2=margin_fixed=0expr=%sexpr2=SUBMIT=Convert+Nowlang=endate_fmt=us, url, format(as.Date(date), %m/%d/%y), to, from) Lines - readLines(params) value - gsub(.*(.+[0-9]\\.[0-9]+).*, \\1, grep(nl, grep(from, grep(to, Lines, value = TRUE), value = TRUE), value = TRUE)) as.numeric(value) } Hmm.. There are still some inconsistencies: foo('EUR', 'RUB', '2009-10-16') ### slightly wrong [1] 43.830 foo('RUB', 'EUR', '2009-10-16') [1] 0.0228 #Friday, October 16, 2009 #1 Euro = 43.85951 Russian Rouble #1 Russian Rouble (RUB) = 0.02280 Euro (EUR) #Median price = 43.82953 / 43.85951 (bid/ask) It seems that EUR/RUB reports the median data. For other pairs (say, EUR/PEN) the discrepancies can be more obvious. Thank you Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimation in a changepoint regression with R
Package bcp does Bayesian changepoint analysis, though not in the general regression framework. The most recent reference is Bioinformatics 24(19) 2143-2148; doi: 10.1093/bioinformatics/btn404; slightly older is JSS 23(3). Both reference some alternatives you might want to consider (including strucchange, among others). Jay Message: 4 Date: Thu, 15 Oct 2009 03:56:22 -0700 (PDT) From: FMH kagba2...@yahoo.com Subject: [R] Estimation in a changepoint regression with R To: r-help@r-project.org Message-ID: 365399.56401...@web38303.mail.mud.yahoo.com Content-Type: text/plain; charset=iso-8859-1 Dear All, I'm trying to do the estimation in a changepoint regression problem via R, but never found any suitable function which might help me to do this. Could someone give me a hand?on this matter? Thank you. -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
For some alternative views (and references) for FET see: http://www.stat.columbia.edu/~cook/movabletype/archives/2009/10/what_is_the_bay.html#comments kjetil On Fri, Oct 16, 2009 at 8:38 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: On Fri, 16 Oct 2009, Robin Hankin wrote: Hi fexact.c points you to the original ACM paper: Well, you'll get a better idea from the help page as to the real 'original' source reference: the reference below is to a revised version in a remark. And indeed Agresti's book (first edition on the help page, also has a 2002 second edition) is a good source for the 'minutiae'. /* ALGORITHM 643, COLLECTED ALGORITHMS FROM ACM. THIS WORK PUBLISHED IN TRANSACTIONS ON MATHEMATICAL SOFTWARE, VOL. 19, NO. 4, DECEMBER, 1993, PP. 484-488. - You may find the discussion in the vignette(fishervig) in the aylmer package helpful. HTH Robin Peter Dalgaard wrote: Peng Yu wrote: On Thu, Oct 15, 2009 at 4:19 PM, RICHARD M. HEIBERGER r...@temple.edu wrote: On Thu, Oct 15, 2009 at 4:56 PM, Peng Yu pengyu...@gmail.com wrote: Can somebody point me a book on Fisher's exact test? I looked a few webpages. But the descriptions on the webpages are not very complete. Is there a book on that covers all the aspect of Fisher's exact test that is implemented in R? Section 15.2 of my book (Statistical Analysis and Data Display, with Burt Holland and published by Springer) shows a detailed example. It doesn't mention odd ratio. The general idea of basing the inference on the noncentral hypergeometric distribution is something I have first seen in BreslowDay's famous 1980 book on case-control studies, including the fact that the conditional MLE differs from the ordinary OR. (I'm sure there's an earlier reference, but I happened to be a grad student when that book came out...) The rest of what R does is carbon copied from similar procedures for the binomial distribution. I wouldn't know what kind of book to look for for that sort of minutiae. Alan Agresti is a possible source. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two graphs 1 x-axis
Hi Naomi, Take a look at the lattice package for plotting. An example using your data: library(lattice) library(reshape) a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) bla = data.frame(a,b,c) # melt is from reshape bla2 = melt(bla, id.vars = a) xyplot(value~a | variable, bla2, layout = c(1,2), strip = strip.custom(factor.levels = c(a vs b, a vs c))) ?melt ?xyplot xyplot takes care that the axis are equal, no need to set it yourself. Lattice is a bit harder to get to know than the 'normal' plotting system in R, but is great for multivariate data. cheers and good luck, Paul Duijvesteijn, Naomi wrote: Dear R-people I have a question concerning plotting graphs. Here an example dataset a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) d-as.data.frame(cbind(a,b,c)) plot.new() plot(d$a, d$b, col=red) par(new=TRUE) plot(d$a,d$c, col=red, pch=|) What I would want is to plot de second plot under the first plot. So not in the the first plot. There is a way to divide your graph in 2 or 3 parts and use the same x-axis but I do not seem to get it right. Could somebody help me out? Thanks in advance! Regards, Naomi Disclaimer: De informatie opgenomen in dit bericht (en bijlagen) kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te gebruiken, de afzender direct te informeren en het bericht te vernietigen. Aan dit bericht kunnen geen rechten of plichten worden ontleend. -- Disclaimer: The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, you are expected not to use the contents herein, to notify the sender immediately and to destroy the message. No rights can be derived from this message. Please consider the environment before printing this email __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
For me, the classical reference for FET is: @BOOK{Siegel56, title = {Nonparametric Statistics for the Behavioral Sciences}, publisher = {McGraw-Hill}, year = {1956}, author = {Siegel, Sidney}, address = {New York} } Tom Prof Brian Ripley wrote: On Fri, 16 Oct 2009, Robin Hankin wrote: Hi fexact.c points you to the original ACM paper: Well, you'll get a better idea from the help page as to the real 'original' source reference: the reference below is to a revised version in a remark. And indeed Agresti's book (first edition on the help page, also has a 2002 second edition) is a good source for the 'minutiae'. /* ALGORITHM 643, COLLECTED ALGORITHMS FROM ACM. THIS WORK PUBLISHED IN TRANSACTIONS ON MATHEMATICAL SOFTWARE, VOL. 19, NO. 4, DECEMBER, 1993, PP. 484-488. - You may find the discussion in the vignette(fishervig) in the aylmer package helpful. HTH Robin Peter Dalgaard wrote: Peng Yu wrote: On Thu, Oct 15, 2009 at 4:19 PM, RICHARD M. HEIBERGER r...@temple.edu wrote: On Thu, Oct 15, 2009 at 4:56 PM, Peng Yu pengyu...@gmail.com wrote: Can somebody point me a book on Fisher's exact test? I looked a few webpages. But the descriptions on the webpages are not very complete. Is there a book on that covers all the aspect of Fisher's exact test that is implemented in R? Section 15.2 of my book (Statistical Analysis and Data Display, with Burt Holland and published by Springer) shows a detailed example. It doesn't mention odd ratio. The general idea of basing the inference on the noncentral hypergeometric distribution is something I have first seen in BreslowDay's famous 1980 book on case-control studies, including the fact that the conditional MLE differs from the ordinary OR. (I'm sure there's an earlier reference, but I happened to be a grad student when that book came out...) The rest of what R does is carbon copied from similar procedures for the binomial distribution. I wouldn't know what kind of book to look for for that sort of minutiae. Alan Agresti is a possible source. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] currency conversion function?
On 10/15/09, Jeff Ryan jeff.a.r...@gmail.com wrote: getFX(EUR/USD,from=2009-04-01) Indeed, with the date correctly specified, the function no longer generates errors. There is one issue though (similar to the one in the code posted by Henrique): getFX(EUR/PEN,from=2009-10-16) [1] EURPEN EURPEN EUR.PEN 2009-10-16 4.3197 getFX(PEN/EUR,from=2009-10-16) [1] PENEUR PENEUR ### doesn't coincide with the value on the web page PEN.EUR 2009-10-16 0.2377 #Friday, October 16, 2009 #1 Euro = 4.31973 Peruvian Nuevo Sol #1 Peruvian Nuevo Sol (PEN) = 0.23150 Euro (EUR) While EUR/PEN is correct, PEN/EUR seems wrong. I spotted this in other currencies, too. If you are looking for additional FX data, the FRED archive (St. Louis Fed) is very good as well... http://research.stlouisfed.org/fred2/categories/94 getSymbols(DEXUSEU, src=FRED) Thank you for the pointer. Best Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install JGR manually?
(cc'ing JGR specific list) Hello On 10/15/09, Carl Witthoft c...@witthoft.com wrote: Here's the problem: on Windows, the 'jgr.exe' tool starts up by checking for a connecting to the 'net in order to grab the support packages. Well, we have machines at work that are not and never will be connected to the Internet. I tried manually installing all the packages (JGR, Rjava, etc) but the jgr.exe still tries to find a connection. Have you (manually) installed all JGR dependencies: iplots, JavaGD, etc.? What is the exact message that JGR pops out? Does JGR refuse to start at all? Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] : Question about correlation between data.
hi everybody, I'm a student, and I'm new using R! I'm looking for statistical help hoping somebody can answer me! This is my problem: I have 2 temporal series. The firstone is a series of mesured data (height of monitorated points), the second is a series of temperature (in Celsius degree). Using Matlab I have built the two graphs (Measured Data - Time Temperature - Time). Looking those graphs I can surely say that there is a clear correlation beetween theme, and also that the measured data are surely influenced by the variations of temperature. Unfortunately my statistical knowledges are not that large so using R seems quite difficult to me. My question is: is there a code already written the can compare the 2 temporal series and can find the correlation between the data??? And also: is there a code that can correct the Measured Data from the influence of temperature and return a clean data??? if you want I can send you the .txt file with data. Thanks Antonio Gonelli - University of Ferrara - Italy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package installation from source
Dear all I noticed from NEWS 2.11.0,dev SIGNIFICANT USER-VISIBLE CHANGES o Packages must have been installed under R 2.10.0 or later, as the current help system is the only one now supported. So I tried to follow instructions in manual, Duncan Murdoch presentation and help pages to prepare and accomplish installation of a set of functions I use. However in R 2.11.0dev and too in R 2.10.0dev I was not able to finish it. I installed Rtools210 I opened commander window and or R program with --vanilla option. I coppied all my functions and data to R and run package.skeleton package.skeleton(list=ls(), name=fun) Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './fun/Read-and-delete-me'. Package structure seems to be ok list.files(D:/temp/fun, recursive=TRUE) [1] data/modely1.rda data/modely2.rda data/modely3.rda [4] data/stand.rda DESCRIPTION man/addLine.Rd [7] man/azce.Rd man/bayerf.h.Rd man/bayerf.Rd snip [112] R/tridy.RR/trigrid.R R/truehist.R [115] R/voda.tlak.RR/vodiv2rozkl.R R/warmcold.R [118] R/weighted.mean.RR/weighted.var.R R/write.excel.R [121] Read-and-delete-me I changed PATH by PATH = D:\programy\Rtools\bin; D:\programy\Rtools\perl\bin; D:\programy\Rtools\MinGW\bin; D:\programy\R-2.10.0dev\bin; %PATH% When I try to do install.packages(D:/temp/fun, repos=NULL, type=source) sh nenˇ n zvem vnitýnˇho ani vnŘjçˇho pýˇkazu, spustiteln‚ho programu nebo d vkov‚ho souboru. Warning message: In install.packages(D:/temp/fun, repos = NULL, type = source) : installation of package 'D:/temp/fun' had non-zero exit status which basically tells that sh is not a name of command, exe or bat file. The same I get when I try to run D:\temp R CMD INSTALL fun Please can you give me any suggestion for making installation work. I remember when going from 1.9 to 2.0 versions it was also necessary to install my bunch of functions, however at that time I need to use something like make, make install or so and gather all necessary programs myself. Best regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence intervals and p-values of ARMAX model
Is there any way to get confidence intervals and/or p-values in the ARMAX model? I tried the search using: ?ar ?arma ?arima But, AFAIK, ar and arma don't accept exogenous (independent) variables, and arima(x, order, xreg) does not return statistics on the xreg terms. BTW, just in case someone tries to get this information from Wikipedia, the entries in articles Autoregressive model, Autoregressive moving average model, and Autoregressive integrated moving average that mention R were written by me - so there's no point in repeating them (but corrections are welcome!) :-) Alberto Monteiro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on fisher.test()
And indeed Agresti's book (first edition on the help page, also has a 2002 second edition) is a good source for the 'minutiae'. I find the description in his 1992 survey article even better, especially with regard to the noncentral hypergeometric distribution: @article{Agresti:92, Author = {Agresti, Alan}, Journal = {Statistical Science}, Number = 1, Pages = {131--153}, Source = {JSTOR}, Title = {A Survey of Exact Inference for Contingency Tables}, Volume = 7, Year = 1992} Best regards, Stefan Evert [ stefan.ev...@uos.de | http://purl.org/stefan.evert ] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] currency conversion function?
On Fri, Oct 16, 2009 at 8:04 AM, Liviu Andronic landronim...@gmail.com wrote: On 10/15/09, Jeff Ryan jeff.a.r...@gmail.com wrote: getFX(EUR/USD,from=2009-04-01) Indeed, with the date correctly specified, the function no longer generates errors. There is one issue though (similar to the one in the code posted by Henrique): getFX(EUR/PEN,from=2009-10-16) [1] EURPEN EURPEN EUR.PEN 2009-10-16 4.3197 getFX(PEN/EUR,from=2009-10-16) [1] PENEUR PENEUR ### doesn't coincide with the value on the web page PEN.EUR 2009-10-16 0.2377 #Friday, October 16, 2009 #1 Euro = 4.31973 Peruvian Nuevo Sol #1 Peruvian Nuevo Sol (PEN) = 0.23150 Euro (EUR) While EUR/PEN is correct, PEN/EUR seems wrong. I spotted this in other currencies, too. When I look on the site now: onversion Table: PEN to EUR (Interbank rate) Time period: 10/10/09 to 10/16/09. Daily averages: 10/10/2009,0.23960 10/11/2009,0.24120 10/12/2009,0.24120 10/13/2009,0.24170 10/14/2009,0.24090 10/15/2009,0.23630 10/16/2009,0.23770 Which seems to match... This is from the FXHistory section. getFX(PEN/EUR, from=2009-10-16) [1] PENEUR PENEUR PEN.EUR 2009-10-16 0.2377 Is it possible you are looking at a 'live' quote? Or maybe just getting an update at precisely the wrong time? Given the 24-7 nature of the market, I think this just reflects when they choose to mark the price for history quotes. Jeff If you are looking for additional FX data, the FRED archive (St. Louis Fed) is very good as well... http://research.stlouisfed.org/fred2/categories/94 getSymbols(DEXUSEU, src=FRED) Thank you for the pointer. Best Liviu -- Jeffrey Ryan jeffrey.r...@insightalgo.com ia: insight algorithmics www.insightalgo.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two graphs 1 x-axis
You can use ggplot2. library(ggplot2) a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) dframe = data.frame(a,b,c);dframe melt.dframe - melt(dframe, id= a);melt.dframe qplot(a,value,data=melt.dframe) + facet_grid(variable~.,scales=free) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA --- On Fri, 10/16/09, Duijvesteijn, Naomi naomi.duijveste...@ipg.nl wrote: From: Duijvesteijn, Naomi naomi.duijveste...@ipg.nl Subject: [R] two graphs 1 x-axis To: r-help@r-project.org r-help@r-project.org Date: Friday, October 16, 2009, 3:22 AM Dear R-people I have a question concerning plotting graphs. Here an example dataset a-c(1,2,3,4,5,6) b-c(3,5,4,6,1,1) c-c(1,1,1,1,1,1) d-as.data.frame(cbind(a,b,c)) plot.new() plot(d$a, d$b, col=red) par(new=TRUE) plot(d$a,d$c, col=red, pch=|) What I would want is to plot de second plot under the first plot. So not in the the first plot. There is a way to divide your graph in 2 or 3 parts and use the same x-axis but I do not seem to get it right. Could somebody help me out? Thanks in advance! Regards, Naomi Disclaimer: De informatie opgenomen in dit bericht (en bijlagen) kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te gebruiken, de afzender direct te informeren en het bericht te vernietigen. Aan dit bericht kunnen geen rechten of plichten worden ontleend. -- Disclaimer: The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, you are expected not to use the contents herein, to notify the sender immediately and to destroy the message. No rights can be derived from this message. Please consider the environment before printing this email -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using a custom color sequence for image()
Hi, I'd like to use a custom color sequence (black - low values, green - high values) in am image() plot. While I can specify colors (say a sequence of grays) to the col argument, the ordering is getting messed up. I have two questions: 1. How can I get a sequence of say 256 colors starting from black and ending in green? 2. How is this specified to image() such that it uses the colors in the proper ordering? Thanks, -- Rajarshi Guha NIH Chemical Genomics Center [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [stats-rosuda-devel] how to install JGR manually?
On Oct 16, 2009, at 9:14 , Liviu Andronic wrote: (cc'ing JGR specific list) Yes, thanks, stats-rosuda-devel is where this post should go. Carl, On 10/15/09, Carl Witthoft c...@witthoft.com wrote: Here's the problem: on Windows, the 'jgr.exe' tool starts up by checking for a connecting to the 'net in order to grab the support packages. Well, we have machines at work that are not and never will be connected to the Internet. I tried manually installing all the packages (JGR, Rjava, etc) but the jgr.exe still tries to find a connection. What this means that you have either installed them in the wrong library or you have not installed them all. The launcher actually tells you exactly which packages are missing so it should be easy to install them manually. If in doubt, remove your preferences file .JGRprefsrc (in your home). If you're still stuck, please tell us exactly your setup (where you installed R, the packages etc.). Also try to run JGR with --debug and it will create a file C:\JGRdebug.txt with additional information that may be helpful. Cheers, Simon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating a stochastic matrix with a specified second dominant eigenvalue
A valiant attempt, Albyn! Unfortunately, the matrix B is not guaranteed to be a stochastic matrix. In fact, it is not even guaranteed to be a real matrix. Your procedure can generate a B that contains negative elements or even complex elements. M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .7 Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1. 0.7000 -0.04436574 apply(B,1,sum) [1] 1 1 1 B [,1] [,2] [,3] [1,] 0.77737077 0.3340768 -0.1114476 [2,] 0.20606226 0.2601840 0.5337537 [3,] 0.08326022 0.2986603 0.6180794 Note that B[1,3] is negative. Another example: M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .95 Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1.-0.i 0.9500+0.i -0.09348883-0.02904173i apply(B,1,sum) [1] 1+0i 1-0i 1+0i B [,1] [,2] [,3] [1,] 0.6558652-0.550613i 0.2408879+0.2212234i 0.1032469+0.3293896i [2,] 0.1683119+1.594515i 0.6954317-0.7378503i 0.1362564-0.8566647i [3,] 0.2812210-2.462385i 0.2135648+1.2029636i 0.5052143+1.2594216i Note that B has complex elements. So, I took your algorithm and embedded it in an iterative procedure to keep repeating your steps until it found a B matrix that is real and non-negative. Here is that function: e2stochMat - function(N, e2, maxiter) { iter - 0 while (iter = maxiter) { iter - iter + 1 M - matrix(runif(N*N), nrow=N) M - M / apply(M,1,sum) e - eigen(M) e$values[2] -e2 Q - e$vectors B - Q %*% diag(e$values) %*% solve(Q) real - all (abs(Im(B)) 1.e-16) positive - all (Re(B) 0) if (real positive) break } list(stochMat=B, iter=iter) } e2stochMat(N=3, e2=0.95, maxiter=1) # works e2stochMat(N=5, e2=0.95, maxiter=1) # fails This works for very small N, say, N = 3, but it fails for larger N. The probability of success is a decreasing function of N and e2. So, the algorithm fails for large N and for values of e2 close to 1. Thanks for trying. Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: Albyn Jones [mailto:jo...@reed.edu] Sent: Thursday, October 15, 2009 6:56 PM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] Generating a stochastic matrix with a specified second dominant eigenvalue I just tried the following shot in the dark: generate an N by N stochastic matrix, M. I used M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .7 (pick your favorite lambda, you may need to fiddle with the others to guarantee this is second largest.) Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1. 0.7000 -0.08518772 apply(B,1,sum) [1] 1 1 1 I haven't proven that this must work, but it seems to. Since you can verify that it worked afterwards, perhaps the proof is in the pudding. albyn On Thu, Oct 15, 2009 at 06:24:20PM -0400, Ravi Varadhan wrote: Hi, Given a positive integer N, and a real number \lambda such that 0 \lambda 1, I would like to generate an N by N stochastic matrix (a matrix with all the rows summing to 1), such that it has the second largest eigenvalue equal to \lambda (Note: the dominant eigenvalue of a stochastic matrix is 1). I don't care what the other eigenvalues are. The second eigenvalue is important in that it governs the rate at which the random process given by the stochastic matrix converges to its stationary distribution. Does anyone know of an algorithm to do this? Thanks for any help, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan. html http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] negative length vectors are not allowed in wilcox.exact() and perm.test()
Dear R friends, I want to compare two datasets and I get the message Error in .Call(cpermdist2, ma = as.integer(m), mb = as.integer(col), : negative length vectors are not allowed after specifying the exact test. I'm using the exactRankTests package. Do you suggest me using the coin library, or is there anything wrong with my data? Kind regards, David -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a custom color sequence for image()
Rajarshi Guha asked: Hi, I'd like to use a custom color sequence (black - low values, green - high values) in am image() plot. While I can specify colors (say a sequence of grays) to the col argument, the ordering is getting messed up. I have two questions: 1. How can I get a sequence of say 256 colors starting from black and ending in green? 2. How is this specified to image() such that it uses the colors in the proper ordering? Does image(x, y, z) (whatever are the x, y and z) work correctly? If so, probably this is what you want: image(x, y, z, nlevels=256, col=rgb(0, seq(0, 1, length=256), 0)) Explanation: rgb(0, seq(0, 1, length=256), 0) creates a vector of colours that begin with black = #0 and ends up with green = #00FF00. Parameter nlevels = 256 forces the image to use all colours. Alberto Monteiro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package installation from source
On 10/16/2009 9:31 AM, Petr PIKAL wrote: Dear all I noticed from NEWS 2.11.0,dev SIGNIFICANT USER-VISIBLE CHANGES o Packages must have been installed under R 2.10.0 or later, as the current help system is the only one now supported. So I tried to follow instructions in manual, Duncan Murdoch presentation and help pages to prepare and accomplish installation of a set of functions I use. However in R 2.11.0dev and too in R 2.10.0dev I was not able to finish it. I installed Rtools210 I opened commander window and or R program with --vanilla option. I coppied all my functions and data to R and run package.skeleton package.skeleton(list=ls(), name=fun) Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './fun/Read-and-delete-me'. Package structure seems to be ok list.files(D:/temp/fun, recursive=TRUE) [1] data/modely1.rda data/modely2.rda data/modely3.rda [4] data/stand.rda DESCRIPTION man/addLine.Rd [7] man/azce.Rd man/bayerf.h.Rd man/bayerf.Rd snip [112] R/tridy.RR/trigrid.R R/truehist.R [115] R/voda.tlak.RR/vodiv2rozkl.R R/warmcold.R [118] R/weighted.mean.RR/weighted.var.R R/write.excel.R [121] Read-and-delete-me I changed PATH by PATH = D:\programy\Rtools\bin; D:\programy\Rtools\perl\bin; D:\programy\Rtools\MinGW\bin; D:\programy\R-2.10.0dev\bin; %PATH% When I try to do install.packages(D:/temp/fun, repos=NULL, type=source) sh nenˇ n zvem vnitýnˇho ani vnŘjçˇho pýˇkazu, spustiteln‚ho programu nebo d vkov‚ho souboru. Warning message: In install.packages(D:/temp/fun, repos = NULL, type = source) : installation of package 'D:/temp/fun' had non-zero exit status which basically tells that sh is not a name of command, exe or bat file. There should be a sh.exe in d:\programy\Rtools\bin if that's where you installed the Rtools. Are you sure you have the path set the way you think you do? Remember that PATH is an environment variable, and environment variable settings are local: so setting the PATH in a CMD window has no effect on R or any other command window. You need to use the Control Panel to set the default system path, and then restart any process that needs to see it. Duncan Murdoch The same I get when I try to run D:\temp R CMD INSTALL fun Please can you give me any suggestion for making installation work. I remember when going from 1.9 to 2.0 versions it was also necessary to install my bunch of functions, however at that time I need to use something like make, make install or so and gather all necessary programs myself. Best regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating a stochastic matrix with a specified second dominant eigenvalue
On Oct 15, 2009, at 6:24 PM, Ravi Varadhan wrote: Hi, Given a positive integer N, and a real number \lambda such that 0 \lambda 1, I would like to generate an N by N stochastic matrix (a matrix with all the rows summing to 1), such that it has the second largest eigenvalue equal to \lambda (Note: the dominant eigenvalue of a stochastic matrix is 1). I don't care what the other eigenvalues are. The second eigenvalue is important in that it governs the rate at which the random process given by the stochastic matrix converges to its stationary distribution. Does anyone know of an algorithm to do this? I surely don't. My linear algebra is a distant and not entirely pleasant memory. I went searching and ended up at Google books reading a couple of texts on real analysis and stochastic matrices. Both citations reminded me that a matrix induces or implies a polynomial form over powers of some matrix (? constructed out of the basis vectors?) with the eigenvalues representing the coefficients. I believe that form can be solved for, but I don't know the process. It made me wonder if constructing a matrix out of powers of an appropriate stochastic basis would deliver the sought for resultant. Such a construction is outside my abilities. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative length vectors are not allowed in wilcox.exact() and perm.test()
David Croll wrote: I want to compare two datasets and I get the message Error in .Call(cpermdist2, ma = as.integer(m), mb = as.integer(col), : negative length vectors are not allowed after specifying the exact test. I'm using the exactRankTests package. Do you suggest me using the coin library, or is there anything wrong with my data? Hard to say. Reproducible example please ... ? -- View this message in context: http://www.nabble.com/%22negative-length-vectors-are-not-allowed%22-in-wilcox.exact%28%29-and-perm.test%28%29-tp25926434p25927290.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative length vectors are not allowed in wilcox.exact() and perm.test()
David Croll wrote: I want to compare two datasets and I get the message Error in .Call(cpermdist2, ma = as.integer(m), mb = as.integer(col), : negative length vectors are not allowed after specifying the exact test. I'm using the exactRankTests package. Do you suggest me using the coin library, or is there anything wrong with my data? Yes. You did not show your data. Dieter -- View this message in context: http://www.nabble.com/%22negative-length-vectors-are-not-allowed%22-in-wilcox.exact%28%29-and-perm.test%28%29-tp25926434p25927297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply() and using factor() on a factor
Thank you Mohamed and Bill for your replies. (I did not send the data because it is unwieldy.) Yes Bill, the issue arises directly from what you had guessed. I was working with a subset of the data (which implicitly had factors for the complete data set). On this, what is the best way take a subset of the data which ignores these extraneous factors? log-data.frame(Flag=1:2, RequestID=factor(letters[1:2],levels=letters[1:10])) log2 -subset(log, RequestID==a) levels(log2$RequestID) [1] a b c d e f g h i j In other words, how do I take a subset which yields a as the only level for log2? Alex -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Thursday, October 15, 2009 11:59 PM To: Alexander Peterhansl; r-help@r-project.org Subject: RE: [R] tapply() and using factor() on a factor -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alexander Peterhansl Sent: Thursday, October 15, 2009 2:50 PM To: r-help@r-project.org Subject: [R] tapply() and using factor() on a factor Dear List, Shouldn't result1 and result2 be equal in the following case? Note that log$RequestID is a factor. That is, is.factor(log$RequestID) yields TRUE. result1 - tapply(log$Flag,factor(log$RequestID),sum) result2 - tapply(log$Flag,log$RequestID,sum) Showing us the output of dput(log) (or str(log) and summary(log)) would let people discover the problem more readily. Since you didn't I'll guess what the dataset may contain. If log$RequestID is a factor with lots of unused levels tapply will output an NA for each unused level. factor(log$RequestID) will create a new set of levels, only those actually used, so tapply will not be forced to fill those spots with NA's. E.g., log-data.frame(Flag=1:2, RequestID=factor(letters[1:2], levels=letters[1:10])) tapply(log$Flag, log$RequestID, sum) a b c d e f g h i j 1 2 NA NA NA NA NA NA NA NA tapply(log$Flag, factor(log$RequestID), sum) a b 1 2 I suppose tapply(X,INDEX,FUN) could call FUN(X[0]) to see how to fill the cells with no data behind them, but it doesn't. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Yet, when I summarize the output, I get the following: summary(result1) Min.1st Qu. Median Mean 3rd Qu.Max. 11.00 11.00 11.00 26.06 11.00 101.00 summary(result2) Min. 1st Qu. Median Mean 3rd Qu.Max.NA's 11.00 11.00 11.0026.06 11.00 101.00 978.00 Why does result2 have 978 NA's? Any help on this would be appreciated. Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determining a linear model based on a factor
I guess I should disclose up front that am not a statistician by schooling, but I am intersted in getting the terminology correct so please correct it if I butcher it too badly. I have been able to very easily build a linear model showing the correlation between two variables, e.g. year built and square footage: HomeSqFt_lm-lm(as.numeric(as.character(SqFootage)) ~ as.numeric(as.character(Home_Year_Built)), data=Home_DF) summary(HomeSqFt_lm) I would like to, however, be able to use lm to produce the a linear model using the same variables for different neighborhoods, e.g. square footage vs. build year for neighborhood 1, etc. Is that possible using the lm() command?An example of my dataset is shown below. sample_size-200 Home_SqFootage-sample(1200:3600, size=sample_size, rep=T) Home_Year_Built-sample(1989:2008, size=sample_size, rep=T) Home_Year_Sold-sample(1989:2008, size=sample_size, rep=T) Neighborhood-sample(1:4, size=sample_size, rep=T) Home_DF-data.frame(SqFootage=Home_SqFootage, YearBuilt=as.character(Home_Year_Built), YearSold=as.character(Home_Year_Sold), Neighborhood=Neighborhood) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrixes as data
Hola! I am working on a problem where data points are (square) matrices. Is there a way to make a vector of matrices, such that it can be stored in a data.frame? Can that be done with S3? or do I have to learn S4 objects methods? Kjetil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Proper syntax for using varConstPower in nlme
Michael A. Gilchrist wrote: - nlme(Count ~ quad.PBMC.model(aL, aN, T0), + data = tissueData, + weights = varConstPower(form =~ Count), + start = list( fixed = c(rep(1000, 8), -2, -2) ), + fixed = list(T0 ~ TypeTissue-1, aL ~ 1, aN ~ 1), + random = aL + aN ~ 1|Tissue + ) Error in MEestimate(nlmeSt, grpShrunk) : Singularity in backsolve at level 0, block 1 You could use varPower(form=~fitted()) (the default, also varPower()). In my experience this runs into some limitation quickly with nlme, because some boundary conditions make convergence fail. Try varPower(fixed = 0.5) first and play with the number. You should only use varConstPower when you have problems with values that cover a large range, coming close to zero, which could make varPower go havoc. Always do a plot of the result; the default plot gives you residual, and some indication how to proceed. Dieter -- View this message in context: http://www.nabble.com/Proper-syntax-for-using-varConstPower-in-nlme-tp25914277p25927578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrixes as data
On Fri, Oct 16, 2009 at 4:36 PM, Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com wrote: Hola! I am working on a problem where data points are (square) matrices. Is there a way to make a vector of matrices, such that it can be stored in a data.frame? Can that be done with S3? or do I have to learn S4 objects methods? If the matrices are all the same size then you could store them in an array, which is essentially a 3 or more dimensional matrix. Otherwise, you can store them in a list, and get them by number: foo = list(matrix(1:9,3,3),matrix(1:16,4,4)) foo[[1]] foo[[2]] and so forth. You'll only need to create new object classes (with S3 or S4) if you want special behaviour of vectors of these things (such as plot(foo) doing something sensible). With S3 it's easy: class(foo)=squareMatrixVector plot.squareMatrixVector=function(x,y,...){ cat(ouch\n) } plot(foo) ouch Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] populating an array
R doesn't access arrays like C, use [i,j] to access a 2-d array, e.g.: my_array - array(0,dim=c(2,2)) for(i in seq(1,2,by=1)){ + for(j in seq(1,2,by=1)){ + my_array[i,j] = i+j + } + } my_array [,1] [,2] [1,]23 [2,]34 tdm wrote: Hi, Can someone please give me a pointer as to how I can set values of an array? Why does the code below not work? my_array - array(dim=c(2,2)) my_array[][] = 0 my_array [,1] [,2] [1,]00 [2,]00 for(i in seq(1,2,by=1)){ for(j in seq(1,2,by=1)){ my_array[i][j] = 5 } } Warning messages: 1: In my_array[i][j] = 5 : number of items to replace is not a multiple of replacement length 2: In my_array[i][j] = 5 : number of items to replace is not a multiple of replacement length my_array [,1] [,2] [1,]50 [2,]50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test model for singular gradient matrix
Hello, I am working with a real-time hydrologic modeling system, and I am using R (R batch script on Linux) to create a non-linear relationship (exponential) between observed data. I want to extract the non-linear coefficients (“b” and “m”) if the relationship can be created, if the relationship cannot be created I will use default “b” and “m” coefficients. I keep getting an error of singular gradient matrix (see below). I want to test whether I can create the relationship (because if I cannot the script crashes) and use the model extracted coefficients or use default coefficients. Model in R batch script: fit.nls - nls(P~(b*exp(m*Z)), start=list(m=0.015,b=0.017), control=list(maxiter=200)) Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates ** the initial parameter values are also the default Question: 1) Is there a way to test fit.nls (or the data) prior to see if this error occurs. 2) Would this test be set up as an if statement? if (fit is good) {proceed model coefficients} else {use default coefficients}? Thanks for all your help, Doug *** Douglas M. Hultstrand Ph.D. Candidate Earth Sciences Watershed Science Program Colorado State University http://www.cnr.colostate.edu/~dmhultst/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing Embedded Null characters from text/html
[David contacted me directly, so I am sending my off-line reply to the list just for the record in case others encounter a simple problem.] Hi David. No problem contacting me at all. I saw your mail at one point on the mailing list, but didn't have a chance to respond. Indeed, it seems like there is some embedded null in the string. I need to investigate more about what is happening with the encoding, etc. and whether it is on the RCurl or R side. But for the meantime, the following two approaches seem to get around the problem: 1) just use htmlParse(url) on the URL directly, i.e. don't use RCurl. We only need basic HTTP facilities and htmlParse() (or more specifically libxml2) provides these for us. 2) If you need RCurl to manage the connection and communication for the HTTP request, use txt = rawToChar(getURLContent(url, binary = TRUE)) # You'll see a warning about truncation htmlParse(txt, asText = TRUE) BTW, use htmlTreeParse() or htmlParse(). I use the latter and then XPath expression via getNodeSet() or xpathApply() to extract content from the document. HTH, D. David Young wrote: Hi, I'm trying to download some data from the web and am running into problems with 'embedded null' characters. These seem to indicate to R that it should stop processing the page so I'd like to remove them. I've been looking around and can't seem to identify exactly what the character is and consequently how to remove it. # THE CODE WORKS ON THIS PAGE library(RCurl) library(XML) theurl - http://en.wikipedia.org/wiki/Brazil_national_football_team; webpage - getURL(theurl) # BUT DOES NOT WORK HERE DUE TO EMBEDDED NULL CHARACTERS theurl - http://screen.yahoo.com/b?pr=1/s=nmdb=stocksvw=0b=21; webpage - getURL(theurl) Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : Failed writing body (1371 != 1461) In addition: Warning messages: 1: In curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : truncating string with embedded nul: 'ttp://finance. ## I DELETED SOME HERE FOR BREVITY## al\nData and [... truncated] 2: In curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : only read 1371 of the 1461 input bytes/characters # THIS CODE COPIES THE PROBLEMATIC PAGE TO MY COMPUTER destfile-file:///C:/projects/stock data/data/test.htm download.file ( theurl , destfile , quiet = TRUE ) # WHICH LEAVES ME WITH JUST IDENTIFYING WHAT CHARACTER IS CAUSING THE # PROBLEM AND THEN GETTING RID OF IT. I'd appreciate any advice. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply() and using factor() on a factor
On Oct 16, 2009, at 11:33 AM, Alexander Peterhansl wrote: Thank you Mohamed and Bill for your replies. (I did not send the data because it is unwieldy.) Yes Bill, the issue arises directly from what you had guessed. I was working with a subset of the data (which implicitly had factors for the complete data set). On this, what is the best way take a subset of the data which ignores these extraneous factors? log-data.frame(Flag=1:2, RequestID=factor(letters[1:2],levels=letters[1:10])) log2 -subset(log, RequestID==a) levels(log2$RequestID) [1] a b c d e f g h i j log2$RequestID - factor(log2$RequestID) You might think that log2 -subset(log, RequestID==a, drop=TRUE) might do that task, but it clearly doesn't. -- DW In other words, how do I take a subset which yields a as the only level for log2? Alex -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Thursday, October 15, 2009 11:59 PM To: Alexander Peterhansl; r-help@r-project.org Subject: RE: [R] tapply() and using factor() on a factor -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alexander Peterhansl Sent: Thursday, October 15, 2009 2:50 PM To: r-help@r-project.org Subject: [R] tapply() and using factor() on a factor Dear List, Shouldn't result1 and result2 be equal in the following case? Note that log$RequestID is a factor. That is, is.factor(log$RequestID) yields TRUE. result1 - tapply(log$Flag,factor(log$RequestID),sum) result2 - tapply(log$Flag,log$RequestID,sum) Showing us the output of dput(log) (or str(log) and summary(log)) would let people discover the problem more readily. Since you didn't I'll guess what the dataset may contain. If log$RequestID is a factor with lots of unused levels tapply will output an NA for each unused level. factor(log$RequestID) will create a new set of levels, only those actually used, so tapply will not be forced to fill those spots with NA's. E.g., log-data.frame(Flag=1:2, RequestID=factor(letters[1:2], levels=letters[1:10])) tapply(log$Flag, log$RequestID, sum) a b c d e f g h i j 1 2 NA NA NA NA NA NA NA NA tapply(log$Flag, factor(log$RequestID), sum) a b 1 2 I suppose tapply(X,INDEX,FUN) could call FUN(X[0]) to see how to fill the cells with no data behind them, but it doesn't. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Yet, when I summarize the output, I get the following: summary(result1) Min.1st Qu. Median Mean 3rd Qu.Max. 11.00 11.00 11.00 26.06 11.00 101.00 summary(result2) Min. 1st Qu. Median Mean 3rd Qu.Max.NA's 11.00 11.00 11.0026.06 11.00 101.00 978.00 Why does result2 have 978 NA's? Any help on this would be appreciated. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Division of data frame and deletion of values from column
Hi all, I guess this might be an easy question, but I've searched multiple help pages without finding any answear... so now I put my trust in you! I have a data frame (36 variables and 556 observations). One column contains three factors, and I would like to divide the data frame into three new ones, based on the value of the factors, thereby having only one value for all elements of the particular column in each of the data frames. The reason is that I later will create plots and do statistical analyzes on these data frames, and I don't want those factors affecting the result. ID Weight Age_days ... 1 18 76.1 106 2 19 77.0 175 3 20 78.1 121 4 21 78.2 121 5 22 78.8 106 6 23 76.3 106 . . . I also have another column containing several factors, of which I would like to exclude one (get NA instead). ID Weight Age_days Value_ID ... 1 18 76.1 106 high 2 19 77.0 175 low 3 20 78.1 121middle 4 21 78.2 121 high 5 22 78.8 106 high 6 23 76.3 106number -- exclude 7 24 76.9 175 low . . . I really hope someone could help me, though you might think it's too easy... Best regards, Joel _ Hitta kärleken nu i vår! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrixes as data
Thanks. The points of having the column of matrices (all the same dimension) in a data.frame, is that there are also other data, each matrix is at a location, so there are geographical coordinates and possibly other measurements at the same location. Kjetil On Fri, Oct 16, 2009 at 12:46 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Fri, Oct 16, 2009 at 4:36 PM, Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com wrote: Hola! I am working on a problem where data points are (square) matrices. Is there a way to make a vector of matrices, such that it can be stored in a data.frame? Can that be done with S3? or do I have to learn S4 objects methods? If the matrices are all the same size then you could store them in an array, which is essentially a 3 or more dimensional matrix. Otherwise, you can store them in a list, and get them by number: foo = list(matrix(1:9,3,3),matrix(1:16,4,4)) foo[[1]] foo[[2]] and so forth. You'll only need to create new object classes (with S3 or S4) if you want special behaviour of vectors of these things (such as plot(foo) doing something sensible). With S3 it's easy: class(foo)=squareMatrixVector plot.squareMatrixVector=function(x,y,...){ cat(ouch\n) } plot(foo) ouch Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Division of data frame and deletion of values from column
On Oct 16, 2009, at 12:16 PM, Joel Fürstenberg-Hägg wrote: Hi all, I guess this might be an easy question, but I've searched multiple help pages without finding any answear... so now I put my trust in you! I have a data frame (36 variables and 556 observations). One column contains three factors, and I would like to divide the data frame into three new ones, based on the value of the factors, thereby having only one value for all elements of the particular column in each of the data frames. The reason is that I later will create plots and do statistical analyzes on these data frames, and I don't want those factors affecting the result. ID Weight Age_days ... 1 18 76.1 106 2 19 77.0 175 3 20 78.1 121 4 21 78.2 121 5 22 78.8 106 6 23 76.3 106 . . . ?split I also have another column containing several factors, of which I would like to exclude one (get NA instead). ID Weight Age_days Value_ID ... 1 18 76.1 106 high 2 19 77.0 175 low 3 20 78.1 121middle 4 21 78.2 121 high 5 22 78.8 106 high 6 23 76.3 106number -- exclude 7 24 76.9 175 low . . . ?is.na# which has an is.na(- capacity I really hope someone could help me, though you might think it's too easy... Best regards, Joel David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalization of tabulate()
If I have correctly understood what Robin wants, I don't think Gabor's elaborate solution is necessary (though I grant that it may be more general). It's also slow due to the apply's. A more straightforward and faster approach is to convert the rows to individual character strings via paste() and then use table() and match() to get your counts. ## Example data tbl - matrix(sample(0:9,24,rep=TRUE),ncol=3) ## sample space obs - tbl[sample(1:6,12,rep=TRUE),] ## sample ## Now the code ## Use paste to convert each row into a character string tblRow - do.call(paste,c(data.frame(tbl),sep=.)) obsRow - do.call(paste,c(data.frame(obs),sep=.)) d - rep(0,nrow(tbl)) ## initialize vector of counts counts - table(obsRow) ## Let (the fast) table() do the work d[match(names(counts),tblRow)] - counts ## vector of counts ## here are results from a sample run: tbl [,1] [,2] [,3] [1,]634 [2,]060 [3,]427 [4,]033 [5,]902 [6,]789 [7,]753 [8,]718 obs [,1] [,2] [,3] [1,]902 [2,]033 [3,]634 [4,]902 [5,]789 [6,]060 [7,]427 [8,]060 [9,]902 [10,]902 [11,]789 [12,]789 d [1] 1 2 1 1 4 3 0 0 There may well be more elegant ways to do this, too. Cheers, Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gabor Grothendieck Sent: Friday, October 16, 2009 4:27 AM To: Robin Hankin Cc: r-help@r-project.org Subject: Re: [R] generalization of tabulate() Using the generalized inner product defined in this post: https://www.stat.math.ethz.ch/pipermail/r-help/2006-July/109311.html try this: cbind(S, d = rowSums(inner(S, obs, identical))) On Fri, Oct 16, 2009 at 4:29 AM, Robin Hankin rk...@cam.ac.uk wrote: Hi I want a generalization of tabulate() which works on rows of a matrix. Suppose I have an integer matrix 'observation': observation y1 y2 y3 1 4 0 1 4 0 2 0 3 4 1 0 0 5 0 0 1 4 2 0 3 Each row corresponds to a (multivariate) observation. Note that the first two rows are identical: this means that data c(1,4,0) was observed twice. Now suppose I can list the sample space: S [1,] 5 0 0 [2,] 4 1 0 [3,] 3 2 0 [4,] 2 3 0 [5,] 1 4 0 [6,] 0 5 0 [7,] 4 0 1 [8,] 3 1 1 [9,] 2 2 1 [10,] 1 3 1 [11,] 0 4 1 [12,] 3 0 2 [13,] 2 1 2 [14,] 1 2 2 [15,] 0 3 2 [16,] 2 0 3 [17,] 1 1 3 [18,] 0 2 3 [19,] 1 0 4 [20,] 0 1 4 [21,] 0 0 5 (thus each row corresponds to a point in my sample space). Now what I need to do is to construct a new matrix, which uses the 'observation' matrix above, which is a sort of table: desired y1 y2 y3 d [1,] 5 0 0 0 [2,] 4 1 0 1 [3,] 3 2 0 0 [4,] 2 3 0 0 [5,] 1 4 0 2 [6,] 0 5 0 1 [7,] 4 0 1 0 [8,] 3 1 1 0 [9,] 2 2 1 0 [10,] 1 3 1 0 [11,] 0 4 1 0 [12,] 3 0 2 0 [13,] 2 1 2 0 [14,] 1 2 2 0 [15,] 0 3 2 0 [16,] 2 0 3 2 [17,] 1 1 3 0 [18,] 0 2 3 0 [19,] 1 0 4 0 [20,] 0 1 4 1 [21,] 0 0 5 0 Thus the 'd' column counts the number of times that each row occurs in variable 'observation'. So desired[5,4]=2 because the observation corresponding to desired[5,1:3] (viz c(1,4,0)) occurred twice. And desired[1,4]=0 because the observation corresponding to desired[1,1:3] (viz c(5,0,0)) did not occur once (it was not observed). In my application I have dim(S) ~= c(5,4e6). I've tried merge(), stack(), reshape(), but the best I can do is the (derisory): require(partitions) obs - matrix(as.integer(c( 1, 4, 0, 1, 4, 0, 2, 0, 3, 4, 1, 0, 0, 5, 0, 0, 1, 4, 2, 0, 3 )),ncol=3,byrow=TRUE) S - t(compositions(5,3)) d - rep(0,nrow(S)) for(i in seq_len(nrow(obs))){ for(j in seq_len(nrow(S))){ if(all(obs[i,,drop=TRUE] == S[j,,drop=TRUE])){ d[j] - d[j]+1 } } } S - cbind(S,d) Anyone got anything better before I try C? -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] Cannot calculate mean() for double vector
I've been using R recently to analyze some data, but I'm having a problem using the mean() function. I imported the original data set as a vector of integers, x and then calculated a exponential moving average of the data, x_ema. This part worked fine. Then, I tried to find the mean squared error between the original series and the moving average, using mse = mean((x - x_ema)^2). This gives N/A as a result, which seems to be the result of the mean function. When I run mean() on x_ema, which is of data type double, it always returns N/A. I can find the mean of the original integer data just fine, as well as for a simple test vector of 10 doubles, but it never seems to return a usable result for the x_ema vector. Is this because the vector is too long? (Both x and x_ema contain around 4 values). Am I running into some memory limit? I can do other calculations with x_ema just fine, it's only the mean() function that doesn't seem to work. All the information in the help seems to indicate that this operation should work without a problem. If it matters at all, I'm using R 2.9.2 running on Windows Vista. I'd appreciate any help or suggestions as to what might be wrong. Thank you, Reuben Bellika __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test model for singular gradient matrix
dmhultst wrote: Hello, I am working with a real-time hydrologic modeling system, and I am using R (R batch script on Linux) to create a non-linear relationship (exponential) between observed data. I want to extract the non-linear coefficients (“b” and “m”) if the relationship can be created, if the relationship cannot be created I will use default “b” and “m” coefficients. I keep getting an error of singular gradient matrix (see below). I want to test whether I can create the relationship (because if I cannot the script crashes) and use the model extracted coefficients or use default coefficients. Model in R batch script: fit.nls - nls(P~(b*exp(m*Z)), start=list(m=0.015,b=0.017), control=list(maxiter=200)) Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates ** the initial parameter values are also the default Question: 1) Is there a way to test fit.nls (or the data) prior to see if this error occurs. 2) Would this test be set up as an if statement? if (fit is good) {proceed model coefficients} else {use default coefficients}? The standard R approach to this is try(), i.e. something like result - it.nls - try(nls(P~(b*exp(m*Z)), start=list(m=0.015,b=0.017), control=list(maxiter=200))) if (inherits(result,try-error))) { values - default } else { values - coef(it.nls) } -- View this message in context: http://www.nabble.com/Test-model-for-singular-gradient-matrix-tp25927673p25928893.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset returning unexpected result
Steve Murray wrote: fa3c85ac-06ab-4635-9642-8909552c0...@comcast.net Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Bill=2C It seems to be 'character' - odd...! Not really very odd. I'll wager that your data was originally recorded in Excel or equivalent and that the column for 'latitude' was in 'general' format, which is MS's default and which very few Excel users seem to take the trouble to change to 'number' format. I'll further wager that somewhere in that column you have a non-number entry such as 'nil', '51..345' or even 'could not be measured because ...'. Don't laugh, I've see all of these and many more. That's why it's critical to issue a str(mydata) right after any data import. Cheers, Peter Ehlers str(int1901$Latitude) =A0chr [1:61537] 5.75 6.25 6.75 7.25 7.75 8.25 ... Thanks again=2C Steve What does str(int1901) show to be the type for Latitude? (I'm guessing it's a factor.) -- David Winsemius=2C MD Heritage Laboratories West Hartford=2C CT =20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating a stochastic matrix with a specified second dominant eigenvalue
ooops! I tested it on a 3x3, and a 4x4, and it worked both times. Then I ran it once more and pasted it in without looking carefully... Here's another half-baked idea. Generate a random graph with given degree distribution, turn the incidence matrix into a stochastic matrix (ie the transition probability is 1/d where d is the degree of the vertex). Now all we need is the connection between the degree distribution and the second eigenvalue :-) If you want lambda close to one, perhaps you could generate two graphs, then add vertices joining the two graphs until the second eigenvalue is the right size of course, this will give a random lambda, not exactly your chosen value. albyn On Fri, Oct 16, 2009 at 10:32:48AM -0400, Ravi Varadhan wrote: A valiant attempt, Albyn! Unfortunately, the matrix B is not guaranteed to be a stochastic matrix. In fact, it is not even guaranteed to be a real matrix. Your procedure can generate a B that contains negative elements or even complex elements. M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .7 Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1. 0.7000 -0.04436574 apply(B,1,sum) [1] 1 1 1 B [,1] [,2] [,3] [1,] 0.77737077 0.3340768 -0.1114476 [2,] 0.20606226 0.2601840 0.5337537 [3,] 0.08326022 0.2986603 0.6180794 Note that B[1,3] is negative. Another example: M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .95 Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1.-0.i 0.9500+0.i -0.09348883-0.02904173i apply(B,1,sum) [1] 1+0i 1-0i 1+0i B [,1] [,2] [,3] [1,] 0.6558652-0.550613i 0.2408879+0.2212234i 0.1032469+0.3293896i [2,] 0.1683119+1.594515i 0.6954317-0.7378503i 0.1362564-0.8566647i [3,] 0.2812210-2.462385i 0.2135648+1.2029636i 0.5052143+1.2594216i Note that B has complex elements. So, I took your algorithm and embedded it in an iterative procedure to keep repeating your steps until it found a B matrix that is real and non-negative. Here is that function: e2stochMat - function(N, e2, maxiter) { iter - 0 while (iter = maxiter) { iter - iter + 1 M - matrix(runif(N*N), nrow=N) M - M / apply(M,1,sum) e - eigen(M) e$values[2] -e2 Q - e$vectors B - Q %*% diag(e$values) %*% solve(Q) real - all (abs(Im(B)) 1.e-16) positive - all (Re(B) 0) if (real positive) break } list(stochMat=B, iter=iter) } e2stochMat(N=3, e2=0.95, maxiter=1) # works e2stochMat(N=5, e2=0.95, maxiter=1) # fails This works for very small N, say, N = 3, but it fails for larger N. The probability of success is a decreasing function of N and e2. So, the algorithm fails for large N and for values of e2 close to 1. Thanks for trying. Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: Albyn Jones [mailto:jo...@reed.edu] Sent: Thursday, October 15, 2009 6:56 PM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] Generating a stochastic matrix with a specified second dominant eigenvalue I just tried the following shot in the dark: generate an N by N stochastic matrix, M. I used M = matrix(runif(9),nrow=3) M = M/apply(M,1,sum) e=eigen(M) e$values[2]= .7 (pick your favorite lambda, you may need to fiddle with the others to guarantee this is second largest.) Q = e$vectors Qi = solve(Q) B = Q %*% diag(e$values) %*% Qi eigen(B)$values [1] 1. 0.7000 -0.08518772 apply(B,1,sum) [1] 1 1 1 I haven't proven that this must work, but it seems to. Since you can verify that it worked afterwards, perhaps the proof is in the pudding. albyn On Thu, Oct 15, 2009 at 06:24:20PM -0400, Ravi Varadhan wrote: Hi, Given a positive integer N, and a real number \lambda such that 0 \lambda 1, I would like to generate an N by N stochastic matrix (a matrix with all the rows summing to 1), such that it has the second largest eigenvalue equal to \lambda (Note: the dominant eigenvalue of a stochastic matrix is 1). I don't care what the other eigenvalues are. The second eigenvalue is important in that it governs the rate at which the random process given by the stochastic matrix converges to its
Re: [R] Division of data frame and deletion of values from column
?subset ?split --- On Fri, 10/16/09, Joel Fürstenberg-Hägg joel_furstenberg_h...@hotmail.com wrote: From: Joel Fürstenberg-Hägg joel_furstenberg_h...@hotmail.com Subject: [R] Division of data frame and deletion of values from column To: r-help@r-project.org Received: Friday, October 16, 2009, 12:16 PM Hi all, I guess this might be an easy question, but I've searched multiple help pages without finding any answear... so now I put my trust in you! I have a data frame (36 variables and 556 observations). One column contains three factors, and I would like to divide the data frame into three new ones, based on the value of the factors, thereby having only one value for all elements of the particular column in each of the data frames. The reason is that I later will create plots and do statistical analyzes on these data frames, and I don't want those factors affecting the result. ID Weight Age_days ... 1 18 76.1 106 2 19 77.0 175 3 20 78.1 121 4 21 78.2 121 5 22 78.8 106 6 23 76.3 106 . . . I also have another column containing several factors, of which I would like to exclude one (get NA instead). ID Weight Age_days Value_ID ... 1 18 76.1 106 high 2 19 77.0 175 low 3 20 78.1 121 middle 4 21 78.2 121 high 5 22 78.8 106 high 6 23 76.3 106 number -- exclude 7 24 76.9 175 low . . . I really hope someone could help me, though you might think it's too easy... Best regards, Joel _ Hitta kärleken nu i vår! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Ask a question on any topic and get answers from real people. Go to Yahoo! Answers and share wha __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with the stl function
Hi there, My name is Renan X. Cortes, student of Statistics, from south of Brazil, and I'd like to ask you a few questions about decomposition of time series. In R, when I fit the decomposition using the stl function, an object is returned when ask the summary of the fit, called STL.seasonal (%), STL.trend (%) and STL.remainder (%). Once the decomposition is additive, I thought that this would be some kind of decompositon of the variability of the time serie in terms of seasonal, trend and residual unexplained. Just like a factorial analysis. But, the sum os the %'s isn't one. In fact, in some cases the value of the STL.seasonal or STL.trend exceeds 100%. When I read the paper of the help of the function STL: A Seasonal-Trend Decomposition Procedure Based on Loess, I've seen that the % of the components due the decomposition was constructed under the eigenvalues of the operators matrices T and S. But It's not clear for me, in the stl function in R what exactly does these %'s means. What does these values means? Is there a practical interpretation for them? If so, which is? I attached a file showing the values. This doubt is killing my nights of sleep. Best regards, Renan Xavier Cortes Departament os Statistics Universidade Federal do Rio Grande do Sul, Brasil _ Facebook. cial-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_2:092 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
Have you tried: mean(x) mean(as.numeric(as.character(x))) mean(x_ema) mean(as.numeric(as.character(x_ema))) What is the result of the following: which(is.na(as.numeric(as.character(x_ema Abit hard since you don't provide the data, but there may be an NA or character value that is causing the error. Hope this helps at a bit. --- On Fri, 10/16/09, Reuben Bellika rube...@gmail.com wrote: From: Reuben Bellika rube...@gmail.com Subject: [R] Cannot calculate mean() for double vector To: r-help@r-project.org Date: Friday, October 16, 2009, 12:06 PM I've been using R recently to analyze some data, but I'm having a problem using the mean() function. I imported the original data set as a vector of integers, x and then calculated a exponential moving average of the data, x_ema. This part worked fine. Then, I tried to find the mean squared error between the original series and the moving average, using mse = mean((x - x_ema)^2). This gives N/A as a result, which seems to be the result of the mean function. When I run mean() on x_ema, which is of data type double, it always returns N/A. I can find the mean of the original integer data just fine, as well as for a simple test vector of 10 doubles, but it never seems to return a usable result for the x_ema vector. Is this because the vector is too long? (Both x and x_ema contain around 4 values). Am I running into some memory limit? I can do other calculations with x_ema just fine, it's only the mean() function that doesn't seem to work. All the information in the help seems to indicate that this operation should work without a problem. If it matters at all, I'm using R 2.9.2 running on Windows Vista. I'd appreciate any help or suggestions as to what might be wrong. Thank you, Reuben Bellika __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DCC Garch model estimation
Hi all: I am a master student writing my master thesis about using the DCC garch model analysing the correlation between the insurance stock market and the local market. Right now I have a question about how to estimate the DCC garch, I know all the other meanings of the arguments in the ccgarch package, but still dont know how to estimate the dcc.para in the function dcc.estimation with for example two columns of time series returns. Can you tell me how to get the dcc.para matrix in R? Really thanks. Best Regards. Wenkai Li liwenkai1...@hotmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
Reuben Bellika wrote: I've been using R recently to analyze some data, but I'm having a problem using the mean() function. I imported the original data set as a vector of integers, x and then calculated a exponential moving average of the data, x_ema. This part worked fine. Then, I tried to find the mean squared error between the original series and the moving average, using mse = mean((x - x_ema)^2). This gives N/A as a result, which seems to be the result of the mean function. When I run mean() on x_ema, which is of data type double, it always returns N/A. So, x_ema includes one (or more) NA (and not N/A) in it. Test: if (any(is.na(x_ema))) cat(Oops! NAs in x_ema\n) If you want to get which of them are na: which(is.na(x_ema)) Alberto Monteiro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Different way of scaling data
Hi, I have a data.frame that I need to scale. I've been using the scale function and it works nicely. Some of the libraries I'm testing won't accept negative values for data, so I need to find a way to scale the data from 0 to 1 Any ideas? Thans! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative length vectors are not allowed in wilcox.exact() and perm.test()
Ben Bolker wrote: David Croll wrote: I want to compare two datasets and I get the message Error in .Call(cpermdist2, ma = as.integer(m), mb = as.integer(col), : negative length vectors are not allowed after specifying the exact test. I'm using the exactRankTests package. Do you suggest me using the coin library, or is there anything wrong with my data? Hard to say. Reproducible example please ... ? Yes, can we get an example of a negative length vector, please. Pat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative length vectors are not allowed in wilcox.exact() and perm.test()
Number of politicians' promises kept? --- On Fri, 10/16/09, Patrick Burns pbu...@pburns.seanet.com wrote: From: Patrick Burns pbu...@pburns.seanet.com Subject: Re: [R] negative length vectors are not allowed in wilcox.exact() and perm.test() To: Ben Bolker bol...@ufl.edu Cc: r-help@r-project.org Received: Friday, October 16, 2009, 1:45 PM Ben Bolker wrote: David Croll wrote: I want to compare two datasets and I get the message Error in .Call(cpermdist2, ma = as.integer(m), mb = as.integer(col), : negative length vectors are not allowed after specifying the exact test. I'm using the exactRankTests package. Do you suggest me using the coin library, or is there anything wrong with my data? Hard to say. Reproducible example please ... ? Yes, can we get an example of a negative length vector, please. Pat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Make your browsing faster, safer, and easier with the new Internet Explorer® 8. Opt ernetexplorer/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
OK. It looks like I just have several NA values at the start of my array: which (is.na(x_ema)) [1] 1 2 3 4 5 6 7 8 9 That make sense, because the moving average is not defined for those positions. I'll just have to set those values to zero: x_ema = replace(x_ema, which(is.na(x_ema)), 0) which (is.na(x_ema)) integer(0) The mean() call works now and I can get on with my work. I'll have to remember to condition the data like this in the future. Thanks for the help! Reuben Bellika __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different way of scaling data
library(reshape) ?rescaler I think something along the lines of rescaler(data.frame, type=range) should do what you want. --- On Fri, 10/16/09, Noah Silverman n...@smartmediacorp.com wrote: From: Noah Silverman n...@smartmediacorp.com Subject: [R] Different way of scaling data To: r help r-help@r-project.org Received: Friday, October 16, 2009, 1:41 PM Hi, I have a data.frame that I need to scale. I've been using the scale function and it works nicely. Some of the libraries I'm testing won't accept negative values for data, so I need to find a way to scale the data from 0 to 1 Any ideas? Thans! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to right-align labels in dotchart
Oops replying to the wrong post but anyway does this do what you want? aa - c(3,6,3,5,8) lbs - c('cat','goat', 'elephant', 'horse', 'whale') dotchart(aa, pch=(16), col = 1:5, main=A Dotchart) axis(side = 2, seq_along(aa), lbs, las=1) --- On Thu, 10/15/09, Sean Carmody seancarm...@gmail.com wrote: From: Sean Carmody seancarm...@gmail.com Subject: [R] How to right-align labels in dotchart To: R Help Mailing List r-help@r-project.org Received: Thursday, October 15, 2009, 7:51 PM I have only just discovered the joys of the dotchart (since I am reading William Cleveland's -- Sean Carmody The Stubborn Mule http://www.stubbornmule.net http://twitter.com/seancarmody [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
On Fri, Oct 16, 2009 at 2:01 PM, Reuben Bellika rube...@gmail.com wrote: OK. It looks like I just have several NA values at the start of my array: which (is.na(x_ema)) [1] 1 2 3 4 5 6 7 8 9 That make sense, because the moving average is not defined for those positions. I'll just have to set those values to zero: x_ema = replace(x_ema, which(is.na(x_ema)), 0) No! Your mean is now biased toward zero! see ?mean and read the part about na.rm. -Ista which (is.na(x_ema)) integer(0) The mean() call works now and I can get on with my work. I'll have to remember to condition the data like this in the future. Thanks for the help! Reuben Bellika __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
Are you sure that's the right solution? If you set them to 0, those values are averaged in with the others, and that could make a substantial difference. A much better approach: mean(x_ema, na.rm=TRUE) - see the help for mean for more information. Sarah On Fri, Oct 16, 2009 at 2:01 PM, Reuben Bellika rube...@gmail.com wrote: OK. It looks like I just have several NA values at the start of my array: which (is.na(x_ema)) [1] 1 2 3 4 5 6 7 8 9 That make sense, because the moving average is not defined for those positions. I'll just have to set those values to zero: x_ema = replace(x_ema, which(is.na(x_ema)), 0) which (is.na(x_ema)) integer(0) The mean() call works now and I can get on with my work. I'll have to remember to condition the data like this in the future. Thanks for the help! Reuben Bellika -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How odds ratio is computed in fisher.test()?
I'm wondering how odds ratio is computed. I thought that it is (n11/n12)/(n21/n22), but it is not what fisher.test() computes. Could somebody let me know? n11=3 n12=1 n21=1 n22=3 n1_=n11+n12 n2_=n21+n22 n_1=n11+n21 n_2=n12+n22 x=rbind(c(n11,n12),c(n21,n22)) threshold=dhyper(n11,n1_,n2_,n_1) probability=dhyper(0:n_1,n1_,n2_,n_1) sum(probability[probability=threshold]) [1] 0.4857143 (n11/n12)/(n21/n22) [1] 9 fisher.test(x) Fisher's Exact Test for Count Data data: x p-value = 0.4857 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.2117329 621.9337505 sample estimates: odds ratio 6.408309 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot calculate mean() for double vector
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Reuben Bellika Sent: Friday, October 16, 2009 11:02 AM To: r-help@r-project.org Subject: Re: [R] Cannot calculate mean() for double vector OK. It looks like I just have several NA values at the start of my array: which (is.na(x_ema)) [1] 1 2 3 4 5 6 7 8 9 That make sense, because the moving average is not defined for those positions. I'll just have to set those values to zero: x_ema = replace(x_ema, which(is.na(x_ema)), 0) By the way, you can save some typing and maybe increase your understanding of R's semantics by replacing that line with x_ema[is.na(x_ema)] - 0 When subscripting by a logical variable, read the '[' as 'such that': Values in x_ema such that x_ema is NA are changed to zero. which() is never needed when subscripting. replace() is also a waste of typing and memory if you use the same variable on the left side of the assignment and as the first argument to replace(). I'll leave it to others to comment on the statistical issues. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com which (is.na(x_ema)) integer(0) The mean() call works now and I can get on with my work. I'll have to remember to condition the data like this in the future. Thanks for the help! Reuben Bellika __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I run a function to a piece of text?
Dear users, I got really stuck trying to apply a function to a piece of code that I created using different string functions. To make things really easy, this is a wee example: x-c(1:10) script-x, trim = 0, na.rm = FALSE##script created by a number of paste() and rep() steps mean(script) ##function that I want to apply: doesn't work Is there any way to convert the script class so that the function mean() can read it as if it were text typed in the console? Thanks and have a superb day Javier -- View this message in context: http://www.nabble.com/How-can-I-run-a-function-to-a-piece-of-text--tp25930315p25930315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I run a function to a piece of text?
Based solely on what you told us, this can be done using eval(parse(text=...)) cmd - sprintf(mean(%s), script) eval(parse(text = cmd)) However, with more context, there may be a better solution. See, for example, install.packages(fortunes) library(fortunes) fortune(parse()) HTH, --sundar On Fri, Oct 16, 2009 at 11:39 AM, Javier PB j.perez-barbe...@macaulay.ac.uk wrote: Dear users, I got really stuck trying to apply a function to a piece of code that I created using different string functions. To make things really easy, this is a wee example: x-c(1:10) script-x, trim = 0, na.rm = FALSE ##script created by a number of paste() and rep() steps mean(script) ##function that I want to apply: doesn't work Is there any way to convert the script class so that the function mean() can read it as if it were text typed in the console? Thanks and have a superb day Javier -- View this message in context: http://www.nabble.com/How-can-I-run-a-function-to-a-piece-of-text--tp25930315p25930315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How odds ratio is computed in fisher.test()?
From ?fisher.test: estimate: an estimate of the odds ratio. Note that the _conditional_ Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used. Only present in the 2 by 2 case. from fisher.test: mle - function(x) { if (x == lo) return(0) if (x == hi) return(Inf) mu - mnhyper(1) if (mu x) uniroot(function(t) mnhyper(t) - x, c(0, 1))$root else if (mu x) 1/uniroot(function(t) mnhyper(1/t) - x, c(.Machine$double.eps, 1))$root else 1 } ESTIMATE - mle(x) Peng Yu wrote: I'm wondering how odds ratio is computed. I thought that it is (n11/n12)/(n21/n22), but it is not what fisher.test() computes. Could somebody let me know? n11=3 n12=1 n21=1 n22=3 n1_=n11+n12 n2_=n21+n22 n_1=n11+n21 n_2=n12+n22 x=rbind(c(n11,n12),c(n21,n22)) threshold=dhyper(n11,n1_,n2_,n_1) probability=dhyper(0:n_1,n1_,n2_,n_1) sum(probability[probability=threshold]) [1] 0.4857143 (n11/n12)/(n21/n22) [1] 9 fisher.test(x) Fisher's Exact Test for Count Data data: x p-value = 0.4857 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.2117329 621.9337505 sample estimates: odds ratio 6.408309 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/How-odds-ratio-is-computed-in-fisher.test%28%29--tp25929899p25930580.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I run a function to a piece of text?
Javier PB wrote: I got really stuck trying to apply a function to a piece of code that I created using different string functions. To make things really easy, this is a wee example: x-c(1:10) script-x, trim = 0, na.rm = FALSE##script created by a number of paste() and rep() steps mean(script) ##function that I want to apply: doesn't work Is there any way to convert the script class so that the function mean() can read it as if it were text typed in the console? That's a really tricky question. I think that there are two solutions, both of them pass through an alternate way of calling functions. Namely: for each function that can be called as f(x=a, y=b, z=c), you can use do.call(f, list(x=a,y=b,z=c)) or even do.call(f, list(a, b, c)). Let's test this. f - function(x, y) x + 2 * y f(1, 2) do.call(f, list(1, 2)) So far so good. The problem now is that you don't have a list(x, trim = 0, na.rm = FALSE), but a string! But it's a piece of cake to convert a string to an R object (but only after you know the magic words, and they are eval(parse(text=string))): eval(parse(text = paste(list(, script, ), sep = )) So, a slow and didatical(sp?) solution to your problem is: x-c(1:10) script-x, trim = 0, na.rm = FALSE # Step 1: convert script to a list - but still as a string script.list.string - paste(list(, script, ), sep = ) # Step 2: convert script to a list script.list - eval(parse(text = script.list.string)) # Step 3: call the function do.call(mean, script.list) The second way (and simpler) is to enclose the mean function into the script string, and then invoke the magic words: x-c(1:10) script-x, trim = 0, na.rm = FALSE # Step 1: convert script to the calling of mean - but still as a string mean.string - paste(mean(, script, ), sep = ) # Step 2: compute it eval(parse(text = mean.string)) Alberto Monteiro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install JGR manually?
It doesn't get that far. I launch jgr.exe and it announces that it can't connect to the internet, and shuts itself down. I've installed every dependent package listed in the jgr documentation. Carl Oct 16, 2009 09:14:38 AM, landronim...@gmail.com wrote: === (cc'ing JGR specific list) Hello On 10/15/09, Carl Witthoft wrote: Here's the problem: on Windows, the 'jgr.exe' tool starts up by checking for a connecting to the 'net in order to grab the support packages. Well, we have machines at work that are not and never will be connected to the Internet. I tried manually installing all the packages (JGR, Rjava, etc) but the jgr.exe still tries to find a connection. Have you (manually) installed all JGR dependencies: iplots, JavaGD, etc.? What is the exact message that JGR pops out? Does JGR refuse to start at all? Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I run a function to a piece of text?
An alternative could be: do.call(mean, as.list(parse(text = unlist(strsplit(script, , ) On Fri, Oct 16, 2009 at 3:39 PM, Javier PB j.perez-barbe...@macaulay.ac.uk wrote: Dear users, I got really stuck trying to apply a function to a piece of code that I created using different string functions. To make things really easy, this is a wee example: x-c(1:10) script-x, trim = 0, na.rm = FALSE ##script created by a number of paste() and rep() steps mean(script) ##function that I want to apply: doesn't work Is there any way to convert the script class so that the function mean() can read it as if it were text typed in the console? Thanks and have a superb day Javier -- View this message in context: http://www.nabble.com/How-can-I-run-a-function-to-a-piece-of-text--tp25930315p25930315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Breusch-pagan and white test - check homoscedasticity
Hi r-programmers, I performe Breusch-Pagan tests (bptest in package lmtest) to check the homoscedasticity of the residuals from a linear model and I carry out carry out White's test via bptest (formula, ~ x * z + I(x^2) + I(z^2)) include all regressors and the squares/cross-products in the auxiliary regression. But what can I do if I want find coefficient and p-values of variables x, z, x*z, I(x^2), I(z^2) ? **I wish find out which is responsible of heteroscedasticity... Can anyone help? thanking you in advance, Gautier RENAULT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to right-align labels in dotchart
Thanks very much John, that works. I had tried using las=1 inside the dotchart function itself, but it seems to be one of those occasions where the par parameter is over-ridden by the main plotting function. You can see the results here: http://www.stubbornmule.net/2009/10/asylum-seekers/ Now if only I can nudge the labels a little further to the right... Regards, Sean. On Sat, Oct 17, 2009 at 5:05 AM, John Kane jrkrid...@yahoo.ca wrote: Oops replying to the wrong post but anyway does this do what you want? aa - c(3,6,3,5,8) lbs - c('cat','goat', 'elephant', 'horse', 'whale') dotchart(aa, pch=(16), col = 1:5, main=A Dotchart) axis(side = 2, seq_along(aa), lbs, las=1) --- On Thu, 10/15/09, Sean Carmody seancarm...@gmail.com wrote: From: Sean Carmody seancarm...@gmail.com Subject: [R] How to right-align labels in dotchart To: R Help Mailing List r-help@r-project.org Received: Thursday, October 15, 2009, 7:51 PM I have only just discovered the joys of the dotchart (since I am reading William Cleveland's -- Sean Carmody The Stubborn Mule http://www.stubbornmule.net http://twitter.com/seancarmody [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/ -- Sean Carmody The Stubborn Mule http://www.stubbornmule.net http://twitter.com/seancarmody [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what's the R code for wavelet decomposition (Haar transformation)?
Dear all, Using R function dwt, it seems that I cannot specify the wavelet transformation like Haar. What's the R code for wavelet decomposition which allows me to specify Haar wavelet transformation? Of course, if it can include db2, that is even better. In general, I want an R function like matlab code dwt. Thanks in advance! Zhen Li __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install JGR manually?
On 10/16/09, c...@witthoft.com c...@witthoft.com wrote: It doesn't get that far. I launch jgr.exe and it announces that it can't connect to the internet, and shuts itself down. In R, does the following work? require(JGR) JGR() Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Frequencies, proportions cumulative proportions
Dear R-Helpers, I've looked high and low for a function that provides frequencies, proportions and cumulative proportions side-by-side. Below is the table I need. Is there a function that already does it? Thanks, Bob # Generate some test scores myValues - c(70:95) Score - ( sample( myValues, size=1000, replace=TRUE) ) head(Score) [1] 77 71 81 88 83 93 # Get frequencies proportions myTable - data.frame( table(Score) ) myTable$Prop - prop.table( myTable$Freq ) myTable$CumProp - cumsum( myTable$Prop ) # Print result myTable Score Freq Prop CumProp 1 70 44 0.044 0.044 2 71 42 0.042 0.086 3 72 40 0.040 0.126 4 73 40 0.040 0.166 5 74 43 0.043 0.209 6 75 45 0.045 0.254 7 76 46 0.046 0.300 8 77 40 0.040 0.340 9 78 46 0.046 0.386 1079 43 0.043 0.429 1180 37 0.037 0.466 1281 29 0.029 0.495 1382 33 0.033 0.528 1483 39 0.039 0.567 1584 31 0.031 0.598 1685 32 0.032 0.630 1786 31 0.031 0.661 1887 37 0.037 0.698 1988 30 0.030 0.728 2089 33 0.033 0.761 2190 43 0.043 0.804 2291 41 0.041 0.845 2392 37 0.037 0.882 2493 39 0.039 0.921 2594 42 0.042 0.963 2695 37 0.037 1.000 = Bob Muenchen (pronounced Min'-chen), Manager, Research Computing Support U of TN Office of Information Technology Stokely Management Center, Suite 200 916 Volunteer Blvd., Knoxville, TN 37996-0520 Voice: (865) 974-5230 FAX: (865) 974-8655 Please help us improve: http://oit.utk.edu/feedback Web: http://oit.utk.edu/research Map to Office: http://www.utk.edu/maps Newsletter: http://oit.utk.edu/research/news.php = __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modify the plot from countCDFxt
Hello, I am running the plot from countCDFxt (popbio). I would like to report the y-axis as a percent instead of the log scale (e^01...). I can add an axis with axis(2, 0:1, line =2) but I'm having trouble understanding how to assign the tic marks (with 'at ='). I'd like to tell it to make tics with the percent value (or decimal if that's my only option) at the equivalent of e^-1, e^-2, etc, but the values don't seem to line up correctly. I'd also like to replace the provided axis with my axis instead of just placing my axis out a line but I'm unsure how to do that. library(popbio) logNE - log(erbr$NEast[-1]/erbr$NEast[-6]) countCDFxt(mean(logNE), var(logNE), nt=5, Nc=1317, Ne=200) axis(2, 0:1, labels = c(0.018, 0.050, 0.135, 0.368, 1), at = c(0.018, 0.050, 0.135, 0.368, 1), line =1) Year NEast NWest 2004 731 1732 2005 898 2004 2006 714 1130 2007 1748 1722 2008 1901 1661 2009 1317 1563 1. How do I tell axis() to report as a percent? 2. Why don't my values in label and at match the given axis? 3. How do I replace the axis label in countCDFxt? Thanks for any help, Michelle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [stats-rosuda-devel] how to install JGR manually?
On Oct 16, 2009, at 16:49 , Liviu Andronic wrote: On 10/16/09, c...@witthoft.com c...@witthoft.com wrote: It doesn't get that far. I launch jgr.exe and it announces that it can't connect to the internet, and shuts itself down. In R, does the following work? require(JGR) JGR() No. JGR() creates a start script on Unix machines, but Windows uses the jgr.exe launcher instead. Cheers, Simon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrixes as data
Magnus Torfason-2 wrote: Hola! I am working on a problem where data points are (square) matrices. Is there a way to make a vector of matrices, such that it can be stored in a data.frame? I agree with previous posters that in most cases, you would want to store matrices in a list. However, if you already have a bunch of data in a data.frame, and really want your matrices to hang around with that data (for example for persistent storage purposes), you could use the serialize() function to convert them into a string. I have the following functions (object to character and character to object), that I have used for that exactly: # This function stores an R object as a character string. # Useful for storing in places that don't accept an R list # or other objects. otc - function(o) { return(rawToChar(serialize(o, NULL, ascii=TRUE))) } # This function accepts an object that has been serialized as # a character string and returns the original object. cto - function(c) { return(unserialize(charToRaw(c))) } # Demo code x - matrix(1:4,nrow=2) s - otc(x) d - data.frame(a=c,stringsAsFactors=FALSE) cto(d$a[1]) The alternative, if you're going to be working with these kinds of data *a lot*, is to create a class for the matrices+auxiliary data object (for S3, you can store this as a list of (list of matrices, data frames); for S4, store the list of matrices and the data frame as separate slots in the object). Then write accessor functions that make sense. -- View this message in context: http://www.nabble.com/Matrixes-as-data-tp25927469p25932504.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequencies, proportions cumulative proportions
On 16-Oct-09 20:51:06, Muenchen, Robert A (Bob) wrote: Dear R-Helpers, I've looked high and low for a function that provides frequencies, proportions and cumulative proportions side-by-side. Below is the table I need. Is there a function that already does it? Thanks, Bob # Generate some test scores myValues - c(70:95) Score - ( sample( myValues, size=1000, replace=TRUE) ) head(Score) [1] 77 71 81 88 83 93 # Get frequencies proportions myTable - data.frame( table(Score) ) myTable$Prop - prop.table( myTable$Freq ) myTable$CumProp - cumsum( myTable$Prop ) # Print result myTable Score Freq Prop CumProp 1 70 44 0.044 0.044 2 71 42 0.042 0.086 3 72 40 0.040 0.126 4 73 40 0.040 0.166 5 74 43 0.043 0.209 6 75 45 0.045 0.254 7 76 46 0.046 0.300 8 77 40 0.040 0.340 9 78 46 0.046 0.386 1079 43 0.043 0.429 1180 37 0.037 0.466 1281 29 0.029 0.495 1382 33 0.033 0.528 1483 39 0.039 0.567 1584 31 0.031 0.598 1685 32 0.032 0.630 1786 31 0.031 0.661 1887 37 0.037 0.698 1988 30 0.030 0.728 2089 33 0.033 0.761 2190 43 0.043 0.804 2291 41 0.041 0.845 2392 37 0.037 0.882 2493 39 0.039 0.921 2594 42 0.042 0.963 2695 37 0.037 1.000 I don't think there is a ready-made one, but it is very little effort to make your own: mkMyTable - function(X){ Table - data.frame( table(X) ) Table$Prop - prop.table( Table$Freq ) Table$CumProp - cumsum( Table$Prop ) Table } myTable - mkMyTable(Score) Hoping this helps! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-Oct-09 Time: 22:48:06 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Proper syntax for using varConstPower in nlme
Hi Dieter, Thanks for the reply. I had played with the initial conditions, but apparently not enough. I finally found some that avoided the singularity issue. Many thanks. More generally, I went over the documentation yet again in PB and I still find it a bit confusing. They talk about using form = ~fitted(.) when discussing varPower, but the rest of the documentation seems to suggest that form = ~... should be used to indicate which covariate you assume the variance changes with. Could you or someone else provide some clarification? Thanks. Mike On Fri, 16 Oct 2009, Dieter Menne wrote: Michael A. Gilchrist wrote: - nlme(Count ~ quad.PBMC.model(aL, aN, T0), + data = tissueData, + weights = varConstPower(form =~ Count), + start = list( fixed = c(rep(1000, 8), -2, -2) ), + fixed = list(T0 ~ TypeTissue-1, aL ~ 1, aN ~ 1), + random = aL + aN ~ 1|Tissue + ) Error in MEestimate(nlmeSt, grpShrunk) : Singularity in backsolve at level 0, block 1 You could use varPower(form=~fitted()) (the default, also varPower()). In my experience this runs into some limitation quickly with nlme, because some boundary conditions make convergence fail. Try varPower(fixed = 0.5) first and play with the number. You should only use varConstPower when you have problems with values that cover a large range, coming close to zero, which could make varPower go havoc. Always do a plot of the result; the default plot gives you residual, and some indication how to proceed. Dieter -- View this message in context: http://www.nabble.com/Proper-syntax-for-using-varConstPower-in-nlme-tp25914277p25927578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrixes as data
Create a matrix out of a list. In this example column 1 contains matrices, column 2 contains the number 1 and 2 and column 3 contains letters: L - matrix(list(m, m+10, 1, 2, a, b), 2); L [,1] [,2] [,3] [1,] Integer,4 1a [2,] Numeric,4 2b L[[2,1]] [,1] [,2] [1,] 11 13 [2,] 12 14 On Fri, Oct 16, 2009 at 12:18 PM, Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com wrote: Thanks. The points of having the column of matrices (all the same dimension) in a data.frame, is that there are also other data, each matrix is at a location, so there are geographical coordinates and possibly other measurements at the same location. Kjetil On Fri, Oct 16, 2009 at 12:46 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Fri, Oct 16, 2009 at 4:36 PM, Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com wrote: Hola! I am working on a problem where data points are (square) matrices. Is there a way to make a vector of matrices, such that it can be stored in a data.frame? Can that be done with S3? or do I have to learn S4 objects methods? If the matrices are all the same size then you could store them in an array, which is essentially a 3 or more dimensional matrix. Otherwise, you can store them in a list, and get them by number: foo = list(matrix(1:9,3,3),matrix(1:16,4,4)) foo[[1]] foo[[2]] and so forth. You'll only need to create new object classes (with S3 or S4) if you want special behaviour of vectors of these things (such as plot(foo) doing something sensible). With S3 it's easy: class(foo)=squareMatrixVector plot.squareMatrixVector=function(x,y,...){ cat(ouch\n) } plot(foo) ouch Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequencies, proportions cumulative proportions
Ted, I know how to do that. It's just such a standard display in SAS, SPSS and Stata that I figured someone had done it and I had just overlooked it. Thanks! Bob I don't think there is a ready-made one, but it is very little effort to make your own: mkMyTable - function(X){ Table - data.frame( table(X) ) Table$Prop - prop.table( Table$Freq ) Table$CumProp - cumsum( Table$Prop ) Table } myTable - mkMyTable(Score) Hoping this helps! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-Oct-09 Time: 22:48:06 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Breusch-pagan and white test - check homoscedasticity
On Fri, 16 Oct 2009, Gautier RENAULT wrote: Hi r-programmers, I performe Breusch-Pagan tests (bptest in package lmtest) to check the homoscedasticity of the residuals from a linear model and I carry out carry out White's test via bptest (formula, ~ x * z + I(x^2) + I(z^2)) include all regressors and the squares/cross-products in the auxiliary regression. But what can I do if I want find coefficient and p-values of variables x, z, x*z, I(x^2), I(z^2) ? **I wish find out which is responsible of heteroscedasticity... To take a reproducible example (cigarette consumption from Baltagi's book): ## packages and data library(AER) data(CigarettesB) ## regression cig_lm2 - lm(packs ~ price + income, data = CigarettesB) ## White test bptest(cig_lm2, ~ income * price + I(income^2) + I(price^2), data = CigarettesB) The auxiliary regression that is used in this test cannot be extracted from bptest() but you can easily run it yourself by hand: ## auxiliary regression aux - residuals(cig_lm2)^2 - mean(residuals(cig_lm2)^2) aux_lm - lm(aux ~ income * price + I(income^2) + I(price^2), data = CigarettesB) The test statistic is then the n * R-squared: ## test statistic nrow(CigarettesB) * summary(aux_lm)$r.squared And then you can also look at the details of the auxiliary model: summary(aux_lm) However, this does not have to be very conclusive as in this particular example... hth, Z Can anyone help? thanking you in advance, Gautier RENAULT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC sqlSave does not append the records to a DB2 table
I am running R version 2.9.2 on Windows XP OS with RODBC version Version: 1.3-0. Has anyone out there in the R user community successfully appended records to a DB2 table on a remote database using the sqlSave function in the RODBC package? (or by any other means from R?) I posed a similar question a few months ago and unfortunately, did not receive a response. I was hoping recent upgrades to our DB2 on the database, and I installed the current version RODBC. Unfortunately, it did not bring any joy. I asked the database adminstrator try it, and she had a similar experience. No error message is returned, but the record is not inserted to the table. For testing purposes, I have a very simple one-row, three-column data.frame (se2) I want to insert into a DB2 table. sqlSave(channel, se2, tablename = STORAGE.TEST_APPEND2, append = TRUE, + rownames = FALSE, colnames = FALSE, + verbose = TRUE, + safer = TRUE, addPK = FALSE, + fast = FALSE, test = FALSE, nastring = NULL) Query: INSERT INTO STORAGE.TEST_APPEND2 ( MACRONAME, MACROUSER, MACRO_RT ) VALUES ( 's_ej_mach_config_vz', 'jones2', 5 ) I don't get any error message, but when I check the table row count, the record has not been added to the table. Any suggestions for how to resolve are appreciated! Sincerely, Elaine McGovern Jones ISC Tape and DASD Storage Products Characterization and Failure Analysis Engineering Phone: 408 705-9588 Internal tieline: 587-9588 jon...@us.ibm.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] custom selfStart model works with getInitial but not nls
Hello, I'm having problems creating and using a selfStart model with nlme. Briefly, I've defined the model, a selfStart object, and then combined them to make a selfStart.default model. If I apply getInitial to the selfStart model, I get results. However, if I try usint it with nls or nlsList, these routines complain about a lack of initial conditions. If someone could point out what I'm doing wrong, I'd greatly appreciate it. Details: ## Nonlinear model I want to fit to the data const.PBMC.tcell.model - function(B0, t, aL, aN, T0){ Tb0 = B0; x = exp(-log(aL) + log(T0*aL+(-1+exp(t * aL))*Tb0 * aN) - t * aL); return(x); } ##Define selfStart routine const.PBMC.tcell.selfStart- function(mCall, LHS, data){ t0 = 0; t1 = 24; t2 = 48; ##Get B0 Value B0 = data[1, B0]; T0 = mean(data[data$Time==t0, Count]); T1 = mean(data[data$Time==t1, Count]); T2 = mean(data[data$Time==t2, Count]); if(T0 T2){ ##increase -- doesn't work stop(paste(Error in const.PBMC.tcell.start: T0 T2 for data: , data[1, ])); } ##Estimate aL based on exponential decline from t=0 to t=24 aLVal = -(log(T1) - log(T0))/(t1-t0); ##Estimate aNVal based on final value aNVal = aLVal*T2/B0; values = list(aLVal, aNVal, T0); names(values) - mCall[c(aL, aN, T0)]; #mimic syntax used by PB return(values) } ##Now create new model with selfStart attributes const.PBMC.tcell.modelSS- selfStart(model = const.PBMC.tcell.model, initial=const.PBMC.tcell.selfStart) ##Test routines using getInitial -- This works getInitial(Count ~ const.PBMC.tcell.modelSS(B0, Time,aL, aN, T0), data = tissueData) [1] 0.05720924 $aL [1] 0.05720924 $aN [1] 0.1981895 $T0 [1] 1360.292 ##Now try to use the SS model -- this doesn't work nls(Count ~ const.PBMC.tcell.modelSS(B0, Time,aL, aN, T0), data = tissueData) Error in numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model In addition: Warning message: In nls(Count ~ const.PBMC.tcell.modelSS(B0, Time, aL, aN, T0), data = tissueData) : No starting values specified for some parameters. Intializing 'aL', 'aN', 'T0' to '1.'. Consider specifying 'start' or using a selfStart model __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequencies, proportions cumulative proportions
Muenchen, Robert A (Bob) wrote: Ted, I know how to do that. It's just such a standard display in SAS, SPSS and Stata that I figured someone had done it and I had just overlooked it. Thanks! Bob I don't think there is a ready-made one, but it is very little effort to make your own: mkMyTable - function(X){ Table - data.frame( table(X) ) Table$Prop - prop.table( Table$Freq ) Table$CumProp - cumsum( Table$Prop ) Table } myTable - mkMyTable(Score) Hoping this helps! Ted. I think CrossTable in gmodels does what Bob is after: CrossTable(gmodels) R Documentation Cross Tabulation with Tests for Factor Independence Description An implementation of a cross-tabulation function with output similar to S-Plus crosstabs() and SAS Proc Freq (or SPSS format) with Chi-square, Fisher and McNemar tests of the independence of all table factors. David Scott -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: : Question about correlation between data.
Petr PIKAL wrote: Hi r-help-boun...@r-project.org napsal dne 16.10.2009 15:24:05: hi everybody, I'm a student, and I'm new using R! I'm looking for statistical help hoping somebody can answer me! This is my problem: I have 2 temporal series. The firstone is a series of mesured data (height of monitorated points), the second is a series of temperature (in Celsius degree). Using Matlab I have built the two graphs (Measured Data - Time Temperature - Time). Looking those graphs I can surely say that there is a clear correlation beetween theme, and also that the measured data are surely influenced by the variations of temperature. Unfortunately my statistical knowledges are not that large so using R seems quite difficult to me. My question is: is there a code already written the can compare the 2 temporal series and can find the correlation between the data??? If the relationship is linear than lm(values~temperature, ...) shall suffice if it is nonlinear than you can look e.g. to ?nls And also: is there a code that can correct the Measured Data from the influence of temperature and return a clean data??? maybe ?predict. Regards Petr This sounds a little dangerous to me. Antonio is wanting to determine correlations between *time series* if I understand correctly. The time series need to be prewhitened or the correlations between successive observations modeled in some way. Just using lm can be very misleading because of the violation of the independence assumption. If Antonio does not understand these comments he needs to consult a local statistician. David Scott -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recommendation on a probability textbook (conditional probability)
I need to refresh my memory on Probability Theory, especially on conditional probability. In particular, I want to solve the following two problems. Can somebody point me some good books on Probability Theory? Thank you! 1. Z=X+Y, where X and Y are independent random variables and their distributions are known. Now, I want to compute E(X | Z = z). 2.Suppose that I have $I \times J$ random number in I by J cells. For the random number in the cell on the i'th row and the j's column, it follows Poisson distribution with the parameter $\mu_{ij}$. I want to compute P(n_{i1},n_{i2},...,n_{iJ} | \sum_{j=1}^J n_{ij}), which the probability distribution in a row conditioned on the row sum. Some book directly states that the conditional distribution is a multinomial distribution with parameters (p_{i1},p_{i2},...,p_{iJ}), where p_{ij} = \mu_{ij}/\sum_{j=1}^J \mu_{ij}. But I'm not sure how to derive it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recommendation on a probability textbook (conditional probability)
What's the title? On Fri, Oct 16, 2009 at 8:16 PM, Yi Du abraham...@gmail.com wrote: Hogg's book is enough for you considering your problems. Yi On Fri, Oct 16, 2009 at 7:12 PM, Peng Yu pengyu...@gmail.com wrote: I need to refresh my memory on Probability Theory, especially on conditional probability. In particular, I want to solve the following two problems. Can somebody point me some good books on Probability Theory? Thank you! 1. Z=X+Y, where X and Y are independent random variables and their distributions are known. Now, I want to compute E(X | Z = z). 2.Suppose that I have $I \times J$ random number in I by J cells. For the random number in the cell on the i'th row and the j's column, it follows Poisson distribution with the parameter $\mu_{ij}$. I want to compute P(n_{i1},n_{i2},...,n_{iJ} | \sum_{j=1}^J n_{ij}), which the probability distribution in a row conditioned on the row sum. Some book directly states that the conditional distribution is a multinomial distribution with parameters (p_{i1},p_{i2},...,p_{iJ}), where p_{ij} = \mu_{ij}/\sum_{j=1}^J \mu_{ij}. But I'm not sure how to derive it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Yi Du __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recommendation on a probability textbook (conditional probability)
On Fri, Oct 16, 2009 at 9:37 PM, Peng Yu pengyu...@gmail.com wrote: What's the title? Introduction to Probability. On Fri, Oct 16, 2009 at 8:16 PM, Yi Du abraham...@gmail.com wrote: Hogg's book is enough for you considering your problems. Yi On Fri, Oct 16, 2009 at 7:12 PM, Peng Yu pengyu...@gmail.com wrote: I need to refresh my memory on Probability Theory, especially on conditional probability. In particular, I want to solve the following two problems. Can somebody point me some good books on Probability Theory? Thank you! 1. Z=X+Y, where X and Y are independent random variables and their distributions are known. Now, I want to compute E(X | Z = z). 2.Suppose that I have $I \times J$ random number in I by J cells. For the random number in the cell on the i'th row and the j's column, it follows Poisson distribution with the parameter $\mu_{ij}$. I want to compute P(n_{i1},n_{i2},...,n_{iJ} | \sum_{j=1}^J n_{ij}), which the probability distribution in a row conditioned on the row sum. Some book directly states that the conditional distribution is a multinomial distribution with parameters (p_{i1},p_{i2},...,p_{iJ}), where p_{ij} = \mu_{ij}/\sum_{j=1}^J \mu_{ij}. But I'm not sure how to derive it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Yi Du __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.