Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards
Quoting Petr Pikal [EMAIL PROTECTED]: Hi its a bit tricky but dup-apply(x, 2, duplicated) #which are dupplucated isna-apply(x, 2, is.na) #which are na check-dup|isna # which are both and here is your result x[rowSums(check)!=3,] [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA Hi, The above doesn't work. No need to have NAs in x: x - matrix(c(2,2,1,3,2,3), ncol=2, byrow=TRUE) x [,1] [,2] [1,]22 [2,]13 [3,]23 dup - apply(x, 2, duplicated) x[rowSums(check)!=2 ,] [,1] [,2] [1,]22 [2,]13 Look at 'dup': dup [,1] [,2] [1,] FALSE FALSE [2,] FALSE FALSE [3,] TRUE TRUE Yes, each element in the last row is a duplicate in its own col, but this doesn't mean that the row as a whole is a duplicate. Cheers, H. Regards Petr On 8 Mar 2007 at 10:14, stacey thompson wrote: Date sent:Thu, 8 Mar 2007 10:14:37 -0500 From: stacey thompson [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] Removing duplicated rows within a matrix, with missing data as wildcards I'd like to remove duplicated rows within a matrix, with missing data being treated as wildcards. For example x - matrix((1:3), 5, 3) x[4,2] = NA x[3,3] = NA x [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]1 NA2 [5,]213 I would like to obtain [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA From the R-help archives, I learned about unique(x) and duplicated(x). However, unique(x) returns unique(x) [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]1 NA2 and duplicated(x) gives duplicated(x) [1] FALSE FALSE FALSE FALSE TRUE I have tried various na.action 's but with unique(x) I get errors at best. e.g. unique(x, na.omit(x)) Error: argument 'incomparables != FALSE' is not used (yet) How I might tackle this? Thanks, -stacey -- -stacey lee thompson- Stagiaire post-doctorale Institut de recherche en biologie végétale Université de Montréal 4101 Sherbrooke Est Montréal, Québec H1X 2B2 Canada [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pdf device bounding box?
I apologize if I don't fully understand your question, but the pdf device has a MediaBox, which is equivalent to the BoundingBox in EPS file. The PDFs from R are defined nicely using height/width dimensions, and work well with embedding in pdflatex, etc. For example: pdf(test.pdf,height=3,width=3) plot(1:10) and view the (partially binary) output in your shell: less -N test.pdf on line 117 of this file, I see /MediaBox [0 0 216 216] which is a 3in by 3in box measured in PostScript points. I don't understand how you are mixing this in with the epstopdf command. If you want to make both a PDF and EPS, my best advice is to do both directly from R (see ?postscript for EPS file generation .. the same example as above will have %%BoundingBox: 0 0 216 216 on line 10), and your output for both formats should be clean, simple, and good enough for publishers and everyone else to use. Just one caution, if you have a Windows computer and R 2.5.1 (which is most of us), make sure you write EPS files before loading up a PDF device (PR#9517). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with zicounts
Jaap: I have simulated data from a zero-inflated Poisson model, and would like to use a package like zicounts to test my code of fitting the model. My question is: can I use zicounts directly with the following simulated data? I guess you can use zicounts, but personally I'm more familiar with zeroinfl() from package pscl (because I have written this function :)). With that you can easily do: beta.true-1.0 gamma.true-1.0 n-1000 x-matrix(rnorm(n),n,1) pi-expit(x*beta.true) mu-exp(x*gamma.true) y-numeric(n) # blank vector z-(runif(n)pi) # logical: T with prob p_i, F otherwise y[z]-rpois(sum(z),mu[z]) # draw y_i ~ Poisson(mu_i) where z_i = T y[!z]-0 # set y_i = 0 where z_i = F library(pscl) zeroinfl(y ~ 0 + x | 0 + x) which by default fits a ZIP (with log link and logit inflation). hth, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLMM in lme4 and Tweedie dist.
Hi there, I've been wanting to fit a GLMM and I'm not completely sure I'm doing things right. As I said in a previous message my response variable is continuous with many zeros, so I was having a hard time finding an appropriate error distribution. I read some previous help mails given to other people advising them to use the Tweedie distribution. I'm still not sure if this would be appropriate for my data set, for I'm a beginner and really don't follow all the details. So I ran a GLMM using this distribution. I ran it for several models to do later model selection with AIC. I used the following script, where the file GLMM_tweedie (line 2) has a list of all the models I want to run, each one in the form [ x=lmer(GGgiv ~ Rank_1 + Rank_diff + DAI + Gen_dy*Rank_diff + Gen_dy*DAI + Gen_dy + (1| D_1) + (1| D_2), family = tweedie(var.power=1,link.power=0), offset=log(Dt), data=data) ] data=read.csv(file=GLMM_data.csv) models-read.table(GLMM_tweedie.txt, sep=\t) data$Ggrec_Dtlog = log(data$Ggrec_Dt+1) models-as.vector(models[,1]) totres=c() for (i in 1:79) {model=models[i] + res=eval(parse(text=model)) + res=AIC(logLik(x)) + res=as.vector(res) + totres=rbind(totres,res)} The output would then be just a list of all the AIC of each model. For 1 of the models (the one in the [] above) I'm getting the following error message, which I don't know what it means: CHOLMOD warning: matrix not positive definite Error in objective(.par, ...) : Cholmod error `matrix not positive definite' at file:../Supernodal/t_cholmod_super_numeric.c, line 614 Could anybody give me some advice on using Tweedie distributions and does anybody have an idea what this error message means. Thanks a lot in advance, Cheers, Cristina. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dendrogram / clusteranalysis plotting
On Fri, 2007-03-09 at 01:01 +0100, bunny , lautloscrew.com wrote: Dear all, i performed a clusteranalysis - which worked so far... i plotted the dendrogram and sooo many branches, a rough sketch would be enough ;) i tried max.levels therefore which worked, but not for the plot... (re-)read ?dendrogram. function cut.dendrogram() can prune a tree's lower branches. You can plot the returned object's $upper component, which is itself an object of class dendrogram. There is an example in ?dendrogram of using cut. HTH G i used the following plot(hcd,nodePar =nP, str(hcd,max.level=1)) the output on the terminal was: --[dendrogram w/ 2 branches and 196 members at h = 2.70] |--[dendrogram w/ 2 branches and 34 members at h = 1.79] .. `--[dendrogram w/ 2 branches and 162 members at h = 1.95] .. which is great ! but i cant get it done for the plot, the plot always shows all the branches...! does anybody know how to fix this one ? thx in advance -m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT [w] http://www.freshwaters.org.uk/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can i group branches of a dendrogram
On Fri, 2007-03-09 at 02:00 +0100, bunny , lautloscrew.com wrote: Hi all, how can i group branches of a dendrogram ? Err... you'll need to give us more than that to go on. What do you mean by group? Draw a marker round broad clusters, or prune them? Or something else? I just replied with an answer that deals with pruning back objects of class dendrogram, but if this is not what you mean in this mail, reply with an example of what you tried and a description of exactly what you want to do, and maybe someone can help. G thx in advance __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT [w] http://www.freshwaters.org.uk/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards
Hi again, Your problem as you formulated it is not clearly defined. For example, what do you want to do with this matrix: x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE) x [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 Remove row 1, row 2 or nothing? Maybe you want to proceed in 2 steps: (1) remove strict duplicated rows (2) remove rows with at least 1 NA that match a row with no NAs In this case you would not remove any row from x. The removeLooseDupRows() function below does (2) only. If you want (1) and (2), you need to combine it with unique() by doing either removeLooseDupRows(unique(x)) or unique(removeLooseDupRows(x)) (both should always give the same result). removeLooseDupRows - function(x) { if (nrow(x) = 1) return(x) ii - do.call(order, args=lapply(seq_len(ncol(x)), function(col) x[ , col])) dup_index - logical(nrow(x)) i0 - -1 for (k in 1:length(ii)) { i - ii[k] if (any(is.na(x[i, ]))) { if (i0 == -1) next if (any(x[i, ] != x[i0, ], na.rm=TRUE)) next dup_index[i] - TRUE } else { i0 - i } } x[!dup_index, ] } x - matrix((1:3), 5, 3) x[4,2] = NA x[3,3] = NA x [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]1 NA2 [5,]213 removeLooseDupRows(x) [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]213 removeLooseDupRows(unique(x)) [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA Cheers, H. Quoting [EMAIL PROTECTED]: Quoting Petr Pikal [EMAIL PROTECTED]: Hi its a bit tricky but dup-apply(x, 2, duplicated) #which are dupplucated isna-apply(x, 2, is.na) #which are na check-dup|isna # which are both and here is your result x[rowSums(check)!=3,] [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA Hi, The above doesn't work. No need to have NAs in x: x - matrix(c(2,2,1,3,2,3), ncol=2, byrow=TRUE) x [,1] [,2] [1,]22 [2,]13 [3,]23 dup - apply(x, 2, duplicated) x[rowSums(check)!=2 ,] [,1] [,2] [1,]22 [2,]13 Look at 'dup': dup [,1] [,2] [1,] FALSE FALSE [2,] FALSE FALSE [3,] TRUE TRUE Yes, each element in the last row is a duplicate in its own col, but this doesn't mean that the row as a whole is a duplicate. Cheers, H. Regards Petr On 8 Mar 2007 at 10:14, stacey thompson wrote: Date sent: Thu, 8 Mar 2007 10:14:37 -0500 From: stacey thompson [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] Removing duplicated rows within a matrix, with missing data as wildcards I'd like to remove duplicated rows within a matrix, with missing data being treated as wildcards. For example x - matrix((1:3), 5, 3) x[4,2] = NA x[3,3] = NA x [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]1 NA2 [5,]213 I would like to obtain [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA From the R-help archives, I learned about unique(x) and duplicated(x). However, unique(x) returns unique(x) [,1] [,2] [,3] [1,]132 [2,]213 [3,]32 NA [4,]1 NA2 and duplicated(x) gives duplicated(x) [1] FALSE FALSE FALSE FALSE TRUE I have tried various na.action 's but with unique(x) I get errors at best. e.g. unique(x, na.omit(x)) Error: argument 'incomparables != FALSE' is not used (yet) How I might tackle this? Thanks, -stacey -- -stacey lee thompson- Stagiaire post-doctorale Institut de recherche en biologie végétale Université de Montréal 4101 Sherbrooke Est Montréal, Québec H1X 2B2 Canada [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
[R] Problem with ci.lmer() in package:gmodels
Dear Friends, Please note that in the following CI lower CI higher: require(lmer) require(gmodels) fm2 - lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy) ci(fm2) Estimate CI lower CI upper Std. Error p-value (Intercept) 251.66693 266.06895 238.630280 7.056447 0 Days 10.52773 13.63372 7.389946 1.646900 0 _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is the gmodels package being maintained?
Dear r-helpers, I sent a cc of a recent message about a problem with ci.lmer() in the gmodels package to the author (Gregory R Warnes), and the message bounced. If the author or someone else is maintaining this package or this function, would you kindly supplement the author's name and/or address with a current maintainer and/or provide a current email address? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dendrogram again
Hi all, ok, i know i can cut a dendrogram, which i did. all i get is three objects that a dendrograms itself. for example: myd$upper, myd$lower[[1]], myd$lower[[2]] and so on. of course i can plot them seperately now. but the lower parts still have hundreds of branches. i´ll need a 30 widescreen to watch the whole picture. what i´d like to is group the lower branches , so that i get a dendrogram with a few branches, splitting only in the upper levels. In terms of the cluster analysis, i just want to have a few bigger clusters. thx, m. P.S.: putting parts of a cutted dendrogram back into to one could be an idea ? is it somehow possible ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off topic:Spam on R-help increase?
Ya me parecía que no me pasaba solo a mi... :) uhmmm falló el spam-filter? Folks: In the past 2 days I have seen a large increase of spam getting into R-help. Are others experiencing this problem? If so, has there been some change to the spam filters on the R-servers? If not, is the problem on my end? Feel free to reply privately. Thanks. Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 650-467-7374 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Installing R on Ubuntu 6.10 via apt-get
Hi I'm using Linux Ubuntu 6.10 on a Pentium D 2.8. Well, following http://cran.r-project.org/bin/linux/ubuntu/README I wrote in the sources.list # R deb http://CRAN.R-project.org/bin/linux/ubuntu edgy/ deb http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu edgy/ But after type apt-get update I got Falha ao baixar http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu/edgy/Release Unable to find expected entry Packages in Meta-index file (malformed Release file?) Falha ao baixar http://CRAN.R-project.org/bin/linux/ubuntu/edgy/Release Unable to find expected entry Packages in Meta-index file (malformed Release file?) W: Conflicting distribution: http://www.vps.fmvz.usp.br edgy/ Release (expected edgy but got ) W: Conflicting distribution: http://CRAN.R-project.org edgy/ Release (expected edgy but got ) (PS. Falha ao baixar = Fail to download) The key of Vincent Goulet seems to be OK. Am I doing something wrong or there's really a problem with the Release file? Many thanks! Antonio Olinto __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing R on Ubuntu 6.10 via apt-get
Hi Antonio Look this http://help.nceas.ucsb.edu/index.php/Installing_R_on_Ubuntu -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 Ohttp://maps.google.com/maps?f=qhl=enq=Curitiba,+Brazillayer=ie=UTF8z=18ll=-25.448315,-49.276916spn=0.002054,0.005407t=kom=1 On 09/03/07, Antonio Olinto [EMAIL PROTECTED] wrote: Hi I'm using Linux Ubuntu 6.10 on a Pentium D 2.8. Well, following http://cran.r-project.org/bin/linux/ubuntu/README I wrote in the sources.list # R deb http://CRAN.R-project.org/bin/linux/ubuntu edgy/ deb http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu edgy/ But after type apt-get update I got Falha ao baixar http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu/edgy/Release Unable to find expected entry Packages in Meta-index file (malformed Release file?) Falha ao baixar http://CRAN.R-project.org/bin/linux/ubuntu/edgy/Release Unable to find expected entry Packages in Meta-index file (malformed Release file?) W: Conflicting distribution: http://www.vps.fmvz.usp.br edgy/ Release (expected edgy but got ) W: Conflicting distribution: http://CRAN.R-project.org edgy/ Release (expected edgy but got ) (PS. Falha ao baixar = Fail to download) The key of Vincent Goulet seems to be OK. Am I doing something wrong or there's really a problem with the Release file? Many thanks! Antonio Olinto __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off topic:Spam on R-help increase?
Horacio Castellini wrote: Ya me parecía que no me pasaba solo a mi... :) uhmmm falló el spam-filter? Si, es todos que lee R-news, pero maravillo, que es el palabra para spam en español? Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Deconvolution of a spectrum
Dear useRs, I have a curve which is a mixture of Gaussian curves (for example UV emission or absorption spectrum). Do you have any suggestions how to implement searching for optimal set of Gaussian peaks to fit the curve? I know that it is very complex problem, but maybe it is a possibility to do it? First supposement is to use a nls() with very large functions, and compare AIC value, but it is very difficult to suggest any starting points for algotirithm. Searching google I have found only a description of commercial software for doing such deconvolution (Origin, PeakFit) without any information about used algorithms. No ready-to-use function in any language. I have tried to use a Mclust workaround for this problem, by generating a large dataset for which the spectrum is a histogram and feed it into the Mclust. The results seem to be serious, but this is very ugly and imprecise method. Thanks for any help, Luke __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deconvolution of a spectrum
On Fri, Mar 09, 2007 at 01:25:24PM +0100, Lukasz Komsta wrote: Dear useRs, I have a curve which is a mixture of Gaussian curves (for example UV emission or absorption spectrum). Do you have any suggestions how to implement searching for optimal set of Gaussian peaks to fit the curve? I know that it is very complex problem, but maybe it is a possibility to do it? First supposement is to use a nls() with very large functions, and compare AIC value, but it is very difficult to suggest any starting points for algotirithm. Searching google I have found only a description of commercial software for doing such deconvolution (Origin, PeakFit) without any information about used algorithms. No ready-to-use function in any language. I have tried to use a Mclust workaround for this problem, by generating a large dataset for which the spectrum is a histogram and feed it into the Mclust. The results seem to be serious, but this is very ugly and imprecise method. Thanks for any help, Luke I would try `nls'. we have used `nls' for fitting magnetic resonance spectra consisting of =~ 10 gaussian peaks. this works OK, if the input data are reasonable (not too noisy, peak amplitudes above noise level, peak distance not unreasonably smaller than peak width, i.e peak overlap such that peaks are still more or less identifiable visually). of course you must invest effort in getting the start values (automatically or manually) right. if your data are good, you might get good start values for the positions (the means of the gaussians) with an approach that was floating around the r-help list in 11/2005, which I adopted as follows: peaks - function (series, span = 3, what = c(max, min), do.pad = TRUE, add.to.plot = FALSE, ...) { if ((span - as.integer(span))%%2 != 1) stop('span' must be odd) if (!is.numeric(series)) stop(`peaks' needs numeric input) what - match.arg(what) if (is.null(dim(series)) || min(dim(series)) == 1) { series - as.numeric(series) x - seq(along = series) y - series } else if (nrow(series) == 2) { x - series[1, ] y - series[2, ] } else if (ncol(series) == 2) { x - series[, 1] y - series[, 2] } if (span == 1) return(list(x = x, y = y, pos = rep(TRUE, length(y))), span = span, what = what, do.pad = do.pad) if (what == min) z - embed(-y, span) else z - embed(y, span) s - span%/%2 s1 - s + 1 v - max.col(z, first) == s1 if (do.pad) { pad - rep(FALSE, s) v - c(pad, v, pad) idx - v } else idx - c(rep(FALSE, s), v) val - list(x = x[idx], y = y[idx], pos = v, span = span, what = what, do.pad = do.pad) if (add.to.plot == TRUE) points(val, ...) val } this looks for local maxima in the vector (y-values) or 2-dim array (x/y-matrix) `series'in a neighborhood of each point defined by `span'. if you first plot your data and then call the above on the data with 'add.to.plot = TRUE', the results of the peak search are added to your plot (and you can modify this plotting via the `...' argument). maybe this works for your data to get the peak position estimates (and the amplitudes in the next step) right. frequently the standard deviations estimates can be set to some fixed value for any given experiment. and of course distant parts of your spectrum won't have anything to do which each other, so you can split up the fitting to help `nls' along a bit. joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is the gmodels package being maintained?
Michael Kubovy wrote: Dear r-helpers, I sent a cc of a recent message about a problem with ci.lmer() in the gmodels package to the author (Gregory R Warnes), and the message bounced. If the author or someone else is maintaining this package or this function, would you kindly supplement the author's name and/or address with a current maintainer and/or provide a current email address? Haven't heard that Greg should be out of circulation. You might try the address from his homepage: [EMAIL PROTECTED] -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dendrogram again
On Fri, 2007-03-09 at 12:17 +0100, bunny , lautloscrew.com wrote: Hi all, ok, i know i can cut a dendrogram, which i did. all i get is three objects that a dendrograms itself. for example: myd$upper, myd$lower[[1]], myd$lower[[2]] and so on. of course i can plot them seperately now. but the lower parts still have hundreds of branches. i´ll need a 30 widescreen to watch the whole picture. what i´d like to is group the lower branches , so that i get a dendrogram with a few branches, splitting only in the upper levels. In terms of the cluster analysis, i just want to have a few bigger clusters. thx, m. P.S.: putting parts of a cutted dendrogram back into to one could be an idea ? is it somehow possible ? Again, perhaps I'm missing something, but if I understand you correctly (again no example I can follow - what is myd and how did you create it?), you only want to plot the upper part of the dendrogram and not the lower branches. If so, then this /is/ on ?dendrogram and you /do/ use cut() to do it ...: 'cut.dendrogram()' returns a list with components '$upper' and '$lower', the first is a truncated version of the original tree, also of class 'dendrogram', the latter a list with the branches obtained from cutting the tree, each a 'dendrogram'. So to only show the pruned tree, you just plot $upper - it does say that $upper is a dendrogram and that it is the truncated version of the original tree - which is what I understand you to be asking for. This example shows it in action - this is what I mean by a reproducible example - (I'm using package vegan as I am familiar with this data set): require(vegan) ## if false install it data(varespec) hc - hclust(vegdist(varespec, bray), method = ward) hc - as.dendrogram(hc) ## this is the full dendrogram - too many nodes, so prune plot(hc) ## lets take four clusters and prune it back hc.pruned - cut(hc, h = 1) # can't specify k so read height of first # plot - cutting at h = 1 gives 4 clusters # plot only the upper part of the tree showing only the 4 clusters plot(hc.pruned$upper, center = TRUE) Is this what you want? If not, using the example I provide above, tell us exactly what you want to achieve. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and clinical studies
Does anyone know if for clinical studies the FDA would accept statistical analyses performed with R ? Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color key of heatmap.2
XinMeng wrote: Hi all: The color key of heatmap.2 is as follows if I use redgreen style: low level:red high leve:green And what I want is: low level:green hight level:red ?colorpanel Best, Jim How can I do it then? Thanks a lot for your help! My best! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is the gmodels package being maintained?
Hi, Finding his email address was not immediate, but I finally did, and did bring the problem to Greg's attention @ rochester, and the message didn't bounce this time. On Mar 9, 2007, at 7:43 AM, Peter Dalgaard wrote: Michael Kubovy wrote: Dear r-helpers, I sent a cc of a recent message about a problem with ci.lmer() in the gmodels package to the author (Gregory R Warnes), and the message bounced. If the author or someone else is maintaining this package or this function, would you kindly supplement the author's name and/or address with a current maintainer and/or provide a current email address? Haven't heard that Greg should be out of circulation. You might try the address from his homepage: [EMAIL PROTECTED] _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards
Hi H., Your response has improved the clarity of my thinking. Kind thanks. Also, your use of seq_len() prompted me to update from R version 2.3.1 on this machine. For your matrix x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE) x [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 I would want to delete either x[1,] or x[2,] but not both. Practically, your removeLooseDupRows(x) removeLooseDupRows - function(x) { if (nrow(x) = 1) return(x) ii - do.call(order, args=lapply(seq_len(ncol(x)), function(col) x[ , col])) dup_index - logical(nrow(x)) i0 - -1 for (k in 1:length(ii)) { i - ii[k] if (any(is.na(x[i, ]))) { if (i0 == -1) next if (any(x[i, ] != x[i0, ], na.rm=TRUE)) next dup_index[i] - TRUE } else { i0 - i } } x[!dup_index, ] } should leave no such ambiguous cases for my data, as the nrow(x) are very high with few NA in each x. For example, a row of (1, 2, 3) is very likely to exist in my data. However, to find the row numbers of any remaining ambiguous matches, should they exist, using example: x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 3), ncol=3, byrow=TRUE) x [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 [5,]1 NA2 [6,]213 after your suggested removeLooseDupRows(x) [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 [5,]213 q - removeLooseDupRows(unique(x)) q [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 I could # ambiguous matches in matrix form apply(q, 1, function(row1) apply(q, 1, function(row2) all(is.na(row1) | is.na(row2) | row1==row2))) [,1] [,2] [,3] [,4] [1,] TRUE TRUE FALSE FALSE [2,] TRUE TRUE FALSE FALSE [3,] FALSE FALSE TRUE FALSE [4,] FALSE FALSE FALSE TRUE # indices of ambiguous matches m - which(apply(q, 1, function(row1) apply(q, 1, function(row2) all(is.na(row1) | is.na(row2) | row1==row2))), arr=T) m row col [1,] 1 1 [2,] 2 1 [3,] 1 2 [4,] 2 2 [5,] 3 3 [6,] 4 4 #put in order and omit duplicates m2 - unique(t(apply(m, 1, sort))) m2 [,1] [,2] [1,]11 [2,]12 [3,]22 [4,]33 [5,]44 # show the ambiguous matches m2[m2[,1]!=m2[,2], drop=F] [1] 1 2 ...and procede from there. This solution came from another helpful R-help respondant to my poorly-defined problem. Appreciative thanks to everyone for your instructive help. Cheers, stacey -- -stacey lee thompson- Stagiaire post-doctorale Institut de recherche en biologie végétale Université de Montréal 4101 Sherbrooke Est Montréal, Québec H1X 2B2 Canada [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] autoload libraries at startup
Hi All I was wondering if there is a way I can specify in R that it should load libraries automatically at startup, so that I do not have to manually issue the command. Thanks Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] autoload libraries at startup
Hi I do not know if this is the best way, but have a look at .Rprofile - a text file that lives in the R root directory ans is executed at startup. You could put library() commands in that. See ?Startup for more information. Regards JS --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 08 March 2007 20:46 To: r-help@stat.math.ethz.ch Subject: [R] autoload libraries at startup Hi All I was wondering if there is a way I can specify in R that it should load libraries automatically at startup, so that I do not have to manually issue the command. Thanks Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] understanding print.summary.lm and perhaps print/show in general
I'm trying to understand how R prints summary.lm objects and trying to change it slightly for a summary function that calculates standard errors using an alternative method. I've found that I can modify a summary.lm object and then it prints the modified way but I want to change a few things in the print method that I think I might just be able to do. One is that I want the coefficients table to print a different header (other than Std. Error). I've tried changing the column name of the summary$coef matrix and this works for calls to printCoefmat but it still prints out Std. Error when I pass the summary.lm to the command line by itself. I don't understand this behavior. When I do this (enter an object on the command line by itself), does it then calls the print / show method associated with that objects class, in this case, summary.lm? Below is some sample code to reproduce the behavior I don't understand and a comment regarding the result I don't understand. Cheers, Paul # lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) printCoefmat(suma$coef) # prints what I expect suma # the above is the print behavior question regards, # why does the coefficients matrix have in its header # the usual Estimate Std. Error t value Pr(|t|) # I expect A B C D as above in the call to printCoefmat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] understanding print.summary.lm and perhaps print/show in general
Paul, Usually summary methods perform some computations if needed and then change the class of the original object so that a print method can be called for the new summary object. In this case, this is done at the end of the summary.lm method: ... if (!is.null(z$na.action)) ans$na.action - z$na.action class(ans) - summary.lm ^^ ans } So then print.summary.lm does all the job displaying the summary.lm object. To see that function do getAnywhere(print.summary.lm) Then you can then modify that function as needed. -Christos Christos Hatzis, Ph.D. Nuvera Biosciences, Inc. 400 West Cummings Park Suite 5350 Woburn, MA 01801 Tel: 781-938-3830 www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Bailey Sent: Friday, March 09, 2007 9:34 AM To: r-help@stat.math.ethz.ch Subject: [R] understanding print.summary.lm and perhaps print/show in general I'm trying to understand how R prints summary.lm objects and trying to change it slightly for a summary function that calculates standard errors using an alternative method. I've found that I can modify a summary.lm object and then it prints the modified way but I want to change a few things in the print method that I think I might just be able to do. One is that I want the coefficients table to print a different header (other than Std. Error). I've tried changing the column name of the summary$coef matrix and this works for calls to printCoefmat but it still prints out Std. Error when I pass the summary.lm to the command line by itself. I don't understand this behavior. When I do this (enter an object on the command line by itself), does it then calls the print / show method associated with that objects class, in this case, summary.lm? Below is some sample code to reproduce the behavior I don't understand and a comment regarding the result I don't understand. Cheers, Paul # lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) printCoefmat(suma$coef) # prints what I expect suma # the above is the print behavior question regards, # why does the coefficients matrix have in its header # the usual Estimate Std. Error t value Pr(|t|) # I expect A B C D as above in the call to printCoefmat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert pixels into axis coordinates in R
Dear R users, I've two questions: 1) Does anybody have a clue how to convert pixel from a jpeg graphic (e.g. something like a square of 100x100 pxs) into axis coordinate values in R? / //2)// Is there any possibility to extend the locator function in a way that //locator( ) outputs all coordinates from a plot at once, without clicking on the graph? Thanks for any hints. Regards, P. Stencel / __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards
you could also try something like the following: x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 3), ncol=3, byrow=TRUE) wildcardVals - 1:3 # possible wildcard values ind - complete.cases(x) nc - ncol(x) nr - nrow(x[ind, ]) nwld - length(wildcardVals) posb - apply(x[!ind, , drop = FALSE], 1, function(y){ out - matrix(y, nwld, nc, by = TRUE) out[, is.na(y)] - wildcardVals t(out) }) posb - matrix(c(posb), ncol = nc, by = TRUE) keep.ind - duplicated(rbind(x[ind, ], posb)) keep.ind[-(1:nr)] - apply(matrix(keep.ind[-(1:nr)], nc = nwld, by = TRUE), 1, function(x) if(any(x)) rep(TRUE, length(x)) else x) out - rbind(x[ind, ], matrix(rep(x[!ind, ], each = nwld), nc = nc)) unique(out[!keep.ind, ]) I hope it works ok. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: stacey thompson [EMAIL PROTECTED] To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Cc: [EMAIL PROTECTED] Sent: Friday, March 09, 2007 3:09 PM Subject: Re: [R] Removing duplicated rows within a matrix,with missing data as wildcards Hi H., Your response has improved the clarity of my thinking. Kind thanks. Also, your use of seq_len() prompted me to update from R version 2.3.1 on this machine. For your matrix x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE) x [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 I would want to delete either x[1,] or x[2,] but not both. Practically, your removeLooseDupRows(x) removeLooseDupRows - function(x) { if (nrow(x) = 1) return(x) ii - do.call(order, args=lapply(seq_len(ncol(x)), function(col) x[ , col])) dup_index - logical(nrow(x)) i0 - -1 for (k in 1:length(ii)) { i - ii[k] if (any(is.na(x[i, ]))) { if (i0 == -1) next if (any(x[i, ] != x[i0, ], na.rm=TRUE)) next dup_index[i] - TRUE } else { i0 - i } } x[!dup_index, ] } should leave no such ambiguous cases for my data, as the nrow(x) are very high with few NA in each x. For example, a row of (1, 2, 3) is very likely to exist in my data. However, to find the row numbers of any remaining ambiguous matches, should they exist, using example: x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 3), ncol=3, byrow=TRUE) x [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 [5,]1 NA2 [6,]213 after your suggested removeLooseDupRows(x) [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 [5,]213 q - removeLooseDupRows(unique(x)) q [,1] [,2] [,3] [1,]1 NA3 [2,] NA23 [3,]132 [4,]213 I could # ambiguous matches in matrix form apply(q, 1, function(row1) apply(q, 1, function(row2) all(is.na(row1) | is.na(row2) | row1==row2))) [,1] [,2] [,3] [,4] [1,] TRUE TRUE FALSE FALSE [2,] TRUE TRUE FALSE FALSE [3,] FALSE FALSE TRUE FALSE [4,] FALSE FALSE FALSE TRUE # indices of ambiguous matches m - which(apply(q, 1, function(row1) apply(q, 1, function(row2) all(is.na(row1) | is.na(row2) | row1==row2))), arr=T) m row col [1,] 1 1 [2,] 2 1 [3,] 1 2 [4,] 2 2 [5,] 3 3 [6,] 4 4 #put in order and omit duplicates m2 - unique(t(apply(m, 1, sort))) m2 [,1] [,2] [1,]11 [2,]12 [3,]22 [4,]33 [5,]44 # show the ambiguous matches m2[m2[,1]!=m2[,2], drop=F] [1] 1 2 ...and procede from there. This solution came from another helpful R-help respondant to my poorly-defined problem. Appreciative thanks to everyone for your instructive help. Cheers, stacey -- -stacey lee thompson- Stagiaire post-doctorale Institut de recherche en biologie végétale Université de Montréal 4101 Sherbrooke Est Montréal, Québec H1X 2B2 Canada [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Right truncation data
Hi, Does anybody know how to perform a Cox model analysis for right truncated data? All my data is right truncated, since I only have patients who entered in a hospital for a particular desease, and I would like to modelate the age at hospital entrance as the response variable. Thanks in advance. Isaac Subirana. [EMAIL PROTECTED] La informació continguda en aquest missatge i en qualsevol fitxer adjunt és confidencial, privada i d'ús exclusiu per al destinatari. Si no és la persona a la qual anava dirigida aquesta informació, si us plau, notifiqui immediatament l'enviament erroni al remitent i esborri el missatge. Qualsevol còpia, divulgació, distribució o utilització no autoritzada d'aquest correu electrònic i dels seus adjunts està prohibida en virtut de la legislació vigent. La información contenida en este mensaje y en cualquier fichero adjunto es confidencial, privada y de uso exclusivo para el destinatario. Si usted no es la persona a la cual iba dirigida esta información, por favor, notifique inmediatamente el envÃo erróneo al remitente y borre el mensaje. Cualquier copia, divulgación, distribución o utilización no autorizada de este correo electrónico y de sus adjuntos está prohibida en virtud de la legislación vigente. The information included in this e-mail and any attached files are confidential and private. If you are not the intended recipient, please notify the error to the sender and delete this message immediately. Dissemination, forwarding or copying of this e-mail and its associated attachments is strictly prohibited according with current legislation. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deconvolution of a spectrum
Lukasz Komsta [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I have a curve which is a mixture of Gaussian curves (for example UV emission or absorption spectrum). Do you have any suggestions how to implement searching for optimal set of Gaussian peaks to fit the curve? I know that it is very complex problem, but maybe it is a possibility to do it? First supposement is to use a nls() with very large functions, and compare AIC value, but it is very difficult to suggest any starting points for algotirithm. Perhaps these notes will be helpful if you don't have too much noise in your data: http://research.stowers-institute.org/efg/R/Statistics/MixturesOfDistributions/index.htm efg Earl F. Glynn Stowers Institute for Medical Research __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color key of heatmap.2
mycolors - rev(heatmap.2(length)) where length is the number of colours you wants. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix conversion question
Hello, Please help - I'm blanking on this ... I have a matrix like this: [,1] [,2] [1,]12 [2,]13 [3,]23 and would like to have a list of vectors, where a vector contains the entries in a matrix row ... Can somebody nudge me to the place I need to go? Thanks, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Delphine, Please see the following message posted a week ago: http://comments.gmane.org/gmane.comp.lang.r.general/80175. HTH, -Mat -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine Sent: Friday, March 09, 2007 8:29 AM To: r-help@stat.math.ethz.ch Subject: [R] R and clinical studies Does anyone know if for clinical studies the FDA would accept statistical analyses performed with R ? Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix conversion question
Try split(x, row(x)) -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Johannes Graumann Sent: Friday, March 09, 2007 10:30 AM To: r-help@stat.math.ethz.ch Subject: [R] Matrix conversion question Hello, Please help - I'm blanking on this ... I have a matrix like this: [,1] [,2] [1,]12 [2,]13 [3,]23 and would like to have a list of vectors, where a vector contains the entries in a matrix row ... Can somebody nudge me to the place I need to go? Thanks, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] understanding print.summary.lm and perhaps print/show in general
Another solution is to look into the code of summary.lm a few lines above where the (dim)names are assigned. Based on this, you may try lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) printCoefmat(suma$coef) # prints what I expect suma dimnames(suma$coefficients) - list(names(suma$coefficients), c(LETTERS[1:4])) suma You might also find reading the chapter on generic functions in the R-lang (R language definition) manual useful. Petr Christos Hatzis napsal(a): Paul, Usually summary methods perform some computations if needed and then change the class of the original object so that a print method can be called for the new summary object. In this case, this is done at the end of the summary.lm method: ... if (!is.null(z$na.action)) ans$na.action - z$na.action class(ans) - summary.lm ^^ ans } So then print.summary.lm does all the job displaying the summary.lm object. To see that function do getAnywhere(print.summary.lm) Then you can then modify that function as needed. -Christos Christos Hatzis, Ph.D. Nuvera Biosciences, Inc. 400 West Cummings Park Suite 5350 Woburn, MA 01801 Tel: 781-938-3830 www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Bailey Sent: Friday, March 09, 2007 9:34 AM To: r-help@stat.math.ethz.ch Subject: [R] understanding print.summary.lm and perhaps print/show in general I'm trying to understand how R prints summary.lm objects and trying to change it slightly for a summary function that calculates standard errors using an alternative method. I've found that I can modify a summary.lm object and then it prints the modified way but I want to change a few things in the print method that I think I might just be able to do. One is that I want the coefficients table to print a different header (other than Std. Error). I've tried changing the column name of the summary$coef matrix and this works for calls to printCoefmat but it still prints out Std. Error when I pass the summary.lm to the command line by itself. I don't understand this behavior. When I do this (enter an object on the command line by itself), does it then calls the print / show method associated with that objects class, in this case, summary.lm? Below is some sample code to reproduce the behavior I don't understand and a comment regarding the result I don't understand. Cheers, Paul # lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) printCoefmat(suma$coef) # prints what I expect suma # the above is the print behavior question regards, # why does the coefficients matrix have in its header # the usual Estimate Std. Error t value Pr(|t|) # I expect A B C D as above in the call to printCoefmat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to create a list that grows automatically
Dear R users I would like to know if there is a way to create a list or an array (or anything) which grows automatically as more elements are put into it. What I want to find is something equivalent to an ArrayList object of Java language. In Java, I can do the following thing: // Java code ArrayList myArray = new ArrayList(); myArray.add(object1); myArray.add(object2); // End of java code Thanks in advance. Young-Jin Lee [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lpSolve space problem in R 2.4.1 on Windows XP
Hi. I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on Windows XP (Version 5.1). The problem I am trying to solve has 2843 variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 Gb of memory. After I load the input data into R, I have at most 1.5 Gb of memory available. If I start the lp with significantly less memory available (say 1 Gb), I get an error message from R: Error: cannot allocate vector of size 189459 Kb If I close all my other windows and try to maximize the available memory to the full 1.5 Gb, I can watch the memory get filled up until only about 400 Mb is left, at which point I get a Windows error message: R for Windows GUI front-end has encountered a problem and needs to close. We are sorry for the inconvenience. This behavior persists even when I relax the integer constraints, and eliminate the 2841 constraints that restrict the integer variables to values = 1, so I'm just running a standard lp with 2843 variables and 5683 constraints. I have been able to get the full MIP formulation to work correctly on some very small problems (~10 variables and 25 constraints). Here is the code for a working example: library(lpSolve) (v1=rev(1:8)) [1] 8 7 6 5 4 3 2 1 (csv1=cumsum(as.numeric(v1))) [1] 8 15 21 26 30 33 35 36 (lencsv1=length(csv1)) [1] 8 (Nm1=lencsv1-1) [1] 7 (Np1=lencsv1+1) [1] 9 ngp=3 f.obj=c(1,1,rep(0,Nm1)) f.int=3:Np1 bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1)) bin.dir=rep(=,Nm1) bin.rhs=rep(1,Nm1) gp.con=c(0,0,rep(1,Nm1)) gp.dir== (gp.rhs=ngp-1) [1] 2 ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) ub.dir=rep(=,Nm1) (ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1]) [1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917 lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) lb.dir=rep(=,Nm1) lb.rhs=ub.rhs f.con=rbind(bin.con,gp.con,ub.con,lb.con) f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir) f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs) lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int) lglp$objval [1] 0.917 lglp$solution [1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000 [8] 0.000 0.000 What this is doing is taking the points of v1 and dividing them into contiguous groups (the variable ngp is the number of groups) such that the sums of the v1 values are as close as possible to equal within the three groups. So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution vector shows that the splitting occurs after the second and fourth elements. Anyway, I am wondering... Are 3000 variables and 8500 constraints usually too much for lpSolve to handle in 1.5 Gb of memory? Is there a possible bug (in R or in Windows) that leads to the Windows error when the memory falls below 400 Mb? Is there a problem with my formulation that makes it unstable even after the integer constraints are removed? Thanks! -- TMK -- 212-460-5430home 917-656-5351cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting the p of F statistics from lm
I need to extract the p value from a ANOVA done with lm model fitting - lm(var ~ group) Sfitting - summary(fitting) Sfitting[10][1] gives the F value and the degrees of freedom but I am not able to get the p value. The function df should give a p value given a F but I am not able to make it work. I found only something about aov in the R help and I am not able to make it work Massimo Cressoni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix conversion question
Christos Hatzis wrote: Try split(x, row(x)) H! THE ELEGANCE! Thanks a lot! Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R GUI in Ubuntu?
OK, so I did: sudo R CMD javareconf followed by the following in R as root: install.packages(JGR,dep = TRUE) which I think went OK because if I do: library() then JGR is listed. From the terminal: Packages in library '/usr/local/lib/R/site-library': JavaGD Java Graphics Device JGR JGR - Java Gui for R rJava Low-level R to Java interface BUT, if I then run: JGR() then I get: Error: could not find function JGR I am confused...?!? Thanks in advance, Andy Dirk Eddelbuettel wrote: On Thu, Mar 08, 2007 at 07:05:15PM +0100, Andy Weller wrote: Dear all, I am very new to R and find the terminal-based UI a little daunting. (That's probably the wrong thing to say!) Having searched the Packages it seems that I can have either a Gnome-based or Java-based GUI for my Ubuntu machine. However, I can get neither to work. Having run R as root, I then run the following command: install.packages(gnomeGUI, dependencies=TRUE) The output of which is: checking for gnomeConf.sh file in /usr/local/lib... not found configure: error: conditional HAVE_ORBIT was never defined. Usually this means the macro was only invoked conditionally. ERROR: configuration failed for package 'gnomeGUI' * Removing '/usr/local/lib/R/site-library/gnomeGUI' I have checked to see if I have all dependencies installed - it seems as though I have. No luck! So I try the Java-based GUI with: install.packages(JGR,dep=TRUE) library(JGR) JGR() No luck. So, out of R I try: sudo R CMD javareconf I think you are close. Do the JGR install _after_ the javareconf as it needs the correct values. Also make sure you use the Sun Java packages you get for Ubuntu. Hope this helps, Dirk Then in R, if I check the library with: library(JGR) I get: Error: .onLoad failed in 'loadNamespace' for 'rJava' Error: package 'rJava' could not be loaded HMMmmm - still no joy! I guess I am missing something very basic here?! Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting the p of F statistics from lm
Date: Fri, 09 Mar 2007 11:18:46 -0500 From: Cressoni, Massimo (NIH/NHLBI) [F] [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Precedence: list Thread-topic: Extracting the p of F statistics from lm Thread-index: AcdiZp1fE6s6LWieSsaL2EpoQP/shg== I need to extract the p value from a ANOVA done with lm model fitting - lm(var ~ group) Sfitting - summary(fitting) Sfitting[10][1] gives the F value and the degrees of freedom but I am not able to get the p value. The function df should give a p value given a F but I am not able to make it work. The function pf should. I found only something about aov in the R help and I am not able to make it work Massimo Cressoni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Giovanni Petris [EMAIL PROTECTED] Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lpSolve space problem in R 2.4.1 on Windows XP
If R is closed that way (i.e. crashes), it is a bug by definition: either in R or (more probable) in the package. Can you please contact the package maintainer to sort things out. Thanks, Uwe Ligges Talbot Katz wrote: Hi. I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on Windows XP (Version 5.1). The problem I am trying to solve has 2843 variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 Gb of memory. After I load the input data into R, I have at most 1.5 Gb of memory available. If I start the lp with significantly less memory available (say 1 Gb), I get an error message from R: Error: cannot allocate vector of size 189459 Kb If I close all my other windows and try to maximize the available memory to the full 1.5 Gb, I can watch the memory get filled up until only about 400 Mb is left, at which point I get a Windows error message: R for Windows GUI front-end has encountered a problem and needs to close. We are sorry for the inconvenience. This behavior persists even when I relax the integer constraints, and eliminate the 2841 constraints that restrict the integer variables to values = 1, so I'm just running a standard lp with 2843 variables and 5683 constraints. I have been able to get the full MIP formulation to work correctly on some very small problems (~10 variables and 25 constraints). Here is the code for a working example: library(lpSolve) (v1=rev(1:8)) [1] 8 7 6 5 4 3 2 1 (csv1=cumsum(as.numeric(v1))) [1] 8 15 21 26 30 33 35 36 (lencsv1=length(csv1)) [1] 8 (Nm1=lencsv1-1) [1] 7 (Np1=lencsv1+1) [1] 9 ngp=3 f.obj=c(1,1,rep(0,Nm1)) f.int=3:Np1 bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1)) bin.dir=rep(=,Nm1) bin.rhs=rep(1,Nm1) gp.con=c(0,0,rep(1,Nm1)) gp.dir== (gp.rhs=ngp-1) [1] 2 ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) ub.dir=rep(=,Nm1) (ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1]) [1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917 lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) lb.dir=rep(=,Nm1) lb.rhs=ub.rhs f.con=rbind(bin.con,gp.con,ub.con,lb.con) f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir) f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs) lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int) lglp$objval [1] 0.917 lglp$solution [1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000 [8] 0.000 0.000 What this is doing is taking the points of v1 and dividing them into contiguous groups (the variable ngp is the number of groups) such that the sums of the v1 values are as close as possible to equal within the three groups. So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution vector shows that the splitting occurs after the second and fourth elements. Anyway, I am wondering... Are 3000 variables and 8500 constraints usually too much for lpSolve to handle in 1.5 Gb of memory? Is there a possible bug (in R or in Windows) that leads to the Windows error when the memory falls below 400 Mb? Is there a problem with my formulation that makes it unstable even after the integer constraints are removed? Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time demean model matrix
Suppose I have longitudinal data and want to use the econometric strategy of de-meaning a model matrix by time. For sake of illustration 'mat' is a model matrix for 3 individuals each with 3 observations where ``1'' denotes that individual i was in group j at time t or ``0'' otherwise. mat - matrix(c(1,1,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,1,0,0,0,1,1,0), ncol=3) mat - data.frame(mat, id=gl(3,3)) I can conceive of two ways of de-meaning: either use an explicit loop or use mapply, both of which are below. # put this in a loop over each column to create the de-meaned X matrix mat2 - matrix(0, 9,3) for(i in 1:3){ mat2[,i] - mat[,i] - ave(mat[,i], mat$id) } # Or use mapply as follows mat[,1:3]-mapply(ave, mat[,1:3], MoreArgs=list(mat$id)) Both work, but they require that the model matrix is explictly created and then used in the regression. For example, assume I am using the star data in the mlmRev package data(star, package='mlmRev') I would first need to explictly create the model matrix for the fixed effects as follows and then use the strategy above to de-mean this matrix. mat -model.matrix(lm(math~ -1 + sch, star)) Of course in R, this is rather inefficient as one generally only needs to have a factor for any independent variables and the model matrix is created for you when using lm(). So, my question is whether there is a more efficient way of creating the time de-meaned model matrix? Or, is the solution above the kind of strategy that must be used for this situation? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R GUI in Ubuntu?
I should also add that: library(JGR) gives me the following output: Loading required package: rJava Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/usr/local/lib/R/site-library/rJava/libs/rJava.so': /usr/local/lib/R/site-library/rJava/libs/rJava.so: undefined symbol: JNI_GetCreatedJavaVMs Error: .onLoad failed in 'loadNamespace' for 'rJava' Error: package 'rJava' could not be loaded I have Sun's Java installed and thought rJava installed without problems... Thanks, Andy Andy Weller wrote: OK, so I did: sudo R CMD javareconf followed by the following in R as root: install.packages(JGR,dep = TRUE) which I think went OK because if I do: library() then JGR is listed. From the terminal: Packages in library '/usr/local/lib/R/site-library': JavaGD Java Graphics Device JGR JGR - Java Gui for R rJava Low-level R to Java interface BUT, if I then run: JGR() then I get: Error: could not find function JGR I am confused...?!? Thanks in advance, Andy Dirk Eddelbuettel wrote: On Thu, Mar 08, 2007 at 07:05:15PM +0100, Andy Weller wrote: Dear all, I am very new to R and find the terminal-based UI a little daunting. (That's probably the wrong thing to say!) Having searched the Packages it seems that I can have either a Gnome-based or Java-based GUI for my Ubuntu machine. However, I can get neither to work. Having run R as root, I then run the following command: install.packages(gnomeGUI, dependencies=TRUE) The output of which is: checking for gnomeConf.sh file in /usr/local/lib... not found configure: error: conditional HAVE_ORBIT was never defined. Usually this means the macro was only invoked conditionally. ERROR: configuration failed for package 'gnomeGUI' * Removing '/usr/local/lib/R/site-library/gnomeGUI' I have checked to see if I have all dependencies installed - it seems as though I have. No luck! So I try the Java-based GUI with: install.packages(JGR,dep=TRUE) library(JGR) JGR() No luck. So, out of R I try: sudo R CMD javareconf I think you are close. Do the JGR install _after_ the javareconf as it needs the correct values. Also make sure you use the Sun Java packages you get for Ubuntu. Hope this helps, Dirk Then in R, if I check the library with: library(JGR) I get: Error: .onLoad failed in 'loadNamespace' for 'rJava' Error: package 'rJava' could not be loaded HMMmmm - still no joy! I guess I am missing something very basic here?! Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a list that grows automatically
Young-Jin Lee asked: I would like to know if there is a way to create a list or an array (or anything) which grows automatically as more elements are put into it. ??? I think this is the default behaviour of R arrays: x - vector(length=0) # create a vector of zero length x[1] - 2 x[10] - 3 x[length(x) + 1] - 4 x # 2 NA NA ... 3 4 What I want to find is something equivalent to an ArrayList object of Java language. In Java, I can do the following thing: // Java code ArrayList myArray = new ArrayList(); myArray.add(object1); myArray.add(object2); // End of java code myArray - vector(length=0) myArray - c(myArray, object1) myArray - c(myArray, object2) myArray # array with 2 strings Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off topic:Spam on R-help increase?
DB == Douglas Bates [EMAIL PROTECTED] on Tue, 6 Mar 2007 11:57:28 -0600 writes: DB On 3/6/07, Bert Gunter [EMAIL PROTECTED] wrote: In the past 2 days I have seen a large increase of spam getting into R-help. Are others experiencing this problem? If so, has there been some change to the spam filters on the R-servers? If not, is the problem on my end? DB There has indeed been an increase in the amount of spam making it DB through to the list. We apologize for the inconvenience. Regretably DB we will not be able to do much about it until the beginning of next DB week. DB Martin Maechler is on vacation at present and I am administering the DB lists until he returns. Most of the time this works even though the DB mail servers are in Zurich Switzerland and I am in Madison, WI, USA. DB However, in the last two days we have had a surge in spam and quite a DB bit of it is getting through the filters. DB The filters are catching some of the spam. I think the main DB difference in the last two days has been that the level of spam to the DB lists has increased but it could be that something has happened to the DB filters too. I've been back today, well relaxed and tanned from the nice vacation; thanks to all of you for taking such an interest in it:-) ;-) With a work back-log of almost 4 weeks, I hadn't dared to look into my R-lists inbox of 2400 messages until about an hour ago. Fortunately it's not the spammers that would have become smarter (well they have or their hired geeks, but already a few months ago, not just now). The main problem has ``just'' been disk-server, then network and file mount problems on the mail server that were unfortunately not seen at first by our IT staff. As a consequence, there had also been enormous ( 24 hours) delays in mail delivery, maybe less visible on the mailing list side of it. As far as I can see/guess now, the spam problem should have lasted only about one to two days --- too long of course for you, but at least not till I had returned to work. Yes indeed, we are sorry for this, but no, we cannot promise it won't happen again :-\ Martin DB All the lists except R-help only allow postings from subscribers so DB there should very little spam on the other lists. DB This subscriber-only policy can be difficult for people like me who DB receive email at one address but send it from another. Either the DB sender must remember to use the account that is registered for the DB list or the list administrator must manually approve the posting. DB Even worse, such a policy dissuades new useRs from posting because DB they get a response that their message has been held pending manual DB approval by the administrator. Sometimes they react by reposting the DB message, then re-reposting, then ... DB We have avoided instituting such a policy on R-help because of the DB level of administrative work that will be involved and our desire not DB to dissuade new useRs from posting to the list. DB However, if this keeps up we may need to reconsider. DB I would ask for the list subscribers to bear with us until Martin DB returns and can check on whether something has gone wrong with the DB filters. DB __ DB R-help@stat.math.ethz.ch mailing list DB https://stat.ethz.ch/mailman/listinfo/r-help DB PLEASE do read the posting guide http://www.R-project.org/posting-guide.html DB and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Duplicate rows of matrix
Hello my problem is the following: I have a matrix A and a vector B which contains as many rows as A. I need to build a matrix C which contains B[i]-times the row A[i,] and this for each line of A. if for example A is [1][2] [1] 8 9.4 [2] 4.21.1 and B is (3,1). Then C will be: [1][2] [1] 8 9.4 [2] 8 9.4 [3] 8 9.4 [4] 4.21.1 I have some working code which go through all the lines of A and for each line does a rbind(C, A[i,]) B[i]-times But this is quite time consuming given that each rbind rebuild a new matrix ... is there any faster way? I can think of some minor improvements like building a matrix C of zeros, containing as many columns as A and as many columns as the sum of elements of B ... and the filing it. But I was more looking for some already implemented function/package, is there any? Thanx -- Leggi GRATIS le tue mail con il telefonino i-mode di Wind http://i-mode.wind.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLM: order of terms in model
Dear R-helpers, I have been analysing data using a GLM. My model is as follows: mod - glm (V ~ T + as.factor(A) + N, family=gaussian) and using anova(mod, test=F) to get the analysis of deviance table and the fraction of deviance explained by each term. T and A dominate with respect to their Deviance, with T having a larger effect than A (about twice) However, if I reverse T and A in the model, I get that A now explains more deviance than T. My questions are: 1) What is it due to? 2) Is there any way around this? How do I find which model is best and/or can I use another method that won't be sensitive to the order of the terms. Thanks, Christian Reply to: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicate rows of matrix
Try this: a - matrix(c(8, 4.2, 9.4, 1.1),2) b - c(3,1) a[rep(1:nrow(a), b), ] -Christos Christos Hatzis, Ph.D. Nuvera Biosciences, Inc. 400 West Cummings Park Suite 5350 Woburn, MA 01801 Tel: 781-938-3830 www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bruno C. Sent: Friday, March 09, 2007 12:17 PM To: r-help Subject: [R] Duplicate rows of matrix Hello my problem is the following: I have a matrix A and a vector B which contains as many rows as A. I need to build a matrix C which contains B[i]-times the row A[i,] and this for each line of A. if for example A is [1][2] [1] 8 9.4 [2] 4.21.1 and B is (3,1). Then C will be: [1][2] [1] 8 9.4 [2] 8 9.4 [3] 8 9.4 [4] 4.21.1 I have some working code which go through all the lines of A and for each line does a rbind(C, A[i,]) B[i]-times But this is quite time consuming given that each rbind rebuild a new matrix ... is there any faster way? I can think of some minor improvements like building a matrix C of zeros, containing as many columns as A and as many columns as the sum of elements of B ... and the filing it. But I was more looking for some already implemented function/package, is there any? Thanx -- Leggi GRATIS le tue mail con il telefonino i-modeT di Wind http://i-mode.wind.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a list that grows automatically
This is a bad idea as it can greatly slow things down (the details were discussed several times on this list). What you want to do is define from the start the length of your vector/list, then grow it (by a large margin) only if it becomes full. lst - vector(mode=list, length=10) #assuming 100k nodes are enough #populate the list, then remove the unused nodes if you care to lst - lst[sapply(lst, function(x) {!is.null(x)})] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Young-Jin Lee Sent: Friday, March 09, 2007 11:08 AM To: r-help Subject: [R] How to create a list that grows automatically Dear R users I would like to know if there is a way to create a list or an array (or anything) which grows automatically as more elements are put into it. What I want to find is something equivalent to an ArrayList object of Java language. In Java, I can do the following thing: // Java code ArrayList myArray = new ArrayList(); myArray.add(object1); myArray.add(object2); // End of java code Thanks in advance. Young-Jin Lee [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicate rows of matrix
On Fri, 9 Mar 2007 18:17:04 +0100, Bruno C\. [EMAIL PROTECTED] wrote: Hello my problem is the following: I have a matrix A and a vector B which contains as many rows as A. I need to build a matrix C which contains B[i]-times the row A[i,] and this for each line of A. How about: C - A[rep(seq(nrow(A)), B), ] Completely untested, because you didn't provide example code. -- Seb __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lpSolve space problem in R 2.4.1 on Windows XP
Hello Sam Buttrey. Uwe Ligges from the r-help list asked me to forward this message to the maintainer of the lpSolve package, because R 2.4.1 is crashing when I run lp. I saw your name listed in the lpSolve help file. If you need more detail, please let me know. Thanks! -- TMK -- 212-460-5430home 917-656-5351cell From: Uwe Ligges [EMAIL PROTECTED] To: Talbot Katz [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] lpSolve space problem in R 2.4.1 on Windows XP Date: Fri, 09 Mar 2007 17:51:30 +0100 If R is closed that way (i.e. crashes), it is a bug by definition: either in R or (more probable) in the package. Can you please contact the package maintainer to sort things out. Thanks, Uwe Ligges Talbot Katz wrote: Hi. I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on Windows XP (Version 5.1). The problem I am trying to solve has 2843 variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 Gb of memory. After I load the input data into R, I have at most 1.5 Gb of memory available. If I start the lp with significantly less memory available (say 1 Gb), I get an error message from R: Error: cannot allocate vector of size 189459 Kb If I close all my other windows and try to maximize the available memory to the full 1.5 Gb, I can watch the memory get filled up until only about 400 Mb is left, at which point I get a Windows error message: R for Windows GUI front-end has encountered a problem and needs to close. We are sorry for the inconvenience. This behavior persists even when I relax the integer constraints, and eliminate the 2841 constraints that restrict the integer variables to values = 1, so I'm just running a standard lp with 2843 variables and 5683 constraints. I have been able to get the full MIP formulation to work correctly on some very small problems (~10 variables and 25 constraints). Here is the code for a working example: library(lpSolve) (v1=rev(1:8)) [1] 8 7 6 5 4 3 2 1 (csv1=cumsum(as.numeric(v1))) [1] 8 15 21 26 30 33 35 36 (lencsv1=length(csv1)) [1] 8 (Nm1=lencsv1-1) [1] 7 (Np1=lencsv1+1) [1] 9 ngp=3 f.obj=c(1,1,rep(0,Nm1)) f.int=3:Np1 bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1)) bin.dir=rep(=,Nm1) bin.rhs=rep(1,Nm1) gp.con=c(0,0,rep(1,Nm1)) gp.dir== (gp.rhs=ngp-1) [1] 2 ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) ub.dir=rep(=,Nm1) (ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1]) [1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917 lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1))) lb.dir=rep(=,Nm1) lb.rhs=ub.rhs f.con=rbind(bin.con,gp.con,ub.con,lb.con) f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir) f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs) lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int) lglp$objval [1] 0.917 lglp$solution [1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000 [8] 0.000 0.000 What this is doing is taking the points of v1 and dividing them into contiguous groups (the variable ngp is the number of groups) such that the sums of the v1 values are as close as possible to equal within the three groups. So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution vector shows that the splitting occurs after the second and fourth elements. Anyway, I am wondering... Are 3000 variables and 8500 constraints usually too much for lpSolve to handle in 1.5 Gb of memory? Is there a possible bug (in R or in Windows) that leads to the Windows error when the memory falls below 400 Mb? Is there a problem with my formulation that makes it unstable even after the integer constraints are removed? Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLM: order of terms in model
This is a FAQ 7.18 Why does the output from anova() depend on the order of factors in the model? -thomas On Fri, 9 Mar 2007, Christian Landry wrote: Dear R-helpers, I have been analysing data using a GLM. My model is as follows: mod - glm (V ~ T + as.factor(A) + N, family=gaussian) and using anova(mod, test=F) to get the analysis of deviance table and the fraction of deviance explained by each term. T and A dominate with respect to their Deviance, with T having a larger effect than A (about twice) However, if I reverse T and A in the model, I get that A now explains more deviance than T. My questions are: 1) What is it due to? 2) Is there any way around this? How do I find which model is best and/or can I use another method that won't be sensitive to the order of the terms. Thanks, Christian Reply to: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reformulated matrices dimensions limitation problem
Have a look at the help page for memory.size and memory.limit. The help says you can use these functions on Windows and another approach with Unix. Once you know the available memory you can calculate the total matrix size that fits in it (knowing that a real number takes 8 bytes). I would recommend using up to 70-80% of the available memory for your matrix. Maciej. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use nnet
I want to adjust weight decay and number of hidden units for nnet by a loop like for(decay) { for(number of unit) { for(#run) {model-nnet() test.error- } } } for example: I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and calculate test error after that I want to get a matrix like this decay size maxit #run test_error 0.13200 1 1.2 0.13200 2 1.1 0.13200 3 1.0 0.13200 4 3.4 0.13200 5.. 0.13200 6 .. 0.13200 7 .. 0.13200 8 .. 0.13200 9 .. 0.13200 10 .. 0.23200 1 1.2 0.23200 2 1.1 0.23200 3 1.0 0.23200 4 3.4 0.23200 5.. 0.23200 6 .. 0.23200 7 .. 0.23200 8 .. 0.23200 9 .. 0.23200 10 .. I am not sure if this is correct way to do this? Does anyone tune these parameters like this before? thanks, Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use nnet
AM, I have a pieice of junk on my blog. Here it is. # # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR # # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER # # OF HIDDEN UNITS) OF NEURAL NETS # # library(nnet); library(MASS); data(Boston); X - I(as.matrix(Boston[-14])); # STANDARDIZE PREDICTORS st.X - scale(X); Y - I(as.matrix(Boston[14])); boston - data.frame(X = st.X, Y); # DIVIDE DATA INTO TESTING AND TRAINING SETS set.seed(2005); test.rows - sample(1:nrow(boston), 100); test.set - boston[test.rows, ]; train.set - boston[-test.rows, ]; # INITIATE A NULL TABLE sse.table - NULL; # SEARCH FOR OPTIMAL WEIGHT DECAY # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY for (w in c(0.0001, 0.001, 0.01)) { # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS for (n in 1:10) { # UNITIATE A NULL VECTOR sse - NULL; # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES for (i in 1:10) { # INITIATE THE RANDOM STATE FOR EACH NET set.seed(i); # TRAIN NEURAL NETS net - nnet(Y~X, size = n, data = train.set, rang = 0.1, linout = TRUE, maxit = 1, decay = w, skip = FALSE, trace = FALSE); # CALCULATE SSE FOR TESTING SET test.sse - sum((test.set$Y - predict(net, test.set))^2); # APPEND EACH SSE TO A VECTOR if (i == 1) sse - test.sse else sse - rbind(sse, test.sse); } # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse))); } } # PRINT OUT THE RESULT print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote: I want to adjust weight decay and number of hidden units for nnet by a loop like for(decay) { for(number of unit) { for(#run) {model-nnet() test.error- } } } for example: I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and calculate test error after that I want to get a matrix like this decay size maxit #run test_error 0.13200 1 1.2 0.13200 2 1.1 0.13200 3 1.0 0.13200 4 3.4 0.13200 5.. 0.13200 6 .. 0.13200 7 .. 0.13200 8 .. 0.13200 9 .. 0.13200 10 .. 0.23200 1 1.2 0.23200 2 1.1 0.23200 3 1.0 0.23200 4 3.4 0.23200 5.. 0.23200 6 .. 0.23200 7 .. 0.23200 8 .. 0.23200 9 .. 0.23200 10 .. I am not sure if this is correct way to do this? Does anyone tune these parameters like this before? thanks, Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use nnet
AM, Sorry. please ignore the top box in the code. It is not actually a cv validation but just a simple split-sample validation. sorry for confusion. On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote: AM, I have a pieice of junk on my blog. Here it is. # # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR # # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER # # OF HIDDEN UNITS) OF NEURAL NETS # # library(nnet); library(MASS); data(Boston); X - I(as.matrix(Boston[-14])); # STANDARDIZE PREDICTORS st.X - scale(X); Y - I(as.matrix(Boston[14])); boston - data.frame(X = st.X, Y); # DIVIDE DATA INTO TESTING AND TRAINING SETS set.seed(2005); test.rows - sample(1:nrow(boston), 100); test.set - boston[test.rows, ]; train.set - boston[-test.rows, ]; # INITIATE A NULL TABLE sse.table - NULL; # SEARCH FOR OPTIMAL WEIGHT DECAY # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY for (w in c(0.0001, 0.001, 0.01)) { # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS for (n in 1:10) { # UNITIATE A NULL VECTOR sse - NULL; # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES for (i in 1:10) { # INITIATE THE RANDOM STATE FOR EACH NET set.seed(i); # TRAIN NEURAL NETS net - nnet(Y~X, size = n, data = train.set, rang = 0.1, linout = TRUE, maxit = 1, decay = w, skip = FALSE, trace = FALSE); # CALCULATE SSE FOR TESTING SET test.sse - sum((test.set$Y - predict(net, test.set))^2); # APPEND EACH SSE TO A VECTOR if (i == 1) sse - test.sse else sse - rbind(sse, test.sse); } # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse))); } } # PRINT OUT THE RESULT print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote: I want to adjust weight decay and number of hidden units for nnet by a loop like for(decay) { for(number of unit) { for(#run) {model-nnet() test.error- } } } for example: I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and calculate test error after that I want to get a matrix like this decay size maxit #run test_error 0.13200 1 1.2 0.13200 2 1.1 0.13200 3 1.0 0.13200 4 3.4 0.13200 5.. 0.13200 6 .. 0.13200 7 .. 0.13200 8 .. 0.13200 9 .. 0.13200 10 .. 0.23200 1 1.2 0.23200 2 1.1 0.23200 3 1.0 0.23200 4 3.4 0.23200 5.. 0.23200 6 .. 0.23200 7 .. 0.23200 8 .. 0.23200 9 .. 0.23200 10 .. I am not sure if this is correct way to do this? Does anyone tune these parameters like this before? thanks, Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying some equations over all unique combinations of 4 variables
#I have a data set that looks like this. A bit more complicated actually with # three factor levels but these calculations need to be done on one factor at a #I then have a set of different rates that are applied #to it. #dataset cata - c( 1,1,6,1,1,2) catb - c( 1,2,3,4,5,6) doga - c(3,5,3,6,4, 0) data1 - data.frame(cata, catb, doga) rm(cata,catb,doga) data1 # start rates # names for lists fnams - c(af, pf, cf, mf) mnams - c(am, pm, cm, mm) # Current layout of the rate data frames alphahill - list(af - c(a1,a2,a3), pf - c(d1,d2,d3), cf - c(f1,f2), mf - c(h1,h2)) names(alphahill) - fnams betahill - list(am - c(b1,b2,b3), pm- c(e1,e2,e3), cm - c(g1,g2), mm - c(j1, j2)) names(betahill) - mnams hilltop - list(af - data.frame(a1 - 1:4 , a2 - 2:5, a3 - 3:6), pf - data.frame(d1 - 4:1, d2 - 5:2, d3 - 6:3), cf - data.frame(f1 - 1:4, f2 - 3:6), mf - data.frame(h1 - 1:4, h2 - 2:5)) hilldown - list(am - data.frame(b1 - 4:1, b2 - 5:2, b3 - 6:3), pm - data.frame(e1 - 5:1, e2 - 1:5,e3 - 6:2), cm - data.frame (g1 - 5:1, g2 - 1:5), mm - data.frame(j1 - 1:4, j2 - 5:2)) names(hilltop) - fnams names(hilldown) - mnams for (i in 1:4) { names(hilltop[[i]]) - alphahill[[i]] names(hilldown[[i]]) - betahill[[i]] } rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2, fnams, mnams, af, am,cf,cm,mf, mm,pf, pm) # Now that's out of the way #Assuming I am reading this problem correctly I should have #648 possible combinations for each row of data that is: #unique combinations where I need # (af*am) * (pf*pm) * (cf*cm) * mf * mm # ie (3*3) * (3* 3) * (2*2) * 2*2) # based on the idea that there are 9 unique combination for af am and so # on. # af am # 1 a1 b1 # 2 a2 b1 # 3 a3 b1 # 4 a1 b2 # 5 a2 b2 # 6 a3 b2 # 7 a1 b3 # 8 a2 b3 # 9 a3 b3 # I have a set of equations of the form : #P1 - af*cata + pf*catb^cf + mf*doga #S1 - am*cata + pm*catb^cm + mm*doga #Is there any way that I can do something like this and keep track of #what condition is what since I need to be able to sum the P1s and P2, for # for each combination (or a subset of them) ? I suspect it may be a fairly # straight-forward apply problem but I am having a real problem with it. # I am only likely to need to report, perhaps. 15 combinations but at the # moment Idon't see any easy way to do them and doing all possible outcomes and # extracting the required ones looks like a better and safer approach if it can # be done.And will save a lot of time if we suddenly need a few new comparisons. # Any help would be greatly appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a list that grows automatically
I would like to know if there is a way to create a list or an array (or anything) which grows automatically as more elements are put into it. What I want to find is something equivalent to an ArrayList object of Java language. In Java, I can do the following thing: // Java code ArrayList myArray = new ArrayList(); myArray.add(object1); myArray.add(object2); // End of java code As others have mentioned, you can do this with lists in R. However, there is an important difference between ArrayLists in Java and Lists in R. In Java, when an ArrayList grows past its bound, it doesn't allocate just enough space, it allocates a lot more, so the next time you allocate past the end of the array, there's space already reserved. This gives (IIRC) amortised O(n) behaviour. R doesn't do this however, so has to copy the entire array every time giving O(n^2) behaviour. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying some equations over all unique combinations of 4 variables
I just realised after posting I have two vectors of the wrong length. The corrected program is: #dataset cata - c( 1,1,6,1,1,2) catb - c( 1,2,3,4,5,6) doga - c(3,5,3,6,4, 0) data1 - data.frame(cata, catb, doga) rm(cata,catb,doga) data1 # start rates # names for lists fnams - c(af, pf, cf, mf) mnams - c(am, pm, cm, mm) # Current layout of the rate data frames alphahill - list(af - c(a1,a2,a3), pf - c(d1,d2,d3), cf - c(f1,f2), mf - c(h1,h2)) names(alphahill) - fnams betahill - list(am - c(b1,b2,b3), pm- c(e1,e2,e3), cm - c(g1,g2), mm - c(j1, j2)) names(betahill) - mnams hilltop - list(af - data.frame(a1 - 1:4 , a2 - 2:5, a3 - 3:6), pf - data.frame(d1 - 4:1, d2 - 5:2, d3 - 6:3), cf - data.frame(f1 - 1:4, f2 - 3:6), mf - data.frame(h1 - 1:4, h2 - 3:6)) hilldown - list(am - data.frame(b1 - 4:1, b2 - 5:2, b3 - 6:3), pm - data.frame(e1 - 4:1, e2 - 1:4,e3 - 6:3), cm - data.frame (g1 - 4:1, g2 - 1:4), mm - data.frame(j1 - 1:4, j2 - 4:1)) names(hilltop) - fnams names(hilldown) - mnams for (i in 1:4) { names(hilltop[[i]]) - alphahill[[i]] names(hilldown[[i]]) - betahill[[i]] } rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2, fnams, mnams, af, am,cf,cm,mf, mm,pf, pm) --- John Kane [EMAIL PROTECTED] wrote: #I have a data set that looks like this. A bit more complicated actually with # three factor levels but these calculations need to be done on one factor at a #I then have a set of different rates that are applied #to it. #dataset cata - c( 1,1,6,1,1,2) catb - c( 1,2,3,4,5,6) doga - c(3,5,3,6,4, 0) data1 - data.frame(cata, catb, doga) rm(cata,catb,doga) data1 # start rates # names for lists fnams - c(af, pf, cf, mf) mnams - c(am, pm, cm, mm) # Current layout of the rate data frames alphahill - list(af - c(a1,a2,a3), pf - c(d1,d2,d3), cf - c(f1,f2), mf - c(h1,h2)) names(alphahill) - fnams betahill - list(am - c(b1,b2,b3), pm- c(e1,e2,e3), cm - c(g1,g2), mm - c(j1, j2)) names(betahill) - mnams hilltop - list(af - data.frame(a1 - 1:4 , a2 - 2:5, a3 - 3:6), pf - data.frame(d1 - 4:1, d2 - 5:2, d3 - 6:3), cf - data.frame(f1 - 1:4, f2 - 3:6), mf - data.frame(h1 - 1:4, h2 - 2:5)) hilldown - list(am - data.frame(b1 - 4:1, b2 - 5:2, b3 - 6:3), pm - data.frame(e1 - 5:1, e2 - 1:5,e3 - 6:2), cm - data.frame (g1 - 5:1, g2 - 1:5), mm - data.frame(j1 - 1:4, j2 - 5:2)) names(hilltop) - fnams names(hilldown) - mnams for (i in 1:4) { names(hilltop[[i]]) - alphahill[[i]] names(hilldown[[i]]) - betahill[[i]] } rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2, fnams, mnams, af, am,cf,cm,mf, mm,pf, pm) # Now that's out of the way #Assuming I am reading this problem correctly I should have #648 possible combinations for each row of data that is: #unique combinations where I need # (af*am) * (pf*pm) * (cf*cm) * mf * mm # ie (3*3) * (3* 3) * (2*2) * 2*2) # based on the idea that there are 9 unique combination for af am and so # on. # af am # 1 a1 b1 # 2 a2 b1 # 3 a3 b1 # 4 a1 b2 # 5 a2 b2 # 6 a3 b2 # 7 a1 b3 # 8 a2 b3 # 9 a3 b3 # I have a set of equations of the form : #P1 - af*cata + pf*catb^cf + mf*doga #S1 - am*cata + pm*catb^cm + mm*doga #Is there any way that I can do something like this and keep track of #what condition is what since I need to be able to sum the P1s and P2, for # for each combination (or a subset of them) ? I suspect it may be a fairly # straight-forward apply problem but I am having a real problem with it. # I am only likely to need to report, perhaps. 15 combinations but at the # moment Idon't see any easy way to do them and doing all possible outcomes and # extracting the required ones looks like a better and safer approach if it can # be done.And will save a lot of time if we suddenly need a few new comparisons. # Any help would be greatly appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reg. strings and numeric data in matrix.
Hi All, Sorry for this basic question as I am new to this R. I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. Thanks very much, mallika Mallika Veeramalai, Ph.D., Postdoctoral Associate, Bioinformatics Systems Biology, Burnham Institute for Medical Research, La Jolla, CA 92037, USA. phone : +1 858 646 3100 ext: 3627 Fax : +1 858 795 5249 Web : http://bioinformatics.burnham.org/~mallika/ Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] About cex=: how to improve resolution?
Hi, I need to plot a graph with a fixed circle and with a series of point of different size. Here is a simplified example: angle-pi/180*c(0:360) x-seq(0,2,by=0.2) y-seq(0,2,by=0.2) z-seq(0,1,by=0.1) par(pty=s) plot(-2:2,-2:2,type=n) lines(cos(angle),sin(angle)) points(x,y,cex=z) The size of the points compared to the circle (of radius 1) is important and bears a meaning. But instead of having 11 points with increasing size, I only obtain points of the same size when cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or cex=0.8/0.9/1.0. Please, does anyone know if there is a way of improving the resolution of cex= *without* changing the size of the circle of radius 1 and keeping the same axis? Thanks in advance!!! Luca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting text from a character string
I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way - Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MCMC logit
Hi, I have a dataset with the binary outcome Y(0,1) and 4 covariates (X1,X@,X#,X$). I am trying to use MCMClogit to model logistic regression using MCMC. I am getting an error where it doesnt identify the covariates ,although its reading in correctly. The dataset is a sample of actual dataset. Below is my code: ### #retreive data # considering four covariates d.df=as.data.frame(read.table(c:/tina/phd/thesis/data/modified_data1.1.txt,header=T,sep=,)) y=d.df[,ncol(d.df)] x=d.df[,1:4] c.df=cbind(y,x) #x=cbind(1,x) p - ncol(c.df) # marginal log-prior of beta[] logpriorfun - function(beta, mu, gshape, grate) + { + logprior = -p*log(2) + log(gamma(p+gshape)) - log(gamma(gshape)) + + gshape*log(grate) - (p+gshape)* log(grate+sum(abs(beta))) + return(logprior) + } require(MCMCpack) Loading required package: MCMCpack Loading required package: coda Loading required package: lattice Loading required package: MASS ## ## Markov Chain Monte Carlo Package (MCMCpack) ## Copyright (C) 2003-2007 Andrew D. Martin and Kevin M. Quinn ## ## Support provided by the U.S. National Science Foundation ## (Grants SES-0350646 and SES-0350613) ## [1] TRUE Warning message: package 'MASS' was built under R version 2.4.1 a0 = 0.5 b0 = 1 mu0 = 0 beta.init=list(c(0, rep(0.1,4)), c(0, rep(-0.1,4)), c(0, rep(0, 4))) burnin.cycles = 1000 mcmc.cycles = 25000 # three chains post.list - lapply(beta.init, function(vec) + { + posterior - MCMClogit(y~x1+x2+x3+x4, data=c.df, burnin=burnin.cycles, mcmc=mcmc.cycles, + thin=5, tune=0.5, beta.start=vec, user.prior.density=logpriorfun, logfun=TRUE, + mu=mu0, gshape=a0, grate=b0) + return(posterior) + }) Error in eval(expr, envir, enclos) : object x1 not found Any suggestions will be greatly appreciated. Thanks, Anamika - We won't tell. Get more on shows you hate to love [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting text from a character string
I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? --- Shawn Way 14 Cambridge Center Cambridge, MA 02142 Ph:617-679-4488 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg. strings and numeric data in matrix.
Mallika Veeramalai mallikav at burnham.org writes: I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. It's called a data frame. See the Introduction to R, and help for read.table and read.csv. (The character data will get made into factors unless you use as.is=TRUE or specify colClasses.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About cex=: how to improve resolution?
If the size of the circle is important, then you may want to use the symbols function with the circle argument rather than points and cex. Use the inches argument to set the size (in inches) of the largest circle, then the other circles will be scalled accordingly. Or if you set inches=FALSE, then the circles will be scaled to the x-axis. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luca Quaglia Sent: Friday, March 09, 2007 1:11 PM To: r-help@stat.math.ethz.ch Subject: [R] About cex=: how to improve resolution? Hi, I need to plot a graph with a fixed circle and with a series of point of different size. Here is a simplified example: angle-pi/180*c(0:360) x-seq(0,2,by=0.2) y-seq(0,2,by=0.2) z-seq(0,1,by=0.1) par(pty=s) plot(-2:2,-2:2,type=n) lines(cos(angle),sin(angle)) points(x,y,cex=z) The size of the points compared to the circle (of radius 1) is important and bears a meaning. But instead of having 11 points with increasing size, I only obtain points of the same size when cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or cex=0.8/0.9/1.0. Please, does anyone know if there is a way of improving the resolution of cex= *without* changing the size of the circle of radius 1 and keeping the same axis? Thanks in advance!!! Luca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg. strings and numeric data in matrix.
--- Mallika Veeramalai [EMAIL PROTECTED] wrote: Hi All, Sorry for this basic question as I am new to this R. I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. Thanks very much, mallika If I understand the question the answer is NO. A matrix must be of one type of data. I think that what you want is a data.frame wich allows mixed categores of data. Try this to see the difference. a - c('a','b','c') b - c( 1,2,3) aa - cbind(a,b) aa class(aa) bb - data.frame(a,b) bb class(bb) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg. strings and numeric data in matrix.
See ?data.frame ?read.table and please read (appropriate parts of) the Introduction to R manual. Petr Mallika Veeramalai napsal(a): Hi All, Sorry for this basic question as I am new to this R. I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. Thanks very much, mallika Mallika Veeramalai, Ph.D., Postdoctoral Associate, Bioinformatics Systems Biology, Burnham Institute for Medical Research, La Jolla, CA 92037, USA. phone : +1 858 646 3100 ext: 3627 Fax : +1 858 795 5249 Web : http://bioinformatics.burnham.org/~mallika/ Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dendrogram again
Though your example helped, i am still trying to get exactly what i want. Here´s how i get my dendrogram. and my example... (using ward instead of average, doesnt make any difference for my case - i believe) hc=hclust(dist(mymatrix),average) hcd=as.dendrogram(hc) plot(hcd) the major problem in my case is that my matrix has 196 lines, that means the end of the dendrogram has almost 200 end nodes. ofcourse i know cut meanwhile ;) ... but somehow the branches that i get by cut dont help me. If i look at the full dendrogram i have about 25 nodes. i just want to label about 5 of them (from the upper end). Everything that happens below is not of interest for me. Do you know how to label some specific nodes ? I´ll try max.levels... thx in advance m.- Am 09.03.2007 um 13:52 schrieb Gavin Simpson: require(vegan) ## if false install it data(varespec) hc - hclust(vegdist(varespec, bray), method = ward) hc - as.dendrogram(hc) ## this is the full dendrogram - too many nodes, so prune plot(hc) ## lets take four clusters and prune it back hc.pruned - cut(hc, h = 1) # can't specify k so read height of first # plot - cutting at h = 1 gives 4 clusters # plot only the upper part of the tree showing only the 4 clusters plot(hc.pruned$upper, center = TRUE) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg. strings and numeric data in matrix.
A matrix can only have 1 type of data, so if you try to include both strings and numbers in a matrix, the numbers will be converted to strings. Another type of data object is a data frame, a data frame works much like a matrix in many ways, but allows some columns to be numbers and others to be strings (though usually strings are converted to factors). You should read (or reread) the help page An Introduction to R, section 5 talks about matricies, then section 6 talks about data frames (and lists). Section 7 shows how to read data from files into data frames. Those 3 sections should answer your questions below. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mallika Veeramalai Sent: Friday, March 09, 2007 1:03 PM To: r-help@stat.math.ethz.ch Subject: [R] Reg. strings and numeric data in matrix. Hi All, Sorry for this basic question as I am new to this R. I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. Thanks very much, mallika __ __ Mallika Veeramalai, Ph.D., Postdoctoral Associate, Bioinformatics Systems Biology, Burnham Institute for Medical Research, La Jolla, CA 92037, USA. phone : +1 858 646 3100 ext: 3627 Fax : +1 858 795 5249 Web : http://bioinformatics.burnham.org/~mallika/ Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting text from a character string
Try replacing \d with \\d throughout your pattern. The R parser is trying to interpret the \ before the grep function ever sees it. By backslashing the backslashes, the parser ends up putting a single backslash in the pattern for grep to see. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Way Sent: Friday, March 09, 2007 1:12 PM To: r-help@stat.math.ethz.ch Subject: [R] Extracting text from a character string I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way - Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting text from a character string
On Fri, 2007-03-09 at 15:23 -0500, Shawn Way wrote: I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? At least two different ways: Vec - CB01_0171_03-27-2002-(Sample 26609)-(126) 1. Using substr(), if your source vector is a fixed format # Get the 11th thru the 20th character substr(Vec, 11, 20) [1] 03-27-2002 2. Using sub() for a more generalized approach: # Use a back reference, returning the value pattern within the # parens sub(.+([0-9]{2}-[0-9]{2}-[0-9]{4}).+, \\1, Vec) [1] 03-27-2002 See ?substr, ?sub and ?regex HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting text from a character string
I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way We won't tell. Get more on shows you hate to love __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting text from a character string
Try this: library(gsubfn) x - CB01_0171_03-27-2002-(Sample 26609)-(126) unlist(strapply(x, ..-..-)) The gsubfn home page is at: http://code.google.com/p/gsubfn/ On 3/9/07, Shawn Way [EMAIL PROTECTED] wrote: I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way - Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting text from a character string
actually, I am thinking of strsplit(). On 3/9/07, Shawn Way [EMAIL PROTECTED] wrote: I have a set of character strings like below: data3[1] [1] CB01_0171_03-27-2002-(Sample 26609)-(126) I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way - Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] About cex=: how to improve resolution?
Hi, I need to plot a graph with a circle of radius 1 and with a series of points of different size. The size of these points compared to the fixed circle is important and bears a meaning. Here is the a simplified version of the code I'm using: x-seq(0,2,by=0.2) y-x z-seq(0,1,by=0.1) angle-pi/180*c(0:359) par(pty=s) plot(-2:2,-2:2,type=n) lines(cos(angle),sin(angle)) points(x,y,cex=z) I obtain points of the same size when cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or cex=0.8/0.9/1.0. Please, does anyone know if there is a way of improving the resolution of cex in order to have 10 points *all* of different size (respecting the above written different values of cex)? The circle is fixed of radius 1 and the values of cex are in relation with that and they shouldn't be modified. Thanks, Luca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dendrogram - got it , just need to label :)
Hi all, Hi Gavin, thx for your help i finally found out what i want to do and how to fix it. just needed to get some more level my cut level was too small... two question remain... a) can i somehow scale the twigs after cutting ? b) how can i label the nodes and how to label which one... thx !! -m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reformulated matrices dimensions limitation problem
Have a look at the help page for memory.size and memory.limit. The help says you can use these functions on Windows and another approach with Unix. Once you know the available memory you can calculate the total matrix size that fits in it (knowing that a real number takes 8 bytes). I would recommend using up to 70-80% of the available memory for your matrix. But then there's the overhead for each R object, which is very non-trivial (but not so bad as to be completely depressing...). --e __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCMC logit
As the error message clearly indicates, the function MCMClogit is unable to find the variable x1 (possibly x2,x3, and x4 also) in the data frame c.df. Check the names of the variables in that data frame and make sure that the names correspond to the formula specification. Hope this helps, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Anamika Chaudhuri Sent: Friday, March 09, 2007 3:27 PM To: r-help@stat.math.ethz.ch Subject: [R] MCMC logit Hi, I have a dataset with the binary outcome Y(0,1) and 4 covariates (X1,X@,X#,X$). I am trying to use MCMClogit to model logistic regression using MCMC. I am getting an error where it doesnt identify the covariates ,although its reading in correctly. The dataset is a sample of actual dataset. Below is my code: ### #retreive data # considering four covariates d.df=as.data.frame(read.table(c:/tina/phd/thesis/data/modified_data1.1.txt ,header=T,sep=,)) y=d.df[,ncol(d.df)] x=d.df[,1:4] c.df=cbind(y,x) #x=cbind(1,x) p - ncol(c.df) # marginal log-prior of beta[] logpriorfun - function(beta, mu, gshape, grate) + { + logprior = -p*log(2) + log(gamma(p+gshape)) - log(gamma(gshape)) + + gshape*log(grate) - (p+gshape)* log(grate+sum(abs(beta))) + return(logprior) + } require(MCMCpack) Loading required package: MCMCpack Loading required package: coda Loading required package: lattice Loading required package: MASS ## ## Markov Chain Monte Carlo Package (MCMCpack) ## Copyright (C) 2003-2007 Andrew D. Martin and Kevin M. Quinn ## ## Support provided by the U.S. National Science Foundation ## (Grants SES-0350646 and SES-0350613) ## [1] TRUE Warning message: package 'MASS' was built under R version 2.4.1 a0 = 0.5 b0 = 1 mu0 = 0 beta.init=list(c(0, rep(0.1,4)), c(0, rep(-0.1,4)), c(0, rep(0, 4))) burnin.cycles = 1000 mcmc.cycles = 25000 # three chains post.list - lapply(beta.init, function(vec) + { + posterior - MCMClogit(y~x1+x2+x3+x4, data=c.df, burnin=burnin.cycles, mcmc=mcmc.cycles, + thin=5, tune=0.5, beta.start=vec, user.prior.density=logpriorfun, logfun=TRUE, + mu=mu0, gshape=a0, grate=b0) + return(posterior) + }) Error in eval(expr, envir, enclos) : object x1 not found Any suggestions will be greatly appreciated. Thanks, Anamika - We won't tell. Get more on shows you hate to love [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About cex=: how to improve resolution?
replace points(x,y,cex=z) with symbols(x, y, circles=z/10, inches=FALSE, add=TRUE) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using large datasets: can I overload the subscript operator?
Hello, I do some computations on datasets that come from climate models. These data are huge arrays, significantly larger than typically available RAM, so they have to be accessed row-by-row, or rather slice-by slice, depending on the task. I would like to make an R package to easily access such datasets within R. The C++ backend is ready and being used under Windows/.Net/Visual Basic, but I have yet to learn the specifics of R programming to make a good R interface. I think it should be possible to make a package (call it slice) that could be used like this: library (slice) dataset - load.virtualarray (dataset_definition.xml) ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk and extract it In the above dataset is an object that holds a definition of a 3-dimensional large dataset, and ordinaryvector is an ordinary R vector. The subscripting operator fetches necessary data from disk and extracts a required slice, taking care of caching and other technical details. So, my questions are: Has anyone ever made a similar extension, with virtual (lazy) arrays? Can the suscript operator be overloaded like that in R? (I know it can be in S, at least for vectors.) And a tough one: is it possible to make an expression like [1] (without quoutes) meaningful in R? At the moment it results in a syntax error. I would like to make it return an object of a special class that gets interpreted when subscripting my virtual array as drop this dimension, like this: dataset [, 2, 3, drop = F] # Return a 3-dimensional array dataset [, [2], 3, drop = F] # Return a 2-dimensional array dataset [, [2], [3], drop = F] # Return a 1-dimensional array, like dataset [, 2, 3] Thanks in advance for any help, Maciej. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting the p of F statistics from lm
On Mar 9, 2007, at 11:18 AM, Cressoni, Massimo ((NIH/NHLBI)) [F] wrote: I need to extract the p value from a ANOVA done with lm model fitting - lm(var ~ group) Sfitting - summary(fitting) Sfitting[10][1] gives the F value and the degrees of freedom but I am not able to get the p value. try Sfitting[4]$coefficients[,4] I'm not sure that this is the best way, but it works with the example for lm() summary(lm.D9)[4]$coefficients[,4] # (Intercept) groupTrt # 9.547128e-15 2.490232e-01 _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] H0 and H1 probabilities in Cohen's Effect Size w for X2 test
Dear all, I've been delighted to just notice that Cohen's formulas for Effect Size 'w' and the associated power have been implemented in the 'pwr' package (thanks to Stéphane Champely and others).. There is one aspect, though, that perplexes me. I'm doing some last minute post hoc analyses, meaning that my sample size (N=3404) has been long fixed, and I'm interested in assessing the ES and Power after the fact.. As far as I can deduce from the implementation of the ES.w2 formula or Cohen's (1992) own article, it seems to me that the probabilities p(H0) and p(H1) would simply be the expected and observed absolute frequencies divided by the sample size N, in that the 'true' probablities are the observed proportions and the null probabilities the expected ones. If this is correct, then the effect size and the power statistics can naturally easily be calculated with the 'pwr' package. However, this entails that the noncentrality parameter lambda=N*w^2 is equal to the chi-squared statistic X^2. observed p h ma X 119 64 36 37 Y 594 323 776 1455 expected p h m a X 53.62162 29.10458 61.06698 112.2068 Y 659.37838 357.89542 750.93302 1379.7932 observed.p p h m a X 0.03495887 0.01880141 0.01057579 0.01086957 Y 0.17450059 0.09488837 0.22796710 0.42743831 expected.p p h m a X 0.01575253 0.008550112 0.01793977 0.03296322 Y 0.19370693 0.105139664 0.22060312 0.40534465 ES.w2(observed.p) [1] 0.2406104 ES.w1(expected.p,observed.p) [1] 0.2406104 pwr.chisq.test(w=ES.w1(expected.p,observed.p),N=3404,sig.level=.05, df=3) Chi squared power calculation w = 0.2406104 N = 3404 df = 3 sig.level = 0.05 power = 1 NOTE: N is the number of observations lambda - 3404*ES.w1(observed.p,expected.p)^2 lambda [1] 240.9289 pchisq(qchisq(p=.05,df=3,lower.tail=F),ncp=lambda,df=3,lower=F) [1] 1 Have I missed or misunderstood something here altogether? Should the alternative H0 probabilities be estimated by e.g. some sort of fitting? Any pointers, suggestions or assistance would be greatly appreciated. -Antti Arppe -- == Antti Arppe - Master of Science (Engineering) Researcher doctoral student (Linguistics) E-mail: [EMAIL PROTECTED] WWW: http://www.ling.helsinki.fi/~aarppe -- Work: Department of General Linguistics, University of Helsinki Work address: P.O. Box 9 (Siltavuorenpenger 20 A) 00014 University of Helsinki, Finland Work telephone: +358 9 19129312 (int'l) 09-19129312 (in Finland) Work telefax: +358 9 19129307 (int'l) 09-19129307 (in Finland) -- Private address: Fleminginkatu 25 E 91, 00500 Helsinki, Finland Private telephone: +358 50 5909015 (int'l) 050-5909015 (in Finland) --__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Table Construction from calculations
Hi- I am trying to create a table of values by adding pairs of vectors, but am running into some problems. The problem is best expressed by a simple example. Starting with a data table basis: atom x y z 1 Cu 0.0 0.0 0.0 2 Cu 0.5 0.5 0.5 I want to add 0.5 0.5 0.5 (and also the 0 0 0 but it wouldn't change the values below so I won't refer to it in the rest of the example) to a list of vectors in the form of: latpoints V1 V2 V3 1 0 0 0 2 0 0 1 3 0 0 2 4 0 0 3 5 0 1 1 so that I end up with a table such as: V1 V2 V3 0.5 0.5 0.5 0.5 0.5 1.5 0.5 0.5 2.5 0.5 0.5 3.5 0.5 1.5 1.5 I've tried many variations on the following: (not just cat, but most of the data/data.table options) test = for(i in 1:5) {cat(basis[1,2:4] + latticemultipliers[i,], append=TRUE)} However, I either end up with an error telling me that cat doesn't handle type 'list' or with a table with length of 1 such as: xyz 2 0.5 1.5 1.5 Which is simply the last value that the loop calculates. Does anyone know what function handles lists of the form I am using, or have a better suggestion on how to get the form that I want. Thanks in advance, Seth Imhoff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Table Construction from calculations
Your data table basis is actually a dataframe, whose first column is non-numeric. That's what is causing the problem. Try removing the first column of the dataframe before adding the row to your matrix: test - latpoints + basis[2, -1] -Christos Christos Hatzis, Ph.D. Nuvera Biosciences, Inc. 400 West Cummings Park Suite 5350 Woburn, MA 01801 Tel: 781-938-3830 www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Seth Imhoff Sent: Friday, March 09, 2007 9:23 PM To: r-help@stat.math.ethz.ch Subject: [R] Table Construction from calculations Hi- I am trying to create a table of values by adding pairs of vectors, but am running into some problems. The problem is best expressed by a simple example. Starting with a data table basis: atom x y z 1 Cu 0.0 0.0 0.0 2 Cu 0.5 0.5 0.5 I want to add 0.5 0.5 0.5 (and also the 0 0 0 but it wouldn't change the values below so I won't refer to it in the rest of the example) to a list of vectors in the form of: latpoints V1 V2 V3 1 0 0 0 2 0 0 1 3 0 0 2 4 0 0 3 5 0 1 1 so that I end up with a table such as: V1 V2 V3 0.5 0.5 0.5 0.5 0.5 1.5 0.5 0.5 2.5 0.5 0.5 3.5 0.5 1.5 1.5 I've tried many variations on the following: (not just cat, but most of the data/data.table options) test = for(i in 1:5) {cat(basis[1,2:4] + latticemultipliers[i,], append=TRUE)} However, I either end up with an error telling me that cat doesn't handle type 'list' or with a table with length of 1 such as: xyz 2 0.5 1.5 1.5 Which is simply the last value that the loop calculates. Does anyone know what function handles lists of the form I am using, or have a better suggestion on how to get the form that I want. Thanks in advance, Seth Imhoff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using large datasets: can I overload the subscript operator?
On 3/9/2007 6:47 PM, Maciej Radziejewski wrote: Hello, I do some computations on datasets that come from climate models. These data are huge arrays, significantly larger than typically available RAM, so they have to be accessed row-by-row, or rather slice-by slice, depending on the task. I would like to make an R package to easily access such datasets within R. The C++ backend is ready and being used under Windows/.Net/Visual Basic, but I have yet to learn the specifics of R programming to make a good R interface. I think it should be possible to make a package (call it slice) that could be used like this: library (slice) dataset - load.virtualarray (dataset_definition.xml) ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk and extract it In the above dataset is an object that holds a definition of a 3-dimensional large dataset, and ordinaryvector is an ordinary R vector. The subscripting operator fetches necessary data from disk and extracts a required slice, taking care of caching and other technical details. So, my questions are: Has anyone ever made a similar extension, with virtual (lazy) arrays? Yes, e.g. the SQLiteDF package. Can the suscript operator be overloaded like that in R? (I know it can be in S, at least for vectors.) Yes. And a tough one: is it possible to make an expression like [1] (without quoutes) meaningful in R? At the moment it results in a syntax error. I would like to make it return an object of a special class that gets interpreted when subscripting my virtual array as drop this dimension, like this: dataset [, 2, 3, drop = F] # Return a 3-dimensional array dataset [, [2], 3, drop = F] # Return a 2-dimensional array dataset [, [2], [3], drop = F] # Return a 1-dimensional array, like dataset [, 2, 3] No, that's not legal S or R syntax. However, you might be able to define a special object D and use syntax like dataset [, D[2], 3, drop = F] Duncan Murdoch Thanks in advance for any help, Maciej. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Table Construction from calculations
On Mar 9, 2007, at 9:23 PM, Seth Imhoff wrote: I am trying to create a table of values by adding pairs of vectors, but am running into some problems. The problem is best expressed by a simple example. Starting with a data table basis: atom x y z 1 Cu 0.0 0.0 0.0 2 Cu 0.5 0.5 0.5 I want to add 0.5 0.5 0.5 (and also the 0 0 0 but it wouldn't change the values below so I won't refer to it in the rest of the example) to a list of vectors in the form of: latpoints V1 V2 V3 1 0 0 0 2 0 0 1 3 0 0 2 4 0 0 3 5 0 1 1 so that I end up with a table such as: V1 V2 V3 0.5 0.5 0.5 0.5 0.5 1.5 0.5 0.5 2.5 0.5 0.5 3.5 0.5 1.5 1.5 I've tried many variations on the following: (not just cat, but most of the data/data.table options) test = for(i in 1:5) {cat(basis[1,2:4] + latticemultipliers[i,], append=TRUE)} However, I either end up with an error telling me that cat doesn't handle type 'list' or with a table with length of 1 such as: xyz 2 0.5 1.5 1.5 Is this what you want? (latpoints - data.frame(matrix(c(0,0,0,0,0,1,0,0,2,0,0,3,0,1,1), nrow = 5, byrow = T))) X1 X2 X3 1 0 0 0 2 0 0 1 3 0 0 2 4 0 0 3 5 0 1 1 (latpoints - latpoints + c(0.5, 0.5, 0.5)) X1 X2 X3 1 0.5 0.5 0.5 2 0.5 0.5 1.5 3 0.5 0.5 2.5 4 0.5 0.5 3.5 5 0.5 1.5 1.5 This is an important feature of R called vectorization (see, .e.g, cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf or www.ms.washington.edu/stat390/winter07/R_primer.pdf) which allows you do avoid writing loops. _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using large datasets: can I overload the subscript operator?
Look at the netcdf packages. A lot of output from climate models is in netcdf anyway. It can take all sorts of slices and strides. -Roy M. On Mar 9, 2007, at 6:54 PM, Duncan Murdoch wrote: On 3/9/2007 6:47 PM, Maciej Radziejewski wrote: Hello, I do some computations on datasets that come from climate models. These data are huge arrays, significantly larger than typically available RAM, so they have to be accessed row-by-row, or rather slice-by slice, depending on the task. I would like to make an R package to easily access such datasets within R. The C++ backend is ready and being used under Windows/.Net/Visual Basic, but I have yet to learn the specifics of R programming to make a good R interface. I think it should be possible to make a package (call it slice) that could be used like this: library (slice) dataset - load.virtualarray (dataset_definition.xml) ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk and extract it In the above dataset is an object that holds a definition of a 3-dimensional large dataset, and ordinaryvector is an ordinary R vector. The subscripting operator fetches necessary data from disk and extracts a required slice, taking care of caching and other technical details. So, my questions are: Has anyone ever made a similar extension, with virtual (lazy) arrays? Yes, e.g. the SQLiteDF package. Can the suscript operator be overloaded like that in R? (I know it can be in S, at least for vectors.) Yes. And a tough one: is it possible to make an expression like [1] (without quoutes) meaningful in R? At the moment it results in a syntax error. I would like to make it return an object of a special class that gets interpreted when subscripting my virtual array as drop this dimension, like this: dataset [, 2, 3, drop = F] # Return a 3-dimensional array dataset [, [2], 3, drop = F] # Return a 2-dimensional array dataset [, [2], [3], drop = F] # Return a 1-dimensional array, like dataset [, 2, 3] No, that's not legal S or R syntax. However, you might be able to define a special object D and use syntax like dataset [, D[2], 3, drop = F] Duncan Murdoch Thanks in advance for any help, Maciej. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: [EMAIL PROTECTED] (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ Old age and treachery will overcome youth and skill. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] long character string problem
Hi All I am having 2 very long character strings (550chars) and I want to put them as expressions together with c(). The problem is that I also get these double-quotes, as seen below in 'fct'. How can I remove these double-quotes? I tried as.name() but it did not work (because of size?). These are creating trouble with subsequent programs, which I tested with strings that for some reason do not have these double quotes (see very bottom). cum1 [1] A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110) cum2 [1] A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210) fct = c(as.expression(cum1), as.expression(cum2)) fct expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110), A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)) fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1)) fct expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] understanding print.summary.lm and perhaps print/show in general
Petr, Thanks, you set me on the right path. It turns out that the behavior that surprises me is this: when you change $coef it isn't the same as changing $coefficients. The first changes the value, but the second changes the value and makes the print/show metod change its output from when the summary.lm was created. I think the sample code below highlights this behavior nicely. Why would you want this behavior? Cheers, Paul R code: lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) dimnames(suma$coef) # after setting colnames, dimnames of coefficients variable is set suma # but printing is still the old print dimnames(suma$coefficients) - list(names(suma$coefficients), c(LETTERS[1:4])) dimnames(suma$coef) # no change in dimnames from before suma # but the summary output is now refreshed! ## Another solution is to look into the code of summary.lm a few lines above where the (dim)names are assigned. Based on this, you may try lma - lm(dist ~ speed, data=cars) suma - summary(lma) colnames(suma$coef) - c(LETTERS[1:4]) printCoefmat(suma$coef) # prints what I expect suma dimnames(suma$coefficients) - list(names(suma$coefficients), c(LETTERS[1:4])) suma You might also find reading the chapter on generic functions in the R-lang (R language definition) manual useful. Petr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dendrogram - got it , just need to label :)
Here is one example of labeling nodes, borrowing code from the help page for the dendrapply() function. local({ edgeLab - function(n) { if(!is.leaf(n)) { a - attributes(n) i - i+1 attr(n, edgetext) - format(i) } n } i - 0 }) dL - dendrapply(as.dendrogram(hclust(dist(iris[, 1:4]), method = single)), edgeLab) plot(dL) This labels the edges above the nodes. Martin Maechler and Robert Gentleman are developing the dendrogram objects suite of functions. As I have had to label nodes in S-PLUS, I'd like to put in a request for a few more control parameters for edge/internal node labeling control: - Allow the label without the polygon surrounding it. The polygon can obliterate too much of the dendrogram for larger sample sizes. Perhaps an edgePar polygon plot logical p.plot taking values TRUE (default) and FALSE to omit the polygon. - Allow the label to appear near the node at the base of the edge. Perhaps an edgePar text location parameter t.pos taking values in (0.0, 1.0) where 0.5 is in the middle of the edge (the default) and 1.0 is at the base of the edge. Since clusters are not always identified by 'cutting' the dendrogram (e.g. in the iris single linkage dendrogram plot we want to identify internal nodes that are large runts or whose vertical edge lengths are considerably longer than average) it is useful to be able to identify nodes deeper in the tree. This is aided by having access to internal node labels/names and being able to extract internal nodes by those labels/names. Best Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of bunny , lautloscrew.com Sent: Fri 3/9/2007 2:02 PM To: R-help@stat.math.ethz.ch Subject: [R] dendrogram - got it , just need to label :) Hi all, Hi Gavin, thx for your help i finally found out what i want to do and how to fix it. just needed to get some more level my cut level was too small... two question remain... a) can i somehow scale the twigs after cutting ? b) how can i label the nodes and how to label which one... thx !! -m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] long character string problem
I think you are looking for fct - c(parse(text=cum1),parse(text=cum2)) although you need to include operators before your A coefficients (for example, ...C11)A12*... --- [EMAIL PROTECTED] wrote: Hi All I am having 2 very long character strings (550chars) and I want to put them as expressions together with c(). The problem is that I also get these double-quotes, as seen below in 'fct'. How can I remove these double-quotes? I tried as.name() but it did not work (because of size?). These are creating trouble with subsequent programs, which I tested with strings that for some reason do not have these double quotes (see very bottom). cum1 [1] A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110) cum2 [1] A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210) fct = c(as.expression(cum1), as.expression(cum2)) fct expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110), A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)) fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1)) fct expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Need Mail bonding? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] long character string problem
first of all, your expressions are not legal. For example sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2) should be sqrt(B11*(X11*x1+X21*x2)^2+C11)*A12*(X12*x1+X22*x2) Here is a rewrite of your first equation that seems to present the correct results: x - expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)*A12* + (X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)*A13*(X13*x1+X23*x2)+ + -1*sqrt(B13*(X13*x1+X23*x2)^2+C13)*A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+ + C14)*A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)*A16*(X16*x1+X26*x2)+ + 1*sqrt(B16*(X16*x1+X26*x2)^2+C16)*A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)*A18* + (X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)*A19*(X19*x1+X29*x2)+-1* + sqrt(B19*(X19*x1+X29*x2)^2+C19)*A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110)) x expression(A11 * (X11 * x1 + X21 * x2) + 1 * sqrt(B11 * (X11 * x1 + X21 * x2)^2 + C11) * A12 * (X12 * x1 + X22 * x2) + 1 * sqrt(B12 * (X12 * x1 + X22 * x2)^2 + C12) * A13 * (X13 * x1 + X23 * x2) + -1 * sqrt(B13 * (X13 * x1 + X23 * x2)^2 + C13) * A14 * (X14 * x1 + X24 * x2) + -1 * sqrt(B14 * (X14 * x1 + X24 * x2)^2 + C14) * A15 * (X15 * x1 + X25 * x2) + 1 * sqrt(B15 * (X15 * x1 + X25 * x2)^2 + C15) * A16 * (X16 * x1 + X26 * x2) + 1 * sqrt(B16 * (X16 * x1 + X26 * x2)^2 + C16) * A17 * (X17 * x1 + X27 * x2) + 1 * sqrt(B17 * (X17 * x1 + X27 * x2)^2 + C17) * A18 * (X18 * x1 + X28 * x2) + 1 * sqrt(B18 * (X18 * x1 + X28 * x2)^2 + C18) * A19 * (X19 * x1 + X29 * x2) + -1 * sqrt(B19 * (X19 * x1 + X29 * x2)^2 + C19) * A110 * (X110 * x1 + X210 * x2) + 1 * sqrt(B110 * (X110 * x1 + X210 * x2)^2 + C110)) On 3/9/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi All I am having 2 very long character strings (550chars) and I want to put them as expressions together with c(). The problem is that I also get these double-quotes, as seen below in 'fct'. How can I remove these double-quotes? I tried as.name() but it did not work (because of size?). These are creating trouble with subsequent programs, which I tested with strings that for some reason do not have these double quotes (see very bottom). cum1 [1] A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110) cum2 [1] A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210) fct = c(as.expression(cum1), as.expression(cum2)) fct expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110), A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)) fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1)) fct expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to
Re: [R] use nnet
thank you very much. I have a another question about nnet if I set size=0, and skip=TRUE. Then this network has just input layer and out layer. Is this also called perceptron network? thanks, Aimin Yan At 12:39 PM 3/9/2007, Wensui Liu wrote: AM, Sorry. please ignore the top box in the code. It is not actually a cv validation but just a simple split-sample validation. sorry for confusion. On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote: AM, I have a pieice of junk on my blog. Here it is. # # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR # # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER # # OF HIDDEN UNITS) OF NEURAL NETS # # library(nnet); library(MASS); data(Boston); X - I(as.matrix(Boston[-14])); # STANDARDIZE PREDICTORS st.X - scale(X); Y - I(as.matrix(Boston[14])); boston - data.frame(X = st.X, Y); # DIVIDE DATA INTO TESTING AND TRAINING SETS set.seed(2005); test.rows - sample(1:nrow(boston), 100); test.set - boston[test.rows, ]; train.set - boston[-test.rows, ]; # INITIATE A NULL TABLE sse.table - NULL; # SEARCH FOR OPTIMAL WEIGHT DECAY # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY for (w in c(0.0001, 0.001, 0.01)) { # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS for (n in 1:10) { # UNITIATE A NULL VECTOR sse - NULL; # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES for (i in 1:10) { # INITIATE THE RANDOM STATE FOR EACH NET set.seed(i); # TRAIN NEURAL NETS net - nnet(Y~X, size = n, data = train.set, rang = 0.1, linout = TRUE, maxit = 1, decay = w, skip = FALSE, trace = FALSE); # CALCULATE SSE FOR TESTING SET test.sse - sum((test.set$Y - predict(net, test.set))^2); # APPEND EACH SSE TO A VECTOR if (i == 1) sse - test.sse else sse - rbind(sse, test.sse); } # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse))); } } # PRINT OUT THE RESULT print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote: I want to adjust weight decay and number of hidden units for nnet by a loop like for(decay) { for(number of unit) { for(#run) {model-nnet() test.error- } } } for example: I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and calculate test error after that I want to get a matrix like this decay size maxit #run test_error 0.13200 1 1.2 0.13200 2 1.1 0.13200 3 1.0 0.13200 4 3.4 0.13200 5.. 0.13200 6 .. 0.13200 7 .. 0.13200 8 .. 0.13200 9 .. 0.13200 10 .. 0.23200 1 1.2 0.23200 2 1.1 0.23200 3 1.0 0.23200 4 3.4 0.23200 5.. 0.23200 6 .. 0.23200 7 .. 0.23200 8 .. 0.23200 9 .. 0.23200 10 .. I am not sure if this is correct way to do this? Does anyone tune these parameters like this before? thanks, Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read a irregular text file data into dataframe()
I am using R2.4.1 calling a text file contains the following data structure: when i call the file into R using tData-read.table(c:\\test.txt) it gave me Error saying, irregular column in the data set however i need to use the below type of data Is there any alternative in R? ~ 0010 0028 0061 0088 0010 0042 0084 0004 0010 0055 0010 0018 0040 0042 0010 0046 0059 0010 0016 0042 0055 0010 0012 0018 0054 0010 0034 0042 0102 0081 0001 0076 0085 0080 0086 0017 0032 0081 0004 0010 0055 0010 0042 0061 0080 0010 0017 0078 0084 0006 0010 0040 0042 0075 0080 0005 0028 0032 0006 0010 0040 0061 -- Lecturer J. Joshua Thomas KDU College Penang Campus Research Student, University Sains Malaysia [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use nnet
no, it is called regression. ^_^. On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote: thank you very much. I have a another question about nnet if I set size=0, and skip=TRUE. Then this network has just input layer and out layer. Is this also called perceptron network? thanks, Aimin Yan At 12:39 PM 3/9/2007, Wensui Liu wrote: AM, Sorry. please ignore the top box in the code. It is not actually a cv validation but just a simple split-sample validation. sorry for confusion. On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote: AM, I have a pieice of junk on my blog. Here it is. # # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR # # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER # # OF HIDDEN UNITS) OF NEURAL NETS # # library(nnet); library(MASS); data(Boston); X - I(as.matrix(Boston[-14])); # STANDARDIZE PREDICTORS st.X - scale(X); Y - I(as.matrix(Boston[14])); boston - data.frame(X = st.X, Y); # DIVIDE DATA INTO TESTING AND TRAINING SETS set.seed(2005); test.rows - sample(1:nrow(boston), 100); test.set - boston[test.rows, ]; train.set - boston[-test.rows, ]; # INITIATE A NULL TABLE sse.table - NULL; # SEARCH FOR OPTIMAL WEIGHT DECAY # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY for (w in c(0.0001, 0.001, 0.01)) { # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS for (n in 1:10) { # UNITIATE A NULL VECTOR sse - NULL; # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES for (i in 1:10) { # INITIATE THE RANDOM STATE FOR EACH NET set.seed(i); # TRAIN NEURAL NETS net - nnet(Y~X, size = n, data = train.set, rang = 0.1, linout = TRUE, maxit = 1, decay = w, skip = FALSE, trace = FALSE); # CALCULATE SSE FOR TESTING SET test.sse - sum((test.set$Y - predict(net, test.set))^2); # APPEND EACH SSE TO A VECTOR if (i == 1) sse - test.sse else sse - rbind(sse, test.sse); } # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse))); } } # PRINT OUT THE RESULT print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote: I want to adjust weight decay and number of hidden units for nnet by a loop like for(decay) { for(number of unit) { for(#run) {model-nnet() test.error- } } } for example: I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and calculate test error after that I want to get a matrix like this decay size maxit #run test_error 0.13200 1 1.2 0.13200 2 1.1 0.13200 3 1.0 0.13200 4 3.4 0.13200 5.. 0.13200 6 .. 0.13200 7 .. 0.13200 8 .. 0.13200 9 .. 0.13200 10 .. 0.23200 1 1.2 0.23200 2 1.1 0.23200 3 1.0 0.23200 4 3.4 0.23200 5.. 0.23200 6 .. 0.23200 7 .. 0.23200 8 .. 0.23200 9 .. 0.23200 10 .. I am not sure if this is correct way to do this? Does anyone tune these parameters like this before? thanks, Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read a irregular text file data into dataframe()
I don't know of any canned function to do this but you can write your own function (see contents below) to: (1) open file connection (2) read number of fields (3) create empty matrix with the number of rows and maximum number of columns of your data (4) rewind to beginning of file (5) scan line-by-line and fill the matrix (6) close the file connection (7) convert matrix to data frame (8) use the function type.convert to automatically convert numerical columns to mode numeric (since scan(), as I've specified it, reads in everything as mode character, which converts the holding matrix's mode to character from its default of logical). the function below will work for your example data set, but to make it more general, you can add arguments like 'what' to scan(), 'sep' to both count.fields() and scan(); depending on whether you have column names you can modify it accordingly as well. # call function with this line df - read.irregular(c:\\test.txt) # this is the function read.irregular - function(filenm) { fileID - file(filenm,open=rt) nFields - count.fields(fileID) mat - matrix(nrow=length(nFields),ncol=max(nFields)) invisible(seek(fileID,where=0,origin=start,rw=read)) for(i in 1:nrow(mat) ) { mat[i,1:nFields[i]] -scan(fileID,what=,nlines=1,quiet=TRUE) } close(fileID) df - as.data.frame(mat) df[] - lapply(df,type.convert,as.is=TRUE) return(df) } Hope this helps. --- j.joshua thomas [EMAIL PROTECTED] wrote: I am using R2.4.1 calling a text file contains the following data structure: when i call the file into R using tData-read.table(c:\\test.txt) it gave me Error saying, irregular column in the data set however i need to use the below type of data Is there any alternative in R? ~ 0010 0028 0061 0088 0010 0042 0084 0004 0010 0055 0010 0018 0040 0042 0010 0046 0059 0010 0016 0042 0055 0010 0012 0018 0054 0010 0034 0042 0102 0081 0001 0076 0085 0080 0086 0017 0032 0081 0004 0010 0055 0010 0042 0061 0080 0010 0017 0078 0084 0006 0010 0040 0042 0075 0080 0005 0028 0032 0006 0010 0040 0061 -- Lecturer J. Joshua Thomas KDU College Penang Campus Research Student, University Sains Malaysia [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read a irregular text file data into dataframe()
read.table(c:\\test.txt,fill=TRUE) Petr j.joshua thomas napsal(a): I am using R2.4.1 calling a text file contains the following data structure: when i call the file into R using tData-read.table(c:\\test.txt) it gave me Error saying, irregular column in the data set however i need to use the below type of data Is there any alternative in R? ~ 0010 0028 0061 0088 0010 0042 0084 0004 0010 0055 0010 0018 0040 0042 0010 0046 0059 0010 0016 0042 0055 0010 0012 0018 0054 0010 0034 0042 0102 0081 0001 0076 0085 0080 0086 0017 0032 0081 0004 0010 0055 0010 0042 0061 0080 0010 0017 0078 0084 0006 0010 0040 0042 0075 0080 0005 0028 0032 0006 0010 0040 0061 -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using large datasets: can I overload the subscript operator?
On Sat, 10 Mar 2007, Maciej Radziejewski wrote: Hello, The http://www.met.rdg.ac.uk/cag/rclim/ site may have some useful leads. In addition, you'll find ideas in two packages created by Tim Keitt, rgdal, and Rdbi+RdbiPgSQL (now on Bioconductor). I do some computations on datasets that come from climate models. These data are huge arrays, significantly larger than typically available RAM, so they have to be accessed row-by-row, or rather slice-by slice, depending on the task. I would like to make an R package to easily access such datasets within R. The C++ backend is ready and being used under Windows/.Net/Visual Basic, but I have yet to learn the specifics of R programming to make a good R interface. Look at the Matrix package for examples - you may need finalizers to tidy up memory allocation - see examples in rgdal. The key thing will be thinking through how to implement the R objects as classes, probably not simply reflecting the C++ classes. Classes are covered in the Green Book (Chambers 1998) and Venables Ripley (2000) S Programming. I think it should be possible to make a package (call it slice) that could be used like this: library (slice) dataset - load.virtualarray (dataset_definition.xml) ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk and extract it In the above dataset is an object that holds a definition of a 3-dimensional large dataset, and ordinaryvector is an ordinary R vector. The subscripting operator fetches necessary data from disk and extracts a required slice, taking care of caching and other technical details. So, my questions are: Has anyone ever made a similar extension, with virtual (lazy) arrays? Can the suscript operator be overloaded like that in R? (I know it can be in S, at least for vectors.) Yes, there are many examples, see the Matrix package for some that use new-style classes (in language issues like this, R is S, the differences are in scoping). And a tough one: is it possible to make an expression like [1] (without quoutes) meaningful in R? At the moment it results in a syntax error. I would like to make it return an object of a special class that gets interpreted when subscripting my virtual array as drop this dimension, like this: Most likely not in this context, because [ in this context will not be what you want. But if your [.dataset method is careful about examining its arguments, you ought to be able to get the result you want. You'll likely learn a good deal from looking for example at the code in the Matrix package. dataset [, 2, 3, drop = F] # Return a 3-dimensional array dataset [, [2], 3, drop = F] # Return a 2-dimensional array dataset [, [2], [3], drop = F] # Return a 1-dimensional array, like dataset [, 2, 3] Thanks in advance for any help, Maciej. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.