[R] p-generalized normal distribution
Hello, I would like to know if there is already an R-package available for computing the density, distribution function, quantiles and random numbers of the p-generalized normal distribution. Best regards, Steve Kalke __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
On Mon, 2009-11-09 at 19:31 -0600, Peng Yu wrote: On Sun, Nov 8, 2009 at 7:32 PM, John Fox j...@mcmaster.ca wrote: Dear Peng, I'm tempted to try to get an entry in the fortunes package but will instead try to answer your questions directly: I can not install 'fortunes'. What are the fortunes packages about? install.packages(fortunes, repos=http://R-Forge.R-project.org;) Warning: unable to access index for repository http://R-Forge.R-project.org/bin/macosx/leopard/contrib/2.9 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘fortunes’ is not available It is on CRAN. just do: install.packages(fortunes) Choose a repository near to you and it should install. There appears to be a problem with R-forge and Mac OsX binaries. The code you show above works for me on linux and R2.9.x G I did read Section 9.1.2 and various other textbooks before posting my questions. But each reference uses slightly different notations and terminology. I get confused and would like a description that summaries everything so that I don't have to refer to many different resources. May I ask a few questions on the section in your textbook? Which variable in Section 9.1.2 is a matrix of contrasts mentioned in the help page of 'contr.helmert'? Which matrix of contrast in R corresponds to dummy regression? With different R formula, e.g. y ~ x vs. y ~ x -1, $X_F$ (mentioned on page 189) is different and hence $\beta_F$ (mentioned in eq. 9.3) is be different. So my understanding is that the matrix of contrast should depend on the formula. But it is not according to the help page of contr.helmert. If the model is simply y ~ A, for the factor A, then cbind(1, contrasts(A)) is what I call X_B, the row-basis of the model matrix. As I explain in the section that you read, the level means are mu = X_B beta, and thus beta = X_B^-1 mu = 0 are the hypotheses tested by the contrasts. Moreover, if, as in Helmert contrasts, the columns of X_B are orthogonal, then so are the rows of X_B^-1, and the latter are simply rescalings of the former. That allows one conveniently to code the hypotheses directly in X_B; all this is also explained in that section of my book, and is essentially what Peter D. told you. In R, contr.treatment and contr.SAS provide dummy-variable (0/1) coding of regressors, differing only in the selection of the reference level. What is the mathematical definition of polynomial contrasts? Why polynomial contrasts are the default contrasts for ordered factors? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binary packages on R-forge (was reference on contr.helmert and typo on its help page.)
On Tue, 10 Nov 2009, Gavin Simpson wrote: On Mon, 2009-11-09 at 19:31 -0600, Peng Yu wrote: On Sun, Nov 8, 2009 at 7:32 PM, John Fox j...@mcmaster.ca wrote: Dear Peng, I'm tempted to try to get an entry in the fortunes package but will instead try to answer your questions directly: I can not install 'fortunes'. What are the fortunes packages about? install.packages(fortunes, repos=http://R-Forge.R-project.org;) Warning: unable to access index for repository http://R-Forge.R-project.org/bin/macosx/leopard/contrib/2.9 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘fortunes’ is not available It is on CRAN. just do: install.packages(fortunes) Choose a repository near to you and it should install. There appears to be a problem with R-forge and Mac OsX binaries. The code you show above works for me on linux and R2.9.x My understanding is that R-forge only supports binary builds for current versions of R, so for both Windows and Mac OS X that means R 2.10.x only. Others can use type=source (the default under Linux), but this is another reason to keep your R up-to-date (as the posting guide mentions): as my Mac is up-to-date this worked for me. (The posting guide also mentions using a sensible subject line, so I've changed this one.) -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear trend line and a quadratic trend line.
Here is a solution using ggplot2 n - 50 Duncan - data.frame(income = runif(n, 0, 5)) Duncan$prestige - with(Duncan, 0.01 * income - .001 * income ^2 + rnorm(n, sd = 10)) library(ggplot2) ggplot(Duncan, aes(y = prestige, x = income)) + #define the data geom_point() + #add the data points geom_smooth(method = lm, colour = red) + #add a linear trend geom_smooth(method = lm, formula = y ~ poly(x, 2), color = green) #add the quadratic trend HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Eric Fail Verzonden: zondag 8 november 2009 18:21 Aan: r-help@r-project.org Onderwerp: [R] linear trend line and a quadratic trend line. Dear list users How is it possible to visualise both a linear trend line and a quadratic trend line on a plot of two variables? Here my almost working exsample. data(Duncan) attach(Duncan) plot(prestige ~ income) abline(lm(prestige ~ income), col=2, lwd=2) Now I would like to add yet another trend line, but this time a quadratic one. So I have two trend lines. One linear trend line and a quadratic trend line. A trend line as if I had taken I(income^2) into the equation. I know I can make two models and compare them using anova, but for pedagogical resons I wold like to visualise it. I know the trick from my past as an SPSS user, but I would like to do this in R as well. Se it in SPSS http://www.childrens-mercy.org/stats/weblog2006/QuadraticRegression.asp Thanks in adcance Eric __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting Mona clustering result
Hi all, Is there a way of plotting a 'decision tree' from the results of Mona in the cluster package. The default bannerplot is not quite what I'm after - I would like a plot of the binary decision tree. Thanks Zoë [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Check out the reshape package for transforming data from long to wide and vice versa. Yet I still don't know what problem you've encountered with ezANOVA. Using the data you just sent, where Day now has 3 levels, I reformat back to the presumably original long format and find that ezANOVA returns the same sphericity tests as John's solution (which is expected because ezANOVA is a wrapper to Anova): library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) ) On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Thank you very much. Finally got it to work. However, I had to recode it from: columns: subject/treatment/DV (where all my response data was in one DV column) to columns: subject/treatment/day1/day2/day3/ (where my response data is now in three different columns). Is there a way to do that without hand recoding (cutting and pasting in spreadsheet) by hand? Thank you for your help. Glad it works as is. Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote: Dear Sergios, Why don't you try what I suggested originally? Adapted to this data set, mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata - data.frame(Day=factor(1:3)) summary(Anova(mod, idata=idata, idesign=~Day)) Peter Dalgaard also pointed toward an article that describes how to do the same thing with anova(). Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sergios (Sergey) Charntikov Sent: November-09-09 7:13 PM To: Mike Lawrence Cc: r-help@r-project.org Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package) Hi Mike, I tried to run my data in SPSS and it works fine without any problems, plug in my levels, plug in my covariate (since it is all within) and get my Mauchly Tests. I tried to rearrange the data so it looks like this subj/treatment/day1/day2/day3 subject treatment day1 day2 day3 1 1 8 8 8 1 2 5 7 5 2 1 7 4 4 2 2 4 5 7 3 1 8 6 4 3 2 5 2 4 4 1 2 9 4 4 2 1 9 1 5 1 4 8 1 5 2 7 8 2 6 1 4 7 2 6 2 4 5 2 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for variable 'Dataset When I try mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2, Dataset$day3))), ordered(Dataset$Treatment)) Anova(mod, idata=idata, idesign=~Dataset$Treatment) I get: Terms in the intra-subject model matrix are not orthogonal. When I try is.matrix(Dataset) - I get no. My original mock Dataset (attached in txt) is below. Maybe I am not coding it right? I would hate to recode all my data for SPSS, since at the end I would need to show that Sphericity was not violated. Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca wrote: No luck as in...? What error did you encounter? In your example data set, you only have 2 levels of each within-Ss factor, in which case you shouldn't expect to obtain tests of sphericity; as far as I understand it, sphericity necessarily holds when for repeated measures with only 2 levels and tests are really only possible for repeated measures with 3 or more levels. I think it's analogous to how you don't need to test homogeneity of variance when performing a paired t-test; the test ends up representing the pairs as single distribution of difference scores with a single variance. Mike On Mon, Nov 9, 2009 at 5:30 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Tried EZanova, no luck with my particular dataset.
[R] Nelson- Siegel - (Yield Curve - Smoothening of curve)
I am Julia Cains from Brisbane. This is my first mail to this group and I have recently started learning the R language.  I am trying to learn the smoothening of the yield curve. However, I came across the CRAN package â âYieldCurveâ meant for Modelling and estimation of the yield curve. The problem is I am not able to understand whether this package will help me to carry out smoothening of the curve.  The related pdf file consists of following R code.   YieldCurve-package Modelling and estimation of the yield curve (Page 2)  # __ data(FedYieldCurve) tau - c(3, 6, 12, 60, 84, 120) mediumTerm - c(12,60,84) NSParameters - Nelson.Siegel( rate=FedYieldCurve[1:10,], maturity=tau, MidTau=mediumTerm ) y - NSrates(NSParameters[5,1:3], NSParameters$lambda[5],tau) plot(tau,FedYieldCurve[5,],main=Fitting Nelson-Siegel yield curve, type=o) lines(tau,y, col=2) legend(topleft,legend=c(observed yield curve,fitted yield curve), col=c(1,2),lty=1)  # __ The plot gives me two curves (1) Observed yield curve and (2) Fitted yield curve.   Unfortunately I am not able to interpret the result.  The input has 330 records (i.e. from January 1982 to June 2009) and each record has 6 yields given (for the period 3months, 6 months, 1 year, 5 years, 7 years and 10 years.)  Unfortunately I am not that good in Mathematics. Can someone help me understand whether this code is meant for smoothening of the yield curve. I shall be highly obliged.  Regards  Julia  *** Only a man of Worth sees Worth in other men *** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it possible to detect whether running as Rscript?
I would like to write a block of code that runs when a script is being run from Rscript, but not to run if the same file is being source()d, or submitted via ESS, etc. As a gloss I would like to write a file that looks somewhat like: #!/usr/bin/env Rscript my.func - function(...) { #do something... } if ( #is.running.Rscript# ) { do.call(my.func, commandArgs(trailingOnly=TRUE)) } If it helps, I'm thinking of something similar to the Python idiom of writing #!/usr/bin/env python #class, function definitions if (__name__=='__main__'): #code to be executed when running this file directly The intent is to use R's debugging facilities to help solve a problem that occurs when a script is invoked (from a makefile) with particular arguments. So if the makefile invokes my.script datafile then I can track down the problem by sourcing the script into R and trying my.func(datafile) Is this doable, or is there a better alternative? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] contrast in lme
Dear R-users I'm modelling some longitudinal data (1 response variable measured at 6 occasions, 1 baseline, one treatment variable) collected in the same subjects using the following model: library(nlme) model.lme - lme(response ~ V0+ time+ tt + tt:time, random = ~1|subject, correlation = corSymm(form = ~ 1 | subject), na.action=na.omit, data=arm.total2) Then I would lke to estimate the effect of treatment at time = 6 using the package contrast library(contrast) contrast(model.lme , a = list(tt=1,time=6), b = list(tt=0,time=6) ) But I get the following error message Erreur dans gendata.default(fit = list(modelStruct = list(reStruct = list( : not enough factors I'm obviously doing something wrong. Thanks for your help. Jacques -- View this message in context: http://old.nabble.com/contrast-in-lme-tp26281409p26281409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible to detect whether running as Rscript?
Peter Meilstrup wrote: I would like to write a block of code that runs when a script is being run from Rscript, but not to run if the same file is being source()d, or submitted via ESS, etc. As a gloss I would like to write a file that looks somewhat like: #!/usr/bin/env Rscript my.func - function(...) { #do something... } if ( #is.running.Rscript# ) { do.call(my.func, commandArgs(trailingOnly=TRUE)) } If it helps, I'm thinking of something similar to the Python idiom of writing #!/usr/bin/env python #class, function definitions if (__name__=='__main__'): #code to be executed when running this file directly The intent is to use R's debugging facilities to help solve a problem that occurs when a script is invoked (from a makefile) with particular arguments. So if the makefile invokes my.script datafile then I can track down the problem by sourcing the script into R and trying my.func(datafile) Is this doable, or is there a better alternative? I don't know if it gives enough detail, but you could look at the interactive() and/or commandArgs(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Oops, I see now that despite repeated subject names, treatment is a between-Ss variable, so you need to use this to get the equivalent of Anova: library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) b$subject=paste(b$subject,b$treatment) #create unique sids ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(day) , between=.(treatment) ) On Tue, Nov 10, 2009 at 6:47 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Check out the reshape package for transforming data from long to wide and vice versa. Yet I still don't know what problem you've encountered with ezANOVA. Using the data you just sent, where Day now has 3 levels, I reformat back to the presumably original long format and find that ezANOVA returns the same sphericity tests as John's solution (which is expected because ezANOVA is a wrapper to Anova): library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) ) On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Thank you very much. Finally got it to work. However, I had to recode it from: columns: subject/treatment/DV (where all my response data was in one DV column) to columns: subject/treatment/day1/day2/day3/ (where my response data is now in three different columns). Is there a way to do that without hand recoding (cutting and pasting in spreadsheet) by hand? Thank you for your help. Glad it works as is. Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote: Dear Sergios, Why don't you try what I suggested originally? Adapted to this data set, mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata - data.frame(Day=factor(1:3)) summary(Anova(mod, idata=idata, idesign=~Day)) Peter Dalgaard also pointed toward an article that describes how to do the same thing with anova(). Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sergios (Sergey) Charntikov Sent: November-09-09 7:13 PM To: Mike Lawrence Cc: r-help@r-project.org Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package) Hi Mike, I tried to run my data in SPSS and it works fine without any problems, plug in my levels, plug in my covariate (since it is all within) and get my Mauchly Tests. I tried to rearrange the data so it looks like this subj/treatment/day1/day2/day3 subject treatment day1 day2 day3 1 1 8 8 8 1 2 5 7 5 2 1 7 4 4 2 2 4 5 7 3 1 8 6 4 3 2 5 2 4 4 1 2 9 4 4 2 1 9 1 5 1 4 8 1 5 2 7 8 2 6 1 4 7 2 6 2 4 5 2 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for variable 'Dataset When I try mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2, Dataset$day3))), ordered(Dataset$Treatment)) Anova(mod, idata=idata, idesign=~Dataset$Treatment) I get: Terms in the intra-subject model matrix are not orthogonal. When I try is.matrix(Dataset) - I get no. My original mock Dataset (attached in txt) is below. Maybe I am not coding it right? I would hate to recode all my data for SPSS, since at the end I would need to show that Sphericity was not violated. Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca wrote: No luck as in...? What error did you encounter? In your example data set, you only have 2 levels of each within-Ss factor, in which case you shouldn't expect to obtain tests of sphericity; as far as I understand it, sphericity necessarily
[R] Error: cannot allocate vector of size...
I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Compile package in version R-2.10.0 (Windows XP)
Under version 2.7.0 I could without problems compile R-packages, I wrote myself (Windows XP). In the relevant path I have at the beginning of the path the following enterings: PATH=c:\Rtools\bin;c:\Rtools\perl\bin;c:\Rtools\MinGW\bin;C:\Programme\MiKTeX~1\ miktex\bin;C:\Programme\MiKTeX 2.5\miktex\bin;c:\Programme\HTML Help Workshop;c: \programme\R\bin;c:\windows;c:\windows\system32; which works for 2.7.0 - but no longer for 2.10.0 What can I do? Paul Ruppen, Pädagogische Hochschule Wallis (Switzerland) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write data frame in a list
Hi, I have got a data frame: df=data.frame(x=c(3,6,7),y=c(2,7,4)) and I would like to write my values from data frame to list using loop, for example: lista=list() for (i in 1: length(?)){ lista[[?]][?] = df [?] } But I havn't got any idea what I should put in places where I put a question mark? As a rusult I would like to get a list like this: list: [[1]] 3,6,7 [[2]] 2,7,4 I wll be very happy if somebody can help me -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-help(tune.svm)
Hello, my name is Xavier I'm student of UPC in Barcelnoa and I've read your R-help about tune.svm and i have the same problem. Have you solved the problem?? Do you know how tune.svm works?? Do you know why it's different from svm?? https://stat.ethz.ch/pipermail/r-help/2005-August/077427.html Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do ADF test and KPSS test in R
Dear all, How to do ADF test ¡¢KPSS¡¢ PP¡¢GLS test in R£¿ Thanks a lot ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write data frame in a list
Is this what you want: df=data.frame(x=c(3,6,7),y=c(2,7,4)) df x y 1 3 2 2 6 7 3 7 4 as.list(df) $x [1] 3 6 7 $y [1] 2 7 4 On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote: Hi, I have got a data frame: df=data.frame(x=c(3,6,7),y=c(2,7,4)) and I would like to write my values from data frame to list using loop, for example: lista=list() for (i in 1: length(?)){ lista[[?]][?] = df [?] } But I havn't got any idea what I should put in places where I put a question mark? As a rusult I would like to get a list like this: list: [[1]] 3,6,7 [[2]] 2,7,4 I wll be very happy if somebody can help me -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Models for Discrete Choice in R
Iuri Gavronski wrote: Frank, I certainly can't speak for Emmanuel. I don't know his reasons. The reason I've posted this question is the fact that (as far as I understood), ordinal regression is based on logistic regression (or probit), and logistic regression expects a formula like dichotomous ~ ratio1 + ratio2 + ... + ration. However, most examples I've found in Design, MASS and VGAM test models like ordinal ~ categorical1 + categorical2 + ... + categoricaln. I wonder if it is just coincidence or I have just found the wrong functions. Logistic regression includes binary, ordinal, and polytomous (multinomial) cases. Binary logistic regression needs a binary response. Ordinal logistic regression (usually proportional odds but can be other flavors such as continuation ratio model) needs an ordinal response. The polr and lrm functions work this way. Frank Best, Iuri. On Mon, Nov 9, 2009 at 11:43 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Emmanuel Charpentier wrote: Le dimanche 08 novembre 2009 à 19:05 -0600, Frank E Harrell Jr a écrit : Emmanuel Charpentier wrote: Le dimanche 08 novembre 2009 à 17:07 -0200, Iuri Gavronski a écrit : Hi, I would like to fit Logit models for ordered data, such as those suggested by Greene (2003), p. 736. Does anyone suggests any package in R for that? look up the polr function in package MASS (and read the relevant pages in VR4 and some quoted references...) or the slightly more sophisticated (larger range of models) lrm function in F. Harrell's Design (now rms) packge (but be aware that Design is a huge beast witch carries its own computing universe, based on (strong) Harrell's view of what a regression analysis should be : reading his book is, IMHO, necessary to understand his choices and agree (or disgree) with them). If you have a multilevel model (a. k. a. one random effect grouping), the repolr packge aims at that, but I've been unable to use it recently (numerical exceptions). By the way, my dependent variable is ordinal and my independent variables are ratio/intervalar. Numeric ? Then maybe some recoding/transformation is in order ... in which case Design/rms might or might not be useful. I'm not clear on what recoding or transformation is needed for an ordinal dependent variable and ratio/interval independent variables, nor why rms/Design would not be useful. I was thinking about transformations/recoding of the *independent* variables... Emmanuel Charpentier I realize that; still unclear. Frank __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
A little simple math. You have 3M rows with 100 items on each row. If read in this would be 300M items. If numeric, 8 bytes/item, this is 2.4GB. Given that you are probably using a 32 bit version of R, you are probably out of luck. A rule of thumb is that your largest object should consume at most 25% of your memory since you will probably be making copies as part of your processing. Given that, is you want to read in 100 variables at a time, I would say your limit would be about 500K rows to be reasonable. So you have a choice; read in fewer rolls, read in all 3M rows but at 20 columns per read, put the data in a database and extract what you need. Unless you go to a 64-bit version of R you will probably not be able to have the whole file in memory at one time. On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote: I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compile package in version R-2.10.0 (Windows XP)
Paul Ruppen wrote: Under version 2.7.0 I could without problems compile R-packages, I wrote myself (Windows XP). In the relevant path I have at the beginning of the path the following enterings: PATH=c:\Rtools\bin;c:\Rtools\perl\bin;c:\Rtools\MinGW\bin;C:\Programme\MiKTeX~1\ miktex\bin;C:\Programme\MiKTeX 2.5\miktex\bin;c:\Programme\HTML Help Workshop;c: \programme\R\bin;c:\windows;c:\windows\system32; which works for 2.7.0 - but no longer for 2.10.0 What can I do? At least: - update Rtools - update MikTeX Uwe Ligges Paul Ruppen, Pädagogische Hochschule Wallis (Switzerland) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R process gets killed spontaneously
Peng Yu wrote: My R process has been killed for a few times, although the system administrator did not do so. It happened when R attempted to allocate a lot of memory. I'm wondering whether R would spontaneously kill itself if it can not allocate enough memory? It does not kill itself. If it was not some out-of-memory kill process of your OS, then you may have seen some segfault. We need reproducible code and all the details described in the posting guide. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Numeric formatting question
Hi All, I have am using Sweave and the \Sexpr{} to place some numeric variables in my tex document. I want to format the number prior to entry so they read slightly more elegantly. Say i have the following numbers x - 0.00487324 y - 0.00432 z - 0.567 I would like to have the numbers displayed as follows x1 - 0.0049 y1 - 0.0043 z1 - 0.57 I've seen i can use sprintf(%.3f, pi) for example to get the formating after the decimal place, but i can't figure out an elegant way to find the position of the first non-zero entry to allow me to substitute this value into the sprintf command. Can anyone offer any advise? Thanks in advance Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric formatting question
Michael Pearmain-2 wrote: Hi All, I have am using Sweave and the \Sexpr{} to place some numeric variables in my tex document. I want to format the number prior to entry so they read slightly more elegantly. Say i have the following numbers x - 0.00487324 y - 0.00432 z - 0.567 I would like to have the numbers displayed as follows x1 - 0.0049 y1 - 0.0043 z1 - 0.57 \Sexpr{round(x,2)} Dieter -- View this message in context: http://old.nabble.com/Numeric-formatting-question-tp26283386p26283422.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric formatting question
On Nov 10, 2009, at 7:23 AM, Michael Pearmain wrote: Hi All, I have am using Sweave and the \Sexpr{} to place some numeric variables in my tex document. I want to format the number prior to entry so they read slightly more elegantly. Say i have the following numbers x - 0.00487324 y - 0.00432 z - 0.567 I would like to have the numbers displayed as follows x1 - 0.0049 y1 - 0.0043 z1 - 0.57 I've seen i can use sprintf(%.3f, pi) for example to get the formating after the decimal place, but i can't figure out an elegant way to find the position of the first non-zero entry to allow me to substitute this value into the sprintf command. Can anyone offer any advise? Thanks in advance Mike It looks like you want the numbers formatted to 2 significant digits, rather than to a fixed number of decimal places. Thus, use: format(x, digits = 2, scientific = FALSE) [1] 0.0049 format(y, digits = 2, scientific = FALSE) [1] 0.0043 format(z, digits = 2, scientific = FALSE) [1] 0.57 See ?format and note the 'digits' and 'scientific' arguments. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with executing r-function from windows command prompt
Hi all, I am trying to execute one function from windows command prompt and I am trying to execute queries like source(myRfile.R) but itis throughing runtime errors can any one help me how can i run my *.R file and how can i call a function in that *.R file thanks in advance kiran. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write data frame in a list
Thanks jholtman for advice but I need do it using a loop. (Because my code is a little more complicated). jholtman wrote: Is this what you want: df=data.frame(x=c(3,6,7),y=c(2,7,4)) df x y 1 3 2 2 6 7 3 7 4 as.list(df) $x [1] 3 6 7 $y [1] 2 7 4 On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote: Hi, I have got a data frame: df=data.frame(x=c(3,6,7),y=c(2,7,4)) and I would like to write my values from data frame to list using loop, for example: lista=list() for (i in 1: length(?)){ lista[[?]][?] = df [?] } But I havn't got any idea what I should put in places where I put a question mark? As a rusult I would like to get a list like this: list: [[1]] 3,6,7 [[2]] 2,7,4 I wll be very happy if somebody can help me -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26283178.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
OK, it's the simple math that's confusing me :) So you're saying 2.4GB, while windows sees the data as 700KB. Why is that different? And lets say I could potentially live with e.g. 1/3 of the cases - that would make it .8GB, which should be fine? But then my question is if there is any way to sample the rows in read.table? Or what would be the best way of importing a random third of my cases? Thanks! M. jholtman wrote: A little simple math. You have 3M rows with 100 items on each row. If read in this would be 300M items. If numeric, 8 bytes/item, this is 2.4GB. Given that you are probably using a 32 bit version of R, you are probably out of luck. A rule of thumb is that your largest object should consume at most 25% of your memory since you will probably be making copies as part of your processing. Given that, is you want to read in 100 variables at a time, I would say your limit would be about 500K rows to be reasonable. So you have a choice; read in fewer rolls, read in all 3M rows but at 20 columns per read, put the data in a database and extract what you need. Unless you go to a 64-bit version of R you will probably not be able to have the whole file in memory at one time. On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote: I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write data frame in a list
If you really want a loop, then do: df=data.frame(x=c(3,6,7),y=c(2,7,4)) new.list - list() for (i in names(df)){ + new.list[[i]] - df[[i]] + } new.list $x [1] 3 6 7 $y [1] 2 7 4 On Tue, Nov 10, 2009 at 8:12 AM, Grzes gregori...@gmail.com wrote: Thanks jholtman for advice but I need do it using a loop. (Because my code is a little more complicated). jholtman wrote: Is this what you want: df=data.frame(x=c(3,6,7),y=c(2,7,4)) df x y 1 3 2 2 6 7 3 7 4 as.list(df) $x [1] 3 6 7 $y [1] 2 7 4 On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote: Hi, I have got a data frame: df=data.frame(x=c(3,6,7),y=c(2,7,4)) and I would like to write my values from data frame to list using loop, for example: lista=list() for (i in 1: length(?)){ lista[[?]][?] = df [?] } But I havn't got any idea what I should put in places where I put a question mark? As a rusult I would like to get a list like this: list: [[1]] 3,6,7 [[2]] 2,7,4 I wll be very happy if somebody can help me -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26283178.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Akaike weight in R
I am using lm simulation in R and try to find the AICc and Akaike weight of the model. I searched out that using package AICcmodavg AICc is easily to get. I wonder how can I get the Akaike weight, any function I can use to generate it? Thanks in advance. Sunny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
maiya wrote: OK, it's the simple math that's confusing me :) So you're saying 2.4GB, while windows sees the data as 700KB. Why is that different? 700_MB_, I assume! In a nutshell, a single column and a spacer takes 2 bytes per subject, but a floating point variable takes 8, and R is not good at detecting things that can be compressed. At best, you can ensure that variables are read as integers or factors, bringing the storage requirements down to four bytes. The ff package might be something for you. The most strongly compressed items have trouble with storing NA values, though. -p And lets say I could potentially live with e.g. 1/3 of the cases - that would make it .8GB, which should be fine? But then my question is if there is any way to sample the rows in read.table? Or what would be the best way of importing a random third of my cases? Thanks! M. jholtman wrote: A little simple math. You have 3M rows with 100 items on each row. If read in this would be 300M items. If numeric, 8 bytes/item, this is 2.4GB. Given that you are probably using a 32 bit version of R, you are probably out of luck. A rule of thumb is that your largest object should consume at most 25% of your memory since you will probably be making copies as part of your processing. Given that, is you want to read in 100 variables at a time, I would say your limit would be about 500K rows to be reasonable. So you have a choice; read in fewer rolls, read in all 3M rows but at 20 columns per read, put the data in a database and extract what you need. Unless you go to a 64-bit version of R you will probably not be able to have the whole file in memory at one time. On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote: I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
Check out: http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg79590.html for sampling a large file. On Tue, Nov 10, 2009 at 8:32 AM, maiya maja.zaloz...@gmail.com wrote: OK, it's the simple math that's confusing me :) So you're saying 2.4GB, while windows sees the data as 700KB. Why is that different? And lets say I could potentially live with e.g. 1/3 of the cases - that would make it .8GB, which should be fine? But then my question is if there is any way to sample the rows in read.table? Or what would be the best way of importing a random third of my cases? Thanks! M. jholtman wrote: A little simple math. You have 3M rows with 100 items on each row. If read in this would be 300M items. If numeric, 8 bytes/item, this is 2.4GB. Given that you are probably using a 32 bit version of R, you are probably out of luck. A rule of thumb is that your largest object should consume at most 25% of your memory since you will probably be making copies as part of your processing. Given that, is you want to read in 100 variables at a time, I would say your limit would be about 500K rows to be reasonable. So you have a choice; read in fewer rolls, read in all 3M rows but at 20 columns per read, put the data in a database and extract what you need. Unless you go to a 64-bit version of R you will probably not be able to have the whole file in memory at one time. On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote: I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] when vectorising does not work: silent function fail?
Dear All, I'm using apply to do some genetic association analysis along a chromosome, with many thousands markers. For each marker the analysis is the same, so I was planning to use apply(chrom, 2, somefunction) In the specific case I do: my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]}) This is all good and well in theory, but in practice the lrm() model will fail for some markers and wreck the whole process. Failure for some markers is no problem for me, but *predicting* which markers will fail can be hugely problematic. I then though of creating some fucntion to catch the error messages that would otherwise scr*w things over: my.lrm = function(x){ pol = NULL pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri) if(length(pol) 0) rez = anova(pol)[1,3] if(length(pol) == 0) rez = 1 rez} my.results = apply(chr, 2, my.lrm) Still no joy, even adding try() in the evaluation and options(show.error.messages = F) I am at loss on how to get the darn function to bail out *silently* if needs be so I can just smack a replacement value in --which would also have the benefit of keeping the order of the markers. Any idea will be gratefully acknowledged. Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] when vectorising does not work: silent function fail?
Have you tried something like this: my.results = apply(chr, 2, function(x){ result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]) if (inherits(result, try-error)) return(NULL) result }) This should catch the error and have NULL in that list element. On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli f.calb...@imperial.ac.uk wrote: Dear All, I'm using apply to do some genetic association analysis along a chromosome, with many thousands markers. For each marker the analysis is the same, so I was planning to use apply(chrom, 2, somefunction) In the specific case I do: my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]}) This is all good and well in theory, but in practice the lrm() model will fail for some markers and wreck the whole process. Failure for some markers is no problem for me, but *predicting* which markers will fail can be hugely problematic. I then though of creating some fucntion to catch the error messages that would otherwise scr*w things over: my.lrm = function(x){ pol = NULL pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri) if(length(pol) 0) rez = anova(pol)[1,3] if(length(pol) == 0) rez = 1 rez} my.results = apply(chr, 2, my.lrm) Still no joy, even adding try() in the evaluation and options(show.error.messages = F) I am at loss on how to get the darn function to bail out *silently* if needs be so I can just smack a replacement value in --which would also have the benefit of keeping the order of the markers. Any idea will be gratefully acknowledged. Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] drop unused levels in subset.data.frame
Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s - subset(d, y==A, drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1 The subset still retains all the unused factor levels. I wonder how people usually get rid of all unused levels in a data.frame after subsetting? I came up with this but I may have missed a better built-in solution, dropit - function (d, columns = names(d), ...) { d[columns] = lapply(d[columns], [, drop=TRUE, ...) d } str(dropit(s)) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop unused levels in subset.data.frame
On Nov 10, 2009, at 10:49 AM, baptiste auguie wrote: Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s - subset(d, y==A, drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1 The subset still retains all the unused factor levels. I wonder how people usually get rid of all unused levels in a data.frame after subsetting? I came up with this but I may have missed a better built-in solution, dropit - function (d, columns = names(d), ...) { d[columns] = lapply(d[columns], [, drop=TRUE, ...) d } If you are looking for a one-liner, then consider: data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)} else {x})) I added a numeric column to make sure I had not clobbered a non-factor variable. d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]), N=1:15) s - subset(d, y==A, drop=TRUE) str( data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)} else {x})) ) 'data.frame': 5 obs. of 3 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 $ N: int 1 4 7 10 13 str(dropit(s)) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop unused levels in subset.data.frame
On Nov 10, 2009, at 9:49 AM, baptiste auguie wrote: Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s - subset(d, y==A, drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1 The subset still retains all the unused factor levels. I wonder how people usually get rid of all unused levels in a data.frame after subsetting? I came up with this but I may have missed a better built-in solution, dropit - function (d, columns = names(d), ...) { d[columns] = lapply(d[columns], [, drop=TRUE, ...) d } str(dropit(s)) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 There is a page in the R wiki here: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels that has some approaches. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Comparison of vectors in a matrix
Hi, I have a matrix with two columns, and the elements of the matrix are vectors. So for example, in line 3 of column 1 I have a vector v31=(marc, robert, marie). What I need to do is to compare all vectors in column 1 and 2, so as to get, for example setdiff(v31,v32) into a new column. Is there a way to do this in R? Thanks! -- View this message in context: http://old.nabble.com/Comparison-of-vectors-in-a-matrix-tp26284855p26284855.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating the percentage of explained deviance in lmer
Dear all, I am trying to calculate some measure of the amount of variability in the response variable that is explained by a model fitted in lmer m1-lmer(response-var ~ Condition+(1|Site/Area/Transect),family=binomial) . I've seen from the literature that the precentage of explained deviance is a common measure. How can I calculate it? Thanks a lot for your help, I hope this is not too shallow. Best Fabio -- Fabio Bulleri, PhD Dipartimento di Biologia Università di Pisa Via A. Volta 6, I-56126, Pisa, Italy Tel. +39 050 2211448 Fax +39 050 2211410 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2 significant digits
Hi, How to represent a rounded number ending with 0 with 2-significant digits? If I have for ex, 0.8031 and I use signif or round with digits = 2, I'll get 0.8. If I use format, I get character type (even if I pass number as parameter) and if I convert with as.numeric, I'll lose one significant digit (0): format(13.7, nsmall = 2) [1] 13.70 as.numeric( format(13.7, nsmall = 2)) [1] 13.7 Regards, Carol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Implementation of the Shuffled Complex Evol ution (SCE-UA) Algorithm
Good evening list, I'm looking for an R implementation of the Shuffled Complex Evolutionâ (SCE-UA) algorithm after Duan et al. (1993). Does anybody know if there is an extension/ package existing that contains it? Thanks very much for your help! Cheers, Simon Duan QY, Gupta KV, Sorooshian S (1993) Shuffled Complex Evolution Approach for Effective and Efficient Global Minimization. In Jour. of optimization theorie and applications. Vol 76, 3, 501-521 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NetCDF output in R
Dear CSAG R users, I will be glad if someone can point out what I am doing wrong or not doing at all in this. I am trying to write out netcdf file in R. I have 26 time step but only the first time step is written. For example: library(ncdf) path - '/home/work/' forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=)) fore - get.var.ncdf(forecast,'ppt') lon - get.var.ncdf(forecast,'lon') lat - get.var.ncdf(forecast,'lat') dim(fore)[3] 26 times - 1:dim(fore)[3] write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times) [1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data array has 37440 entries! [[1]] [1] 6 Warning message: In 1:nt : numerical expression has 26 elements: only the first used # function for writing out the netcdf file # write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){ lon - dim.def.ncdf('lon','degrees_east',lons) lat - dim.def.ncdf('lat','degrees_north',lats) times - 1:nt tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE) # levs - dim.def.ncdf('lev','pressure',levs) var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9) ncid - create.ncdf(filename,list(var)) put.var.ncdf(ncid, var, data) close.ncdf(ncid) } ##end of function### Thank you. Nana Browne There is no key to happiness. The door is always open. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 significant digits
On Nov 10, 2009, at 10:22 AM, carol white wrote: Hi, How to represent a rounded number ending with 0 with 2-significant digits? If I have for ex, 0.8031 and I use signif or round with digits = 2, I'll get 0.8. If I use format, I get character type (even if I pass number as parameter) and if I convert with as.numeric, I'll lose one significant digit (0): format(13.7, nsmall = 2) [1] 13.70 as.numeric( format(13.7, nsmall = 2)) [1] 13.7 Regards, Carol See ?sprintf if you want to force two decimal places, as opposed to two significant digits: sprintf(%.2f, 0.8031) [1] 0.80 A trailing zero is not a significant digit, hence it will not be displayed if you are formatting based upon significant digits, as opposed to a fixed number of decimal places. If you want to force the formatting of a numeric value for output, it will end up being a character, which should not be a problem, since you are presumably looking to display or output the number for reading, not to be used for subsequent calculations. Conceptually, you need to separate how R stores numeric values internally, versus how numeric values are displayed (printed) during output. By default, R stores numeric values internally as double precision floats to allow for mathematical calculations to be performed on those values. The printing of those numbers for output in an R console or GUI, is by default based upon significant digits (see ?print.default), but you can alter that behavior using various functions such as sprintf, to enable pretty output for formatted reports and tables, etc. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polygon kills X-server
2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de: This one is extraordinary dangerous: it also killed my kind of X server called Windows completely so that I had to reset the machine. Perhaps it should be debugged on the Linux side with less serious side effects The best way to debug this might be via a nested X server - something like Xnest or Xephyr - or possibly on a virtual machine in the hope that the virtual machine's display system crashes before the host's does. I haven't tried this yet! Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop unused levels in subset.data.frame
Neat, I reinvented the wheel! Would that seem like a useful example at the end of the help page for ?subset ? (it currently has very little to say about drop). Thanks also to David for the alternative idea. Best regards, baptiste 2009/11/10 Marc Schwartz marc_schwa...@me.com: On Nov 10, 2009, at 9:49 AM, baptiste auguie wrote: Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s - subset(d, y==A, drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1 The subset still retains all the unused factor levels. I wonder how people usually get rid of all unused levels in a data.frame after subsetting? I came up with this but I may have missed a better built-in solution, dropit - function (d, columns = names(d), ...) { d[columns] = lapply(d[columns], [, drop=TRUE, ...) d } str(dropit(s)) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 There is a page in the R wiki here: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels that has some approaches. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop unused levels in subset.data.frame
If you don't want to preserve factor levels when subsetting use characters. There are very few other differences in behavior. Hadley On Tuesday, November 10, 2009, baptiste auguie baptiste.aug...@googlemail.com wrote: Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s - subset(d, y==A, drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1 The subset still retains all the unused factor levels. I wonder how people usually get rid of all unused levels in a data.frame after subsetting? I came up with this but I may have missed a better built-in solution, dropit - function (d, columns = names(d), ...) { d[columns] = lapply(d[columns], [, drop=TRUE, ...) d } str(dropit(s)) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5 $ y: Factor w/ 1 level A: 1 1 1 1 1 Best regards, baptiste __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R process gets killed spontaneously
Hi Peng, in a very simplistic manner, what happens is that the Operating System thinks it is too dangerous to let the R process to use so much memory. So, to protect the whole system, it kills R, before the system becomes unstable. I've been looking at the problem you observed last week and, as I promised, I will come back to you with my findings sometime this week. I was already informed on how to reduce the memory footprint when creating the specific object you're interested in, but still need to run some tests. Will get back to you soon, b On Nov 10, 2009, at 2:23 PM, Peng Yu wrote: 2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de: Peng Yu wrote: My R process has been killed for a few times, although the system administrator did not do so. It happened when R attempted to allocate a lot of memory. I'm wondering whether R would spontaneously kill itself if it can not allocate enough memory? It does not kill itself. If it was not some out-of-memory kill process of your OS, then you may have seen some segfault. We need reproducible code and all the details described in the posting guide. Please see this thread about the necessary information. http://www.mail-archive.com/r-help@r-project.org/msg74804.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] phase determination
Hi, I'm trying to determine the phase of irregularly sampled data. Is there any particular reason why both spec.pgram and spec.ls return phase-NULL for vectors? Thank you. Lisandro x Lisandro Benedetti-Cecchi Associate Professor in Ecology Department of Biology - University of Pisa Via Derna 1, 56126 Pisa, Italy Office: 39 050 2211413 Fax: 39 050 2211410 e-mail: lbenede...@biologia.unipi.it http://www.discat.unipi.it/BiolMar/people/LBC/LBC.htm http://www.unipi.it [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NetCDF output in R
Dear R users, I will be glad if someone can point out what I am doing wrong or not doing at all in this. I am trying to write out netcdf file in R. I have 26 time step but only the first time step is written. For example: library(ncdf) path - '/home/work/' forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=)) fore - get.var.ncdf(forecast,'ppt') lon - get.var.ncdf(forecast,'lon') lat - get.var.ncdf(forecast,'lat') dim(fore)[3] 26 times - 1:dim(fore)[3] write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times) [1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data array has 37440 entries! [[1]] [1] 6 Warning message: In 1:nt : numerical expression has 26 elements: only the first used # function for writing out the netcdf file # write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){ lon - dim.def.ncdf('lon','degrees_east',lons) lat - dim.def.ncdf('lat','degrees_north',lats) times - 1:nt tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE) # levs - dim.def.ncdf('lev','pressure',levs) var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9) ncid - create.ncdf(filename,list(var)) put.var.ncdf(ncid, var, data) close.ncdf(ncid) } ##end of function### Thank you. Nana Browne There is no key to happiness. The door is always open. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
On Tue, 10 Nov 2009, maiya wrote: OK, it's the simple math that's confusing me :) So you're saying 2.4GB, while windows sees the data as 700KB. Why is that different? Your data are stored on disk as a text file (in CSV format, in fact), not as numbers. This can take up less space. And lets say I could potentially live with e.g. 1/3 of the cases - that would make it .8GB, which should be fine? But then my question is if there is any way to sample the rows in read.table? Or what would be the best way of importing a random third of my cases? A better solution is probably to read a subset of the columns at a time. The easiest way to do this is probably to read the data into a SQLite database with the 'sqldf' package, but another solution is to use the colClasses= argument to read.table() and specify NULL for the classes of the columns you don't want to read. There are other ways as well. It might even be faster to do the cross-tabulations in a database and read the resulting summaries into R to compute any statistics you need. Thanks! M. jholtman wrote: A little simple math. You have 3M rows with 100 items on each row. If read in this would be 300M items. If numeric, 8 bytes/item, this is 2.4GB. Given that you are probably using a 32 bit version of R, you are probably out of luck. A rule of thumb is that your largest object should consume at most 25% of your memory since you will probably be making copies as part of your processing. Given that, is you want to read in 100 variables at a time, I would say your limit would be about 500K rows to be reasonable. So you have a choice; read in fewer rolls, read in all 3M rows but at 20 columns per read, put the data in a database and extract what you need. Unless you go to a 64-bit version of R you will probably not be able to have the whole file in memory at one time. On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote: I'm trying to import a table into R the file is about 700MB. Here's my first try: DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried memory.limit(size=4095) and got DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: rm() DD-read.table(01uklicsam-20070301.dat,header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] when vectorising does not work: silent function fail?
On Tue, 10 Nov 2009, jim holtman wrote: Have you tried something like this: my.results = apply(chr, 2, function(x){ result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]) if (inherits(result, try-error)) return(NULL) result }) This should catch the error and have NULL in that list element. It's probably more useful to return a numeric vector of the right size, with tryCatch() tryCatch(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3], error=function(e) NA) Also, you probably get less data copying by using a for() or while() loop than by using apply() in this context. Finally, the overhead of formula parsing and model matrix construction repeated thousands of times probably dominates this computation; if it isn't just a one-off it would probably be worth a lower-level implementation. -thomas On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli f.calb...@imperial.ac.uk wrote: Dear All, I'm using apply to do some genetic association analysis along a chromosome, with many thousands markers. For each marker the analysis is the same, so I was planning to use apply(chrom, 2, somefunction) In the specific case I do: my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]}) This is all good and well in theory, but in practice the lrm() model will fail for some markers and wreck the whole process. Failure for some markers is no problem for me, but *predicting* which markers will fail can be hugely problematic. I then though of creating some fucntion to catch the error messages that would otherwise scr*w things over: my.lrm = function(x){ pol = NULL pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri) if(length(pol) 0) rez = anova(pol)[1,3] if(length(pol) == 0) rez = 1 rez} my.results = apply(chr, 2, my.lrm) Still no joy, even adding try() in the evaluation and options(show.error.messages = F) I am at loss on how to get the darn function to bail out *silently* if needs be so I can just smack a replacement value in --which would also have the benefit of keeping the order of the markers. Any idea will be gratefully acknowledged. Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] when vectorising does not work: silent function fail?
On 10 Nov 2009, at 17:16, tlum...@u.washington.edu wrote: On Tue, 10 Nov 2009, jim holtman wrote: Have you tried something like this: my.results = apply(chr, 2, function(x){ result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]) if (inherits(result, try-error)) return(NULL) result }) This should catch the error and have NULL in that list element. It's probably more useful to return a numeric vector of the right size, with tryCatch() tryCatch(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)) [1,3], error=function(e) NA) Both suggestions work beautifully, thanks to both! Also, you probably get less data copying by using a for() or while() loop than by using apply() in this context. I'm happy to take this in, I am not sure I can see how it would work (the less data copying) though. Finally, the overhead of formula parsing and model matrix construction repeated thousands of times probably dominates this computation; if it isn't just a one-off it would probably be worth a lower-level implementation. The markers are ~ 1.5M, but I don't do this kind of analysis often enough to justify any more work than a measly script. Cheers, Fede -thomas On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli f.calb...@imperial.ac.uk wrote: Dear All, I'm using apply to do some genetic association analysis along a chromosome, with many thousands markers. For each marker the analysis is the same, so I was planning to use apply(chrom, 2, somefunction) In the specific case I do: my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3]}) This is all good and well in theory, but in practice the lrm() model will fail for some markers and wreck the whole process. Failure for some markers is no problem for me, but *predicting* which markers will fail can be hugely problematic. I then though of creating some fucntion to catch the error messages that would otherwise scr*w things over: my.lrm = function(x){ pol = NULL pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri) if(length(pol) 0) rez = anova(pol)[1,3] if(length(pol) == 0) rez = 1 rez} my.results = apply(chr, 2, my.lrm) Still no joy, even adding try() in the evaluation and options(show.error.messages = F) I am at loss on how to get the darn function to bail out *silently* if needs be so I can just smack a replacement value in --which would also have the benefit of keeping the order of the markers. Any idea will be gratefully acknowledged. Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St. Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NetCDF output in R
On Tue, 10 Nov 2009, nana wrote: I will be glad if someone can point out what I am doing wrong or not doing at all in this. Sending the same message to both r-help and r-devel is one thing you are doing wrong. I am trying to write out netcdf file in R. I have 26 time step but only the first time step is written. For example: library(ncdf) path - '/home/work/' forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=)) fore - get.var.ncdf(forecast,'ppt') lon - get.var.ncdf(forecast,'lon') lat - get.var.ncdf(forecast,'lat') dim(fore)[3] 26 times - 1:dim(fore)[3] write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times) [1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data array has 37440 entries! [[1]] [1] 6 Warning message: In 1:nt : numerical expression has 26 elements: only the first used This looks like the cause of your problem. The function has times - 1:nt implying that nt is a single number, but you are passing the vector 'times' as the argument. # function for writing out the netcdf file # write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){ lon - dim.def.ncdf('lon','degrees_east',lons) lat - dim.def.ncdf('lat','degrees_north',lats) times - 1:nt tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE) # levs - dim.def.ncdf('lev','pressure',levs) var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9) ncid - create.ncdf(filename,list(var)) put.var.ncdf(ncid, var, data) close.ncdf(ncid) } ##end of function### Thank you. Nana Browne There is no key to happiness. The door is always open. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge data
df1 -- dataframe with column date and several other columns. #rows 40k Several of the dates are repeated. df2 -- dataframe with two columns date and index. #rows ~130 This is really a map from date to index. I would like to create a column called index in df1 which has the corresponding index from df2. The following works: index - NULL for(wk in df1$week){ index - c(index,df2$index[df2$week==wk]) } and then add index to df1. Can you please suggest a better way of doing this? I didn't think merge was suitable for this...is it? THANKS. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interfacing R and C++
Hi, I m currently working on interfacing R and C++ (passing a matrix from R to C++, doing some stuff, returning a vector of results). There seem to be a number of ways of doing this, such as rcpp and swig. Is there a recommended way of doing this? In the long run I would like what I m working on to be portable across OS etc. Nathan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Titles on panel graphs created in zoo
I have a plotted a stacked panel graph (single x axis and multiple y axis) using the package zoo and would like to add a title for each separate panel. I am using the script: z - with(mydata,zoo(cbind(mydata$Water.level,mydata$Submerged.plants, mydata$Crayfish.CPUE,mydata$Carp.CPUE),Year)) plot(z,type=b,pch=16,lty=2,xlab=Year,ylab=c(Metres,Realtive density, CPUE,CPUE),main=) Any help would be much appreciated. M -- View this message in context: http://old.nabble.com/Titles-on-panel-graphs-created-in-zoo-tp26285713p26285713.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HEEELP!!!!
Hello. My name is Ana. I´m doing an eology master, and I´m just learning how R works. I have a Mac OS X 10.5.6, and I´m tryng to run just a simple ANOVA nanalyses. I dowloaded R version 2.10.0, and it seems I have problems with the script. I don´t know what to do, I´ve already change the languages, be sure of being working in the correct directory and doesn´t work the script is: #1. example data01-read.table(ch01.txt,header=T) names(data01) attach(data01) FERTIL-factor(FERTIL) model01-(YIELD~FERTIL) summary(model01) anova(model01) all my classmates, even those with a mac operator, wrote the same, and it worked. When copying the script in the R console, it seems that there is a problem with the ~ symbol. this symbol is not in the keyboard, so I select it from spetial characters, and then i selected with shift an key, and still, it doesn´t works I attached th file with the data. i hope you can help me! Ana. FERTIL YIELD YIELDM VARIETY SEX DBH FLOWERS YIELD2 1 6.2725.12 1 2 174 234 6.27 1 5.3617.25 1 1 171 110 5.36 1 6.3926.42 1 2 212 512 6.39 1 4.8516.08 1 1 187 133 4.85 1 5.9922.15 1 2 227 11075.99 1 7.1415.92 1 1 274 624 7.14 1 5.0840.25 2 2 86 8 5.08 1 4.0735.25 2 2 161 333 4.07 1 4.3531.98 2 1 285 637 4.35 1 4.9536.52 2 1 182 225 4.95 2 3.0743.32 2 2 225 512 3.07 2 3.2937.12 1 133 47 3.29 2 4.0418.33 2 154 164 4.04 2 4.1922.63 2 71 1 4.19 2 3.4125.93 1 133 36 3.41 2 3.7515.05 3 1 267 524 3.75 2 4.8728.05 4 2 166 198 4.87 2 3.9428.55 4 2 145 225 3.94 2 6.2833.24 2 207 548 NA 2 3.1531.68 4 1 216 272 3.15 3 4.0430.32 4 2 157 164 4.04 3 3.7927.58 4 2 87 5 3.79 3 4.56NA NA 1 185 133 4.56 3 4.55NA NA 1 183 133 4.55 3 4.55NA NA 2 193 419 4.55 3 4.53NA NA 1 264 442 4.53 3 3.53NA NA 1 83 3 3.53 3 3.71NA NA 2 300 14823.71 3 7 NA NA 1 242 333 NA 3 4.61NA NA 2 316 11384.61 NA NA NA NA 2 217 561 NA NA NA NA NA 1 235 488 NA NA NA NA NA 2 189 282 NA NA NA NA NA 1 242 536 NA NA NA NA NA 2 84 11 NA NA NA NA NA 2 135 148 NA NA NA NA NA 1 267 512 NA NA NA NA NA 2 122 70 NA NA NA NA NA 2 280 1107NA NA NA NA NA 2 295 941 NA NA NA NA NA 2 139 89 NA NA NA NA NA 2 165 181 NA NA NA NA NA 2 81 8 NA NA NA NA NA 1 190 173 NA NA NA NA NA 2 68 1 NA NA NA NA NA 2 314 1091NA NA NA NA NA 1 284 561 NA NA NA NA NA 2 77 3 NA NA NA NA NA 1 228 312 NA NA NA NA NA 2 282 1398NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use package-rsm to generate experimental design
Hi guys, I am a totally begginer with R. I am planning to use R to design and optimize my experiment. The experiment includes 4 factors, and three of the fators have 3 levels, and the last factor has 6 levels. I am having a really hard time to learn this program on my own and didn't find enough useful material to get start. It would be appreciated if any one send me useful materials, especially those on how to use the pakage-RSM(Response surface methodoly) to generate experiment design. Thanks! Rob -- View this message in context: http://old.nabble.com/How-to-use-package-rsm-to-generate-experimental-design-tp26287354p26287354.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] source() vs attach()0
Hello, After hours of googling I could not resolve the following (although it seems simple): I would like to put subfunctions in a separate .R file that is then called with source() from inside several main functions. A crude example would be as follows: file subtest.R ** subtest - function() { foo - foo+1 } ** *** main function test-function(foo) { source(test.R,local=TRUE) subtest() return(foo) } ** Then executing the code test(1) works as prescribed. But of course, each function call to test() 'sources' (i.e. parse() and eval()) the file subtest.R again and again. (try e.g. changing line 2 in subtest.R to 'foo- foo +2' and run 'test(1)' ) How can I 'attach' the function subtest inside the main function, such that it is not evaluated again at each function call? And this while maintaining the above 'subfunction' functionality (i.e. attaching the subfunction under the right environment)? I was not able to resolve that - thanks in advance for any help! all the best Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of vectors in a matrix
Nice problem! If I understand you correctly, here's how to do it (with list-based matrices): set.seed(1) (x - matrix(lapply(rpois(10,2)+1, function(k) sample(letters[1:10], size=k)), ncol=2, dimnames=list(1:5,c(A,B A B 1 Character,2 Character,5 2 Character,2 Character,5 3 Character,3 Character,3 4 Character,5 Character,3 5 Character,2 i x[1,1] [[1]] [1] c b x[1,2] [[1]] [1] c d a j f (y - cbind(x, A-B=apply(x, 1, function(ab) setdiff(ab[[1]], ab[[2]] A B A-B 1 Character,2 Character,5 b 2 Character,2 Character,5 g 3 Character,3 Character,3 Character,3 4 Character,5 Character,3 Character,2 5 Character,2 i Character,2 y[1,3] [[1]] [1] b -- Tony Plate esterhazy wrote: Hi, I have a matrix with two columns, and the elements of the matrix are vectors. So for example, in line 3 of column 1 I have a vector v31=(marc, robert, marie). What I need to do is to compare all vectors in column 1 and 2, so as to get, for example setdiff(v31,v32) into a new column. Is there a way to do this in R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge data
On Nov 10, 2009, at 12:36 PM, Chuck White wrote: df1 -- dataframe with column date and several other columns. #rows 40k Several of the dates are repeated. df2 -- dataframe with two columns date and index. #rows ~130 This is really a map from date to index. I would like to create a column called index in df1 which has the corresponding index from df2. The following works: index - NULL for(wk in df1$week){ index - c(index,df2$index[df2$week==wk]) } and then add index to df1. Can you please suggest a better way of doing this? I didn't think merge was suitable for this...is it? THANKS. I think merge should work, but if you really have looked at the various arguments, tested reasonable examples and are still convinced it wouldn't, then see what you get with: df1 - data.frame(dt = Sys.Date() - sample(100:120, 30, replace=TRUE), 1:30) df2 - data.frame(dt2 = Sys.Date() -100:120, index=LETTERS[1:21]) df1$index - df2[ match(df1$dt,df2$dt2), index] df1 dt X1.30 index 1 2009-07-30 1 D 2 2009-07-16 2 R 3 2009-07-23 3 K 4 2009-07-29 4 E 5 2009-07-15 5 S 6 2009-08-02 6 A 7 2009-07-18 7 P 8 2009-07-21 8 M 9 2009-07-27 9 G 10 2009-07-2610 H 11 2009-07-3111 C 12 2009-07-2612 H 13 2009-07-1813 P 14 2009-07-2314 K 15 2009-07-2115 M 16 2009-07-1916 O 17 2009-07-1417 T 18 2009-07-1618 R 19 2009-07-1519 S 20 2009-07-1320 U 21 2009-07-2821 F 22 2009-07-2022 N 23 2009-07-2423 J 24 2009-07-2024 N 25 2009-07-1625 R 26 2009-07-3026 D 27 2009-07-1427 T 28 2009-08-0228 A 29 2009-07-1929 O 30 2009-07-2630 H I tried merge(df1, df2, by.x=1, by.y=1) and got the same result modulo the order of the output. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why arima does not work when the time series is a costant?
Hi! One of my time series happens to be a constant. When I call arima with c(0, 0, 0), it gives me error. Here is an example: ts1 - ts(rep(1, 29)) fit - arima(ts1, order=c(0,0,0)) When I run the second line above, it gives me the following error Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods I was confused why arima does not work if my time series is a constant. It should be very simple - resulted model should be a constant with the value of 1. Thanks, Aijun. -- View this message in context: http://old.nabble.com/Why-arima-does-not-work-when-the-time-series-is-a-costant--tp26287374p26287374.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HEEELP!!!!
Hello On 11/10/09, Ana María Prieto prieto.anama...@gmail.com wrote: When copying the script in the R console, it seems that there is a problem with the ~ symbol. this symbol is not in the keyboard, so I select it from spetial characters, For what language is your keyboard designed? If it is for English, similar to this one [1], try a British English layout, and then the ~ sign will (possibly) be on the \ button (press shift+\). I'm on Linux, but I hope it would be similar for Mac. Liviu [1] http://www.mrtotes.co.uk/UK_Mac_Keyboard.jpg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interfacing R and C++
On 11/10/2009 12:35 PM, Nathan Harmston wrote: Hi, I m currently working on interfacing R and C++ (passing a matrix from R to C++, doing some stuff, returning a vector of results). There seem to be a number of ways of doing this, such as rcpp and swig. Is there a recommended way of doing this? In the long run I would like what I m working on to be portable across OS etc. Just follow the instructions in the Writing R Extensions manual. You're not doing anything very unusual, so rcpp and swig may be overkill. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HEEELP!!!!
On Nov 10, 2009, at 11:24 AM, Ana María Prieto wrote: Hello. My name is Ana. I´m doing an eology master, and I´m just learning how R works. I have a Mac OS X 10.5.6, and I´m tryng to run just a simple ANOVA nanalyses. I dowloaded R version 2.10.0, and it seems I have problems with the script. I don´t know what to do, I´ve already change the languages, be sure of being working in the correct directory and doesn´t work the script is: #1. example data01-read.table(ch01.txt,header=T) names(data01) attach(data01) FERTIL-factor(FERTIL) model01-(YIELD~FERTIL) I suspect that the problem is not with the lack of a tilde (~) or the misinterpretation of the special character that you seem to have gotten into this mail message, but rather with your failure to type the two letters lm thusly: model01 - lm(YIELD~FERTIL) I have 3 Macs of varying vintages (one desktop and two laptops) and they all have the tilde in the upper left corner just below the escape key. Future submissions to the R-help list should be accompanied by a more informative subject line. That sort of Subject is considered rude (some might even say infantile) behavior in this arena. -- David. summary(model01) anova(model01) all my classmates, even those with a mac operator, wrote the same, and it worked. When copying the script in the R console, it seems that there is a problem with the ~ symbol. this symbol is not in the keyboard, so I select it from spetial characters, and then i selected with shift an key, and still, it doesn´t works I attached th file with the data. i hope you can help me! Ana. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Monte Carlo Simulation in R...
Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Regards Garry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] All possible combinations of functions within a function
Dear All, I wrote a function for cluster analysis to compute cophenetic correlations between dissimilarity matrices (using the VEGAN library) and cluster analyses of every possible clustering algorithm (SEE ATTACHED) http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now, it is extremely long, and for the future I was hoping to find a more efficient way of doing this sort of thing. To give you an outline of the function, first I create the dissimiarity matrices using all possible methods in the VEGAN command vegdist, then create the clusters using all possible algorithms in hclust and the dissimilarity matrices I crated, then create a table, and in one column, list all combinations, and in the other, compute and put the cophenetic correlation. Any help would be appreciated! I'm pretty new to writing my own functions but I see great time-saving potential. Thanks! Mike -- View this message in context: http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Command-line arguments and --interactive
On Tue, 10 Nov 2009, Duncan Murdoch wrote: --interactive tells R that there is a human producing the input stream, so it can ask questions and expect them to be answered. In your experiments with it, your input stream was the pipe holding the output of echo, and R got confused because that pipe wouldn't answer its question. I see. That makes sense, and thank you. Your problem is that you want an input stream that starts out from your fixed code and then switches to your shell's stdin. Correct. Or, you might consider it an ability to add a command or two to the effective end of .Rprofile. I think you can do that on some systems by saving your input to a file then concatenating it to stdin, e.g. something like this: echo '10*5; scan()' test.R cat test.R - | R --interactive I don't know if there's a way to do this in one line, and I'd expect some oddities. Creating a new file progammatically is difficult. Another solution, suggested by peterdc in the #R irc channel, was to run R itself and then use the system() command to complete the prior-to-launching-R task...that is the solution I'm going with, but thanks very much for your help! --Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Monte Carlo Simulation in R...
On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All possible combinations of functions within a function
Here is a hint of how you might want to do the first part. You might want to study the 'lappy' function vegList - lapply(c('bray', ..., 'binomial'), function(.method){ vegdist(x, method=.method) }) clustList - lapply(vegList, function(.dist){ lapply(c('average', ..., 'centroid'), function(.dist){ hclust(.veg, .dist) }) }) rest left as exercise for the reader. On Tue, Nov 10, 2009 at 2:05 PM, bikemike42 ml...@tamu.edu wrote: Dear All, I wrote a function for cluster analysis to compute cophenetic correlations between dissimilarity matrices (using the VEGAN library) and cluster analyses of every possible clustering algorithm (SEE ATTACHED) http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now, it is extremely long, and for the future I was hoping to find a more efficient way of doing this sort of thing. To give you an outline of the function, first I create the dissimiarity matrices using all possible methods in the VEGAN command vegdist, then create the clusters using all possible algorithms in hclust and the dissimilarity matrices I crated, then create a table, and in one column, list all combinations, and in the other, compute and put the cophenetic correlation. Any help would be appreciated! I'm pretty new to writing my own functions but I see great time-saving potential. Thanks! Mike -- View this message in context: http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All possible combinations of functions within a function
Hi, From what I understand of your code, you might find the following construct useful, funs - c(mean, sum, sd, diff) x - 1:10 lapply(funs, do.call, args=list(x)) and then working with lists rather than naming every object individually. You might find mapply useful too when you have to pass several parameters. HTH, baptiste 2009/11/10 bikemike42 ml...@tamu.edu: Dear All, I wrote a function for cluster analysis to compute cophenetic correlations between dissimilarity matrices (using the VEGAN library) and cluster analyses of every possible clustering algorithm (SEE ATTACHED) http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now, it is extremely long, and for the future I was hoping to find a more efficient way of doing this sort of thing. To give you an outline of the function, first I create the dissimiarity matrices using all possible methods in the VEGAN command vegdist, then create the clusters using all possible algorithms in hclust and the dissimilarity matrices I crated, then create a table, and in one column, list all combinations, and in the other, compute and put the cophenetic correlation. Any help would be appreciated! I'm pretty new to writing my own functions but I see great time-saving potential. Thanks! Mike -- View this message in context: http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] do.call and timeSeries
Bierbryer, Andrew wrote: Does anyone know why the following code hangs on the do.call, but works fine when I either comment out the require(timeSeries) or only do 2 levels of a for loop instead of 3? Thanks, Andrew Bierbryer require(timeSeries) num - 1 x.list - list() for ( i in 1:10 ) { for ( j in 1:20 ) { for ( k in 1:30 ) { x.list[[num]] - cbind(num,10) num - num + 1 } } } cat('calling do.call\n') x.df - do.call(rbind,x.list) cat('called do.call\n') This seems like a highly unusual way to generate x.df, but it works fine for me. No idea why this doesn't work for you. Here's my sessionInfo; what's yours? sessionInfo() R version 2.10.0 Patched (2009-11-02 r50295) i386-pc-mingw32 locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C [5] LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] timeSeries_2100.84 timeDate_2100.86 -Peter Ehlers [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Thanks. I tried the rgamma function too. But I'm still wondering how I can set the min, max, and sum of the variates created by the random draws. Anyone has a clue? Thanks. Garry On Tue, Nov 10, 2009 at 11:47 AM, David Winsemius dwinsem...@comcast.netwrote: On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] do.call and timeSeries
Peter - I am using 2.8.1 on linux. When I use 2.10.0 on the pc, it works, so it must be a version issue. sessionInfo() R version 2.8.1 (2008-12-22) x86_64-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U TF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME= C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATI ON=C attached base packages: [1] stats graphics grDevices utils datasets methods base -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Tuesday, November 10, 2009 2:38 PM To: Bierbryer, Andrew Cc: r-help@r-project.org Subject: Re: [R] do.call and timeSeries Bierbryer, Andrew wrote: Does anyone know why the following code hangs on the do.call, but works fine when I either comment out the require(timeSeries) or only do 2 levels of a for loop instead of 3? Thanks, Andrew Bierbryer require(timeSeries) num - 1 x.list - list() for ( i in 1:10 ) { for ( j in 1:20 ) { for ( k in 1:30 ) { x.list[[num]] - cbind(num,10) num - num + 1 } } } cat('calling do.call\n') x.df - do.call(rbind,x.list) cat('called do.call\n') This seems like a highly unusual way to generate x.df, but it works fine for me. No idea why this doesn't work for you. Here's my sessionInfo; what's yours? sessionInfo() R version 2.10.0 Patched (2009-11-02 r50295) i386-pc-mingw32 locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C [5] LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] timeSeries_2100.84 timeDate_2100.86 -Peter Ehlers [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Formatted contingency tables with (%)
Quite often, I need those tables: x - sample(c(a, b, c), 40, rep=T) y - sample(c(X, Y), 40, rep=T) (tbl - table(x, y)) (z - as.factor(paste(as.vector(tbl), (, round(prop.table(as.vector(tbl)) * 100, 1), %), sep=))) matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl)) But the result looks ugly and is not copypaste-able for LaTeX verbatim or table environment, moreover, the \ is not what I want in the printout. How to achieve: y x X Y a 3 (7.5%) 7 (17.5%) b 9 (22.5%) 5 (12.5%) c 6 (15.0%) 10 (25.0%) Thank you for help or hints. Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automating Plot Commands using a Loop
Hello, Thank you very much. Your string makes perfect sense to me, but I get an error when I try this: Data-read.csv(Datacull.txt,header=T,row.names=1) TData-t(Data) PlotFunction-function (x) { par(mfrow=c(3,6)) for (i in colnames(x)) { plot(x[[i]],type=o,axes=F,xlab='',ylab='',ylim=c(0,2),main=i) axis(1,at=1:6,lab=c(A,B,C,D,E,F)) axis(2,at=seq(0,2,.25)) }} PlotFunction(TData) I get an error: Error in x[[i]] : subscript out of bounds If I try to hard code it without using a function for (i in colnames(TData)) { plot(TData[[i]],type=o,axes=F,xlab='',ylab='',ylim=c(0,2),main=i) axis(1,at=1:6,lab=c(A,B,C,D,E,F)) axis(2,at=seq(0,2,.25)) }} I get the similar error: Error in TData[[i]] : subscript out of bounds What does this mean? jholtman wrote: try this: Data - read.table(textConnection( Alpha Beta Gamma Delta A .1 .2 .3.4 B .2.3 .4.5 C .8 .9 .43 .13 D .13 .34 .34 .3), header=TRUE) closeAllConnections() par(mfrow=c(2,2)) for (i in colnames(Data)){ plot(Data[[i]],type=o,axes=F,xlab='', ylab='', ylim=c(0,1), main=i) axis(1,at=1:4,lab=c(A,B,C,D)) axis(2,at=seq(0, 1, .25)) # changed your axis to work } On Mon, Nov 9, 2009 at 3:38 PM, Koraelus korae...@yahoo.com wrote: I have three matrices with the same row and column names, but different data. e.g. Data Alpha Beta Gamma Delta A .1 .2 .3 .4 B .2 .3 .4 .5 C .8 .9 .43 .13 D .13 .34 .34 .3 For each column, I would like to create a separate plot on a single window. I currently am using the cmd-line: windows() par(mfrow=c(2,2)) plot(Data$Alpha,type=o,axes=F,ann=F,ylim=c(0,1)) axis(1,at=1:4,lab=c(A,B,C,D) axis(2,at=.25*0:1) plot(Data$Beta,type=o,axes=F,ann=F,ylim=c(0,1)) axis(1,at=1:4,lab=c(A,B,C,D) axis(2,at=.25*0:1) etc. I would like to automate this as much as possible. I tried to write a function, but clearly it would involve some sort of loop... I don't know how to do anything like that. Thanks, koraelus -- View this message in context: http://old.nabble.com/Automating-Plot-Commands-using-a-Loop-tp26273268p26273268.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Automating-Plot-Commands-using-a-Loop-tp26273268p26289078.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
May be you are interested in the first `n' for which the sum of iid gamma rvs exceeds 2000, subject to the min-max constraints on each rv. If so, the following one-liner will give it to you: which(cumsum(pmax(1, pmin(rgamma(500, shape=0.067, rate=0.008), 85))) 2000)[1] Note that I have used a slightly inefficient code (a somewhat generous N of 500), but this permits economy of code (on-liner!). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hongwei Dong Sent: Tuesday, November 10, 2009 2:57 PM To: David Winsemius Cc: R-help Forum Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... Thanks. I tried the rgamma function too. But I'm still wondering how I can set the min, max, and sum of the variates created by the random draws. Anyone has a clue? Thanks. Garry On Tue, Nov 10, 2009 at 11:47 AM, David Winsemius dwinsem...@comcast.netwrote: On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formatted contingency tables with (%)
On Nov 10, 2009, at 3:07 PM, soeren.vo...@eawag.ch wrote: Quite often, I need those tables: x - sample(c(a, b, c), 40, rep=T) y - sample(c(X, Y), 40, rep=T) (tbl - table(x, y)) (z - as.factor(paste(as.vector(tbl), (, round(prop.table(as.vector(tbl)) * 100, 1), %), sep=))) matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl)) But the result looks ugly and is not copypaste-able for LaTeX verbatim or table environment, moreover, the \ is not what I want in the printout. How to achieve: y x X Y a 3 (7.5%) 7 (17.5%) b 9 (22.5%) 5 (12.5%) c 6 (15.0%) 10 (25.0%) Thank you for help or hints. In addition to my other thought: library(xtable) xtable(tbl) % latex table generated in R 2.10.0 by xtable 1.5-5 package % Tue Nov 10 15:44:14 2009 \begin{table}[ht] \begin{center} \begin{tabular}{rrr} \hline X Y \\ \hline a6 10 \\ b 124 \\ c35 \\ \hline \end{tabular} \end{center} \end{table} Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gsub does not support \b?
Hello, can someone help? How come gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS) [1] ADVANCED ENERGY INDS not ADVANCED ENERGY INDUSTRIES Thanks. Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating multiple plots using a splitting factor
Thank you for the responses. The Lattice library is indeed useful for producing the graphs in which I am interested, and I appreciate the clarification between a function and the result of a function. Ideally, I would like to be able to page through the graphs rather than (or in addition to) having them displayed on the same page. Is there a way to do this? Danielle B. Johnston, Habitat Researcher Colorado Division of Wildlife -Original Message- From: Phil Spector [mailto:spec...@stat.berkeley.edu] Sent: Tuesday, November 10, 2009 12:46 AM To: Johnston, Danielle Subject: Re: [R] creating multiple plots using a splitting factor Danielle - What you meant to say was lapply(splitlist, function(x)hist(x$distance_cm, breaks=10)) but you might also be interested in trying library(lattice) histogram(~distance_cm|site,data=seeddist2) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 9 Nov 2009, Johnston, Danielle wrote: Hello, I am new to R. I often collect data at multiple sites and need to create separate graphs (such as scatterplots or histograms) of specific variables for each site. I have tried to do this by splitting the data frame and then using lapply, but it seems that the graphing commands cannot be called as functions. Here is a sample of my data, called seeddist2: siteDaysSinceRelease distance_cm 10 GVM 1 17.8 11 GVM 1 17.8 12 GVM 1 14.0 13 GVM 1 14.0 14 GVM 1 14.9 15 GVM 1 25.4 16 WRR 1 25.4 17 WRR 1 35.0 18 WRR 1 45.0 19 WRR 1 55.0 20 WRR 1 60.0 Here is what I tried to get separate histograms of distance_cm by site: splitlist- split(seeddist2, site) lapply(splitlist, hist(distance_cm, breaks=10)) I then get an error message saying that match.fun didn't find the function. Is there another way to produce multiple graphs at once? Thank you, Danielle B. Johnston, Habitat Researcher Colorado Division of Wildlife [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R process gets killed spontaneously
This was happening to me on Red Hat Linux when I was running huge jobs within a screen session. By any chance are your R processes running within a screen session? (screen is a very nice program that will keep your sessions alive after you log out, but it was killing off my big memory jobs for unknown reasons.) Eric - Original message - From: Peng Yu pengyu...@gmail.com To: r-h...@stat.math.ethz.ch Date: Tue, 10 Nov 2009 10:23:18 -0600 Subject: Re: [R] R process gets killed spontaneously 2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de: Peng Yu wrote: My R process has been killed for a few times, although the system administrator did not do so. It happened when R attempted to allocate a lot of memory. I'm wondering whether R would spontaneously kill itself if it can not allocate enough memory? It does not kill itself. If it was not some out-of-memory kill process of your OS, then you may have seen some segfault. We need reproducible code and all the details described in the posting guide. Please see this thread about the necessary information. http://www.mail-archive.com/r-help@r-project.org/msg74804.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All possible combinations of functions within a function
Thank you Baptiste and Jim- I look forward to trying these ideas out when I have a chance. Mike baptiste auguie wrote: Hi, From what I understand of your code, you might find the following construct useful, funs - c(mean, sum, sd, diff) x - 1:10 lapply(funs, do.call, args=list(x)) and then working with lists rather than naming every object individually. You might find mapply useful too when you have to pass several parameters. HTH, baptiste 2009/11/10 bikemike42 ml...@tamu.edu: Dear All, I wrote a function for cluster analysis to compute cophenetic correlations between dissimilarity matrices (using the VEGAN library) and cluster analyses of every possible clustering algorithm (SEE ATTACHED) http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now, it is extremely long, and for the future I was hoping to find a more efficient way of doing this sort of thing. To give you an outline of the function, first I create the dissimiarity matrices using all possible methods in the VEGAN command vegdist, then create the clusters using all possible algorithms in hclust and the dissimilarity matrices I crated, then create a table, and in one column, list all combinations, and in the other, compute and put the cophenetic correlation. Any help would be appreciated! I'm pretty new to writing my own functions but I see great time-saving potential. Thanks! Mike -- View this message in context: http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael L. Treglia Graduate Student Wildlife and Fisheries Sciences Texas AM University Lab: (979)862-7245 Cell: (917)841-5603 ml...@tamu.edu http://people.tamu.edu/~mlt35 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standardGeneric seems slow; any way to get around it?
Hi, I'm running some routines with standard matrix operations like solve() and diag(). When I do a profile, the lead item under total time is standardGeneric(). Furthermore, solve() and diag() have much greater total time than self time. ??? I assume there is some time-consuming decision going on in the usual functions; is there any way to avoid that and go straight to the calculaions? Thanks John Tillinghast [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R process gets killed spontaneously
I run R in gnome-terminal. Is it what you referred as 'screen session'? On Tue, Nov 10, 2009 at 3:01 PM, rmail...@justemail.net wrote: This was happening to me on Red Hat Linux when I was running huge jobs within a screen session. By any chance are your R processes running within a screen session? (screen is a very nice program that will keep your sessions alive after you log out, but it was killing off my big memory jobs for unknown reasons.) Eric - Original message - From: Peng Yu pengyu...@gmail.com To: r-h...@stat.math.ethz.ch Date: Tue, 10 Nov 2009 10:23:18 -0600 Subject: Re: [R] R process gets killed spontaneously 2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de: Peng Yu wrote: My R process has been killed for a few times, although the system administrator did not do so. It happened when R attempted to allocate a lot of memory. I'm wondering whether R would spontaneously kill itself if it can not allocate enough memory? It does not kill itself. If it was not some out-of-memory kill process of your OS, then you may have seen some segfault. We need reproducible code and all the details described in the posting guide. Please see this thread about the necessary information. http://www.mail-archive.com/r-help@r-project.org/msg74804.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formatted contingency tables with (%)
On Nov 10, 2009, at 3:07 PM, soeren.vo...@eawag.ch wrote: Quite often, I need those tables: x - sample(c(a, b, c), 40, rep=T) y - sample(c(X, Y), 40, rep=T) (tbl - table(x, y)) (z - as.factor(paste(as.vector(tbl), (, round(prop.table(as.vector(tbl)) * 100, 1), %), sep=))) matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl)) But the result looks ugly and is not copypaste-able for LaTeX verbatim or table environment, moreover, the \ is not what I want in the printout. How to achieve: y x X Y a 3 (7.5%) 7 (17.5%) b 9 (22.5%) 5 (12.5%) c 6 (15.0%) 10 (25.0%) Thank you for help or hints. library(gmodels) CrossTable(tbl, prop.c=F, prop.r=F, prop.chisq=F) # author, Marc Schwartz Cell Contents |-| | N | | N / Table Total | |-| Total Observations in Table: 40 | y x | X | Y | Row Total | -|---|---|---| a | 6 |10 |16 | | 0.150 | 0.250 | | -|---|---|---| b |12 | 4 |16 | | 0.300 | 0.100 | | -|---|---|---| c | 3 | 5 | 8 | | 0.075 | 0.125 | | -|---|---|---| Column Total |21 |19 |40 | -|---|---|---| -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub does not support \b?
Ok, I figured it out. My stupid mistake, should be \\b instead of \b. From: Tan, Richard Sent: Tuesday, November 10, 2009 3:36 PM To: 'r-help@r-project.org' Subject: gsub does not support \b? Hello, can someone help? How come gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS) [1] ADVANCED ENERGY INDS not ADVANCED ENERGY INDUSTRIES Thanks. Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R echo code chunk runs off the page using Lyx and Sweave
I am not really sure where in the interactions this is handled, but I would like to keep echo-ed R code chunks from running past the right margin and off the page. I started with R and options(width=n), but this does not seem to do anything (in the context of a document -- line command works just fine). I have beating my head against different Lyx document settings without anything to show. Anyone point me in the right direction? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.edu wrote: I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to suppress the output from stepAIC?
Hi Jack, try stepAIC with trace parameter: stepAIC(...,trace=FALSE) Regards, M Jack Luo jluo.rh...@gmail.com a écrit : Hi, I am now running a cross-validation using coxph coupled with stepAIC for model selection, is there anyway to suppress the output? It's too much. -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R process gets killed spontaneously
I think he was referring to the actual 'screen' command.. but, I digress... Is there any way you can put your commands in a script and execute them from the command line so that you see the actual memory and GC output from R in realtime? I use the following (in WinXP) to debug any faulty processes or detect if the job was actually completed successfully: (adapted for Linux, *not* winxp) R --verbose MorningStartup.r morningsummary-log.txt That should yield some more results which hopefully should help you investigate. The pertinent stuff is output to the console. -cedrick Peng Yu wrote: I run R in gnome-terminal. Is it what you referred as 'screen session'? On Tue, Nov 10, 2009 at 3:01 PM, rmail...@justemail.net wrote: This was happening to me on Red Hat Linux when I was running huge jobs within a screen session. By any chance are your R processes running within a screen session? (screen is a very nice program that will keep your sessions alive after you log out, but it was killing off my big memory jobs for unknown reasons.) Eric - Original message - From: Peng Yu pengyu...@gmail.com To: r-h...@stat.math.ethz.ch Date: Tue, 10 Nov 2009 10:23:18 -0600 Subject: Re: [R] R process gets killed spontaneously 2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de: Peng Yu wrote: My R process has been killed for a few times, although the system administrator did not do so. It happened when R attempted to allocate a lot of memory. I'm wondering whether R would spontaneously kill itself if it can not allocate enough memory? It does not kill itself. If it was not some out-of-memory kill process of your OS, then you may have seen some segfault. We need reproducible code and all the details described in the posting guide. Please see this thread about the necessary information. http://www.mail-archive.com/r-help@r-project.org/msg74804.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
By the way, maybe the number of groups can be determined endogenously. It will be better if I do not have to set the total number of groups exogenously. Thanks Garry On Tue, Nov 10, 2009 at 1:29 PM, Hongwei Dong pdxd...@gmail.com wrote: Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.eduwrote: I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.html -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
If the number of groups can be set endogenously my previous email about the smallest `n' would apply. You can view this as a waiting time problem. Here is one approach: x - round(pmax(1, pmin(rgamma(500, shape=0.067, rate=0.008), 85))) csx - cumsum(x) ind - which(csx 2000)[1] xg - c(x[1:(ind-1)], 2000 - csx[ind-1]) # group sizes sum(xg) length(xg) # number of groups However, note that the elements of `xg' are not gamma variables (due to min-max constraint), and they are all not independent (due to sum constraint). Hope this helps, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hongwei Dong Sent: Tuesday, November 10, 2009 4:44 PM To: R-help Forum Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... By the way, maybe the number of groups can be determined endogenously. It will be better if I do not have to set the total number of groups exogenously. Thanks Garry On Tue, Nov 10, 2009 at 1:29 PM, Hongwei Dong pdxd...@gmail.com wrote: Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.eduwrote: I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadh an.html -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
On Nov 10, 2009, at 4:29 PM, Hongwei Dong wrote: Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. I think you may want your scale (from whatever dataset it was determined) to be 1/0.008 Read this message on an SPSS mailing list: http://www.listserv.uga.edu/cgi-bin/wa?A2=ind9902L=spssx-lD=0F=PP=7633 and then look at the parameterization for rgamma in the r help pages. I seem to remember that Terry Therneau has also pointed out this ambiguous scale parameter issue for gamma in the past. (This is, by the way, beginning to sound quite a bit like a homework problem.) -- David Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.edu wrote: I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are independent Gammas, with the given shape and scale. You want to simulate from the joint distribution conditional on the event sum(X) == 2000. Is that your problem? I don't know how to do the simulation, but maybe someone else does. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list
Re: [R] R echo code chunk runs off the page using Lyx and Sweave
options(width=n) is supposed to work, and does for me. I don't use Lyx though... -Ista On Tue, Nov 10, 2009 at 4:27 PM, Mark Connolly mark_conno...@acm.org wrote: I am not really sure where in the interactions this is handled, but I would like to keep echo-ed R code chunks from running past the right margin and off the page. I started with R and options(width=n), but this does not seem to do anything (in the context of a document -- line command works just fine). I have beating my head against different Lyx document settings without anything to show. Anyone point me in the right direction? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R echo code chunk runs off the page using Lyx and Sweave
Ista Zahn wrote: options(width=n) is supposed to work, and does for me. I don't use Lyx though... -Ista On Tue, Nov 10, 2009 at 4:27 PM, Mark Connolly mark_conno...@acm.org wrote: I am not really sure where in the interactions this is handled, but I would like to keep echo-ed R code chunks from running past the right margin and off the page. I started with R and options(width=n), but this does not seem to do anything (in the context of a document -- line command works just fine). I have beating my head against different Lyx document settings without anything to show. Anyone point me in the right direction? \SweaveOpts{keep.source=TRUE} in your LaTeX code will (I think) keep whatever manual formatting you do, in all code chunks (or use keep.source=TRUE) for particular code chunks of concern -- View this message in context: http://old.nabble.com/R-echo-code-chunk-runs-off-the-page-using-Lyx-and-Sweave-tp26291535p26292226.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
On Nov 10, 2009, at 5:00 PM, David Winsemius wrote: On Nov 10, 2009, at 4:29 PM, Hongwei Dong wrote: Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. I think you may want your scale (from whatever dataset it was determined) to be 1/0.008 Note that Ravi Varadhan and I may not be not disagreeing since: rgamma function (n, shape, rate = 1, scale = 1/rate) snipped You are fairly far out in the right tail of that distribution: round(rgamma(200, shape=0.067, scale = 1/0.008)) [1] 12 0 0 0 20 0 9 0 54 0 0 0 2 8 49 0 26 0 0 0 0 2 0 0 0 0 [27] 0 0 3 0 0 1 0 0 0 11 0 0 1 256 0 3 0 9 0 0 0 0 0 1 0 0 [53] 0 0 0 0 1 0 0 0 2 1 0 0 0 0 13 0 0 0 0 0 0 0 152 157 0 3 [79] 4 0 0 0 0 26 0 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [105] 4 0 0 0 0 0 0 0 2 0 1 0 5 2 0 18 0 0 0 0 0 0 19 0 190 0 [131] 0 0 2 0 0 0 0 83 0 0 0 0 0 4 0 0 0 0 0 20 0 9 1 1 0 0 [157] 0 4 16 0 28 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 23 0 0 [183] 0 0 0 0 6 1 1 0 1 0 0 0 1 0 0 0 0 0 I'm wondering if the Gamma distribution is the correct distribution to be fitting? Is there some sort of -- David Read this message on an SPSS mailing list: http://www.listserv.uga.edu/cgi-bin/wa?A2=ind9902L=spssx-lD=0F=PP=7633 and then look at the parameterization for rgamma in the r help pages. I seem to remember that Terry Therneau has also pointed out this ambiguous scale parameter issue for gamma in the past. (This is, by the way, beginning to sound quite a bit like a homework problem.) -- David Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.edu wrote: I think he means rate = 0.008, so he is looking for: rgamma(n, shape=0.067, rate=0.008) Even then his problem is not well-posed. You cannot have both independent gamma rv's and have them sum to 2000. Ravi. --- Ravi Varadhan, Ph.D. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of David Winsemius Sent: Tuesday, November 10, 2009 2:47 PM To: Hongwei Dong Cc: R-help Forum; Duncan Murdoch Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R... On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 You might want to check that your parameterization in in agreement with that used by the rgamma function. Simply using those numbers yields a distribution that does not look as though it would get many qualifying samples. Here are 20 draws without any exclusions outside a range: rgamma(20, shape=0.067, scale = 0.008) [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 7.680773e-38 6.441082e-15 6.168961e-13 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 1.852885e-04 4.212802e-07 1.774495e-25 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 11/10/2009 1:25 PM, Hongwei Dong wrote: Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Your requirements are slightly contradictory or incomplete. Here's one way to fully specify the problem: The X_i values are
Re: [R] Titles on panel graphs created in zoo
Please read the last line on every message to r-help and particularly note the requirement to provide reproducible code. We don't have your data so its not reproducible. Also please clarify what you want where. On Tue, Nov 10, 2009 at 10:46 AM, MichelleJ m.jack...@qmul.ac.uk wrote: I have a plotted a stacked panel graph (single x axis and multiple y axis) using the package zoo and would like to add a title for each separate panel. I am using the script: z - with(mydata,zoo(cbind(mydata$Water.level,mydata$Submerged.plants, mydata$Crayfish.CPUE,mydata$Carp.CPUE),Year)) plot(z,type=b,pch=16,lty=2,xlab=Year,ylab=c(Metres,Realtive density, CPUE,CPUE),main=) Any help would be much appreciated. M -- View this message in context: http://old.nabble.com/Titles-on-panel-graphs-created-in-zoo-tp26285713p26285713.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] qplot error
When I invoke qplot, I get the following error: Error in rename.default(x, .base_to_ggplot) : object '.data' not found I would appreciate any advice. sessionInfo: R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grid grDevices datasets splines graphics stats [7] tcltk utils methods base other attached packages: [1] epicalc_2.10.0.0 ggplot2_0.8.3 reshape_0.8.3 [4] plyr_0.1.9 proto_0.3-8MCMCpack_1.0-4 [7] coda_0.13-4polycor_0.7-7 sfsmisc_1.0-8 [10] mvtnorm_0.9-8 xtable_1.5-6 prettyR_1.5 [13] lme4_0.999375-32 Matrix_0.999375-31 effects_2.0-10 [16] colorspace_1.0-1 nnet_7.3-1 mvnormtest_0.1-7 [19] gmodels_2.15.0 gtools_2.6.1 latticeExtra_0.6-4 [22] lattice_0.17-26RColorBrewer_1.0-2 doBy_4.0.5 [25] foreign_0.8-38 rms_2.1-0 e1071_1.5-20 [28] class_7.3-1car_1.2-16 mitools_2.0 [31] MASS_7.3-3 svSocket_0.9-48TinnR_1.0.3 [34] R2HTML_1.59-1 Hmisc_3.7-0survival_2.35-7 loaded via a namespace (and not attached): [1] cluster_1.12.1 gdata_2.6.1svMisc_0.9-56 tools_2.10.0 Respectfully, Frank Lawrence __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.