[R] Plotting time data for various countries in same graph
Hi, I've the following kind of data Time Country Values 2010Q1India 5 2010Q2India 7 2010Q3India 5 2010Q4India 9 2010Q1China 10 2010Q2China 6 2010Q3China 9 2010Q4 China 14 I needed to plot a graph with the x-axis being time,y-axis being he Values and 2 line graph , one for India and one for counry. I don't have great knowledge on graphics in R. I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country)) But this does not help. Can anyone help me with this? -- Anindya Sankar Dey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] robustbase adjbox segfault - memory not mapped
B == Baan baanba...@gmail.com on Mon, 4 Mar 2013 22:47:10 +0530 writes: B Thank you Martin. Look forward to the fix. Committed to the R-forge version of robustbase. It was a simple integer overflow, indeed, necessarily happening when the sample size was = 2^16.5. I'm planning to submit robustbase_0.9-7 to CRAN today. Martin B Regards B Baan B On Monday 04 March 2013 10:19 PM, Martin Maechler wrote: B == Baan baanba...@gmail.com on Mon, 4 Mar 2013 15:02:02 +0530 writes: B Hi, I encountered a segfault, memory not mapped error B when using adjbox in robustbase. In trying to recreate B the issue I found that the error occurs only for large B sample size. Here is the code. require(robustbase) B Loading required package: robustbase x - rnorm(10) y - rep(1, 10) adjbox(x ~ y) ## gives a plot x - rnorm(1) y - rep(1, 1) adjbox(x ~ y) ## gives a plot x - rnorm(10) y - rep(1, 10) adjbox(x ~ y) B *** caught segfault *** B address 0xfffcc47af530, cause 'memory not mapped' B Traceback: B 1: .C(mc_C, x, n, eps = eps, iter = c.iter, medc = double(1)) B 2: mcComp(x, doReflect, eps1 = eps1, eps2 = eps2, maxit = maxit, B trace.lev = trace.lev) B 3: mc.default(x, ..., na.rm = TRUE) B 4: mc(x, ..., na.rm = TRUE) B 5: adjboxStats(unclass(groups[[i]]), coef = range, doReflect = doReflect) B 6: adjbox.default(split(mf[[response]], mf[-response]), ...) B 7: adjbox(split(mf[[response]], mf[-response]), ...) B 8: adjbox.formula(x ~ y) B 9: adjbox(x ~ y) Indeed, I (as maintainer of robustbase) can reproduce the segfault *even* though you did not specify the random seed... So this should be fixed ... hopefully within a week or so, but I am not promising anything, given my busy schedule! Martin Maechler, ETH Zurich [] B My setup details: B R --version B R version 2.15.2 (2012-10-26) -- Trick or Treat B Package:robustbase B Version:0.9-5 B Date: 2012-03-01 B Packaged: 2013-03-01 16:34:03 UTC; maechler B NeedsCompilation: yes B Repository: CRAN B Date/Publication: 2013-03-01 18:31:33 B Built: R 2.15.2; x86_64-pc-linux-gnu; 2013-03-04 05:54:20 B UTC; unix B Platform: x86_64-pc-linux-gnu (64-bit) B uname -a B Linux R 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64 GNU/Linux B Debian squeeze B Could someone pls help. B Regards B Baan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to construct bivariate joint cumulative pdf from bivariate joint pdf
Hello, I am using sm.density() to find the bivariate joint PDFof events: For eg, x-cbind(rnorm(30),rnorm(30)) den-sm.density(x) Then I get the joint pdf from den$estimate in order to constructthe joint cumulative PDF. However, summing up all the values from den$estimateisnot equal to 1(have multipliedby the grid size). Anyone could help? Thanks. mc [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aov() and anova() making faulty F-tests
On Mar 6, 2013, at 03:56 , Rolf Turner wrote: Your subject line is patent nonsense. The aov() and anova() functions have been around for decades. If they were doing something wrong it would have been noticed long since. You should realize that the fault is in your understanding, not in these functions. I cannot really follow your convoluted and messy code, but it would appear that you want to consider M and I to be random effects. Only M and M:I, AFAICT. And, yes, it is messy; in particular, I refuse to believe that y~M*I has generated output with lowercase m and i! Where have you informed aov() as to the presence of these random effects? To be specific, try y~I + Error(M + M:I). Without the random effects, aov() is just telling you that there is a highly significant interaction between M and I, and beyond that, no sensible comparisons can be made. cheers, Rolf Turner On 03/06/2013 03:36 PM, PatGauthier wrote: Dear useRs, I've just encountered a serious problem involving the F-test being carried out in aov() and anova(). In the provided example, aov() is not making the correct F-test for an hypothesis involving the expected mean square (EMS) of a factor divided by the EMS of another factor (i.e., instead of the error EMS). Here is the example: Expected Mean Squaredf Mi σ2+18σ2M 1 Ij σ2+6σ2MI+12Ф(I) 2 MIij σ2+6σ2MI 2 ε(ijk)lσ2 30 The clear test for Ij is EMS(I) / EMS(MI) - F(2,2) However, observe the following example carried out in R, M - rep(c(M1, M2), each = 18) I - as.ordered(rep(rep(c(5,10,15), each = 6), 2)) y - c(44,39,48,40,43,41,27,20,25,21,28,22,35,30,29,34,31,38,12,7,6,11,7,12,15,10,12,17,11,13,22,15,27,22,21,19) dat - data.frame(M, I, y) summary(aov(y~M*I, data = dat)) DfSum Sq Mean Sq F value Pr(F) m 1 3136.0 3136.0295.85 2e-16 *** i2 513.7 256.9 24.23 5.45e-07 *** m:i 2 969.5 484.7 45.73 7.77e-10 *** Residuals 30 318.010.6 --- In this example aov has taken the F-ratio of MS(I) / MS(ε) - F(2,30) = 24.23 with F-crit = qf(0.95,2,3) = 9.55 -- significant However, as stated above, the correct F-ratio is MS(I) / MS(MI) - F(2,2) = 0.53 with F-crit = qf(0.95,2,2) = 19 -- non-significant Why is aov() miscalculating the F-ratio, and is there a way to fix this without prior knowledge of the appropriate test (e.g., EMS(I)/EMS(MI)? Thanks for your help, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm Regression takes 24+ GB RAM - Error message
Hello, I am a rather unexperienced r-user (learned the language 1 month ago) and run into the following problem using a local computer with 6 cores 24 GB RAM and R 2.15 64-bit. I didn't install any additional packages 1. Via the read.table command I load a data table (with different data types) which is about 730 MB large 2. I add 2 calculated columns 3. I split the dataset by 5 criteria 4. I run the lm command on the split with the calculated columns as the variables The RAM consumption goes rapidly up and stays at 24 GB for a couple of minutes. The result: Error: cannot allocate vector size of 5.0 Mb In addition: There ware 50 or more warnings (use warnings() to see the first 50) -- Reached total allocation of 24559Mb My code works perfectly fine for a smaller dataset. I am surprised about the errors as the CPU should do all the work with the lm calculations and the output cannot be that large, can it??? (I cannot check the object size of the lm object due to the error) Right now I am running only 1 linear model, but actually I wanted to run 6! Is Windows putting some restrictions on R regarding the RAM usage? Can I change any settings? A RAM upgrade is not an option. Do I need to use a different R package instead (bigmemory?)? Thanks in advance for your help!! -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Understanding lm-based analysis of fractional factorial experiments
All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 contains data from the Leaf spring experiment. The dataset is also in this zip file: ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip I've learned from the book that the effects can be found using a linear model and double the coefficients. So, I do leaf - read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, sep=), yavg, ssq, lnssq)) leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf) leaf.lm Call: lm(formula = yavg ~ B * C * D * E * Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ 7.54000 0.07003 0.32333-0.09668 0.07668 Q+ B+:C+ B+:D+ C+:D+ B+:E+ -0.33670 0.01335 0.11995 0.02335 NA C+:E+ D+:E+ B+:Q+ C+:Q+ D+:Q+ NA NA 0.22915-0.25745 0.28255 E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ C+:D+:E+ 0.05415 NA NA NA NA B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ C+:E+:Q+ 0.04160-0.16160-0.18840 NA NA D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ B+:D+:E+:Q+ NA NA NA NA NA C+:D+:E+:Q+ B+:C+:D+:E+:Q+ NA NA (seems there is little I can do about the line breaks here, sorry) However, the book (table 5.5), has 0.221 for the main effect of B and 0.176, and the above is neither this, nor half of it. Now, I can reproduce what's in the book with lm(yavg ~ B, data=leaf) Call: lm(formula = yavg ~ B, data = leaf) Coefficients: (Intercept) B+ 7.5254 0.2213 lm(yavg ~ C, data=leaf) Call: lm(formula = yavg ~ C, data = leaf) Coefficients: (Intercept) C+ 7.5479 0.1763 Assuming lm does in fact double the coefficient in this case, but here the intercept varies, which doesn't seem correct, nor can I as trivially find the interactions the same way. Now, I try the effects() function, and get familiar numbers: effects(leaf.lm) (Intercept) B+ C+ D+ E+ Q+ -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920 B+:C+ B+:D+ C+:D+ B+:Q+ C+:Q+ D+:Q+ -0.03415-0.03915 0.07085-0.16915 0.33085-0.10755 E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+ 0.05415-0.02080 0.08080-0.09420 and indeed, I have verified that effects(leaf.lm)/2 gives me the expected result. So, I have found the correct answer, but I don't understand why. I have read the documentation for effects() as well as looked through the relevant chapter in Statistical Models in S, but from that all I got was that I suppose there is a hint in the phrase the effects are the uncorrelated single-degree-of-freedom, and that is somewhat different from the coefficients, but I can't make out from the book (Wu Hamada) why the coefficients should be any different than the effects, to the contrary, it is quite clear from equation (5.8) in the book that the coefficients they use are effects(leaf.lm)/4. So, there are at least two points of confusion here, one is how coef() differs from effects() in the case of fractional factorial experiments, and the other is the factor 1/4 between the coefficients used by Wu Hamada and the values returned by effects() as I would think from theory I've read that it should be a factor 2. Best regards, Kjetil -- Kjetil Kjernsmo PhD Research Fellow, University of Oslo, Norway Semantic Web / SPARQL Query Federation kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message
most likely either 'lower' or 'upper' is NA. put options(error = recover) in your script to stop on the error and examine the value. you need to learn debugging 101 to help yourself out. Sent from my iPad On Mar 5, 2013, at 16:00, li li hannah@gmail.com wrote: Dear all, I got an error message when running the following code. Can anyone give any suggestions on fixing this type of error? Thank you very much in advance. Hanna integrand - function(x, rho, a, b, z){ + x1 - x[1] + x2 - x[2] + Sigma - matrix(c(1, rho, rho, 1), 2,2) + mu - rep(0,2) + f - pmnorm(c((z-a*x1)/b, (z-a*x2)/b), mu, Sigma)*dmnorm(c(0,0), mu, diag(2)) + f +} adaptIntegrate(integrand, lower=rep(-Inf, 2), upper=c(2,2), + rho=0.1, a=0.6, b=0.3, z=3, maxEval=1) Error in if (any(lower upper)) stop(lowerupper integration limits) : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting time data for various countries in same graph
On 03/06/2013 07:06 PM, Anindya Sankar Dey wrote: Hi, I've the following kind of data Time Country Values 2010Q1India 5 2010Q2India 7 2010Q3India 5 2010Q4India 9 2010Q1China 10 2010Q2China 6 2010Q3China 9 2010Q4 China 14 I needed to plot a graph with the x-axis being time,y-axis being he Values and 2 line graph , one for India and one for counry. I don't have great knowledge on graphics in R. I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country)) But this does not help. Can anyone help me with this? Hi Anindya, This might be a start for you: asd.df-read.table( text=Time Country Values 2010Q1India 5 2010Q2India 7 2010Q3India 5 2010Q4India 9 2010Q1China 10 2010Q2China 6 2010Q3China 9 2010Q4 China 14 ,header=TRUE) # Time is read as a factor, so it can be used directly in plotting as.numeric(asd.df$Time) [1] 1 2 3 4 1 2 3 4 plot(as.numeric(asd.df$Time)[asd.df$Country == India], asd.df$Values[asd.df$Country == India], type=l,col=4,lwd=2,xaxt=n,xlab=Financial Quarter, ylab=Value,ylim=c(0,14)) lines(as.numeric(asd.df$Time)[asd.df$Country == China], asd.df$Values[asd.df$Country == China], col=2,lwd=2) axis(1,at=1:4,labels=paste(Q,1:4,sep=)) legend(2,2,c(India,China),lty=1,lwd=2,col=c(4,2)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining column having same values
Dear useRs, I have a matrix in the following form [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] 1 1 3 2 3 1 1 2 3 3 2 and following is my desired output (combining the column headers, having same values). a-1,2,6,7 b-3,5,9,10 c-4,8,11 Thanks in advance Elisa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] faulty F-tests
Thanks so much. I see my foolish ways now. -- View this message in context: http://r.789695.n4.nabble.com/aov-and-anova-making-faulty-F-tests-tp4660407p4660439.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm and Formula tutorial
Dear Alex, Here you have some url's: http://data.princeton.edu/R/linearModels.html http://www.r-bloggers.com/r-tutorial-series-simple-linear-regression/ Regards, Eva --- El mié, 6/3/13, Alaios ala...@yahoo.com escribió: De: Alaios ala...@yahoo.com Asunto: [R] lm and Formula tutorial Para: R help R-help@r-project.org Fecha: miércoles, 6 de marzo, 2013 08:08 Dear all, I was reading last night the lm and the Formula manual page, and 'I have to admit that I had tough time to understand their syntax. Is there a simpler guide for the dummies like me to start with? I would like to thank you in advance for your help Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining column having same values
Dear Eliza, You question is not very clear. I think you are looking for the which() function. Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens eliza botto Verzonden: woensdag 6 maart 2013 12:26 Aan: r-help@r-project.org Onderwerp: [R] combining column having same values Dear useRs, I have a matrix in the following form [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] 1 1 3 2 3 1 1 2 3 3 2 and following is my desired output (combining the column headers, having same values). a-1,2,6,7 b-3,5,9,10 c-4,8,11 Thanks in advance Elisa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot with frequencies(counts)
On 03/06/2013 12:45 AM, km wrote: Dear All, I have a table as following position type count 1 2 100 1 3 51 1 5 64 1 8 81 1 6 32 2 2 41 2 3 85 and so on Normally if would have a vector of 2,3,4,5... by position position and plot them by position. But now i have counts of these types. Is there a way to compute boxplot of such kind of data ? Hi KM, We must assume that the type variable is to be used as a value, otherwise you would want something like a frequency plot by two factors of position and type. (If this is the case, I would suggest a nested bar plot). Here is a fairly awful kludge that will get you a boxplot (tdf is your table): reprow-function(x) return(matrix(rep(x[1:2],x[3]),ncol=2,byrow=TRUE)) replist-apply(as.matrix(tdf),1,reprow) repmat-replist[[1]] for(rep in 1:length(replist)) repmat-rbind(repmat,replist[[rep]]) boxplot(repmat[,2],repmat[,1]) With apologies Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm Regression takes 24+ GB RAM - Error message
On Wed, Mar 6, 2013 at 9:51 AM, Jonas125 schleeberge...@pg.com wrote: Hello, I am a rather unexperienced r-user (learned the language 1 month ago) and run into the following problem using a local computer with 6 cores 24 GB RAM and R 2.15 64-bit. I didn't install any additional packages 1. Via the read.table command I load a data table (with different data types) which is about 730 MB large 2. I add 2 calculated columns 3. I split the dataset by 5 criteria 4. I run the lm command on the split with the calculated columns as the variables The RAM consumption goes rapidly up and stays at 24 GB for a couple of minutes. The result: Error: cannot allocate vector size of 5.0 Mb In addition: There ware 50 or more warnings (use warnings() to see the first 50) -- Reached total allocation of 24559Mb So it seems R has access to all your memory. My guess is that you have so-called factors [Categorical variables] in your dataset and this makes the linear regression a much larger calculation (in the intermediate steps) than you might realize because the design matrix has to deal with all the crossed categories. Can you provide the output of str(DATA_SET)? MW My code works perfectly fine for a smaller dataset. I am surprised about the errors as the CPU should do all the work with the lm calculations and the output cannot be that large, can it??? (I cannot check the object size of the lm object due to the error) Right now I am running only 1 linear model, but actually I wanted to run 6! Is Windows putting some restrictions on R regarding the RAM usage? Can I change any settings? A RAM upgrade is not an option. Do I need to use a different R package instead (bigmemory?)? Not a bad idea. Thanks in advance for your help!! -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting time data for various countries in same graph
Hello, You've forgot to use a geom. Also, to have Time be the x axis variable you need to do a conversion. library(ggplot2) dat - read.table(text = Time Country Values 2010Q1India 5 2010Q2India 7 2010Q3India 5 2010Q4India 9 2010Q1China 10 2010Q2China 6 2010Q3China 9 2010Q4 China 14 , header = TRUE) dat$Time - as.numeric(sub(Q, \\., dat$Time)) p - ggplot(dat,aes(x=Time,y=Values,colour=Country)) p + geom_line() Hope this helps, Rui Barradas Em 06-03-2013 08:06, Anindya Sankar Dey escreveu: Hi, I've the following kind of data Time Country Values 2010Q1India 5 2010Q2India 7 2010Q3India 5 2010Q4India 9 2010Q1China 10 2010Q2China 6 2010Q3China 9 2010Q4 China 14 I needed to plot a graph with the x-axis being time,y-axis being he Values and 2 line graph , one for India and one for counry. I don't have great knowledge on graphics in R. I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country)) But this does not help. Can anyone help me with this? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p2 - ggplot(SummCD, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p2 #Plot: levels C and D: # Summary (means etc) SummEF - ddply(DataEF, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p3 - ggplot(SummEF, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, #Why is the fill commando not working? position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p3 ary(gridExtra) sidebysideplot - grid.arrange(p1, p2, p3, ncol=2) Anna Zakrisson Braeunlich PhD student Department of Ecology Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden Lives in
Re: [R] combining column having same values
Hi, Try this: mat1- as.matrix(read.table(text= 1 1 3 2 3 1 1 2 3 3 2 ,sep=,header=FALSE)) res-lapply(1:3,function(i) which(mat1==i)) names(res)- c(a,c,b) res #$a #[1] 1 2 6 7 #$c #[1] 4 8 11 #$b #[1] 3 5 9 10 A.K. - Original Message - From: eliza botto eliza_bo...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, March 6, 2013 6:26 AM Subject: [R] combining column having same values Dear useRs, I have a matrix in the following form [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] 1 1 3 2 3 1 1 2 3 3 2 and following is my desired output (combining the column headers, having same values). a-1,2,6,7 b-3,5,9,10 c-4,8,11 Thanks in advance Elisa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
Look at the function melt and use the factor columns as id variables. You should be able to do what you want with facet_grid. I have found inkscape useful to build legends and modify axes labels. I know this is only a partial answer, but I hope this helps. Stephen On Wed 06 Mar 2013 06:32:42 AM CST, Anna Zakrisson wrote: Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p2 - ggplot(SummCD, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p2 #Plot: levels C and D: # Summary (means etc) SummEF - ddply(DataEF, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p3 - ggplot(SummEF, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, #Why is the fill commando not working? position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p3 ary(gridExtra) sidebysideplot - grid.arrange(p1, p2, p3, ncol=2) Anna Zakrisson Braeunlich PhD student Department of Ecology Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`•. . • `•. .• `•. . º`•. . • `•. .• `•. .º`•. . •
[R] chi square exact test
SPPS is offering a chi square exact test for one dimensional data with small sample size (6). What is the comparable function in R? Kind Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good practice for data() for R-packages
On 05.03.2013 11:21, Johannes Radinger wrote: Hi, I am compiling a R-package and have two tables (.rda files) that are used by the functions in my package. In the manual for ?data (http://stat.ethz.ch/R-manual/R-patched/library/utils/html/data.html), there is a chapter on good practice for such sysdata.. However what is not clear to me yet: 1) Probably I need to do the second approach: For objects which are system data, for example lookup tables used in calculations within the function, use a file ‘R/sysdata.rda’ in the package sources or create the objects by R code at package installation time. But what if I have two rda-files? Is the sysdata.rda a fixed name (name convention)? But both objects into the same rda file. 2) How should these rda-table be used/loaded in the package functions? is data() still working/okay? The objects will be available in your NAMESPACE. Best, Uwe ligges /johannnes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chi square exact test
A quick google search produces multiple results. Good luck. :) ~Nicole Ford Ph.D. Student Graduate Assistant/ Instructor Department of Government and International Affairs University of South Florida office: SOC 012M Sent from my iPhone On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote: SPPS is offering a chi square exact test for one dimensional data with small sample size (6). What is the comparable function in R? Kind Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi HJ, Tem2- as.data.frame(Tem1) res-do.call(rbind,split(Tem2,Tem2$V1)) row.names(res)- 1:nrow(res) head(res,7) # V1 V2 #1 111 1 #2 111 2 #3 111 3 #4 111 4 #5 111 13 #6 111 14 #7 111 15 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2 333 9 333 10 333 11 333 12 333 21 333 22 333 23 I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...?? Many thanks in advance! HJ On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote: Hi, b[b[,4]15 (b[,1]4|is.na(b[,1])) (b[,2]4|is.na(b[,2])),] # [,1] [,2] [,3] [,4] [,5] #[1,] 6 NA NA 16 20 #[2,] NA 5 NA 17 21 A.K. - Original Message - From: HJ YAN yhj...@googlemail.com To: r-help@r-project.org Cc: Sent: Tuesday, March 5, 2013 9:33 PM Subject: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear R user I have data created using code below b-matrix(2:21,nrow=4) b[,1:3]=NA b[4,2]=5 b[3,1]=6 Now the data is b [,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 14 18 [2,] NA NA NA 15 19 [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I want to keep data in column 4 greater than 15 and the value in column 1 2 either greater than 4 or is 'NA'. So I would like to have my outcome as below... [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I thought something like the code below gonna to work but it only returns the last row,e.g NA 5 NA 17 21. ... bb-b[which( (b[,2]4 | b[,2]==NA) (b[,1]4 | b[,1]==NA) b[,4]15) ,]) Please could anyone help? Many thanks in advance HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding the knots in a smoothing spline using nknots
Thanks, David! That makes sense. I shall re-read the manual page again. Regards, Mike Nielsen On Wed, Feb 27, 2013 at 12:19 PM, David Winsemius dwinsem...@comcast.netwrote: On Feb 27, 2013, at 6:39 AM, Mike Nielsen wrote: Hi r-helpers. Please forgive my ignorance, but I would like to plot a smoothing spline (smooth.spline) from package stats, and show the knots in the plot, and I can't seem to figure out where smooth.spline has located the knots (when I use nknots). Unfortunately, I don't know a lot about splines, but I know that they provide me an easy way to estimate the location of local maxima and minima on varying time-scales (number of knots) in my original data. I see there is a fit$knot, but it's not clear to me what those values are: for some reason I had expected that they would be contained in my original y values, but they're not. It appears they are in the range of [0-1] and the ss$fit$min and ss$fit$range provide the scaling data ( for the x-values rather than the y-values): unique(ss$fit$knot) [1] 0. 0.04095904 0.08291708 0.12487512 0.16583417 0.20779221 0.24975025 0.29070929 [9] 0.33266733 0.37462537 0.41658342 0.45754246 0.49950050 0.54145854 0.58241758 0.62437562 [17] 0.66633367 0.70829171 0.74925075 0.79120879 0.83316683 0.87412587 0.91608392 0.95804196 [25] 1. I would think that in your case with x0 being 0 you could just use ss$fit$range*unique(ss$fit$knot) as your knot positions. In the more geneneral case you would need to add ss$fit$min. I tried confirming this hunch by looking statiscal Models in S, inMASSe4, and at the R code but the R code calls a FORTRAN routine, so you would need to pull the source to confirm. -- David. I tried generating nknots equally spaced points in my x, but when I plotted the points that corresponded to my original y values at those equally-spaced x values, I found that the spline did not pass through them, which, perhaps naively, I thought it might. Also, the manual says that yin comprises the y values used at the unique y values -- should this read at the unique x values? Could someone kindly point to a resource where I can get a slightly fuller explanation? I looked at the code for smooth.spline, but can't readily follow it. Here's a toy example: x-seq(from=0,to=4*pi,length=1002) y-sin(x) ss-smooth.spline(x,y=y,all.knots=F,nknots=25) ss Call: smooth.spline(x = x, y = y, all.knots = F, nknots = 25) Smoothing Parameter spar= -0.4573636 lambda= 1.006117e-09 (14 iterations) Equivalent Degrees of Freedom (Df): 26.99935 Penalized Criterion: 3.027077e-06 GCV: 3.190666e-09 str(ss) List of 15 $ x : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ... $ y : num [1:1002] 2.88e-05 1.26e-02 2.51e-02 3.77e-02 5.02e-02 ... $ w : num [1:1002] 1 1 1 1 1 1 1 1 1 1 ... $ yin : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ... $ data:List of 3 ..$ x: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ... ..$ y: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ... ..$ w: num [1:1002] 1 1 1 1 1 1 1 1 1 1 ... $ lev : num [1:1002] 0.2238 0.177 0.1399 0. 0.0891 ... $ cv.crit : num 3.19e-09 $ pen.crit: num 3.03e-06 $ crit: num 3.19e-09 $ df : num 27 $ spar: num -0.457 $ lambda : num 1.01e-09 $ iparms : Named int [1:3] 1 0 14 ..- attr(*, names)= chr [1:3] icrit ispar iter $ fit :List of 5 ..$ knot : num [1:31] 0 0 0 0 0.041 ... ..$ nk : num 27 ..$ min : num 0 ..$ range: num 12.6 ..$ coef : num [1:27] 2.88e-05 1.72e-01 5.19e-01 9.04e-01 1.05 ... ..- attr(*, class)= chr smooth.spline.fit $ call: language smooth.spline(x = x, y = y, all.knots = F, nknots = 25) - attr(*, class)= chr smooth.spline Many thanks! Regards, Mike Nielsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
Kjetil Kjernsmo kjekje at ifi.uio.no writes: All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. Just a quick thought (sorry for removing context): what happens if you use sum-to-zero contrasts throughout, i.e. options(contrasts=c(contr.sum, contr.poly)) ... ? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2 333 9 333 10 333 11 333 12 333 21 333 22 333 23 I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...?? Many thanks in advance! HJ On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote: Hi, b[b[,4]15 (b[,1]4|is.na(b[,1])) (b[,2]4|is.na(b[,2])),] # [,1] [,2] [,3] [,4] [,5] #[1,] 6 NA NA 16 20 #[2,] NA 5 NA 17 21 A.K. - Original Message - From: HJ YAN yhj...@googlemail.com To: r-help@r-project.org Cc: Sent: Tuesday, March 5, 2013 9:33 PM Subject: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear R user I have data created using code below b-matrix(2:21,nrow=4) b[,1:3]=NA b[4,2]=5 b[3,1]=6 Now the data is b [,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 14 18 [2,] NA NA NA 15 19 [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I want to keep data in column 4 greater than 15 and the value in column 1 2 either greater than 4 or is 'NA'. So I would like to have my outcome as below... [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I thought something like the code below gonna to work but it only returns the last row,e.g NA 5 NA 17 21. ... bb-b[which( (b[,2]4 | b[,2]==NA) (b[,1]4 | b[,1]==NA) b[,4]15) ,]) Please could anyone help? Many thanks in advance HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function completely locks up my computer if the input is too big
Le mardi 05 mars 2013 à 15:19 -0800, Benjamin Caldwell a écrit : Hi all, Thanks for the suggestions. Updating the function as below to break the problem into chunks seemed to do the trick - perhaps there is a relatively small limit to the size of a vector that R can work with? On the contrary, that's because the limit supported by R is too high for your computer's memory that the OS tries to allocate too much memory and swaps to death. Admittedly, an OS should be smart enough not to completely freeze in the process, but that's just how it is... (If R did not support such a long vector, you would just get a nice error message, that's all.) Regards Best rotate - function(x,y,tilt,threshold){ df.main-data.frame(x,y) if(length(x) threshold){ l - round(length(x)/ threshold, 0) dfchunk - split(df.main, factor(sort(rank(row.names(df.main))%%l))) n-length(summary(dfchunk)[,1]) xy-vector(list, n) for (i in 1:n){ wk.df - dfchunk[[i]] x - wk.df$x y - wk.df$y d2 - x^2+y^2 rotate.dis-sqrt(d2) or.rad - atan(x/y) or.deg - Rad2Deg(or.rad) or.deg[is.na(or.deg)] - 0 tilt.in - tilt + or.deg xy[[i]]-data.frame(Pol2Car(distance=rotate.dis, deg=tilt.in)) } xy-do.call(rbind, xy[1:n]) } else { d2 - x^2+y^2 rotate.dis-sqrt(d2) or.rad - atan(x/y) or.deg - Rad2Deg(or.rad) n - length(or.deg) for(i in 1:n){ if(is.na(or.deg[i])==TRUE) {or.deg[i] - 0} } tilt.in - tilt + or.deg xy-data.frame(Pol2Car (distance=rotate.dis, deg=tilt.in)) } xy } *Ben Caldwell* Graduate Fellow University of California, Berkeley 130 Mulford Hall #3114 Berkeley, CA 94720 Office 223 Mulford Hall (510)859-3358 On Tue, Mar 5, 2013 at 1:44 PM, Peter Alspach peter.alsp...@plantandfood.co.nz wrote: Tena koe Benjamin I haven't looked at you code in detail, but in general ifelse is slow and can generally be avoided. For example, ben - 1:10^7 system.time(BEN - ifelse(ben10, NA, -ben)) user system elapsed 1.310.241.56 system.time({BEN1 - -ben; BEN1[BEN1 -10] - NA}) user system elapsed 0.170.030.20 all.equal(BEN, BEN1) [1] TRUE HTH ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Benjamin Caldwell Sent: Wednesday, 6 March 2013 10:18 a.m. To: r-help Subject: [R] Function completely locks up my computer if the input is too big Dear r-help, Somewhere in my innocuous function to rotate an object in Cartesian space I've created a monster that completely locks up my computer (requires a hard reset every time). I don't know if this is useful description to anyone - the mouse still responds, but not the keyboard and not windows explorer. The script only does this when the input matrix is large, and so my initial runs didn't catch it as I used a smaller matrix to speed up the test runs. When I tried an input matrix with a number of dimensions in the same order of magnitude as the data I want to read in, R and my computer choked. This was a surprise for me, as I've always been able to break execution in the past or do other tasks. So i tried it again, and still no dice. Now I need the function to work as subsequent functions/functionality are dependent, and I can't see anything on the face of it that would likely cause the issue. Any insight on why this happens in general or specifically in my case are appreciated. Running R 15.2, Platform: x86_64-w64-mingw32/x64 (64-bit) on a windows 7 machine with 4 mb RAM. In the meantime I suppose I'll write a loop to do this function piece-wise for larger data and see if that helps. Script is attached and appended below. Thanks Ben Caldwell #compass to polar coordinates compass2polar - function(x) {-x+90} #degrees (polar) to radians Deg2Rad - function(x) {(x*pi)/180} # radians to degrees Rad2Deg - function (rad) (rad/pi)*180 # polar to cartesian coordinates - assumes degrees those from a compass. output is a list, x y of equal length Pol2Car - function(distance,deg) { rad - Deg2Rad(compass2polar(deg)) rad - rep(rad, length(distance)) x - ifelse(is.na(distance), NA, distance * cos(rad)) y - ifelse(is.na(distance), NA, distance * sin(rad)) x-round(x,2) y-round(y,2) cartes- list(x,y) name-c('x','y') names(cartes)-name cartes } #rotate an object, with assumed origin at 0,0, in any number of degrees rotate - function(x,y,tilt){ 8 d2 - x^2+y^2 rotate.dis-sqrt(d2) or.rad - atan(x/y) or.deg - Rad2Deg(or.rad) n - length(or.deg) for(i in 1:n){ if(is.na(or.deg[i])==TRUE) {or.deg[i] - 0} } # browser() tilt.in - tilt + or.deg xy-Pol2Car (distance=rotate.dis, deg=tilt.in) # if(abs(tilt) = 0) { # shift.frame - cbind(xy$x, xy$y) # shift.frame.val -
Re: [R] Understanding lm-based analysis of fractional factorial experiments
Hi, On Wed, Mar 6, 2013 at 5:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote: All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. I have my doubts... I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 contains data from the Leaf spring experiment. The dataset is also in this zip file: ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip I've learned from the book that the effects can be found using a linear model and double the coefficients. So, I do leaf - read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, sep=), yavg, ssq, lnssq)) leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf) That is complete nonsense: dim(leaf) [1] 16 11 length(coef(leaf.lm)) [1] 32 So you are trying to estimate 32 coefficients from 16 data points. That is never going to work. leaf.lm Call: lm(formula = yavg ~ B * C * D * E * Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ 7.54000 0.07003 0.32333-0.09668 0.07668 Q+ B+:C+ B+:D+ C+:D+ B+:E+ -0.33670 0.01335 0.11995 0.02335 NA C+:E+ D+:E+ B+:Q+ C+:Q+ D+:Q+ NA NA 0.22915-0.25745 0.28255 E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ C+:D+:E+ 0.05415 NA NA NA NA B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ C+:E+:Q+ 0.04160-0.16160-0.18840 NA NA D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ B+:D+:E+:Q+ NA NA NA NA NA C+:D+:E+:Q+ B+:C+:D+:E+:Q+ NA NA (seems there is little I can do about the line breaks here, sorry) However, the book (table 5.5), has 0.221 for the main effect of B and 0.176, and the above is neither this, nor half of it. Now, I can reproduce what's in the book with lm(yavg ~ B, data=leaf) Call: lm(formula = yavg ~ B, data = leaf) Coefficients: (Intercept) B+ 7.5254 0.2213 lm(yavg ~ C, data=leaf) Call: lm(formula = yavg ~ C, data = leaf) Coefficients: (Intercept) C+ 7.5479 0.1763 Assuming lm does in fact double the coefficient in this case, I have no idea what this means. but here the intercept varies, which doesn't seem correct, You mean that the intercept for lm(yavg ~ B, data=leaf) differs from the intercept for lm(yavg ~ C, data=leaf) ? If so that is expected. The intercept is the expected value of yavg when all predictors are zero. The expected value for B = zero does not have to be the same as the expected value for C = 0. nor can I as trivially find the interactions the same way. What way? Now, I try the effects() function, and get familiar numbers: effects(leaf.lm) (Intercept) B+ C+ D+ E+ Q+ -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920 B+:C+ B+:D+ C+:D+ B+:Q+ C+:Q+ D+:Q+ -0.03415-0.03915 0.07085-0.16915 0.33085-0.10755 E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+ 0.05415-0.02080 0.08080-0.09420 and indeed, I have verified that effects(leaf.lm)/2 gives me the expected result. So, I have found the correct answer, but I don't understand why. I have read the documentation for effects() as well as looked through the relevant chapter in Statistical Models in S, but from that all I got was that I suppose there is a hint in the phrase the effects are the uncorrelated single-degree-of-freedom, and that is somewhat different from the coefficients, but I can't make out from the book (Wu Hamada) why the coefficients should be any different than the effects, to the contrary, it is quite clear from equation (5.8) in the book that the coefficients they use are effects(leaf.lm)/4. So, there are at least two points of confusion here, one is how coef() differs from effects() in the case of fractional factorial experiments, and the other is the
[R] Generating unique filenames.
Hi, I am trying to create unique filenames for my output text file. The idea is that I would like to append a string to .zsc.txt so that all my files are uniquely named but with a similar format. I have tried adding the string variable to .zsc.txt while creating the output file name, i.e. write.table function, is what I have tried using : write.table(x, file=str,.,.zsc.txt); but it isn't working. Would appreciate everyone's input on the matter. Thanks :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating unique filenames.
On 06.03.2013 15:20, Sahana Srinivasan wrote: Hi, I am trying to create unique filenames for my output text file. The idea is that I would like to append a string to .zsc.txt so that all my files are uniquely named but with a similar format. I have tried adding the string variable to .zsc.txt while creating the output file name, i.e. write.table function, is what I have tried using : write.table(x, file=str,.,.zsc.txt); but it isn't working. ?paste Uwe Ligges Would appreciate everyone's input on the matter. Thanks :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating unique filenames.
Hello, On Mar 6, 2013, at 9:20 AM, Sahana Srinivasan wrote: Hi, I am trying to create unique filenames for my output text file. The idea is that I would like to append a string to .zsc.txt so that all my files are uniquely named but with a similar format. I have tried adding the string variable to .zsc.txt while creating the output file name, i.e. write.table function, is what I have tried using : write.table(x, file=str,.,.zsc.txt); but it isn't working. I think you are looking for the paste() or the newish paste0() function to assemble the parts of your filename. In the example below I make a unique name out of a timestamp and a path. You might have a different unique name to use instead of timestamp. Note, use file.path() to build up filename that includes a path description. path - /my/own/path appendage - .zsc.txt string - format(Sys.time(), format = %Y-%j-%H%M%S) outputFile - file.path(path, paste0(string, appendage)) write.table(x, file = outputFile) Cheers, Ben Would appreciate everyone's input on the matter. Thanks :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm and Formula tutorial
I may be wrong, but I believe what makes it difficult is that the Help file assumes some linear model statistics that you may not have. I suggest that you look for a tutorial on linear models first and then re-read the Help. Incidentally, the provenance of the syntax is GLIM (correction requested if I'm wrong), and the Nelder-McCullough book on GLM's has a chapter on linear models and syntax that is relevant, iirc. -- Bert On Tue, Mar 5, 2013 at 11:08 PM, Alaios ala...@yahoo.com wrote: Dear all, I was reading last night the lm and the Formula manual page, and 'I have to admit that I had tough time to understand their syntax. Is there a simpler guide for the dummies like me to start with? I would like to thank you in advance for your help Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
On 03/06/2013 02:50 PM, Ben Bolker wrote: Just a quick thought (sorry for removing context): what happens if you use sum-to-zero contrasts throughout, i.e. options(contrasts=c(contr.sum, contr.poly)) ... ? That works (except for the sign)! What would this mean? Kjetil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
As Ista indicates, the basic issue is that the OP does not understand linear modeling and is therefore just thrashing around with lm. For example, the statement about effects being double coefficient is only true with the orthogonal (-1,1) parameterization of the contrasts. So I suggest the OP either find some local statistical help or start reading up on linear models, rather than wasting further time and space here. -- Bert On Wed, Mar 6, 2013 at 6:17 AM, Ista Zahn istaz...@gmail.com wrote: Hi, On Wed, Mar 6, 2013 at 5:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote: All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. I have my doubts... I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 contains data from the Leaf spring experiment. The dataset is also in this zip file: ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip I've learned from the book that the effects can be found using a linear model and double the coefficients. So, I do leaf - read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, sep=), yavg, ssq, lnssq)) leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf) That is complete nonsense: dim(leaf) [1] 16 11 length(coef(leaf.lm)) [1] 32 So you are trying to estimate 32 coefficients from 16 data points. That is never going to work. leaf.lm Call: lm(formula = yavg ~ B * C * D * E * Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ 7.54000 0.07003 0.32333-0.09668 0.07668 Q+ B+:C+ B+:D+ C+:D+ B+:E+ -0.33670 0.01335 0.11995 0.02335 NA C+:E+ D+:E+ B+:Q+ C+:Q+ D+:Q+ NA NA 0.22915-0.25745 0.28255 E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ C+:D+:E+ 0.05415 NA NA NA NA B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ C+:E+:Q+ 0.04160-0.16160-0.18840 NA NA D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ B+:D+:E+:Q+ NA NA NA NA NA C+:D+:E+:Q+ B+:C+:D+:E+:Q+ NA NA (seems there is little I can do about the line breaks here, sorry) However, the book (table 5.5), has 0.221 for the main effect of B and 0.176, and the above is neither this, nor half of it. Now, I can reproduce what's in the book with lm(yavg ~ B, data=leaf) Call: lm(formula = yavg ~ B, data = leaf) Coefficients: (Intercept) B+ 7.5254 0.2213 lm(yavg ~ C, data=leaf) Call: lm(formula = yavg ~ C, data = leaf) Coefficients: (Intercept) C+ 7.5479 0.1763 Assuming lm does in fact double the coefficient in this case, I have no idea what this means. but here the intercept varies, which doesn't seem correct, You mean that the intercept for lm(yavg ~ B, data=leaf) differs from the intercept for lm(yavg ~ C, data=leaf) ? If so that is expected. The intercept is the expected value of yavg when all predictors are zero. The expected value for B = zero does not have to be the same as the expected value for C = 0. nor can I as trivially find the interactions the same way. What way? Now, I try the effects() function, and get familiar numbers: effects(leaf.lm) (Intercept) B+ C+ D+ E+ Q+ -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920 B+:C+ B+:D+ C+:D+ B+:Q+ C+:Q+ D+:Q+ -0.03415-0.03915 0.07085-0.16915 0.33085-0.10755 E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+ 0.05415-0.02080 0.08080-0.09420 and indeed, I have verified that effects(leaf.lm)/2 gives me the expected result. So, I have found the correct answer, but I don't understand why. I have read the documentation for effects() as well as looked through the relevant chapter in Statistical Models in S, but from that all I got was that I suppose there is a hint in the phrase
Re: [R] CARET and NNET fail to train a model when the input is high dimensional
James, I did a fresh install from CRAN to get caret_5.15-61 and ran your code with method.name = nnet and grid.len = 3. I don't get an error, although there were issues: In nominalTrainWorkflow(dat = trainData, info = trainInfo, ... : There were missing values in resampled performance measures. The results had: Resampling results across tuning parameters: size decay ROCSens Spec ROC SD Sens SD Spec SD 1 0 0.521 0.52 0.521 0.0148 0.0312 0.00901 1 1e-04 0.513 0.528 0.498 0.00616 0.00386 0.00552 1 0.10.515 0.522 0.514 0.0169 0.0284 0.0426 3 0 NaNNaNNaNNA NA NA 3 1e-04 NaNNaNNaNNA NA NA 3 0.1NaNNaNNaNNA NA NA 5 0 NaNNaNNaNNA NA NA 5 1e-04 NaNNaNNaNNA NA NA 5 0.1NaNNaNNaNNA NA NA To test more, I ran: test - nnet(trX, trY, size = 3, decay = 0) Error in nnet.default(trX, trY, size = 3, decay = 0) : too many (2107) weights So, you need to pass in MaxNWts to nnet() with a value that let's you fit the model. Off the top of my head, you could use something like: MaxNWts = length(levels(trY))*(max(my.grid$.size) * (nCol + 1) + max(my.grid$.size) + 1) Also, this one of the methods for getting help (the other is to just email me). I also try to keep up on stack exchange too. Max On Tue, Mar 5, 2013 at 9:47 PM, James Jong ribonucle...@gmail.com wrote: The following code fails to train a nnet model in a random dataset using caret: nR - 700 nCol - 2000 myCtrl - trainControl(method=cv, number=3, preProcOptions=NULL, classProbs = TRUE, summaryFunction = twoClassSummary) trX - data.frame(replicate(nR, rnorm(nCol))) trY - runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4] trY - as.factor(ifelse(sign(trY)0,'X1','X0')) my.grid - createGrid(method.name, grid.len, data=trX) my.model - train(trX,trY,method=method.name ,trace=FALSE,trControl=myCtrl,tuneGrid=my.grid, metric=ROC) print(Done) The error I get is: task 2 failed - arguments imply differing number of rows: 1334, 666 However, everything works if I reduce nR to, say 20. Any thoughts on what may be causing this? Is there a place where I could report this bug other than this mailing list? Here is my session info: sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nnet_7.3-5 pROC_1.5.4 caret_5.15-052 foreach_1.4.0 [5] cluster_1.14.3 plyr_1.8reshape2_1.2.2 lattice_0.20-13 loaded via a namespace (and not attached): [1] codetools_0.2-8 compiler_2.15.2 grid_2.15.2 iterators_1.0.6 [5] stringr_0.6.2 tools_2.15.2 Thanks, James [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining column having same values
Hello, Try x - scan(text = 1 1 3 2 3 1 1 2 3 3 2) sapply(unique(x), function(.x) which(x == .x)) Hope this helps, Rui Barradas Em 06-03-2013 11:26, eliza botto escreveu: Dear useRs, I have a matrix in the following form [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] 1 1 3 2 3 1 1 2 3 3 2 and following is my desired output (combining the column headers, having same values). a-1,2,6,7 b-3,5,9,10 c-4,8,11 Thanks in advance Elisa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
On Mar 6, 2013, at 4:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote: All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 contains data from the Leaf spring experiment. The dataset is also in this zip file: ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip I've learned from the book that the effects can be found using a linear model and double the coefficients. So, I do leaf - read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, sep=), yavg, ssq, lnssq)) leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf) leaf.lm I'll ignore the rest of your question, in the hope that this will answer them sufficiently. You probably want a simple linear model, specified in R using + instead of *. leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf) leaf.lm Call: lm(formula = yavg ~ B + C + D + E + Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ Q+ 7.50084 0.22125 0.17625 0.02875 0.10375 -0.25960 Does this give you the numbers you expect? Peter Best regards, Kjetil -- Kjetil Kjernsmo PhD Research Fellow, University of Oslo, Norway Semantic Web / SPARQL Query Federation kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
On 03/06/2013 04:18 PM, Peter Claussen wrote: I'll ignore the rest of your question, in the hope that this will answer them sufficiently. OK! You probably want a simple linear model, specified in R using + instead of *. leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf) leaf.lm Call: lm(formula = yavg ~ B + C + D + E + Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ Q+ 7.50084 0.22125 0.17625 0.02875 0.10375 -0.25960 Does this give you the numbers you expect? Well, it partly gives the numbers I expect, but I want the interactions as well, so it is only a partial answer. Best, Kjetil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding lm-based analysis of fractional factorial experiments
On Mar 6, 2013, at 9:23 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote: On 03/06/2013 04:18 PM, Peter Claussen wrote: I'll ignore the rest of your question, in the hope that this will answer them sufficiently. OK! You probably want a simple linear model, specified in R using + instead of *. leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf) leaf.lm Call: lm(formula = yavg ~ B + C + D + E + Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ Q+ 7.50084 0.22125 0.17625 0.02875 0.10375 -0.25960 Does this give you the numbers you expect? Well, it partly gives the numbers I expect, but I want the interactions as well, so it is only a partial answer. But you don't have enough data points to estimate all of the possible interactions; that's why you have NA in your original results. You could add the just the first order interactions manually, i.e., + B:C + B:D … Peter Best, Kjetil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Troubles with labeling x axis
Hi! I have problems with labeling x axis while plotting time series data. I have 40 monthly measurement. One period lasts 4 months. I'd like to have 40 ticks on x axis (10 larger, the rest smaller) and labels just at the beginning of each period, just like in the image http://r.789695.n4.nabble.com/file/n4660465/2221.jpg My code leaves x axis empty: data - read.csv(file=CSV files/Komen.csv, head=TRUE, sep=;) dataTimeSeries - ts(data, frequency=12, start=c(2000,4)) dataTimeSeries Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 7 45 47 3 24 132 35 32 28 2001 161 48 31 33 161 154 420 19 149 44 54 16 2002 152 94 43 64 193 85 98 77 236 87 72 47 2003 196 120 51 27 143 99 56 require(graphics) plot.ts(dataTimeSeries, xaxt=n, xlab= Perioda, ylab= Opazovane vrednosti, type='l', col='red') axis(side=1, at=seq(1,40,4), labels=seq(1,10,1)) Thanks in advance for any help! -- View this message in context: http://r.789695.n4.nabble.com/Troubles-with-labeling-x-axis-tp4660465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2 333 9 333 10 333 11 333 12 333 21 333 22 333 23 I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...?? Many thanks in advance! HJ On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote: Hi, b[b[,4]15 (b[,1]4|is.na(b[,1])) (b[,2]4|is.na(b[,2])),] #[,1] [,2] [,3] [,4] [,5] #[1,]6 NA NA 16 20 #[2,] NA5 NA 17 21 A.K. - Original Message - From: HJ YAN yhj...@googlemail.com To: r-help@r-project.org Cc: Sent: Tuesday, March 5, 2013 9:33 PM Subject: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear R user I have data created using code below b-matrix(2:21,nrow=4) b[,1:3]=NA b[4,2]=5 b[3,1]=6 Now the data is b [,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 14 18 [2,] NA NA NA 15 19 [3,] 6 NA NA 16 20 [4,] NA5 NA17 21 I want to keep data in column 4 greater than 15 and the value in column 1 2 either greater than 4 or is 'NA'. So I would like to have my outcome as below... [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I thought something like the code below gonna to work but it only returns the last row,e.g NA 5 NA 17 21. ... bb-b[which( (b[,2]4 | b[,2]==NA) (b[,1]4 | b[,1]==NA) b[,4]15) ,]) Please could anyone help? Many thanks in advance HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Friedman test in R
Dear R users, I am new to R and looking into using a Friedman test in R with post-hoc analysis for a time series datset in which I am looking at changes in multiple features over three time points in a number of individuals. As well as detecting if there is an overall difference in features between the three time points I want to determine which features change significantly and between which time points. I therefore need to perform a post-hoc test such as the wilcoxon signed-rank test. I am having trouble formatting my data and performing the formula: friedman.test (y~A|B) I think that y should be the feature measurements, A should be the time points and B the subject. The data look something like this.. subject timepoint feature1feature2feature3 feature4 .. 1 1 26 32 43 45 1 2 45 63 3 87 1 3 23 22 4 94 2 1 76 44 79 79 2 2 56 56 8 76 2 3 87 23 7 67 etc My question is how I could read this table into R in a format that would allow the above test to be performed? Also is there any way I can perform post-hoc wilcoxon signed rank tests to determine which features are different and between which time points? Thanks very much in advance for any help you can offer! -- View this message in context: http://r.789695.n4.nabble.com/Friedman-test-in-R-tp4660441.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
Dear Anna, Is this what you would like? Summ - ddply(mydata, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 Summ$Grouping - c(AB, AB, CD, CD, EF, EF)[Summ$factor1] Summ$factor1bis - c(0, 1, 0, 1, 0, 1)[Summ$factor1] ggplot(Summ, aes(factor3, mean, group = factor1bis, shape = factor1bis, linetype = factor1bis, ymin = mean - sdv , ymax = mean + sdv)) + geom_point(position = position_dodge(width = 0.25), size = 3) + geom_line(position = position_dodge(width = 0.25)) + geom_errorbar(width = 0.3, position = position_dodge(width = 0.25), size = 0.3) + facet_wrap(~Grouping, ncol = 2) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + labs(shape = factor1, group = factor1, linetype = factor1) Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Anna Zakrisson Verzonden: woensdag 6 maart 2013 13:33 Aan: r-help@r-project.org Onderwerp: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid? Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE),
Re: [R] Understanding lm-based analysis of fractional factorial experiments
On Mar 6, 2013, at 4:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote: All, I have just returned to R after a decade of absence, and it is good to see that R has become such a great success! I'm trying to bring Design of Experiments into some aspects of software performance evaluation, and to teach myself that, I picked up Experiments: Planning, Analysis and Optimization by Wu and Hamada. I try to reproduce an analysis in the book using lm, but have to conclude I don't understand what lm does in this context, even though I end up at the desired result. I'm currently using R 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy and Debian Squeeze. I think the discussion below can be followed without having the book at hand though. I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 contains data from the Leaf spring experiment. The dataset is also in this zip file: ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip I've learned from the book that the effects can be found using a linear model and double the coefficients. So, I do leaf - read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, sep=), yavg, ssq, lnssq)) leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf) leaf.lm I'll ignore the rest of your question, in the hope that this will answer them sufficiently. You probably want a simple linear model, specified in R using + instead of *. leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf) leaf.lm Call: lm(formula = yavg ~ B + C + D + E + Q, data = leaf) Coefficients: (Intercept) B+ C+ D+ E+ Q+ 7.50084 0.22125 0.17625 0.02875 0.10375 -0.25960 Does this give you the numbers you expect? Peter Kjetil -- Kjetil Kjernsmo PhD Research Fellow, University of Oslo, Norway Semantic Web / SPARQL Query Federation kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difficulty in caper: Error in phy$node.label[which(newNb 0) - Ntip]
Hello, I'm doing a comparative analysis of mammal brain and body size data. I'm following Charlie Nunn and Natalie Cooper's instructions for Running PGLS in R using caper. I run into the following error when I create my comparative dataset, combining my phylogenetic tree (mammaltree) and taxon measures (mammaldata): Error in phy$node.label[which(newNb 0) - Ntip] : only 0's may be mixed with negative subscripts My full script is provided at the bottom. I have looked at the caper manual by David Orme to understand how comparative.data() constructs the dataset, but still cannot interpret the error. Many thanks to anyone who could provide me with insight. Nicole Thompson E3B Columbia University library(caper) Loading required package: ape Loading required package: MASS Loading required package: mvtnorm mammaldata -read.csv(R.Mammal_data.csv, header = TRUE) mammaltree -read.nexus(BEphylotree.nex) mammal - comparative.data(phy = mammaltree, data = mammaldata, names.col = Taxon, vcv = TRUE, na.omit = FALSE, warn.dropped = TRUE) #names.col? Error in phy$node.label[which(newNb 0) - Ntip] : only 0's may be mixed with negative subscripts -- Nicole A Thompson E3B Columbia University, NYCEP nat2...@columbia.edu 480.522.4212 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Gui frond has stopped working
I use R (2.15.2) on a windows 7 (64) for data mining with xcms package. This is a routine process for me and didn't encount any problems until few weeks ago when I got an error message for R GUI frond end has stopped working. The following information was given by windows to describe the error. \AppData\Local\Temp\WER5936.tmp.WERInternalMetadata.xml \AppData\Local\Temp\WER7243.tmp.appcompat.txt \AppData\Local\Temp\WER7282.tmp.mdmp I have uninstall and re-install couple of time the R software but it did not help. Any ideas how to resolve this issue? I even tried to upgrade to 2.15.3 or to 2.15.1 but still get the same error message Thanks in advance for any help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About basic logical operators
On 05/03/2013 7:53 PM, Victor hyk wrote: Hello everyone, I have a basic question regarding logical operators. x-seq(-1,1,by=0.02) x [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [61] 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40 0.42 [73] 0.44 0.46 0.48 0.50 0.52 0.54 0.56 0.58 0.60 0.62 0.64 0.66 [85] 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 0.86 0.88 0.90 [97] 0.92 0.94 0.96 0.98 1.00 x[x=0.02] [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 x[x0.2] [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [61] 0.20 Why does x[x=0.02] return no 0.02 You don't have a 0.02 in your dataset. Evaluate x[52] - 0.02 and you won't get zero due to rounding (as Jeff said, see FAQ 7.31). but x[x0.2] return a subsample with 0.02? You don't have 0.2, either. Evaluate x[61] - 0.2 and you get a negative value. Duncan Murdoch Anyone who can tell me why? Thanks! Victor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print justify
Hi everyone, I'm trying to print a table justified to the left, but it doesn't work. Any hints? KennArt - data.frame(NR=c(171,172,174,175,176,177,181,411,980), TYP=c(Körnermais, Corn Cob Mix, Zuckermais, Mischanbau (Silo)Mais/Sonnenblumen, Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen Zustand, Mais mit Bejagungsschneise (Kulturpflanze), Hirse, Silomais (Als Hauptfutter), Sudangras)) print(KennArt, justify=left) still justifies to the right: NR TYP 1 171 Körnermais 2 172 Corn Cob Mix 3 174 Zuckermais 4 175 Mischanbau (Silo)Mais/Sonnenblumen 5 176 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen Zustand 6 177 Mais mit Bejagungsschneise (Kulturpflanze) 7 181 Hirse 8 411 Silomais (Als Hauptfutter) 9 980 Sudangras print(KennArt[2:3,], justify=left) doesn't leftify either, so it's not the German letters' fault. format(KennArt, justify=left) does the job mostly, but the column names are still rightified... This solution is fine for me now, but I'm still wondering... sessionInfo() returns: R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] graphics grDevices datasets utils stats methods base other attached packages: [1] foreign_0.8-52 fortunes_1.5-0 BerryFunctions_1.0 evd_2.3-0 loaded via a namespace (and not attached): [1] tools_2.15.1 Thanks ahead, Berry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) V2-c(1:23) Tem1-cbind(V1,V2) Tem2-Tem1[1:20,] How do I get outcome like [21,] 333 21 [22,] 333 22 [23,] 333 23 P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe?? Sorry for so many questions today, as I'm coding for a work deadline tonight. Many thanks! Cheers HJ On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2 333 9 333 10 333 11 333 12 333 21 333 22 333 23 I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...?? Many thanks in advance! HJ On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote: Hi, b[b[,4]15 (b[,1]4|is.na(b[,1])) (b[,2]4|is.na(b[,2])),] # [,1] [,2] [,3] [,4] [,5] #[1,] 6 NA NA 16 20 #[2,] NA 5 NA 17 21 A.K. - Original Message - From: HJ YAN yhj...@googlemail.com To: r-help@r-project.org Cc: Sent: Tuesday, March 5, 2013 9:33 PM Subject: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear R user I have data created using code below b-matrix(2:21,nrow=4) b[,1:3]=NA b[4,2]=5 b[3,1]=6 Now the data is b [,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 14 18 [2,] NA NA NA 15 19 [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I want to keep data in column 4 greater than 15 and the value in column 1 2 either greater than 4 or is 'NA'. So I would like to have my outcome as below... [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I thought something like the code below gonna to work but it only returns the last row,e.g NA 5 NA 17 21. ... bb-b[which( (b[,2]4 | b[,2]==NA) (b[,1]4 | b[,1]==NA)
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Just to add: Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] A.K. - Original Message - From: arun smartpink...@yahoo.com To: HJ YAN yhj...@googlemail.com Cc: R help r-help@r-project.org Sent: Wednesday, March 6, 2013 11:06 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) V2-c(1:23) Tem1-cbind(V1,V2) Tem2-Tem1[1:20,] How do I get outcome like [21,] 333 21 [22,] 333 22 [23,] 333 23 P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe?? Sorry for so many questions today, as I'm coding for a work deadline tonight. Many thanks! Cheers HJ On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2 333 9 333 10 333 11 333 12 333 21 333 22 333 23 I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...?? Many thanks in advance! HJ On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote: Hi, b[b[,4]15 (b[,1]4|is.na(b[,1])) (b[,2]4|is.na(b[,2])),] # [,1] [,2] [,3] [,4] [,5] #[1,] 6 NA NA 16 20 #[2,] NA 5 NA 17 21 A.K. - Original Message - From: HJ YAN yhj...@googlemail.com To: r-help@r-project.org Cc: Sent: Tuesday, March 5, 2013 9:33 PM Subject: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear R user I have data created using code below b-matrix(2:21,nrow=4) b[,1:3]=NA b[4,2]=5 b[3,1]=6 Now the data is b [,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 14 18 [2,] NA NA NA 15 19 [3,] 6 NA NA 16 20 [4,] NA 5 NA 17 21 I want to keep data in column
Re: [R] print justify
I don't know about justify as an arg to print, but the following should qualify as a hint. format(c('a','aa','aaa'), justify='left') [1] a aa aaa tmp - data.frame(a=c('a','aa','aaa')) print(tmp,justify='left') a 1 a 2 aa 3 aaa tmp$b - format(c('a','aa','aaa'),justify='left') tmp a b 1 a a 2 aa aa 3 aaa aaa -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 3/6/13 8:03 AM, Berry Boessenkool berryboessenk...@hotmail.com wrote: Hi everyone, I'm trying to print a table justified to the left, but it doesn't work. Any hints? KennArt - data.frame(NR=c(171,172,174,175,176,177,181,411,980), TYP=c(Körnermais, Corn Cob Mix, Zuckermais, Mischanbau (Silo)Mais/Sonnenblumen, Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen Zustand, Mais mit Bejagungsschneise (Kulturpflanze), Hirse, Silomais (Als Hauptfutter), Sudangras)) print(KennArt, justify=left) still justifies to the right: NR TYP 1 171 Körnermais 2 172 Corn Cob Mix 3 174 Zuckermais 4 175Mischanbau (Silo)Mais/Sonnenblumen 5 176 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen Zustand 6 177Mais mit Bejagungsschneise (Kulturpflanze) 7 181 Hirse 8 411Silomais (Als Hauptfutter) 9 980 Sudangras print(KennArt[2:3,], justify=left) doesn't leftify either, so it's not the German letters' fault. format(KennArt, justify=left) does the job mostly, but the column names are still rightified... This solution is fine for me now, but I'm still wondering... sessionInfo() returns: R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=CLC_TIME=German_Germany.1252 attached base packages: [1] graphics grDevices datasets utils stats methods base other attached packages: [1] foreign_0.8-52 fortunes_1.5-0 BerryFunctions_1.0 evd_2.3-0 loaded via a namespace (and not attached): [1] tools_2.15.1 Thanks ahead, Berry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CARET and NNET fail to train a model when the input is high dimensional
Thank you Max. I presume that in order to use caret with nnet and MaxNWts, I would have to write my custom method for train that supports this new argument. From what I read, when writing my custom method, I would need to define functions parameters, model, prediction, prob and sort and pass them to trainControl. However, If all I need is a new parameters function (in order to pass the MaxNWTs argument to nnet), is there a way to reuse the other functions (model, prediction, prob and sort) that are already defined for the nnet method? James On Wed, Mar 6, 2013 at 9:59 AM, Max Kuhn mxk...@gmail.com wrote: James, I did a fresh install from CRAN to get caret_5.15-61 and ran your code with method.name = nnet and grid.len = 3. I don't get an error, although there were issues: In nominalTrainWorkflow(dat = trainData, info = trainInfo, ... : There were missing values in resampled performance measures. The results had: Resampling results across tuning parameters: size decay ROCSens Spec ROC SD Sens SD Spec SD 1 0 0.521 0.52 0.521 0.0148 0.0312 0.00901 1 1e-04 0.513 0.528 0.498 0.00616 0.00386 0.00552 1 0.10.515 0.522 0.514 0.0169 0.0284 0.0426 3 0 NaNNaNNaNNA NA NA 3 1e-04 NaNNaNNaNNA NA NA 3 0.1NaNNaNNaNNA NA NA 5 0 NaNNaNNaNNA NA NA 5 1e-04 NaNNaNNaNNA NA NA 5 0.1NaNNaNNaNNA NA NA To test more, I ran: test - nnet(trX, trY, size = 3, decay = 0) Error in nnet.default(trX, trY, size = 3, decay = 0) : too many (2107) weights So, you need to pass in MaxNWts to nnet() with a value that let's you fit the model. Off the top of my head, you could use something like: MaxNWts = length(levels(trY))*(max(my.grid$.size) * (nCol + 1) + max(my.grid$.size) + 1) Also, this one of the methods for getting help (the other is to just email me). I also try to keep up on stack exchange too. Max On Tue, Mar 5, 2013 at 9:47 PM, James Jong ribonucle...@gmail.com wrote: The following code fails to train a nnet model in a random dataset using caret: nR - 700 nCol - 2000 myCtrl - trainControl(method=cv, number=3, preProcOptions=NULL, classProbs = TRUE, summaryFunction = twoClassSummary) trX - data.frame(replicate(nR, rnorm(nCol))) trY - runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4] trY - as.factor(ifelse(sign(trY)0,'X1','X0')) my.grid - createGrid(method.name, grid.len, data=trX) my.model - train(trX,trY,method=method.name ,trace=FALSE,trControl=myCtrl,tuneGrid=my.grid, metric=ROC) print(Done) The error I get is: task 2 failed - arguments imply differing number of rows: 1334, 666 However, everything works if I reduce nR to, say 20. Any thoughts on what may be causing this? Is there a place where I could report this bug other than this mailing list? Here is my session info: sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nnet_7.3-5 pROC_1.5.4 caret_5.15-052 foreach_1.4.0 [5] cluster_1.14.3 plyr_1.8reshape2_1.2.2 lattice_0.20-13 loaded via a namespace (and not attached): [1] codetools_0.2-8 compiler_2.15.2 grid_2.15.2 iterators_1.0.6 [5] stringr_0.6.2 tools_2.15.2 Thanks, James [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm Regression takes 24+ GB RAM - Error message
The datatable (and the split obviously) only contain characters and numeric data. I found that 4 regression in a row work if I don't use the calculated columns as variables but 2 of the original columns. RAM usage stays below 3GB! -- Why does R has such problems with the calculated columns? Their calculation is already done before the regression starts. It's like this: Create the calculated columns: Dataset$ExtraColumn1 - Dataset$ColumnA / Dataset$ColumnB Dataset$ExtraColumn2 - Dataset$ColumnC / Dataset$ColumnD Perform the split of the dataset inc. calculated columns (the criteria for the split have a hierarchy): Datasplit - split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE)) Perform the regression on the splitted data: Regression1 - lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2, d, na.action = na.omit, singular.ok = TRUE)) BTW: There are no NA values in the data source. What is my mistake? When I calculate the columns I might divide by zero (=inf). Could that create the problem in the regression? Thanks, Jonas -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660496.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm Regression takes 24+ GB RAM - Error message
Le mercredi 06 mars 2013 à 08:31 -0800, Jonas125 a écrit : The datatable (and the split obviously) only contain characters and numeric data. I found that 4 regression in a row work if I don't use the calculated columns as variables but 2 of the original columns. RAM usage stays below 3GB! -- Why does R has such problems with the calculated columns? Their calculation is already done before the regression starts. It's like this: Create the calculated columns: Dataset$ExtraColumn1 - Dataset$ColumnA / Dataset$ColumnB Dataset$ExtraColumn2 - Dataset$ColumnC / Dataset$ColumnD Perform the split of the dataset inc. calculated columns (the criteria for the split have a hierarchy): Datasplit - split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE)) Perform the regression on the splitted data: Regression1 - lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2, d, na.action = na.omit, singular.ok = TRUE)) BTW: There are no NA values in the data source. What is my mistake? What's the value of length(Datasplit)? Have you tried running regressions manually on Datasplit[[1]] and calling object.size() on the result to see how large it is? Regards When I calculate the columns I might divide by zero (=inf). Could that create the problem in the regression? Thanks, Jonas -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660496.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About basic logical operators
x[x0.2] [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 This is a bit off the original topic, but you should really put spaces around the . Otherwise you might be surprised when you compare x to -0.2 instead of +0.2: x-seq(-1,1,by=0.02) x[x-0.2] numeric(0) x [1] 0.2 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, March 06, 2013 7:57 AM To: Victor hyk Cc: r-help@r-project.org Subject: Re: [R] About basic logical operators On 05/03/2013 7:53 PM, Victor hyk wrote: Hello everyone, I have a basic question regarding logical operators. x-seq(-1,1,by=0.02) x [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [61] 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40 0.42 [73] 0.44 0.46 0.48 0.50 0.52 0.54 0.56 0.58 0.60 0.62 0.64 0.66 [85] 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 0.86 0.88 0.90 [97] 0.92 0.94 0.96 0.98 1.00 x[x=0.02] [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 x[x0.2] [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78 [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54 [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30 [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06 [49] -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [61] 0.20 Why does x[x=0.02] return no 0.02 You don't have a 0.02 in your dataset. Evaluate x[52] - 0.02 and you won't get zero due to rounding (as Jeff said, see FAQ 7.31). but x[x0.2] return a subsample with 0.02? You don't have 0.2, either. Evaluate x[61] - 0.2 and you get a negative value. Duncan Murdoch Anyone who can tell me why? Thanks! Victor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chi square exact test
Am 06.03.2013 14:27, schrieb Nicole Ford: Dear Nicole, my be you are wondering about, but I know Google an I am using google before I am asking here. If you are more familiar with googl,e please help me to find the search term where I can find the R function for chi square exact usable for one column test for a sample size less than 6 You are welcome to use this search: http://www.giyf.com/chi%20square%20exact Thanks in advane Knut A quick google search produces multiple results. Good luck. :) ~Nicole Ford Ph.D. Student Graduate Assistant/ Instructor Department of Government and International Affairs University of South Florida office: SOC 012M Sent from my iPhone On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote: SPPS is offering a chi square exact test for one dimensional data with small sample size (6). What is the comparable function in R? Kind Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rainbow producing colors that do not differ sufficiently
R 2.15.2 OS X Colleagues, I often use rainbow to select colors. I encountered a surprise with rainbow(11). It yielded three greens (in positions 4-6). The first two of these are quite similar. The man pages suggest that this might be the case: equispaced hues in RGB space tend to cluster at the red, green and blue primaries The following code illustrates the problem -- the colors labeled 4 and 5 are quite similar. plot(1, type=n, xlim=c(1, 10), ylim=c(0, 1), axes=F, xlab=, ylab=) for (which in 3:7) { rect(which - 1, 0, which, 1, border=NA, col=rainbow(11)[which]) text(which - 0.5, 0.8, which) text(which - 0.5, 0.2, rainbow(11)[which], srt=90) } In this case, I overcame the problem by replacing one element on the rainbow vector with a different green. Is there some better approach to this by which I could automate the entire process but prevent this similarity of colors? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm Regression takes 24+ GB RAM - Error message
Length(Datasplit) = 7100 I did a regression for Datasplit[[1]] and the calculated columns -- the object size is 70 MB. Quite large Assuming that R cannot handle inf values in regressions (didn't have the time to google it) How can I avoid the calculation of infinite values? Like If the denominator would be zero, choose 0.001 as the denominator instead. Dataset[is.infinite(Dataset)] - 0 does not work for me -- default method not implemented for type 'list' class(Dataset) = data.frame -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660501.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm Regression takes 24+ GB RAM - Error message
Le mercredi 06 mars 2013 à 09:18 -0800, Jonas125 a écrit : Length(Datasplit) = 7100 I did a regression for Datasplit[[1]] and the calculated columns -- the object size is 70 MB. Quite large 7100*70/1024 = 485 (GB) No wonder why you run out of memory quite fast. You probably do not need to store the whole lm objects: usually you need coefficients, R-squared, things like that. So instead of returning the objects, return a vector or a list with only the elements you need, you will save much space. And if you really need the objects, set these lm() arguments to FALSE to make the result smaller: model, x, y, qr: logicals. If ‘TRUE’ the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned. Assuming that R cannot handle inf values in regressions (didn't have the time to google it) How can I avoid the calculation of infinite values? Like If the denominator would be zero, choose 0.001 as the denominator instead. Dataset[is.infinite(Dataset)] - 0 does not work for me -- default method not implemented for type 'list' class(Dataset) = data.frame I don't understand why you think infinite values can trigger a memory problem. Why don't you just try it? lm(c(1, Inf) ~ c(1, 2)) Erreur dans lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' lm(c(1, 2) ~ c(1, Inf)) Erreur dans lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'x' So, if anything, this would stop your lapply() call sooner or later, and save your machine from freezing. Regards -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660501.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chi square exact test
Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit : Am 06.03.2013 14:27, schrieb Nicole Ford: Dear Nicole, my be you are wondering about, but I know Google an I am using google before I am asking here. If you are more familiar with googl,e please help me to find the search term where I can find the R function for chi square exact usable for one column test for a sample size less than 6 You are welcome to use this search: http://www.giyf.com/chi%20square%20exact Thanks in advane Knut See ?fisher.test. Regards A quick google search produces multiple results. Good luck. :) ~Nicole Ford Ph.D. Student Graduate Assistant/ Instructor Department of Government and International Affairs University of South Florida office: SOC 012M Sent from my iPhone On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote: SPPS is offering a chi square exact test for one dimensional data with small sample size (6). What is the comparable function in R? Kind Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chi square exact test
Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat: Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit : Am 06.03.2013 14:27, schrieb Nicole Ford: Dear Nicole, my be you are wondering about, but I know Google an I am using google before I am asking here. If you are more familiar with googl,e please help me to find the search term where I can find the R function for chi square exact usable for one column test for a sample size less than 6 You are welcome to use this search: http://www.giyf.com/chi%20square%20exact Thanks in advane Knut See ?fisher.test. fisher test needs two columns I need a one column exact test |x| either a two-dimensional contingency table in matrix form, or a factor object. |y| a factor object; ignored if |x| is a matrix. Knut [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Sampling
When the population values are not distributed symmetrically about the mean, reporting the mean and standard deviation can give the reader an inaccurate impression of the distribution of values in the population. I'd like generating random samples with same mean and standard deviation, but not necessarily same distribution. Thanks Angelo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
For the simplist of the issues use scale_shape(solid = FALSE) to get hollow points Using your data (below) this seems to work p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1,shape = factor1)) + scale_y_continuous(guide_legend(legend.position=c(4 ,6))) + scale_shape(solid = FALSE) + guides(colour = guide_legend(title.position = right)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 John Kane Kingston ON Canada -Original Message- From: a...@ecology.su.se Sent: Wed, 06 Mar 2013 13:32:42 +0100 To: r-help@r-project.org Subject: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid? Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p2 - ggplot(SummCD, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape =
Re: [R] need help using read.fortran
On 06/03/2013 12:57 PM, jsdroyster wrote: Hello kind and R-knowledgeable souls! I am trying to use read.fortran to read in old datasets in 80-column-card format with no separators between variables (just 80 columns of solid digits). I comprehend the instructions for specifying the columns for each variable, but I can't understand how to assign the variable names after reading the help pages for read.fortran, read.fwf and read.table. I tried putting a col.names section in the read.fortran statement: AN35 -data.frame(read.fortran(filename,c(I9,4I2,2I1,3I2,I1,16I2, I1,I5,I1,I3,I2, A4,3A1,A2,A1), header = FALSE,skip=0,sep=@, col.names = paste(idno,empmo,empyr,birthmo,birthyr,sex,race,teno,testmo,testyr, testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,R500,R1k,R2k,R3k,R4k,R6k,R8k, HPD,dept,shift,TWA,envclas,jobcode,hobby.med.STS,audclas,disp), The paste() call will try to find variables with those names, and concatenate their contents. That's not what you want. You want something like col.names = c(idno, empmo, ) row.names = (idno) )) I also tried a separate dimnames statement like so: dimnames(AN35)[[2]] - c(idno,empmo,empyr,birthmo,birthyr,sex,race, teno,testmo,testyr,testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k, R500,R1k,R2k,R3k,R4k,R6k,R8k,HPD,dept,shift,TWA, envclas,jobcode,hobby,med,STS,audclas,disp) I copied this from some documentation but I have no clue what the [[2]] means. dimnames() is a function that returns a list of row names and column names. The column names are the second component, so dimnames(AN35)[[2]] - something changes the column names. Duncan Murdoch If anyone has one good example that would help me a lot! Thanks in advance! Julie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help using read.fortran
Hello kind and R-knowledgeable souls! I am trying to use read.fortran to read in old datasets in 80-column-card format with no separators between variables (just 80 columns of solid digits). I comprehend the instructions for specifying the columns for each variable, but I can't understand how to assign the variable names after reading the help pages for read.fortran, read.fwf and read.table. I tried putting a col.names section in the read.fortran statement: AN35 -data.frame(read.fortran(filename,c(I9,4I2,2I1,3I2,I1,16I2, I1,I5,I1,I3,I2, A4,3A1,A2,A1), header = FALSE,skip=0,sep=@, col.names = paste(idno,empmo,empyr,birthmo,birthyr,sex,race,teno,testmo,testyr, testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,R500,R1k,R2k,R3k,R4k,R6k,R8k, HPD,dept,shift,TWA,envclas,jobcode,hobby.med.STS,audclas,disp), row.names = (idno) )) I also tried a separate dimnames statement like so: dimnames(AN35)[[2]] - c(idno,empmo,empyr,birthmo,birthyr,sex,race, teno,testmo,testyr,testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k, R500,R1k,R2k,R3k,R4k,R6k,R8k,HPD,dept,shift,TWA, envclas,jobcode,hobby,med,STS,audclas,disp) I copied this from some documentation but I have no clue what the [[2]] means. If anyone has one good example that would help me a lot! Thanks in advance! Julie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] robustbase adjbox segfault - memory not mapped
Glad to know. Thanks. Regards Baan On Wednesday 06 March 2013 02:15 PM, Martin Maechler wrote: B == Baan baanba...@gmail.com on Mon, 4 Mar 2013 22:47:10 +0530 writes: B Thank you Martin. Look forward to the fix. Committed to the R-forge version of robustbase. It was a simple integer overflow, indeed, necessarily happening when the sample size was = 2^16.5. I'm planning to submit robustbase_0.9-7 to CRAN today. Martin B Regards B Baan B On Monday 04 March 2013 10:19 PM, Martin Maechler wrote: B == Baan baanba...@gmail.com on Mon, 4 Mar 2013 15:02:02 +0530 writes: B Hi, I encountered a segfault, memory not mapped error B when using adjbox in robustbase. In trying to recreate B the issue I found that the error occurs only for large B sample size. Here is the code. require(robustbase) B Loading required package: robustbase x - rnorm(10) y - rep(1, 10) adjbox(x ~ y) ## gives a plot x - rnorm(1) y - rep(1, 1) adjbox(x ~ y) ## gives a plot x - rnorm(10) y - rep(1, 10) adjbox(x ~ y) B *** caught segfault *** B address 0xfffcc47af530, cause 'memory not mapped' B Traceback: B 1: .C(mc_C, x, n, eps = eps, iter = c.iter, medc = double(1)) B 2: mcComp(x, doReflect, eps1 = eps1, eps2 = eps2, maxit = maxit, B trace.lev = trace.lev) B 3: mc.default(x, ..., na.rm = TRUE) B 4: mc(x, ..., na.rm = TRUE) B 5: adjboxStats(unclass(groups[[i]]), coef = range, doReflect = doReflect) B 6: adjbox.default(split(mf[[response]], mf[-response]), ...) B 7: adjbox(split(mf[[response]], mf[-response]), ...) B 8: adjbox.formula(x ~ y) B 9: adjbox(x ~ y) Indeed, I (as maintainer of robustbase) can reproduce the segfault *even* though you did not specify the random seed... So this should be fixed ... hopefully within a week or so, but I am not promising anything, given my busy schedule! Martin Maechler, ETH Zurich [] B My setup details: B R --version B R version 2.15.2 (2012-10-26) -- Trick or Treat B Package:robustbase B Version:0.9-5 B Date: 2012-03-01 B Packaged: 2013-03-01 16:34:03 UTC; maechler B NeedsCompilation: yes B Repository: CRAN B Date/Publication: 2013-03-01 18:31:33 B Built: R 2.15.2; x86_64-pc-linux-gnu; 2013-03-04 05:54:20 B UTC; unix B Platform: x86_64-pc-linux-gnu (64-bit) B uname -a B Linux R 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64 GNU/Linux B Debian squeeze B Could someone pls help. B Regards B Baan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Graph from Glantz
Hi, I'd like to draw a graph like this one from Stanton Glantz book, Primer of Biostatistics. Thanks Angelo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
Placing a legend. z - ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) + geom_point() z + theme(legend.position = c(.5, .5)) Currently this does not appear to work in RStudio but seems fine if I use gedit or if I run R in a terminal session. John Kane Kingston ON Canada -Original Message- From: a...@ecology.su.se Sent: Wed, 06 Mar 2013 13:32:42 +0100 To: r-help@r-project.org Subject: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid? Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p2 - ggplot(SummCD, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p2 #Plot: levels C and D: # Summary (means etc) SummEF - ddply(DataEF, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p3 - ggplot(SummEF, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, #Why is the fill commando not working? position = dodge, width = 0.3, size=3) +
Re: [R] chi square exact test
Le mercredi 06 mars 2013 à 18:38 +0100, Knut Krueger a écrit : Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat: Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit : Am 06.03.2013 14:27, schrieb Nicole Ford: Dear Nicole, my be you are wondering about, but I know Google an I am using google before I am asking here. If you are more familiar with googl,e please help me to find the search term where I can find the R function for chi square exact usable for one column test for a sample size less than 6 You are welcome to use this search: http://www.giyf.com/chi%20square%20exact Thanks in advane Knut See ?fisher.test. fisher test needs two columns I need a one column exact test |x| either a two-dimensional contingency table in matrix form, or a factor object. |y| a factor object; ignored if |x| is a matrix. Sorry, I missed that part. Can you tell us more about the test you do in SPSS? Are you testing the adequacy of a given distribution to the data? In short: what do you test? Is that test documented somewhere? I found this document, but there does not seem to be such a test there: http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf Regards Knut [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Graph from Glantz
No link and/ no attached file. The list tends to strip most attachments to reduce virus attacks. John Kane Kingston ON Canada -Original Message- From: angeloscozzare...@tiscali.it Sent: Wed, 6 Mar 2013 19:53:18 +0100 To: r-help@r-project.org Subject: [R] Graph from Glantz Hi, I'd like to draw a graph like this one from Stanton Glantz book, Primer of Biostatistics. Thanks Angelo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a function and text
Hi, can I understand why this message was rejected ? Thanks, Eliano Sent from my iPhone On 6 Mar 2013, at 19:18, Eliano eliano.m.marq...@gmail.com wrote: Hi everyone, I am writing some code to generate a function. I am passing that code to a dataset which i'm importing in R, e.g. Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.', strip.white=TRUE) Test V1 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+ (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+ V1 has inside a code for a function. I'm having problems with 2 things: 1 - I need to take out from V1 all that appears in the text, i tried a replace but did not work. Test=replace(Test,' ', ' ') , did not work. 2 - Writing a function like this : nlog=function(par) { beta=par[1:n] Measure=Test[1] # would this read the text? return(Measure) } So i need to use that code inside the function as above. Any suggestion on how you would do this? Kind Regards, Eliano -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
Replying to my own post RStudio is doing this fine once I had rebooted R. I must have had some strange stuff loaded that I had not realised was there. John Kane Kingston ON Canada -Original Message- From: jrkrid...@inbox.com Sent: Wed, 6 Mar 2013 11:16:28 -0800 To: a...@ecology.su.se, r-help@r-project.org Subject: Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid? Placing a legend. z - ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) + geom_point() z + theme(legend.position = c(.5, .5)) Currently this does not appear to work in RStudio but seems fine if I use gedit or if I run R in a terminal session. John Kane Kingston ON Canada -Original Message- From: a...@ecology.su.se Sent: Wed, 06 Mar 2013 13:32:42 +0100 To: r-help@r-project.org Subject: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid? Hi, # For publications, I am not allowed to repeat the axes. I have tried to remove the axes using: # yaxt=n, but it did not work. I have not understood how to do this in ggplot2. Can you help me? # I also do not want loads of space between the graphs (see below script with Dummy Data). # If I could make it look like the examples on the (nice) examples page: # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html # using the facet_grid(), I would be very very happy. # I also do not want the gemoetric points to be filled and the fill=white commande # does not seem to work - why? and are there alternatives? #Furthermore, I would like to add legends to inside the plot area instead of on the side. Like when you use plotrix() and brkdn.plot: legend(topright, c(A, B), pch=c(0,1), bg=white, lty = 1:2, cex=1, bty=n) # This did not work in ggplot2. What are my alternatives. I have extensively searched the internet and have I missed something obvious, it was due to # tiredness and not to lazyness. # Some dummy data: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:5), each = 16)), factor3 = factor(rep(c(1:4), each = 4)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20)), var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40), sd = rep(c(1, 2, 3), each = 20))) # Splitting data into 3 data frames (based on factor1) # If I could do this using for example facet_wrap() or facet_grid(), I would be very # happy! I have tried but failed that method. DataAB - mydata[(mydata$factor1) %in% c(A, B), ] DataCD - mydata[(mydata$factor1) %in% c(C, D), ] DataEF - mydata[(mydata$factor1) %in% c(E, F), ] DataAB library(plyr) library(ggplot2) #Plot: levels A and B: # Summary (means etc) SummAB - ddply(DataAB, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 SummAB p1 - ggplot(SummAB, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p1 #Plot: levels C and D: # Summary (means etc) SummCD - ddply(DataCD, .(factor3,factor1), summarize, mean = mean(var1, na.rm = FALSE), sdv = sd(var1, na.rm = FALSE), se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1 p2 - ggplot(SummCD, aes(factor3, mean, colour = factor1, group = factor1, shape = factor1)) + geom_point(aes(shape=factor(factor1)), color=black, fill=white, position = dodge, width = 0.3, size=3) + geom_line(aes(linetype=factor1), color = black, size = 0.5) + geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3, position = dodge, color = black, size=0.3) + theme_bw() + ylab(expression(paste(my measured stuff))) + xlab(factor3) + ggtitle() + labs(color = factor1, shape = factor1, group = factor1, linetype = factor1) p2 #Plot: levels C and D: # Summary (means etc) SummEF - ddply(DataEF, .(factor3,factor1), summarize, mean =
Re: [R] Learning the R way – A Wish
On 06/03/2013 07:20, Andrew Hoerner wrote: Dear Patrick-- After the official Core Team's R manuals and the individual function help pages, I have found The R Inferno to be the single most useful piece of documentation when I have gotten stuck with a R problems. It is the only introduction that seems to be aware of the ambiguities present in the official documentation and of some of the ways one can get stuck in traps of misunderstanding. Plus, it is enjoyably witty. When I first started using it, I found it ranged from very useful to pretty frustrating. I did not always understand what the examples you presented were trying to say. It is still true that I occasionally wish for a little more discursive explanatory style, but as time goes by I Actually I find myself sometimes thinking the same thing. Pat find that I am increasingly likely to get the point just from the example. Many thanks, Andrew On Tue, Mar 5, 2013 at 1:46 AM, Patrick Burns pbu...@pburns.seanet.com mailto:pbu...@pburns.seanet.com wrote: Andrew, That sounds like a sensible document you propose. Perhaps I'll do a few blog posts along that vein -- thanks. I presume you know of 'The R Inferno', which does a little of what you want. Pat On 04/03/2013 23:42, andrewH wrote: There is something that I wish I had that I think would help me a lot to be a better R programmer, that I think would probably help many others as well. I put the wish out there in the hopes that someone might think it was worth doing at some point. I wish I had the code of some substantial, widely used package – lm, say – heavily annotated and explained at roughly the level of R knowledge of someone who has completed an intro statistics course using R and picked up some R along the way. The idea is that you would say what the various blocks of code are doing, why the authors chose to do it this way rather than some other way, point out coding techniques that save time or memory or prevent errors relative to alternatives, and generally, to explain what it does and point out and explain as many of the smarter features as possible. Ideally, this would include a description at least at the conceptual level if not at the code level of the major C functions that the package calls, so that you understand at least what is happening at that level, if not the nitty-gritty details of coding. I imagine this as a piece of annotated code, but maybe it could be a video of someone, or some couple of people, scrolling through the code and talking about it. Or maybe something more like a wiki page, with various people contributing explanations for different lines, sections, and practices. I am learning R on my own from books and the internet, and I think I would learn a lot from a chatty line-by-line description of some substantial block of code by someone who really knows what he or she is doing – perhaps with a little feedback from some people who are new about where they get lost in the description. There are a couple of particular things that I personally would hope to get out of this. First, there are lots of instances of good coding practice that I think most people pick up from other programmers or by having individual bits of code explained to them that are pretty hard to get from books and help files. I think this might be a good way to get at them. Second, there are a whole bunch of functions in R that I call meta-programming functions – don’t know if they have a more proper name. These are things that are intended primarily to act on R language objects or to control how R objects are evaluated. They include functions like call, match.call, parse and deparse, deparen, get, envir, substitute, eval, etc. Although I have read the individual documentation for many of these command, and even used most of them, I don’t think I have any fluency with them, or understand well how and when to code with them. I think reading a good-sized hunk of code that uses these functions to do a lot of things that packages often need to do in the best-practice or standard R way, together with comments that describe and explain them would help a lot with that. (There is a good smaller-scale example of this in Friedrich Leisch’s tutorial on creating R packages). These are things I think I probably share with many others. I
Re: [R] Troubles with labeling x axis
On 2013-03-06 06:07, iDa wrote: Hi! I have problems with labeling x axis while plotting time series data. I have 40 monthly measurement. One period lasts 4 months. I'd like to have 40 ticks on x axis (10 larger, the rest smaller) and labels just at the beginning of each period, just like in the image http://r.789695.n4.nabble.com/file/n4660465/2221.jpg My code leaves x axis empty: data - read.csv(file=CSV files/Komen.csv, head=TRUE, sep=;) dataTimeSeries - ts(data, frequency=12, start=c(2000,4)) dataTimeSeries Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 7 45 47 3 24 132 35 32 28 2001 161 48 31 33 161 154 420 19 149 44 54 16 2002 152 94 43 64 193 85 98 77 236 87 72 47 2003 196 120 51 27 143 99 56 require(graphics) plot.ts(dataTimeSeries, xaxt=n, xlab= Perioda, ylab= Opazovane vrednosti, type='l', col='red') axis(side=1, at=seq(1,40,4), labels=seq(1,10,1)) Thanks in advance for any help! Have a look at what par(usr) gives to see that your at setting makes no sense. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Course: Beginner's Guide to MCMC, GLM and GAM with R
There are a few places left on the following course: Beginner's Guide to MCMC, GLM and GAM with R When: 10 - 13 June 2013 Where: SAMS, Oban, Scotland Further information: http://www.highstat.com/statscourse.htm Flyer: http://www.highstat.com/Courses/Flyer2013June_SAMS.pdf Kind regards, Alain Zuur __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue when reading a table into R
Hello everyone, I was reading a table into R, and when trying to retrieve it the following message appeared: [ reached getOption(max.print) -- omitted 469376 rows ] Does this mean that R left out 469376 rows? Or R is taking those 469376 rows as well and the limitation is only for printing purposes? Thanks in advance for any help, Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue when reading a table into R
On Wed, 6 Mar 2013, Paul Bernal wrote: I was reading a table into R, and when trying to retrieve it the following message appeared: [ reached getOption(max.print) -- omitted 469376 rows ] Does this mean that R left out 469376 rows? Or R is taking those 469376 rows as well and the limitation is only for printing purposes? Paul, I see this message when I look at the contents of a data frame that is very large. The data are all there but there is a limit to the number of rows that will be 'printed' to the display. If you use the str() function you'll see the number of rows as well as descriptions of the column contents. HTH, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue when reading a table into R
On 06/03/2013 3:58 PM, Paul Bernal wrote: Hello everyone, I was reading a table into R, and when trying to retrieve it the following message appeared: [ reached getOption(max.print) -- omitted 469376 rows ] Does this mean that R left out 469376 rows? Or R is taking those 469376 rows as well and the limitation is only for printing purposes? Only for printing. You can find out what the object looks like internally by str(x) or similar function. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue when reading a table into R
just a limitation on the printing of the data to the console. Change the 'max.print' option if you want more lines output to the console. On Wed, Mar 6, 2013 at 3:58 PM, Paul Bernal paulberna...@gmail.com wrote: Hello everyone, I was reading a table into R, and when trying to retrieve it the following message appeared: [ reached getOption(max.print) -- omitted 469376 rows ] Does this mean that R left out 469376 rows? Or R is taking those 469376 rows as well and the limitation is only for printing purposes? Thanks in advance for any help, Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get xmlToList() to retry if http fails
Hi, I am using xmlToList() in a loop with a call to a webservice, per the code below. # Loop thru target locs for(i in 1:num.target.locs) { url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i]) tmp - xmlToList(url) df$time.offset[i] - tmp$offset system(sleep 1) # wait 1 second per requirements of above web service } # end loop thru target locations Failure struck midway through my loop, with the message below. failed to load HTTP resource Error: 1: failed to load HTTP resource I presume that the webservice failed to respond in this instance. How can I trap the error and have it retry after waiting a second or two, instead of exiting? Thanks. --Scott Waichler Pacific Northwest National Laboratory Richland, WA, USA scott.waich...@pnnl.gov __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chi square exact test
Actually, the http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf file indicates that for small samples and a one-way chi square test, SPSS uses a multinomial distribution to tabulate the distribution of chi square for a given N, K, and probability of membership in each group. In package stats, the dmultinom() function can be used to accomplish this. The last example on the help page shows the steps. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Milan Bouchet-Valat Sent: Wednesday, March 06, 2013 1:17 PM To: Knut Krueger Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] chi square exact test Le mercredi 06 mars 2013 à 18:38 +0100, Knut Krueger a écrit : Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat: Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit : Am 06.03.2013 14:27, schrieb Nicole Ford: Dear Nicole, my be you are wondering about, but I know Google an I am using google before I am asking here. If you are more familiar with googl,e please help me to find the search term where I can find the R function for chi square exact usable for one column test for a sample size less than 6 You are welcome to use this search: http://www.giyf.com/chi%20square%20exact Thanks in advane Knut See ?fisher.test. fisher test needs two columns I need a one column exact test |x| either a two-dimensional contingency table in matrix form, or a factor object. |y| a factor object; ignored if |x| is a matrix. Sorry, I missed that part. Can you tell us more about the test you do in SPSS? Are you testing the adequacy of a given distribution to the data? In short: what do you test? Is that test documented somewhere? I found this document, but there does not seem to be such a test there: http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf Regards Knut [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue when reading a table into R
Since nobody else has mentioned it: if you are seeing that message when you are reading data in, then you probably failed to assign the data to an R object. mydata - read.table(somefile) # correct read.table(somefile) # will simply print your data to the console, not save it I'm not entirely sure what you meant by retrieve so maybe you already knew this. You can use e.g. dim(mydata) to find out whether it's the size you expect. Sarah On Wed, Mar 6, 2013 at 3:58 PM, Paul Bernal paulberna...@gmail.com wrote: Hello everyone, I was reading a table into R, and when trying to retrieve it the following message appeared: [ reached getOption(max.print) -- omitted 469376 rows ] Does this mean that R left out 469376 rows? Or R is taking those 469376 rows as well and the limitation is only for printing purposes? Thanks in advance for any help, Best regards, Paul -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi, How about this: indxTem1-paste0(Tem1[,1],Tem1[,2]) indxTem2-paste0(Tem2[,1],Tem2[,2]) Tem1[!indxTem1%in%indxTem2,] # V1 V2 #[1,] 333 11 #[2,] 111 16 #[3,] 111 17 #[4,] 111 20 #[5,] 222 21 #[6,] 222 22 #[7,] 222 23 #[8,] 222 1 #[9,] 222 2 #[10,] 333 3 #[11,] 333 4 #[12,] 333 5 #[13,] 333 6 #[14,] 333 7 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 4:09 PM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear Arun Thanks a million for your prompt reply and I love all four ways in your reply. Tried the code and just realised an issue here: in my real work, my data is about 4GB large and I'm sure that there are many duplicated values in V2, so that is to say my V1 and V2 should be something like V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 1:7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1... So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 having duplicates? V1 V2 333 11 111 16 111 17 111 20 222 21 222 22 222 23 222 1 222 2 333 3 333 4 333 5 333 6 333 7 Massive thanks HJ On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote: Just to add: Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] A.K. - Original Message - From: arun smartpink...@yahoo.com To: HJ YAN yhj...@googlemail.com Cc: R help r-help@r-project.org Sent: Wednesday, March 6, 2013 11:06 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) V2-c(1:23) Tem1-cbind(V1,V2) Tem2-Tem1[1:20,] How do I get outcome like [21,] 333 21 [22,] 333 22 [23,] 333 23 P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe?? Sorry for so many questions today, as I'm coding for a work deadline tonight. Many thanks! Cheers HJ On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222
[R] Inverse function using FDA
Hi, Does anyone know how (or whether or not it's possible) to output an inverse of a functional object? I haven't found a way, but since derivatives etc. can be computed using the fda package it seems like this should be possible using this package or another designed for functional data analysis. Thanks, Zoe Richards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Dear Arun Thanks a million for your prompt reply and I love all four ways in your reply. Tried the code and just realised an issue here: in my real work, my data is about 4GB large and I'm sure that there are many duplicated values in V2, so that is to say my V1 and V2 should be something like V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 1:7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1... So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 having duplicates? V1 V2 333 11 111 16 111 17 111 20 222 21 222 22 222 23 222 1 222 2 333 3 333 4 333 5 333 6 333 7 Massive thanks HJ On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote: Just to add: Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] A.K. - Original Message - From: arun smartpink...@yahoo.com To: HJ YAN yhj...@googlemail.com Cc: R help r-help@r-project.org Sent: Wednesday, March 6, 2013 11:06 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) V2-c(1:23) Tem1-cbind(V1,V2) Tem2-Tem1[1:20,] How do I get outcome like [21,] 333 21 [22,] 333 22 [23,] 333 23 P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe?? Sorry for so many questions today, as I'm coding for a work deadline tonight. Many thanks! Cheers HJ On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 8:24 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Thank you so much for the help, that's really helpful!! Also I have a quick question about the code below where I can not see why it doesn't work... I know the I shou V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) V2-c(1:23) Tem1-cbind(V1,V2) So Tem 1 looks like... Tem1 V1 V2 [1,] 111 1 [2,] 111 2 [3,] 111 3 [4,] 111 4 [5,] 222 5 [6,] 222 6 [7,] 222 7 [8,] 222 8 [9,] 333 9 [10,] 333 10 [11,] 333 11 [12,] 333 12 [13,] 111 13 [14,] 111 14 [15,] 111 15 [16,] 111 16 [17,] 222 17 [18,] 222 18 [19,] 222 19 [20,] 222 20 [21,] 333 21 [22,] 333 22 [23,] 333 23 I would like the outcome to be... V1 V2 111 1 111 2 111 3 111 4 111 13 111 14 111 15 111 16 222 5 222 6 222 7 222 8 222 17 222 18 222 19 222 20 333 9 333 10 333 11 333 12 333 21 333 22 333 23 So I tried code as below -- Tem3-c(NA,NA) for(i in length(unique(Tem1[,1]))){ Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) Tem3-rbind(Tem3,Tem2) Tem3 } Tem4-Tem3[-1,] --- And only get this... V1 V2
Re: [R] how to get xmlToList() to retry if http fails
Hi, On Mar 6, 2013, at 4:12 PM, Waichler, Scott R wrote: Hi, I am using xmlToList() in a loop with a call to a webservice, per the code below. # Loop thru target locs for(i in 1:num.target.locs) { url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i]) tmp - xmlToList(url) df$time.offset[i] - tmp$offset system(sleep 1) # wait 1 second per requirements of above web service } # end loop thru target locations Failure struck midway through my loop, with the message below. failed to load HTTP resource Error: 1: failed to load HTTP resource You can wrap it in a try function as in the following (untested). I have made the thing stop if the second try fails, but you may want to do something more useful. Check out tryCatch, too. for(i in 1:num.target.locs) { url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i]) tmp - try(xmlToList(url)) if (inherits(tmp, try-error)) { Sys.sleep(2) tmp - try(xmlToList(url)) if (inherits(tmp, try-error)) stop(Error fetching data) } df$time.offset[i] - tmp$offset system(sleep 1) # wait 1 second per requirements of above web service } # end loop thru target locations Cheers, Ben I presume that the webservice failed to respond in this instance. How can I trap the error and have it retry after waiting a second or two, instead of exiting? Thanks. --Scott Waichler Pacific Northwest National Laboratory Richland, WA, USA scott.waich...@pnnl.gov __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inverse function using FDA
On 2013-03-06 12:54, zoe richards wrote: Hi, Does anyone know how (or whether or not it's possible) to output an inverse of a functional object? I haven't found a way, but since derivatives etc. can be computed using the fda package it seems like this should be possible using this package or another designed for functional data analysis. Thanks, Zoe Richards What does your question mean? Possibly, you could 'invert' a mean function, but I have no idea what that would accomplish. Can you provide an example of just what you want to do? Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi, I am not sure I understand it correctly. In the example you gave, there are duplicated rows in Tem1, ie. (222 6 ), (222 7), (333 11), but these rows are also present in Tem2 Is there any chance of triplicates etc.. Also, you wanted to have rows that are not common in Tem1 and Tem2. ie. (111 1) is the first row in both. indxTem1-paste0(Tem1[,1],Tem1[,2]) indxTem2-paste0(Tem2[,1],Tem2[,2]) res-rbind(Tem1[!indxTem1%in%indxTem2,], Tem1[duplicated(Tem1),]) res res V1 V2 # [1,] 333 12 #[2,] 111 16 #[3,] 111 17 #[4,] 111 20 #[5,] 222 21 #[6,] 222 22 #[7,] 222 23 #[8,] 333 4 #[9,] 333 5 #[10,] 333 6 #[11,] 333 7 #[12,] 222 6 #[13,] 222 7 #[14,] 333 11 In cases of more replicates (triplicates, etc...) how do you want to process. Also, here the duplicate rows were found only in Tem1. A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 5:36 PM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi Arun Massive thanks for the hints of making use of 'paste0'! But coincidentally there were no pair of data exactly same in indxTem1 and indxTem2 in the previous example. I changed data as below which is very likely to be in my real data... V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 6,7,11,4,5,6,7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1... And my target outcome is the difference between Tem1 and Tem2 as below: V1 V2 333 12 111 16 111 17 111 20 222 21 222 22 222 23 222 6 222 7 333 11 333 4 333 5 333 6 333 7 Many thanks HJ On Wed, Mar 6, 2013 at 9:29 PM, arun smartpink...@yahoo.com wrote: Hi, How about this: indxTem1-paste0(Tem1[,1],Tem1[,2]) indxTem2-paste0(Tem2[,1],Tem2[,2]) Tem1[!indxTem1%in%indxTem2,] # V1 V2 #[1,] 333 11 #[2,] 111 16 #[3,] 111 17 #[4,] 111 20 #[5,] 222 21 #[6,] 222 22 #[7,] 222 23 #[8,] 222 1 #[9,] 222 2 #[10,] 333 3 #[11,] 333 4 #[12,] 333 5 #[13,] 333 6 #[14,] 333 7 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 4:09 PM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear Arun Thanks a million for your prompt reply and I love all four ways in your reply. Tried the code and just realised an issue here: in my real work, my data is about 4GB large and I'm sure that there are many duplicated values in V2, so that is to say my V1 and V2 should be something like V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 1:7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1... So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 having duplicates? V1 V2 333 11 111 16 111 17 111 20 222 21 222 22 222 23 222 1 222 2 333 3 333 4 333 5 333 6 333 7 Massive thanks HJ On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote: Just to add: Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] A.K. - Original Message - From: arun smartpink...@yahoo.com To: HJ YAN yhj...@googlemail.com Cc: R help r-help@r-project.org Sent: Wednesday, March 6, 2013 11:06 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
Re: [R] Help with a function and text
On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote: Hi, can I understand why this message was rejected ? Thanks, Eliano First hit on a Markmail search: http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected -- David. Sent from my iPhone On 6 Mar 2013, at 19:18, Eliano eliano.m.marq...@gmail.com wrote: Hi everyone, I am writing some code to generate a function. I am passing that code to a dataset which i'm importing in R, e.g. Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.', strip.white=TRUE) Test V1 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+ (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+ V1 has inside a code for a function. I'm having problems with 2 things: 1 - I need to take out from V1 all that appears in the text, i tried a replace but did not work. Test=replace(Test,' ', ' ') , did not work. 2 - Writing a function like this : nlog=function(par) { beta=par[1:n] Measure=Test[1] # would this read the text? return(Measure) } So i need to use that code inside the function as above. Any suggestion on how you would do this? Kind Regards, Eliano -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inverse function using FDA
I am trying to register unemployment rate and inverse of inflation rate to investigate the phillips curvehttp://www.econlib.org/library/Enc/PhillipsCurve.html by looking at the resulting warping function. On Wed, Mar 6, 2013 at 5:17 PM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2013-03-06 12:54, zoe richards wrote: Hi, Does anyone know how (or whether or not it's possible) to output an inverse of a functional object? I haven't found a way, but since derivatives etc. can be computed using the fda package it seems like this should be possible using this package or another designed for functional data analysis. Thanks, Zoe Richards What does your question mean? Possibly, you could 'invert' a mean function, but I have no idea what that would accomplish. Can you provide an example of just what you want to do? Peter Ehlers [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
Hi Arun Massive thanks for the hints of making use of 'paste0'! But coincidentally there were no pair of data exactly same in indxTem1 and indxTem2 in the previous example. I changed data as below which is very likely to be in my real data... V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 6,7,11,4,5,6,7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1... And my target outcome is the difference between Tem1 and Tem2 as below: V1 V2 333 12 111 16 111 17 111 20 222 21 222 22 222 23 222 6 222 7 333 11 333 4 333 5 333 6 333 7 Many thanks HJ On Wed, Mar 6, 2013 at 9:29 PM, arun smartpink...@yahoo.com wrote: Hi, How about this: indxTem1-paste0(Tem1[,1],Tem1[,2]) indxTem2-paste0(Tem2[,1],Tem2[,2]) Tem1[!indxTem1%in%indxTem2,] # V1 V2 #[1,] 333 11 #[2,] 111 16 #[3,] 111 17 #[4,] 111 20 #[5,] 222 21 #[6,] 222 22 #[7,] 222 23 #[8,] 222 1 #[9,] 222 2 #[10,] 333 3 #[11,] 333 4 #[12,] 333 5 #[13,] 333 6 #[14,] 333 7 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, March 6, 2013 4:09 PM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Dear Arun Thanks a million for your prompt reply and I love all four ways in your reply. Tried the code and just realised an issue here: in my real work, my data is about 4GB large and I'm sure that there are many duplicated values in V2, so that is to say my V1 and V2 should be something like V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2-c(1:23, 1:7) # there are also duplicated values in V2 Tem1-cbind(V1,V2) Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1... So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 having duplicates? V1 V2 333 11 111 16 111 17 111 20 222 21 222 22 222 23 222 1 222 2 333 3 333 4 333 5 333 6 333 7 Massive thanks HJ On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote: Just to add: Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] A.K. - Original Message - From: arun smartpink...@yahoo.com To: HJ YAN yhj...@googlemail.com Cc: R help r-help@r-project.org Sent: Wednesday, March 6, 2013 11:06 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Hi, No problem. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) length(V1) #[1] 30 V2- c(1:30) #should be the same length as V1 Tem1- cbind(V1,V2) Tem2-Tem1[1:20,] Tem1[!Tem1[,2]%in%Tem2[,2],] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 #or subset(Tem1,!V2%in% Tem2[,2]) #or Tem1[is.na(match(Tem1[,2],Tem2[,2])),] # V1 V2 #[1,] 222 21 #[2,] 222 22 #[3,] 222 23 #[4,] 222 24 #[5,] 222 25 #[6,] 333 26 #[7,] 333 27 #[8,] 333 28 #[9,] 333 29 #[10,] 333 30 A.K. From: HJ YAN yhj...@googlemail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 10:33 AM Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data... Thank you SO MUCH Arun!!! That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. I do appreciate greatly for your prompt reply. Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) V2-c(1:23) Tem1-cbind(V1,V2) Tem2-Tem1[1:20,] How do I get outcome like [21,] 333 21 [22,] 333 22 [23,] 333 23 P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe?? Sorry for so many questions today, as I'm coding for a work deadline tonight. Many thanks! Cheers HJ On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, You can also try this: Tem3- list() for(i in unique(Tem1[,1])) { Tem3[[i]]- subset(Tem1,Tem1[,1]==i) Tem4- do.call(rbind,Tem3) } head(Tem4) # V1 V2 #[1,] 111 1 #[2,] 111 2 #[3,] 111 3 #[4,] 111 4 #[5,] 111 13 #[6,] 111 14 #or Tem3-c(NA,NA) for(i in unique(Tem1[,1])) { Tem2- subset(Tem1, Tem1[,1]==i) Tem3- rbind(Tem3,Tem2) Tem5- Tem3[-1,] } head(Tem5) # V1 V2 # 111 1 # 111 2 # 111 3 # 111 4 # 111 13 # 111 14 A.K. From: HJ YAN yhj...@googlemail.com To: arun
[R] Fwd: How to conditionally remove dataframe rows?
Hi, I have a data frame with two columns. I need to remove duplicated rows in first column, but I need to do it conditionally to values of the second column. Example: Point_counts Psi_Sp 1A 0 2A 1 3B 1 4B 2 5B 0 6C 1 7D 1 8D 2 I need to turn this data frame in one without duplicated rows at point-counts (one visit per point) but maintain the ones with maximum value at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and maintain 4. At the end I want a data frame like the one below: Point_counts Psi_Sp 1 A 1 2 B 2 3 C 0 4 D 2 How can I do it? I found several ways to edit data frames, but unfortunately I cound not use none of them. I appreciate Francisco [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a function and text
Thanks. Btw are you able to help with my issue? Thanks, Eliano Sent from my iPhone On 6 Mar 2013, at 23:41, David Winsemius [via R] ml-node+s789695n4660547...@n4.nabble.com wrote: On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote: Hi, can I understand why this message was rejected ? Thanks, Eliano First hit on a Markmail search: http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected -- David. Sent from my iPhone On 6 Mar 2013, at 19:18, Eliano [hidden email]/user/SendEmail.jtp?type=nodenode=4660547i=0 wrote: Hi everyone, I am writing some code to generate a function. I am passing that code to a dataset which i'm importing in R, e.g. Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.', strip.white=TRUE) Test V1 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+ (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+ V1 has inside a code for a function. I'm having problems with 2 things: 1 - I need to take out from V1 all that appears in the text, i tried a replace but did not work. Test=replace(Test,' ', ' ') , did not work. 2 - Writing a function like this : nlog=function(par) { beta=par[1:n] Measure=Test[1] # would this read the text? return(Measure) } So i need to use that code inside the function as above. Any suggestion on how you would do this? Kind Regards, Eliano -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=1 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=2 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660547.html To unsubscribe from Help with a function and text, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4660523code=ZWxpYW5vLm0ubWFycXVlc0BnbWFpbC5jb218NDY2MDUyM3wtMTk0ODk5MDYy . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660548.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to transpose it in a fast way?
On Wed, Mar 6, 2013 at 4:18 PM, Yao He yao.h.1...@gmail.com wrote: Dear all: I have a big data file of 6 columns and 6 rows like that: AA AC AA AA ...AT CC CC CT CT...TC .. . I want to transpose it and the output is a new like that AA CC AC CC AA CT. AA CT. AT TC. The keypoint is I can't read it into R by read.table() because the data is too large,so I try that: c-file(silygenotype.txt,r) geno_t-list() repeat{ line-readLines(c,n=1) if (length(line)==0)break #end of file line-unlist(strsplit(line,\t)) geno_t-cbind(geno_t,line) } write.table(geno_t,xxx.txt) It works but it is too slow ,how to optimize it??? I hate to be negative, but this will also not work on a 6x 6 matrix. At some point R will complain either about the lack of memory or about you trying to allocate a vector that is too long. I think your best bet is to look at file-backed data packages (for example, package bigmemory). Look at this URL: http://cran.r-project.org/web/views/HighPerformanceComputing.html and scroll down to Large memory and out-of-memory data. Some of the packages may have the functionality you are looking for and may do it faster than your code. If this doesn't help, you _may_ be able to make your code work, albeit slowly, if you replace the cbind() by data.frame. cbind() will in this case produce a matrix, and matrices are limited to 2^31 elements, which is less than 6 times 6. A data.frame is a special type of list and so _may_ be able to handle that many elements, given enough system RAM. There are experts on this list who will correct me if I'm wrong. If you are on a linux system, you can use split (type man split at the shell prompt to see help) to split the file into smaller chunks of say 5000 lines or so. Process each file separately, write it into a separate output file, then use the linux utility paste to paste the files side-by-side into the final output. Further, if you want to make it faster, do not grow geno_t by cbind'ing a new column to it in each iteration. Pre-allocate a matrix or data frame of an appropriate number of rows and columns and fill it out as you go. But it will still be slow, which I think is due to the inherent slowness of readLines and possibly strsplit. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to transpose it in a fast way?
On Wed, Mar 6, 2013 at 7:56 PM, Peter Langfelder peter.langfel...@gmail.com wrote: On Wed, Mar 6, 2013 at 4:18 PM, Yao He yao.h.1...@gmail.com wrote: Dear all: I have a big data file of 6 columns and 6 rows like that: AA AC AA AA ...AT CC CC CT CT...TC .. . I want to transpose it and the output is a new like that AA CC AC CC AA CT. AA CT. AT TC. The keypoint is I can't read it into R by read.table() because the data is too large,so I try that: c-file(silygenotype.txt,r) geno_t-list() repeat{ line-readLines(c,n=1) if (length(line)==0)break #end of file line-unlist(strsplit(line,\t)) geno_t-cbind(geno_t,line) } write.table(geno_t,xxx.txt) It works but it is too slow ,how to optimize it??? I hate to be negative, but this will also not work on a 6x 6 matrix. At some point R will complain either about the lack of memory or about you trying to allocate a vector that is too long. Maybe this depends on the R version. I have not tried it, but the dev version of R can handle much larger vectors. See http://stat.ethz.ch/R-manual/R-devel/library/base/html/LongVectors.html Yau He, if you are feeling adventurous you could give the development version of R a try. Best, Ista I think your best bet is to look at file-backed data packages (for example, package bigmemory). Look at this URL: http://cran.r-project.org/web/views/HighPerformanceComputing.html and scroll down to Large memory and out-of-memory data. Some of the packages may have the functionality you are looking for and may do it faster than your code. If this doesn't help, you _may_ be able to make your code work, albeit slowly, if you replace the cbind() by data.frame. cbind() will in this case produce a matrix, and matrices are limited to 2^31 elements, which is less than 6 times 6. A data.frame is a special type of list and so _may_ be able to handle that many elements, given enough system RAM. There are experts on this list who will correct me if I'm wrong. If you are on a linux system, you can use split (type man split at the shell prompt to see help) to split the file into smaller chunks of say 5000 lines or so. Process each file separately, write it into a separate output file, then use the linux utility paste to paste the files side-by-side into the final output. Further, if you want to make it faster, do not grow geno_t by cbind'ing a new column to it in each iteration. Pre-allocate a matrix or data frame of an appropriate number of rows and columns and fill it out as you go. But it will still be slow, which I think is due to the inherent slowness of readLines and possibly strsplit. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: How to conditionally remove dataframe rows?
On Mar 6, 2013, at 3:21 PM, Francisco Carvalho Diniz wrote: Hi, I have a data frame with two columns. I need to remove duplicated rows in first column, but I need to do it conditionally to values of the second column. Example: Point_counts Psi_Sp 1A 0 2A 1 3B 1 4B 2 5B 0 6C 1 7D 1 8D 2 I need to turn this data frame in one without duplicated rows at point-counts (one visit per point) but maintain the ones with maximum value at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and maintain 4. At the end I want a data frame like the one below: Try this: dfrm - dfrm[ order(dfrm[[1]], -dfrm[[2]] ) , ] #put desired rows at top of each Point_counts category # then take top item in each category dfrm[ !duplicated(dfrm[[1]]) , ] Point_counts Psi_Sp 1 A 1 2 B 2 3 C 0 4 D 2 How can I do it? I found several ways to edit data frames, but unfortunately I cound not use none of them. I appreciate -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: How to conditionally remove dataframe rows?
Hi, dfrm- read.table(text= Point_counts Psi_Sp 1 A 0 2 A 1 3 B 1 4 B 2 5 B 0 6 C 1 7 D 1 8 D 2 ,sep=,header=TRUE,stringsAsFactors=FALSE) res-do.call(rbind,lapply(split(dfrm,dfrm$Point_counts),function(x) x[which.max(x$Psi_Sp),])) row.names(res)-1:nrow(res) # Point_counts Psi_Sp #1 A 1 #2 B 2 #3 C 1 #your input data doesn't have 0 #4 D 2 A.K. - Original Message - From: Francisco Carvalho Diniz chicocdi...@gmail.com To: r-help@r-project.org Cc: Sent: Wednesday, March 6, 2013 6:21 PM Subject: [R] Fwd: How to conditionally remove dataframe rows? Hi, I have a data frame with two columns. I need to remove duplicated rows in first column, but I need to do it conditionally to values of the second column. Example: Point_counts Psi_Sp 1 A 0 2 A 1 3 B 1 4 B 2 5 B 0 6 C 1 7 D 1 8 D 2 I need to turn this data frame in one without duplicated rows at point-counts (one visit per point) but maintain the ones with maximum value at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and maintain 4. At the end I want a data frame like the one below: Point_counts Psi_Sp 1 A 1 2 B 2 3 C 0 4 D 2 How can I do it? I found several ways to edit data frames, but unfortunately I cound not use none of them. I appreciate Francisco [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple plots and looping assistance requested (revised codes)
Hi Irucka, I tried it and was able to plot it without any errors. Here, your code indicates you need two lines. temper[[i]][1] temper[[1]][1] # which is the column 1. Month 1 1 2 2 3 3 temper[[1]][2] # Data1 #1 1.5 #2 12.3 #3 11.4 Suppose I use names(temper) instead of seq_along(temper) pdf(irucka.pdf) lapply(names(temper),function(i) {plot(as.matrix(temper[[i]][1]),as.matrix(temper[[i]][2]),main=Fluxmaster versus EGRET/WRTDS \n Seasonal FLux Sum,sub=i,xlab=Calendar Year Timesteps,ylab=Total Flux (kg/season)); lines(temper[[i]][1]); lines(temper[[i]][2])}) dev.off() which may not be the one you wanted. A.K. From: Irucka Embry iruc...@mail2world.com To: smartpink...@yahoo.com Sent: Wednesday, March 6, 2013 9:32 PM Subject: Re: [R] multiple plots and looping assistance requested (revised codes) Hi Arun, I was only able to plot by changing from names(temper) to seq_along(temper) and by providing a numeric column entry for the [i] index. My problem has been trying to figure out how to index each column by skipping column 1. Do you have any suggestions? tempernow - lapply(seq_along(temper),function(i) {plot(as.matrix(temp[[i]][1]), as.matrix(temp[[i]][2]), main=Fluxmaster versus EGRET/WRTDS \n Seasonal Flux Sum, sub = i, xlab=Calendar Year Timesteps, ylab=Total Flux (kg/season)); lines(temp[[i]][1], temp[[i]][2])}) Error in xy.coords(x, y) : (list) object cannot be coerced to type 'double' Thank you. Irucka irucka.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a function and text
On Mar 6, 2013, at 3:44 PM, Eliano wrote: Thanks. Btw are you able to help with my issue? Thanks, Eliano I'm sorry, I was too busy answering the question from 'Eliano' over on StackOverflow. I didn't have time to address this one. (Please do note that cross-posting questions to Rhelp is contrary to advice in the Posting Guide.) You might also do further searching in the Archives with the search terms: substitute text eval and perhaps narrow it down further with contributors named: grothendeick, dunlap, ligges, venables Sent from my iPhone On 6 Mar 2013, at 23:41, David Winsemius [via R] ml-node+s789695n4660547...@n4.nabble.com wrote: On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote: Hi, can I understand why this message was rejected ? Thanks, Eliano First hit on a Markmail search: http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected -- David. Sent from my iPhone On 6 Mar 2013, at 19:18, Eliano [hidden email]/user/SendEmail.jtp?type=nodenode=4660547i=0 wrote: Hi everyone, I am writing some code to generate a function. I am passing that code to a dataset which i'm importing in R, e.g. Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.', strip.white=TRUE) Test V1 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+ (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+ V1 has inside a code for a function. I'm having problems with 2 things: 1 - I need to take out from V1 all that appears in the text, i tried a replace but did not work. Test=replace(Test,' ', ' ') , did not work. 2 - Writing a function like this : nlog=function(par) { beta=par[1:n] Measure=Test[1] # would this read the text? return(Measure) } So i need to use that code inside the function as above. Any suggestion on how you would do this? Kind Regards, Eliano -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=1 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=2 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660547.html To unsubscribe from Help with a function and text, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4660523code=ZWxpYW5vLm0ubWFycXVlc0BnbWFpbC5jb218NDY2MDUyM3wtMTk0ODk5MDYy . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660548.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multivariate Power Test?
Generic question... I am familiar with generic power calculations in R, however a lot of the data I primarily work with is multivariate. Is there any package/function that you would recommend to conduct such power analysis? Any recommendations would be appreciated. Thank you for your time, Charles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Is there a R package that use sampling weights in multilevel modeling? The survey package does not handle multilevel modeling and the weight option in lmer and nlmer functions from lme4 (used for multilevel modeling) is for weighted least squares estimation. Suggestion from one with experience in this subjet (including creating weights from strata and sampling unit variables) will be helpful. For example if analyzing data clustered in schools, how to use student's sampling weight or school sampling weight or both? Peter Maclean Department of Economics UDSM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.