Re: [R] Basic misunderstanding, or problem with my installation?
Thanks to the three people who saw what I missed. I typed my code in Libre Office as followed by -, and that program converted those two characters into a single left arrow symbol. I copied the commands from Libre into R without noticing that that had happened. Wierd. On 12/31/2013 7:54 PM, Sarah Goslee wrote: Hi David, Your code is showing up here with an arrow symbols. If it's an actual cut and paste, that's your problem: assignment in R is the two-character - and not an arrow symbol. Otherwise your code looks fine. Sarah On Tuesday, December 31, 2013, David Parkhurst wrote: I've just uninstalled and then reinstalled R on my windows 7 machine. To test my understanding of data frames, I'm trying the following code. (I plan to do other things with it, if it would only work.) Here's the code, which seems pretty basic to me: ls() nums � c(1,2,3,4,5) ltrs � c(“a�,�b�,�c�,�d�,�e�) df1 � data.frame(nums,ltrs) Here's what happens when I try to run it: ls() character(0) nums � c(1,2,3,4,5) Error: unexpected input in nums \ ltrs � c(“a�,�b�,�c�,�d�,�e�) Error: unexpected input in ltrs \ df1 � data.frame(nums,ltrs) Error: unexpected input in df1 \ Am I really misunderstanding the basics, or is there something wrong with my installation? David R-help@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic misunderstanding, or problem with my installation?
On Wed, Jan 1, 2014 at 4:36 AM, David Parkhurst parkh...@imap.iu.edu wrote: Thanks to the three people who saw what I missed. I typed my code in Libre Office as followed by -, and that program converted those two characters into a single left arrow symbol. I copied the commands from Libre into R without noticing that that had happened. Wierd. You should probably install R Studio and use that! You'll get syntax highlighting, bracket matching, and no magic conversion of arrows! www.rstudio.com Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto join matrices produced by rcorr()
This is an unstable process. I suggest using the bootstrap to get a confidence interval for the rank of each correlation coefficient among all non-diagonal correlations. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/howto-join-matrices-produced-by-rcorr-tp4682867p4682932.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] expression in xlab for a plot
Dear All, Happy new year! wonder if you could help with the following: we have: hist(runif(1000,0,100),xlab=expression(AUC[0 - 24]~ (xyz)),ylab=Frequency) the plan is to have part of the xlab expression change dynamically, specifically the values of 0 and 24 should be able to update 'automatically , based on vectors low and high, so if we had: low -24 high -48 then we should have this: hist(runif(1000,0,100),xlab=expression(AUC[24 - 48]~ (xyz)),ylab=Frequency) appreciate your insights, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expression in xlab for a plot
Hi,May be this helps: hist(runif(1000,0,100),xlab=bquote(AUC[.(low) - .(high)]~ (xyz)),ylab=Frequency) A.K. On Wednesday, January 1, 2014 10:30 AM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, Happy new year! wonder if you could help with the following: we have: hist(runif(1000,0,100),xlab=expression(AUC[0 - 24]~ (xyz)),ylab=Frequency) the plan is to have part of the xlab expression change dynamically, specifically the values of 0 and 24 should be able to update 'automatically , based on vectors low and high, so if we had: low -24 high -48 then we should have this: hist(runif(1000,0,100),xlab=expression(AUC[24 - 48]~ (xyz)),ylab=Frequency) appreciate your insights, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expression in xlab for a plot
HI, Another way would be to use ?substitute() hist(runif(1000,0,100),xlab=substitute(expression(AUC[low - high]~ (xyz)),list(low=low,high=high)),ylab=Frequency) A.K. On Wednesday, January 1, 2014 10:36 AM, arun smartpink...@yahoo.com wrote: Hi,May be this helps: hist(runif(1000,0,100),xlab=bquote(AUC[.(low) - .(high)]~ (xyz)),ylab=Frequency) A.K. On Wednesday, January 1, 2014 10:30 AM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, Happy new year! wonder if you could help with the following: we have: hist(runif(1000,0,100),xlab=expression(AUC[0 - 24]~ (xyz)),ylab=Frequency) the plan is to have part of the xlab expression change dynamically, specifically the values of 0 and 24 should be able to update 'automatically , based on vectors low and high, so if we had: low -24 high -48 then we should have this: hist(runif(1000,0,100),xlab=expression(AUC[24 - 48]~ (xyz)),ylab=Frequency) appreciate your insights, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv - markeov random field option
Dear R-users, Happy new year to all! I have been using the mgcv package, and I have run some models using the option mrf, for saptial data. But I have found quite hard to interpret the results. I could not find a lot of documentation on that, examples and so on, so I was wondering if anyone can help me found those. Thanks a lot. Cheers -- Helena Baptista [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] to modify a dataframe
Dear All, From the dataframe df1 df1 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 0, 0, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 0, 1, 0, 1), Pays3 = c(0, 0, 0, 0, 1, 0, 0, 0, 0), Pays4 = c(1, 0, 0, 0, 0, 0, 1, 0, 1), Pays5 = c(1, 1, 0, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(1L, 3L, 4L, 2L, 5L, 6L, 7L, 8L, 9L), class = data.frame) I look for a way to build the new dataframe df2 df2 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 1, 1, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 1, 1, 1, 1), Pays3 = c(0, 0, 0, 1, 1, 0, 0, 0, 0), Pays4 = c(1, 1, 1, 0, 0, 1, 1, 1, 1), Pays5 = c(1, 1, 1, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(NA, -9L), class = data.frame) The purpose is to transform df1 it df2 by giving for every group of lines A, B and C the value 1 if there is at least a value equal to 1 or a value 0 if there is no value equal to 1 Thanks for your helps -- Michel ARNAUD Chargé de mission auprès du DRH DGDRD-Drh - TA 174/04 Av Agropolis 34398 Montpellier cedex 5 tel : 04.67.61.75.38 fax : 04.67.61.57.87 port: 06.47.43.55.31 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to modify a dataframe
1. Thank you for the clear reproducible example. This made it easy to see what you wanted and provide an answer. Hopefully a correct one! 2. Many ways to do this. Here's one, but others may be better. Step1: First greate a grouping factor for Nom to group the separate row labels into the logical groups you have specified: grp - factor(substring(df1$Nom,1,1)) Note that you may need to use regular expressions or some other method to do this if your naming system is more complex than you have shown. Step 2: Create your new structure: df2 - df1 df2[,-1]-lapply(df1[-1],function(x)ave(x,grp,FUN=max)) HTH. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Wed, Jan 1, 2014 at 8:55 AM, Arnaud Michel michel.arn...@cirad.fr wrote: Dear All, From the dataframe df1 df1 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 0, 0, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 0, 1, 0, 1), Pays3 = c(0, 0, 0, 0, 1, 0, 0, 0, 0), Pays4 = c(1, 0, 0, 0, 0, 0, 1, 0, 1), Pays5 = c(1, 1, 0, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(1L, 3L, 4L, 2L, 5L, 6L, 7L, 8L, 9L), class = data.frame) I look for a way to build the new dataframe df2 df2 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 1, 1, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 1, 1, 1, 1), Pays3 = c(0, 0, 0, 1, 1, 0, 0, 0, 0), Pays4 = c(1, 1, 1, 0, 0, 1, 1, 1, 1), Pays5 = c(1, 1, 1, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(NA, -9L), class = data.frame) The purpose is to transform df1 it df2 by giving for every group of lines A, B and C the value 1 if there is at least a value equal to 1 or a value 0 if there is no value equal to 1 Thanks for your helps -- Michel ARNAUD Chargé de mission auprès du DRH DGDRD-Drh - TA 174/04 Av Agropolis 34398 Montpellier cedex 5 tel : 04.67.61.75.38 fax : 04.67.61.57.87 port: 06.47.43.55.31 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to modify a dataframe
Hello, Here's one way. lst1 - lapply(split(df1, gsub([0-9], , df1$Nom)), function(x){ x[, -1] - lapply(x[, -1], function(y){ z - if(any(y == 1)) 1 else 0 rep(z, length(y)) }) x }) df3 - do.call(rbind, lst1) rownames(df3) - NULL identical(df2, df3) # TRUE Hope this helps, Rui Barradas Em 01-01-2014 16:55, Arnaud Michel escreveu: Dear All, From the dataframe df1 df1 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 0, 0, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 0, 1, 0, 1), Pays3 = c(0, 0, 0, 0, 1, 0, 0, 0, 0), Pays4 = c(1, 0, 0, 0, 0, 0, 1, 0, 1), Pays5 = c(1, 1, 0, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(1L, 3L, 4L, 2L, 5L, 6L, 7L, 8L, 9L), class = data.frame) I look for a way to build the new dataframe df2 df2 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 1, 1, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 1, 1, 1, 1), Pays3 = c(0, 0, 0, 1, 1, 0, 0, 0, 0), Pays4 = c(1, 1, 1, 0, 0, 1, 1, 1, 1), Pays5 = c(1, 1, 1, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(NA, -9L), class = data.frame) The purpose is to transform df1 it df2 by giving for every group of lines A, B and C the value 1 if there is at least a value equal to 1 or a value 0 if there is no value equal to 1 Thanks for your helps __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to modify a dataframe
Hi, You could try: df3 - df1 library(plyr) df3[,-1] - ddply(df1,.(Nom1=gsub(\\d+,,Nom)),colwise(function(x) rep(max(x),length(x[,-1] attr(df3,row.names) - attr(df2,row.names) identical(df2,df3) #[1] TRUE A.K. On Wednesday, January 1, 2014 11:56 AM, Arnaud Michel michel.arn...@cirad.fr wrote: Dear All, From the dataframe df1 df1 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 0, 0, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 0, 1, 0, 1), Pays3 = c(0, 0, 0, 0, 1, 0, 0, 0, 0), Pays4 = c(1, 0, 0, 0, 0, 0, 1, 0, 1), Pays5 = c(1, 1, 0, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(1L, 3L, 4L, 2L, 5L, 6L, 7L, 8L, 9L), class = data.frame) I look for a way to build the new dataframe df2 df2 - structure(list(Nom = structure(1:9, .Label = c(A1, A2, A3, B1, B2, C1, C2, C3, C4), class = factor), Pays1 = c(1, 1, 1, 1, 1, 0, 0, 0, 0), Pays2 = c(0, 0, 0, 1, 1, 1, 1, 1, 1), Pays3 = c(0, 0, 0, 1, 1, 0, 0, 0, 0), Pays4 = c(1, 1, 1, 0, 0, 1, 1, 1, 1), Pays5 = c(1, 1, 1, 0, 0, 0, 0, 0, 0)), .Names = c(Nom, Pays1, Pays2, Pays3, Pays4, Pays5), row.names = c(NA, -9L), class = data.frame) The purpose is to transform df1 it df2 by giving for every group of lines A, B and C the value 1 if there is at least a value equal to 1 or a value 0 if there is no value equal to 1 Thanks for your helps -- Michel ARNAUD Chargé de mission auprès du DRH DGDRD-Drh - TA 174/04 Av Agropolis 34398 Montpellier cedex 5 tel : 04.67.61.75.38 fax : 04.67.61.57.87 port: 06.47.43.55.31 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seq_len and loops
2. However, Bill (and Henrik) raised the question of replacing '1' with '1L'; I understand the meaning of that, but does it matter (in practice)? On 12/22/2013 06:57 PM, William Dunlap wrote: for (i in seq_len(x - 1) + 1) should be efficient and safe. Oops, not safe when x is 0. Also, the '+ 1' should be '+ 1L' to get the same answer as seq_len(x)[-1]. It depends what your practice involves. seq_len(n)[-1], 2:n, and seq_len(n-1)+1L all produce an integer vector (if 0n2^31 or so). seq_len(n-1)+1 produces a numeric (double precision floating point) vector. Integers and numerics have different properties which might affect your results, but in many cases you will not care. Integers use 4 bytes of memory, numerics 8. Integers have 32 bits of precision, numerics 52. Integers range from -2^31+1 to 2^31-1 and arithmetic which would give a result outside of that range results in NA (with a warning). Numerics range from c. -2^1023 (c. -10^308) to c. 2^1023 (c. 10^308) and arithmetic which would give a result outside of that range results in +-Inf. If you prefer a sequence to be numeric, then use as.numeric(seq_len(n)), as.numeric(seq_len(n))[-1], or seq_len(n)+1 when making it. If you prefer integers, then use seq_len(n), seq_len(n)[-1], or seq_len(n)+1L. If you don't care, do whatever seems easiest at the time. if (x 1){ for (x in 2:x){ ... is the easiest, most effective, and most easy-to-understand. The dangerous part of that idiom is what you do in the 'else' part of the 'if' statement. Do both clauses make objects with the same names and types? I mildly prefer avoiding if statements because it makes reasoning about the results of the code more complicated. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Göran Broström Sent: Tuesday, December 31, 2013 4:10 PM To: r-help@R-project.org Subject: Re: [R] seq_len and loops Thanks for the answers from Duncan, Bill, Gabor, and Henrik. You convinced me that 1. The solution if (x 1){ for (x in 2:x){ ... is the easiest, most effective, and most easy-to-understand. 2. However, Bill (and Henrik) raised the question of replacing '1' with '1L'; I understand the meaning of that, but does it matter (in practice)? 3. Noone commented upon i - 1 while (i x){ i - i + 1 ... } I suppose that it means that it is the best solution. Thanks, and Happy New Year 2014! Göran On 12/22/2013 06:57 PM, William Dunlap wrote: for (i in seq_len(x - 1) + 1) should be efficient and safe. Oops, not safe when x is 0. Also, the '+ 1' should be '+ 1L' to get the same answer as seq_len(x)[-1]. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Saturday, December 21, 2013 3:52 PM To: Göran Broström; R-help@r-project.org Subject: Re: [R] seq_len and loops On 13-12-21 6:50 PM, Duncan Murdoch wrote: On 13-12-21 5:57 PM, Göran Broström wrote: I was recently reminded on this list that Using 1:ncol() is bad practice (seq_len is designed for that purpose) (Ripley) This triggers the following question: What is good practice for 2:ncol(x)? (This is not a joke; in a recursive situation it often makes sense to perform the calculation for the start value i = 1, then continue with a loop over the rest, the Fortran way;) I usually use if (ncol(x) 1) for (i in 2:ncol(x)){ but I can think of for (i in seq_len(x - 1)){ I - i + 1 and i - 1 while (i ncol(x)){ i - i + 1 What is good practice (efficient and safe)? for (i in seq_len(x - 1) + 1) should be efficient and safe. Oops, not safe when x is 0. A little less efficient, but clearer would be for (i in seq_len(x)[-1]) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looping through columns in a matrix or data frame
Hi, May be this helps: Using your function: mapply(less,test,4) #or invisible(mapply(less,test,4)) #[1] 2 3 #[1] 3 #or for(i in 1:ncol(test)){ less(test[,i],4)} #[1] 2 3 #[1] 3 A.K. Hi, I'm trying to figure out how to loop through columns in a matrix or data frame, but what I've been finding online has not been very clear. I've written the following simple function that I can use on a column to extract all values that are less than a specified number. Consider the following example using that function to extract all values less than 4 from column1 of the table test less - function(x,y){print(x[which(x y)])} test column1 column2 1 2 3 2 3 4 3 4 5 less(test[,1],4) [1] 2 3 What I want to do is loop that function over all the columns in the table. Note: I realize that this is a silly example and there are better ways to do this particular function in R, so please don't respond with better ways to extract values less than a given number. The question that I am interested in is merely how do I loop over the columns. If you could respond by modifying my silly function so that it will loop, that would be the most helpful response. Thanks for the advice! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question: reproducibility of random sampling with replacement
Dear All, I would like to ask for your help on reproducibility of random sampling with replacement. For example, one re-samples the rows with replacement of a residual matrix and uses the new residual matrix thus obtained to produce a statistic ; repeat this for a certain number of times. My questions: will the above produce ever be reproducible by setting a seed? Namely, Given the same residual matrix, Ted applies the above process and so does Jack, will they get the same results by setting a seed? My attempt: setting seed does not freeze the command sample from getting different samples, as from the codes: x= 1:20 S = matrix(0,5,20) for (i in 1:5) { S[i,] = sample(x, replace=FALSE) } set.seed(123) T = matrix(0,5,20) for (i in 1:5) { T[i,] = sample(x, replace=FALSE) } sum(S==T) === I would appreciate any comments and/or suggestions on this. Regards, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: reproducibility of random sampling with replacement
Hello, Inline. Em 01-01-2014 22:12, Chee Chen escreveu: Dear All, I would like to ask for your help on reproducibility of random sampling with replacement. For example, one re-samples the rows with replacement of a residual matrix and uses the new residual matrix thus obtained to produce a statistic ; repeat this for a certain number of times. My questions: will the above produce ever be reproducible by setting a seed? Yes. Namely, Given the same residual matrix, Ted applies the above process and so does Jack, will they get the same results by setting a seed? My attempt: setting seed does not freeze the command sample from getting different samples, Yes it does, you are simply not setting the seed before the first for loop. Rui Barradas as from the codes: x= 1:20 S = matrix(0,5,20) for (i in 1:5) { S[i,] = sample(x, replace=FALSE) } set.seed(123) T = matrix(0,5,20) for (i in 1:5) { T[i,] = sample(x, replace=FALSE) } sum(S==T) === I would appreciate any comments and/or suggestions on this. Regards, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: reproducibility of random sampling with replacement
Why on earth would you expect S and T to be the same given what you have done. I am unable to rightly apprehend the confusion of ideas that could provoke such a question, (Charles Babbage). You have to set the *same* seed before each construction. I.e. do set.seed(123) before creating S; then do set.seed(123) again before creating T. If you do so, S and T will be identical. cheers, Rolf Turner P. S. T is not a good name for an object; too easy to confuse with TRUE. Not an egregious sin, but to be avoided. R. T. On 02/01/14 11:12, Chee Chen wrote: Dear All, I would like to ask for your help on reproducibility of random sampling with replacement. For example, one re-samples the rows with replacement of a residual matrix and uses the new residual matrix thus obtained to produce a statistic ; repeat this for a certain number of times. My questions: will the above produce ever be reproducible by setting a seed? Namely, Given the same residual matrix, Ted applies the above process and so does Jack, will they get the same results by setting a seed? My attempt: setting seed does not freeze the command sample from getting different samples, as from the codes: x= 1:20 S = matrix(0,5,20) for (i in 1:5) { S[i,] = sample(x, replace=FALSE) } set.seed(123) T = matrix(0,5,20) for (i in 1:5) { T[i,] = sample(x, replace=FALSE) } sum(S==T) === I would appreciate any comments and/or suggestions on this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: reproducibility of random sampling with replacement
If you want to reproduce the same sequence twice, then you need to set the seed at the beginning of each calculation. You are only doing it for the second calculation below. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Chee Chen chee.c...@yahoo.com wrote: Dear All, I would like to ask for your help on reproducibility of random sampling with replacement. For example, one re-samples the rows with replacement of a residual matrix and uses the new residual matrix thus obtained to produce a statistic ; repeat this for a certain number of times. My questions: will the above produce ever be reproducible by setting a seed? Namely, Given the same residual matrix, Ted applies the above process and so does Jack, will they get the same results by setting a seed? My attempt: setting seed does not freeze the command sample from getting different samples, as from the codes: x= 1:20 S = matrix(0,5,20) for (i in 1:5) { S[i,] = sample(x, replace=FALSE) } set.seed(123) T = matrix(0,5,20) for (i in 1:5) { T[i,] = sample(x, replace=FALSE) } sum(S==T) === I would appreciate any comments and/or suggestions on this. Regards, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: reproducibility of random sampling with replacement
You have to set the same seed before each random number generation! You did not do this. Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Wed, Jan 1, 2014 at 2:12 PM, Chee Chen chee.c...@yahoo.com wrote: Dear All, I would like to ask for your help on reproducibility of random sampling with replacement. For example, one re-samples the rows with replacement of a residual matrix and uses the new residual matrix thus obtained to produce a statistic ; repeat this for a certain number of times. My questions: will the above produce ever be reproducible by setting a seed? Namely, Given the same residual matrix, Ted applies the above process and so does Jack, will they get the same results by setting a seed? My attempt: setting seed does not freeze the command sample from getting different samples, as from the codes: x= 1:20 S = matrix(0,5,20) for (i in 1:5) { S[i,] = sample(x, replace=FALSE) } set.seed(123) T = matrix(0,5,20) for (i in 1:5) { T[i,] = sample(x, replace=FALSE) } sum(S==T) === I would appreciate any comments and/or suggestions on this. Regards, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to verify char variables contain at least one value
Happy new year fellows, I am trying to do something I believe should be fairly straightforward but I cannot find my way out. My dataset d2 is 26 rows by 245 columns, exclusively char variables. I would like to check whether at least one column from V13 till V239 (they are in numerical sequence) has been filled in, so I try d2$check - c(d2$V13:d2$V239) and/or d2$check - paste(d2$V13:d2$V239,sep=) but I get (translated from Italian): Error in d2$V13:d2$V239 : argument NA/NaN I have tried nchar but the same error occurs. I have also tried to run the above functions on a smaller variable subset (V13, V14, V15, see below for details) just to double check in case some variable would erroneously be in another format, but the same occur. d2$V13 [1] da -5.1% a -10% [9] [17] [25] d2$V14 [1] da -10.1% a -15% [9] [17] [25] d2$V15 [1] Can anyone suggest an alternative function for me to create a variable that checks whether there is at least one value for each of the 26 records I need to analyze? Thank you in advance, Luca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R crashes with memory errors on a 256GB machine (and system shoes only 60GB usage)
Hi All, I have a terrible issue i cant seem to debug which is halting my work completely. I have R 3.02 installed on a linux machine (arch linux-latest) which I built specifically for running high memory use models. the system is a 16 core, 256 GB RAM machine. it worked well at the start but in the recent days i keep getting errors and crashes regarding memory use, such as cannot create vector size of XXX, not enough memory etc when looking at top (linux system monitor) i see i barley scrape the 60 GB of ram (out of 256GB) i really don't know how to debug this and my whole work is halted due to this so any help would be greatly appreciated Best wishes Z [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to verify char variables contain at least one value
On 01/02/2014 05:17 PM, Luca Meyer wrote: Happy new year fellows, I am trying to do something I believe should be fairly straightforward but I cannot find my way out. My dataset d2 is 26 rows by 245 columns, exclusively char variables. I would like to check whether at least one column from V13 till V239 (they are in numerical sequence) has been filled in, so I try d2$check- c(d2$V13:d2$V239) and/or d2$check- paste(d2$V13:d2$V239,sep=) but I get (translated from Italian): Error in d2$V13:d2$V239 : argument NA/NaN I have tried nchar but the same error occurs. I have also tried to run the above functions on a smaller variable subset (V13, V14, V15, see below for details) just to double check in case some variable would erroneously be in another format, but the same occur. d2$V13 [1] da -5.1% a -10% [9] [17] [25] d2$V14 [1] da -10.1% a -15% [9] [17] [25] d2$V15 [1] Can anyone suggest an alternative function for me to create a variable that checks whether there is at least one value for each of the 26 records I need to analyze? Hi Luca, Perhaps you are looking for something like this: d2check-unlist(apply(as.matrix(d2[,paste(V,13:239,sep=)]),1,nchar)) # to test for any non empty rows any(d2check) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to verify char variables contain at least one value
Hello, Luca, also a happy new year! It's not quite clear to me what you want to do, but note first that the :-operator is a short-cut for seq() with by = 1 (look at ?seq), and that it usually (!) does not work on columns of data frames. Exception: when used for the argument subset of function subset(). Second, you seem to want to check in each row of d2 if there is any entry different from , right? So, does apply( subset( d2, subset = V13:V239), 1, function( x) any( x != )) what you want? Hth -- Gerrit On Thu, 2 Jan 2014, Luca Meyer wrote: Happy new year fellows, I am trying to do something I believe should be fairly straightforward but I cannot find my way out. My dataset d2 is 26 rows by 245 columns, exclusively char variables. I would like to check whether at least one column from V13 till V239 (they are in numerical sequence) has been filled in, so I try d2$check - c(d2$V13:d2$V239) and/or d2$check - paste(d2$V13:d2$V239,sep=) but I get (translated from Italian): Error in d2$V13:d2$V239 : argument NA/NaN I have tried nchar but the same error occurs. I have also tried to run the above functions on a smaller variable subset (V13, V14, V15, see below for details) just to double check in case some variable would erroneously be in another format, but the same occur. d2$V13 [1] da -5.1% a -10% [9] [17] [25] d2$V14 [1] da -10.1% a -15% [9] [17] [25] d2$V15 [1] Can anyone suggest an alternative function for me to create a variable that checks whether there is at least one value for each of the 26 records I need to analyze? Thank you in advance, Luca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.